Wednesday 26 November 2014

Wait-free synchronization

Paper here

This is a seminal paper from 1991 that changed the field of concurrency. Wait-free synchronization represented at the time a qualitative break with the traditional locking-based techniques for implementing concurrent objects.

A wait-free implementation of a concurrent data object is one that guarantees that any process can complete any operation in a finite number of steps, regardless the execution speed of other processes.

First, lets define what is a concurrent object. A concurrent object is a data structure shared by concurrent processes. The traditional approach to implement such objects centers around the use of critical sections, where only one process at a time is allowed to operate on the object. At the time, the use of locks was the only way to guarantee safe access to objects when dealing with concurrency. Critical sections are poorly suited for asynchronous, fault-tolerant systems: if a faulty process is halted or delayed in a critical section, non-faulty processes will also be unable to progress. If one thread attempts to acquire a lock that is already held by another thread, the thread will block until the lock is free.

The wait-free condition provides fault-tolerance: no process can be prevented from completing an operation by undetected halting failures of other processes, or by arbitrary variations in their speed.

The paper enumerates a list of shared objects that can be used to solve the consensus. In the table below we have a column of consensus number that indicates the maximum number of processes for which the object can solve a simple consensus problem, and the object column contains the synchronization primitives and objects that are supported.

Here it is another definition: The consensus number for object X is the largest n for which X solves consensus among n processes. If no largest n exists, the consensus number is said to be infinite. Also, in a system of n or more concurrent processes, it is impossible to construct a wait-free implementation of an object with consensus number n from an object with lower consensus number. E.g., it is possible to have consensus between 2 concurrent processes with a queue or a stack object that support test&set, swap, or fetch&add operations.

Consensus number Object
1 r/w registers
2 test&set, swap, fetch&add, queue, stack
... ...
2n-2 n-register assignment
... ...
memory-to-memory move and swap, augmented queue,
compare&swap, fetch&cons, sticky byte

It is impossible to implement consensus with atomic read and write (r/w) registers. Therefore, it is impossible to build wait-free implementation of common data type such as sets, queues, stacks, priority queues, or lists, and it is impossible to have most, if not all, classical synchronization primitives with registers, such as test&set, compare&swap, and fetch&add. E.g., if we have two concurrent processes P and Q that try to write in register r, both processes must first check if the register is already written or not. Both processes can check the variable at the same time before writing. Thus, P and Q will write in the same register and the register will be bivalent - both processes think that the register have their value, but just one process got it right. This is what is expected: at the end of consensus, both concurrent processes must have the same value. Also, it is impossible to construct a wait-free implementation of any object with consensus number greater than 1 using atomic r/w registers.

The test&set, compare&swap, and fetch&add operations are only possible with read-modify-write registers, or any other object with consensus number equal or bigger than 2.

Impossibility results

One fundamental problem of wait-free synchronization can be phrased as follows:

Given two concurrent objects X and Y, does there exist a wait-free implementation of X by Y?

To see that an object X with consensus number m is not possible to construct a wait-free implementation of any object Y with a consensus number n where n>m, we must look to the impossibility results.

Informally, a consensus protocol is a system of n processes that communicate through a set of shared objects {X1, . . . . , Xn}. Each processes start with an input value from some domain; they communicate with one another by applying operations to the shared objects; and they eventually agree (decide function) on a common input value and halt. A consensus protocol is required to be:

  1. consistent: distinct processes never decide on distinct values;
  2. wait-free: each process decides after a finite number of steps;
  3. valid: the common decision value is the input to some process.

It is an immediate consequence of the definitions that if Y implements X, and X solves n-process consensus, then Y also solves n-process consensus. Therefore, there is no object X with a consensus number m that can be used to implement a wait-free algorithm for Y that has a consensus number n, where n>m.

As I said before about atomic r/w registers, it is not possible to have consensus with this type of object, because all processes can overwrite inadvertently each other value. Thus, it is not possible to have consensus in another object with n>1 using atomic r/w registers.

Read-Modify-Write registers

I asserted that it is not possible to have an wait-free implementation with atomic r/w registers. But what about with read-modify-write registers (RMW)?

Many of the synchronization primitives can be expressed as RMW operations, which both read a memory location and write a new value into it simultaneously, either with a completely new value or some function of the previous value. These operations prevent race conditions in multi-threaded applications and they are used to build mutexes and semaphores. Read-modify-write instructions often produce unexpected results when used on I/O devices, as a write operation may not affect the same internal register that would be accessed in a read operation.

Let me describe here RMW and decide functions:

RMW(r: register, f:function) returns(value)
    previous := r
    r := f(r)
    return previous
end RMW


decide(input: value) returns(value)
   prefer[P] := input
   if RMW(r,f) = v
      then return prefer[P]
      else return prefer[Q]
   end if
end decide

Looking to the decide function, and supposing that there are two concurrent processes P and Q trying to write in register r, if the input that is suggested is the one returned from RMW, then it means that P has written first. Otherwise, Q was the first one to write in the register. The decide function prevents concurrent access to the register.

So, in what differs atomic r/w registers from RMW registers? What differs is the use of the function RMW and decide functions when dealing with registers. Simple atomic r/w registers are not enough to have consensus in a wait-free implementation.

Although read-modify-write registers are more powerful than atomic read/write registers, many common read-modify-write operations are still computationally weak. In particular, one cannot construct a wait-free solution to three process consensus using registers that support any combination of read, write, test&set, swap, and fetch&add operations. Therefore, there is no wait-free solution to three-process consensus using any combination of read-modify-write operations.

Compare & Swap (CAS)

CAS is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail.

The CAS operation is expressed in the following way in C language:

int compare_and_swap(int* reg, int oldval, int newval)
{
  ATOMIC();
  int old_reg_val = *reg;
  if (old_reg_val == oldval)
     *reg = newval;
  END_ATOMIC();
  return old_reg_val;
}

CAS is used for implementing synchronization primitives like semaphores and mutexes, likewise more sophisticated lock-free and wait-free algorithms. The author proved that CAS can implement more wait-free algorithms than with atomic read, write, or fetch-and-add. Algorithms built around CAS typically read some key memory location and remember the old value. Based on that old value, they compute some new value.

To implement a wait-free n-process consensus, we need a decide function that uses CAS operation and returns the correct result. This is shown in the next block of code.

decide(input: value) returns(value)
  first:=CAS(r,┴,input)
  if first=┴
    then return input
    else return first
  end if
end decide

In the decide function, the processes share a register r initialized to . Each process attempts to replace with its input; the decision value is established by the process that succeeds.

This protocol is clearly wait-free, since it contains no loops and the function will always end with a finite number of steps.

All the objects that implements CAS are universal because they are the objects with the biggest consensus number, and they can implement all objects and operations with a lower consensus number.

To proof this, we must follow this basic idea. We represent the operations (fetch&add, test&set, etc...) to be applied on the object as a linked list, where the sequence of cells represents the sequence of operations applied to the object.

Applying an operation p to the object in state s leaves the object in state s' and returns the result value r. At the end of operation, it must be necessary to execute a consensus to decide for the next operation that will be applied to the object. When the consensus is used to decide for the next operation to execute on the object, we are turning concurrent operations made by concurrent processes into linear operations. Therefore, the object will never halt the execution because of concurrency.

In conclusion, if CAS is a universal operation, then it can implement all operations with lower consensus number.

No comments:

Post a Comment