ELSEVIER
InformatioK Processing Letters
Information Processing Letters 48 (1993) 215-220
A simple correctness proof of the MCS contention-free Theodore
Johnson
*, Krishna
lock
Harathi
Computer and Information Sciences Department, 301 CSE, Uniuersity of Florida, Gainesville, FL 3261 I-2024, USA
Communicated by S.G. Akl Received 11 January 1993 Revised 3 September 1993
Abstract Mellor-Crummey memory locations.
and Scott present a spin-lock Their algorithm is equivalent
Key words:
exclusion;
that avoids network contention by having processors to a lock-free queue with a special access pattern.
spin on local
The authors provide a complex and unintuitive proof of the correctness of their algorithm. In this paper, we provide a simple proof that the MCS lock is a correct critical section solution, which provides insight into why the algorithm is correct. We show that each acquire-lock and release-lock operation has a decisive instruction which orders the operations with respect to each other. Mutual
Concurrency;
Data structures;
1. Introduction Mellor-Crummey and Scott present a simple spin-lock for mutual exclusion [7], which they call the MCS lock. Their algorithm has several desirable properties: it is fair, because it is equivalent to a FIFO queue ADT with special access pattern, and it is contention free, because each processor spins on a local variable. In contrast, testand-set spin waiting is not fair, and will severely limit performance [ 1,3]. The authors of the MCS locks provide a complex proof of the correctness of their algorithm. In [6], they show correctness through a series of algorithm transformations. Since this proof pro-
* Corresponding author. 0020-0190/93/$06.00
Parallel
processing;
vides little insight, they also provide intuitive correctness arguments. In this paper, we present a simple correctness proof for the MCS lock. We show that the underlying queue is decisive-instruction serializable, and that no operation accesses a garbage address. We conclude that the MCS lock is a correct solution to the critical section problem.
2. MCS-lock
algorithm
The code for the MCS lock, which is shown in Fig. 1, relies on two atomic read-modify-write operations. The swap instruction takes two parameters, L, a pointer to a memory location, and I, a value. The execution of swap c&L, I ) puts I into L and returns the old value of L (“&” is the C language’s address-of operator). The c o m-
0 1993 Elsevier Science Publishers B.V. All rights reserved
SSDI 0020-0190(93)E0174-I
Spin lock
216
T. Johnson. K. Harathi / Information
type
qnode= next:
record *qnode
locked: type
Boolean
lock=*qnode
If
I
//
in
points
to
shared
procedure pred:
queue
:= :=
pred
+
repeat
*lock,
I/
ptr
to
successor
to
in
queue
necessary tail
of
(in
to
I:
ni
the
queue
an
enclosing
invoking
scope)
processor
*qnode)
L :=
Lock
I
II
Link
behind
I/
Busy
wait
- lock
CL:
*Lock,
I:
return
/I
I->next->Locked
while
No
successor,
I->next=nit :=
false
I,
was
not to
free spin predecessor
for
lock
*qnode)
f/ - and - swap(L,
successor
lock
Prepare
compare
repeat
no for
/I
I->locked
release
Initially, Queue
/I
:= while
If /I
true
I->next=nit if
busy-waiting
allocated
I)
pred->next
procedure
record
ptr
II
accessible
nil
swap(L,
i->tocked
if
link Locally
//
*qnode
I->next pred if
a
memory
acquire_Lock(L:
var
Processing Letters 48 (1993) 215-220
No
known
successor
successor
nil) Lock
free
/I
Wait
for
//
Pass
Lock
Fig. 1. MCS-lock algorithm (from [7])
pa r e-a nd_s w a p instruction takes a third parameter X, a value. The execution of compare_and_swap(&L, I, X) puts I into L only if L currently contains the value X. The compare and swap instruction returns True if the testsucceeds, and F a t se if the test fails. Each processor that sets a lock first allocates a locally-accessible record. If process p inserts record r into the queue, then we say that r is p’s record, and p is r’s process. This record contains a boolean flag for spin waiting and a link for forming a queue. The tail of the queue is pointed to be L, and the head of the queue is implicitly pointed to by the lock holder. To set a lock, a processor executes the a c q u i r e-1 o c k procedure and adds its record, pointed to by I, to the end of the queue. The enqueuing is performed by executing the f e t c h and s tore instruction, which atomically stores-1 in c and returns L’s old value. If the processor has a predecessor in the queue, the processor completes the link; otherwise the processor is the lock holder. To release a lock, the processor must reset the busy-waiting bit of its successor. If the processor
has no successor, it sets L to nil atomically the compare_and_swap.
with
3. Correctness We show that the MCS lock is correct by showing that it maintains a queue, and the head of the queue is the process that holds the lock. The MCS lock is decisive-instruction serializable [8]. There are two operations on the MCS lock: acquire_lock and retease_lock. Eachoperation has a single decisive instruction, and corresponding to a concurrent execution g of the queue operations, there is an equivalent serial execution Pd such that if operation 0, executes its decisive instruction before operation 0, does in g, then 0, < 0, in Yd. The decisive-instruction serializability of the MCS lock greatly simplifies its correctness proof, because the equivalent queue is in a single state at any instant. In contrast, a concurrent data structure that is linearizable but not serializable might be in several states simultaneously [4].
T. Johnson, K. Harathi /Information
3.1. The queue ADT A queue is an sists of a finite set and dequeue. The dered. We write
Abstract Data Type that conQ and two operations: enqueue elements of Q are totally orthe state of a queue as Q =
(sr, q2,. . ., q,,), where q1 cp q2 cp . . . ce 9,. The enqueue operation is specified by enqueue((q,,
q,,...,q,),
+ (41, q?,,...>qn, The dequeue is specified by
4’) 4’).
operation
on a non-empty
queue
Processing Letters 48 (1993) 215-220
chronizing code. The actual memory may be weakly consistent. We refer the interested reader to [2] for a survey of consistency models. 3.3. Execution sequences The MCS lock executes correctly because it is presented with a special sequence of concurrent operations. We assume that each processor uses the acquire_tock and release-lock operations to synchronize access to a resource: acquire
r) - lock(L, critical section
release
dequeue((q,,
q2,...,q,))
211
Lock(L)
+ (qz,...,q,,)
where the return value is ql. A dequeue operation on an empty queue is undefined. We define two functions on a queue Q: head(Q) and tail(Q). The head function returns the first element in the queue and the tail function returns the last element in the queue. Corresponding to an actual MCS lock M, there is an abstract queue Q. Initially, both M and Q are empty. When process p with record r performs the decisive instruction for an a cQ changes state to enq u i r e 1 o c k operation, queue(a, r>. When a process performs the decisive instruction for a r e L e a s e 1 o c k operation, Q changes state to dequeue(Q). We will show that the actual MCS lock corresponds to the abstract queue. 3.2. Instruction sequences The parallel processors will be executing instructions simultaneously. To discuss the correctness of the algorithm, we need to place an ordering on the execution of the instructions of the different processes. We will assume that the memory of the parallel computer is sequentially consistent [5]. That is, there is a global sequence of instruction executions 57 such that every process observes its instruction executions to be consistent with XF’. When we discuss time in the definitions and proofs, time refers to the ordering imposed by 5Y. It is reasonable to assume sequential consistency because the MCS-lock is a syn-
If the MCS lock is correct, then it is a multiple-enqueue / single-dequeue queue. Any might exnumber of a c q u i r e 1 o c k operations ecute concurrently, bit we are guaranteed that (1) At most one r e t e a se t o c k operation executes at any given time. Let D, and D, be two different release executions, executing in the time intervals I, and I,, respectively. Then I, and I, do not overlap. (2) No re 1ease lot k operation executes between the time thata release lock sets L to N i 1 and the time that the first subsequent terminates. Let D be a c q u i r e 1o c k operation the execuzon of a re 1ease tot k operation that sets L to N I L, and let D complete its execution at time t,. Let E be the execution of the a c q u i r e_ 1o c k operation that performs the first decisive instruction after time t,. If E completes its execution at time t,, then no r e 1 e a se t o c k operation executes in the time interval (t,Ft,). 3.4. Queue invariants The MCS lock maintains three invariant conditions. Tail invariant: If the abstract queue is nonempty, then L points to the record at the tail of the abstract queue. If the queue is empty, L is Nil. We define the head process to be the process whose record is the head of the abstract queue. A process that is waiting for I - > 1o c ked to become false is blocked.
218
T. Johnson, K. Harathi / Information
Head invariant: If the head process is blocked, it will be unblocked by the time that the preceding r e 1 ease 1 oc k operation terminates (i.e., issues the last%struction in the procedure). Blocking inuariant: A non-head process will not exit the a c q u i r e - 1 o c k procedure. 3.5. Decisive instructions The decisive instruction in the acq u i r e 1 o c k procedure is the fetch ~ and _ store instruction. The decisive instruction in the r e 1ease lot k procedure is the reading of I - >ne x t-if the fetch returns a non-nil value, and is the compare_ and-swap instruction otherwise. Theorem 1. The MCS lock is decisive-instruction serializable. Proof. To show that the MCS lock is decisive-instruction serializable, we first show that the three invariants are always maintained. To show that the invariants are maintained, we assume that only the head process executes a r e 1 e a s e-1 o c k operation. We then show that this assumption holds by induction. Lemma 2. The tail invariant is always maintained. Proof. When a process performs the decisive instruction for an a cqu i r e Lot k operation, it sets the tail pointer (L) to its record. Therefore, acquire to c k operations maintain the tail invariant. A r e 1 e a s e t o c k operation modifies the tail pointer by the compare-and-swap instruction only. The compare _and -swap will succeed only if the queue contains exactly one element. Therefore, the r e 1 e a se - t o c k operation also maintains the queue invariant. q
If
only the head process executes a operation, the blocking invariant is always maintained. Lemma
3.
r e t ease
t ock
Proof. Suppose that the record of a process is in the queue but is not the head record. By the tail invariant, the process must have read a non-nil
Processing Letters 48 (1993) 215-220
tail pointer when it performed the decisive instruction for the a c q u i r e 1o c k operation. The process will therefore wait until the Lo c k e d field of its record, which is initialized to t rue, is set to f a 1 se. This bit will only be reset when the process of the predecessor record executes a r etea se-1 oc k operation. But, after the r et e a se-1 o c k decisive instruction, the process becomes the head process. 0 Lemma tained.
4. The head invariant is always main-
Proof. Suppose that when a head process P performs its decisive instruction on a r et e a s e_ Lo c k operation, dequeue(Q) is nonempty. In this case, the decisive instruction will report that tail pointer is not the process’ record. Then, P will wait until the ne x t field of its record, R is non-nil. By the tail invariant, this field will eventually point to R’s successor in Q, r (because of the execution of r’s process, p, in the acqu i r e_Loc k procedure). After P’s decisive instruction, r is the head record and p is the head process. Process P will reset the Lo c k ed bit of r, unblocking p before P completes. Suppose that when P performs its decisive instruction, Q becomes empty. If p is the process that executes the next decisive instruction (which must be for a a c q u i r e Lo c k operation), p will become the head process and will find that L is n i I. Since p finds L is N I L, p never blocks. 0 Lemma 5. A process executes a r e t e a se_ L o c k operation only when it is the head process. Proof. We proceed by induction on the ith decisive instruction. For the base case, consider the first operation that executes a decisive instruction. Since the queue is empty, the operation must be an a c q u i r e_ t o c k and the lemma holds. Suppose that the lemma holds for the ith decisive instruction. If the (i + 1)th decisive instruction is due to an a c q u i r e-t o c k, the lemma holds. Suppose that the (i + 11th decisive instruction is due to an r e t e a s e_L o c k operation. By
T. Johnson, K. Harathi / Information Processing Letters 48 (1993) 215-220
the execution sequence assumption, the process that executes this operation must have previously executed an a cqui t-e-1 oc k operation. By the blocking invariant, the process must be the head 0 process, so the lemma holds. Since the head invariant is always maintained, locks are granted in the order requested, where the order of requests is the order of the decisive instructions, and which is equivalent to the order in the abstract queue. 0 3.6. Garbage We next show that the queue does not access garbage locations. We assume when a process executes the a c q u i r e-1 o c k procedure, it first allocates a record. When a process exists the r e t e a s e-1 o c k procedure, it discards the record that was at the head of the queue (pointed to by I). For simplicity, we assume that a record is never reused. A record is valid during the time between its creation and its deletion. Theorem record.
6. No process accesses an invalid queue
Proof. By the blocking invariant, only the head process performs a r e 1 e a se-1 o c k operation, and the process dequeues its own record. Therefore, whenever a process access its record, it accesses a valid record. When a process executes the a c q u i r e_ t o c k operation, the process inserts a pointer to its record’s predecessor in Q (if one exists). The process of the predecessor record does not exit the r e 1 ease-to c k procedure until the next field of its record is modified. Therefore, no process accesses an invalid record when executing the acquire Lock operation. When a process executes the r e t e a se Lo c k operation, it modifies its record’s successorin Q. By the execution sequence assumption, this record remains in the queue until the process completes the operation. Therefore, no process access an invalid record when executing the ret e a s e-1 o c k operation. 0
219
4. Critical section solution A critical section solution is correct fies the following three criteria [9l:
if it satis-
1. At most one process executes in the critical section at any given time. 2. If no process is executing in the critical section, and at least one process wishes to enter the critical section, then a process enters the critical section in a finite amount of time. 3. If a process wishes to enter the critical section, then a finite number of processes enter first. We can see that the MCS lock is a correct solution to the critical section problem. We define the set of processes that are in or wish to enter the critical section as the set of processes whose records are in Q. By the blocking invariant, criteria 1 holds. By the head invariant, criteria 2 holds. The MCS lock is a decisive-instruction serializable FIFO queue, so criteria 3 holds.
5. Conclusion The MCS lock is a method of implementing mutual exclusion that avoids network contention. Further, the MCS lock is fair, unlike the simple test-and-set solution. We present a simple correctness proof for the MCS lock. We show that the MCS lock is equivalent to a decisive-instruction serializable queue, and show that several queue invariants always hold. Whereas the correctness proof provided by Mellor-Crummey and Scott is indirect and provides little insight, our correctness proof is direct and provides insight into the algorithm’s execution.
References 111T.E. Anderson,
The performance of spin lock alternatives for shared memory multiprocessors, IEEE Trans. Parallel Distributed Systems 1 (1) (1990) 6-16. 121M. Dubois, C. Scheurich and F. Briggs, Synchronization, coherence, and event ordering in multiprocessors, IEEE Computer 21 (2) (1988) 9-21. 131 R.R. Glenn, D.V. Pryor, J.M. Conroy and T. Johnson, Characterizing memory hotspots in a shared memory mimd
220
T. Johnson, K. Harathi / Information Processing Letters 48 (1993) 215-220
machine, in: Supercomputing ‘91 (IEEE and ACM SIGARCH, 1991). [4] M. Herlihy and J. Wing, Linearizability: A correctness condition for concurrent objects, ACM Trans. Programming Languages Systems 12 (3) (199) 463-492. [51 L. Lamport, How to make a multiprocessor computer that correctly executes multiprocess programs, IEEE Trans. Comput. 28 (9) (1979) 690-691. [6] J. Mellor-Crummey and M. Scott, Algorithms for scalable synchronization on shared-memory multiprocessors, Tech.
Rept. TR90-114, Dept. of Computer Science, Rice University, 1990. [7] J.M. Mellor-Crummey and M.L. Scott, Algorithms for scalable synchronization on shared-memory multiprocessors, ACM Trans. Comput. Systems. 9 (1) (1991) 21-65. [S] D. Shasha and N. Goodman, Concurrent search structure algorithms, ACM Trans. Database Systems 13 (1) (1988) 53-90. [9] A. Silberschatz, J. Peterson and P. Galvin, Operating System Concepts (Addison-Wesley, Reading, MA, 1991).