Salsa - Synchronization

> Salsa - Synchronization

Topics

Nachos

Interrupts

When you need to block a thread, think of using the Sleep() function that is a part of the Thread class.

[ Source code in code/threads | Lab| Quiz ]

Processes frequently need to communicate with other processes, e.g., the input of one process may depend on the output of another, or the processes may need to access the same resource and need to do so in an orderly manner. Here, we first discuss the general issues regarding synchronization and then talk about how to implement some synchronization primitives in Nachos.

Race Conditions and Mutual Exclusion

Two issues need to be considered regarding inter-process communication. Firstly, how to make sure that two processes do not get into each other's way when accessing a common resource and, secondly, how to sequence the processes when there are dependencies among them, e.g., process B waiting for process A's output.

Assume two processes, A and B, need to read and increment the value of a variable x whose initial value is 7. Thus, after both A and B have incremented x, its value should be 9. Assume process A reads the value 7 and before it can increment it is preempted by the CPU scheduler that now starts executing B. B reads the value 7 as well, increments it to 8 and stores it in x. Now, A gets the CPU again, increments the value it had read in to 8 and then stores it back. Thus, the value of x after both A and B have read and incremented it, is incorrect. Situations like this, where two or more processes are reading or writing some shared data and the final result depends on who runs precisely when, are called race conditions.

The key to avoiding race conditions is to provide mutual exclusion - some way of making sure that if one process is accessing a shared variable, other processes will be excluded from doing the same thing. The part of the program where shared variables are accessed is called the critical region or critical section. If execution of processes is arranged in a manner such that no two processes were in their critical section at the same time, race conditions are avoided.

Solutions - Hardware and Software

The simplest hardware solution is to have each process disable interrupts when it enters its critical section and enable them just before it leaves it. This solution is infeasible for two reasons. First, it is unwise to give user processes the power to turn off interrupts. It is possible that a process disables interrupts and then forgets to turn them on again. Second, in a multiprocessor system, disabling interrupts affects only the CPU that executed the disable instruction. The other ones will continue running and can access the shared resource/variable. Many machines therefore, provide special hardware instructions that allow the processes to either test and modify the content of a word, or to swap the contents of two words, atomically. These special instructions can be used to solve the critical section problem. Two commonly provided instructions are Test-and-Set and the Swap instruction. Either of these solutions can be used as a building block to solve more complex synchronization problems.

The software solutions build on the hardware solutions to provide higher-level syncrhonization support. Three software mechanisms are Semaphores, Locks and Monitors. We explain each of these briefly below and also discuss how to implement the first two in Nachos.

Semaphores

A semaphore S is an integer variable that can be accessed only through two atomic operations - P and V (Dutch for wait and signal respectively). These are defined as follows:


                      P(S): 
                           while S <= 0 do no-op;
                           S = S-1;

                      V(S): 
                           S = S + 1;

The testing and modification of the value of the semaphore in both the operations must be done atomically, i.e., when one process is testing or modifying the semaphore value, no other process can modify the value. To see how these operations can be used to synchronize two processes, assume two processes A and B, executing statements S₁ and S₂ respectively. The condition is that S₂ can be executed only after S₁. This can be achieved by having a semaphore, synch initialized to 0. A and B will now execute the following code:

Because synch is initialized to 0, B will execute S₂ only after A has signalled it by invoking V(synch) which is after S₁.

The disadvantage of the above implementation of semaphores is that it involves busy waiting - continuously testing a variable until some value appears. It should be avoided because it wastes CPU time. Only when there is an expectation that the wait will be short is busy waiting used. Taking this into account, we can redefine P() and V() as follows. When a process executes P() and finds that the semaphore is not positive, it must wait. However, rather than busy waiting, the process can block itself. The block operation places the process in a wait queue associated with the semaphore. Control is then transferred to the CPU scheduler, which selects another process to execute. A process that is blocked, waiting on a semaphore, should be restarted when some other process executes V() operation.

The important thing to realize about semaphores is that they work only if the operations P() and V() are atomic. In a uniprocessor environment, this can be achieved by simply disabling interrupts for the duration of these operations.

Note that busy waiting has not been completely eliminated - Instead of having a busy wait at the critical sections of user processes, busy wait is limited to the critical sections of P() and V(). These sections are short, if properly coded.

Implementation in Nachos

The semaphore implementation in Nachos is in the files synch.cc and synch.h in the threads directory. The operations P() and V() are made atomic by making sure that interrupts are disabled (we assume Nachos is running in a uniprocessor environment) for the duration of these operations. Each semaphore object has a queue associated with it - the list class has been used for this purpose (list.h and list.cc). In order to grab a semaphore, a process invokes the P() operation. If the semaphore is not available, the process goes to sleep by invoking the Sleep operation on itself. Upon a signal, it decrements the value of the semaphore and executes its critical section.

Before a process exits its critical section, it invokes V(). As part of this operation, a process waiting in the semaphore queue is removed and scheduled to start executing. This is done by placing it in the Ready List maintained by the scheduler. The process is appended to this list for later assignment to the CPU (ReadyToRun() in scheduler.cc.).

Locks

A lock is a synchronization primitive similar to semaphores, for providing mutual exclusion. The two atomic operations provided are Acquire and Release. A lock can either be free or busy; initially, a lock is free. Before accessing a shared variable, a process acquires the lock, and releases it after it is done accessing that variable. Thus, two processes A and B accessing a shared variable would execute the following code to synchronize their access to their critical sections:


               Lock->Acquire();
                  critical section
               Lock->Release();

By convention, a lock is released by the process that acquires it. Note that this is different from semaphores - it is not necessary that the process decrementing the semaphore value to say, x , will increment it to x+1 upon finishing with its critical section.

Implementation in Nachos

Implementation of locks and condition variables (see section below) is part of a lab. A skeleton definition of a Lock class is provided in synch.h, the requisite code has to be implemented by you.

There are three issues to consider here:

Atomicity of Acquire() and Release().
Provision of some wait mechanism if a process trying to acquire the lock cannot do so.
Making sure that only the process that acquires the lock will release the lock.

The issue of atomicity can be resolved in the same way as for semaphores - disabling interrupts. Thus, your implementation of Acquire() and Release should have instructions for disabling and enabling interrupts. Secondly, you need to make sure that the process that calls Release() is the same one that has invoked Acquire. For this, you should use the function isHeldByCurrentThread() which is part of the Lock class definition. To implement this function, think about the Nachos variable that describes the thread/process that is currently executing and see how that information can be used by it. You may need to define some other variables for this class to implement this functionality. The last issue is provision of a wait mechanism. You can of course, implement a queue for each lock in a way similar to the semaphore implementation. On the other hand, since the semaphore implementation is already given to you, you may simply want to use it to implement your locks.

The implementation of the above primitives may lead to a situation where two or more processes end up waiting indefinitely for an event that can only be caused by one of the waiting processes. This event may be the invocation of a V() operation. This state is called a deadlock. Another problem that may occur is starvation or indefinite blocking in which processes wait indefinitely within the semaphore.

Monitors and Condition Variables

Monitors were designed to make it easy to write programs that implemented synchronization correctly. Semaphores and locks have to be used carefully because a small error, like a misplaced P() may cause a deadlock or other forms of unpredictable and irreproducible behavior. A monitor is a high level primitive, provided as a programming language construct. This means that the compiler is aware of this construct and can handle calls to it in manner different than for other constructs.

A monitor is a collection of procedures, variables, and data structures grouped together in a special module or package. Following is an example monitor:


    monitor synch
       integer i;
       condition c;

       procedure producer(x);
       .
       .
       end;

       procedure consumer(x);
       .
       .
       end;
    end monitor;

A process may call the procedures within a monitor but cannot directly access the monitor's internal data structures. The most important property of a monitor is that only one process can be active in a monitor at any instant. When a process calls a monitor procedure, the procedure first checks if any other process is currently active within the monitor. If so, the calling process is suspended until the other process has left the monitor. If no other process is using the monitor, the calling process may enter.

Why is this good? Note that it is the compiler that is implementing the mutual exclusion that the processes need, not the processes themselves. Thus, it is much less likely that something will go wrong. Also, the person writing the monitor does not have to be aware of how the compiler arranges for mutual exclusion. It is sufficient to know that by turning all the critical regions into monitor procedures, no two processes will ever execute their critical regions at the same time.

Condition Variables

Monitors need to provide a way for the processes to block themselves when they cannot proceed. Condition variables provide this facility. As with locks and semaphores, two operations are provided with them - wait and signal. When a monitor procedure finds that it cannot continue, it does a wait on some condition variable. This causes the calling process say A to block. It also allows another process, say B, that had previously been prohibited from entering the monitor to enter now. B can now wake up another process by doing a signal on the condition variable that A was waiting on.

To avoid having two processes active inside the monitor at the same time, we need a rule specifying what happens after a signal. Should B wait till A leaves the monitor or should A wait until B leaves the monitor. Two schemes have been proposed - the Hoare style and the Mesa style. In the former, the signalling process gives up control over the monitor and the CPU to the woken process which starts running immediately and gives back control of the monitor to the signaller when it leaves its critical section. In the Mesa style, the woken thread is simply put on the ready list and it becomes the responsibility of the woken thread to re-enter the monitor.

An important thing to remember is that condition variables are not counters. They do not accumulate signals the way semaphores do. Thus, if a condition variable is signalled with no one waiting on it, the signal is lost.

Implementation in Nachos

Implementation of condition variables is part of a lab. As before, a class definition, Condition is provided in synch.h; it is your job to implement it.

All operations on a condition variable in Nachos must be made while the current process/thread has acquired a lock (see parameters to the functions Wait(), Signal() and Broadcast()). All accesses to a given condition variable must be protected by the same lock. This means that your implementation must enforce mutual exclusion among processes invoking the operations on condition variables.

Wait(), Signal() and Broadcast() have the following semantics:

Wait() releases the lock, gives up the CPU until signalled and then re-acquire the lock.
Signal() wakes up a thread if there are any waiting on the condition variable.
Broadcast() wakes up all threads waiting on the condition.

In Nachos, condition variables obey the Mesa semantics. When a Signal() or Broadcast() wakes up another thread/process, it simply puts the thread on the ready list, and it is the responsibility of the woken thread to re-acquire the lock. Note that this re-acquire should be taken care of within your implementation of Wait().

The consequences of using the Mesa semantics are that some other thread can acquire the lock and change the data structures, before the woken thread gets a chance to run.

Your implementation for condition variables must include a queue for processes waiting on the condition variable (you can use the List class for this purpose, defined in list.h). Thus, in your implementation for Wait(), the calling process should release the lock, append itself to the queue and sleep until it is woken up by another thread. Upon waking up, it should re-acquire the lock. For Signal(), the calling process should remove a thread from the condition queue and place it on the scheduler's ready list. The Broadcast() operation must do the same but for all the threads on the condition queue.

Of course, remember to make these operations atomic. It is also a good idea to check whether the process that is holding the lock is the one invoking the operations on the condition variable.

Source code in code/threads
Lab
Quiz

Last modified on Wednesday, 18-Feb-98 00:10:21 EST