When you need to block a thread, think of using the Sleep() function
that is a part of the Thread class.
Processes frequently need to communicate with other processes, e.g., the input of one process may depend on the output of another, or the processes may need to access the same resource and need to do so in an orderly manner. Here, we first discuss the general issues regarding synchronization and then talk about how to implement some synchronization primitives in Nachos.
Assume two processes, A and B, need to read and increment the value of a variable x whose initial value is 7. Thus, after both A and B have incremented x, its value should be 9. Assume process A reads the value 7 and before it can increment it is preempted by the CPU scheduler that now starts executing B. B reads the value 7 as well, increments it to 8 and stores it in x. Now, A gets the CPU again, increments the value it had read in to 8 and then stores it back. Thus, the value of x after both A and B have read and incremented it, is incorrect. Situations like this, where two or more processes are reading or writing some shared data and the final result depends on who runs precisely when, are called race conditions.
The key to avoiding race conditions is to provide mutual exclusion - some way of making sure that if one process is accessing a shared variable, other processes will be excluded from doing the same thing. The part of the program where shared variables are accessed is called the critical region or critical section. If execution of processes is arranged in a manner such that no two processes were in their critical section at the same time, race conditions are avoided.
The software solutions build on the hardware solutions to provide higher-level syncrhonization support. Three software mechanisms are Semaphores, Locks and Monitors. We explain each of these briefly below and also discuss how to implement the first two in Nachos.
P(S): while S <= 0 do no-op; S = S-1; V(S): S = S + 1;The testing and modification of the value of the semaphore in both the operations must be done atomically, i.e., when one process is testing or modifying the semaphore value, no other process can modify the value. To see how these operations can be used to synchronize two processes, assume two processes A and B, executing statements S1 and S2 respectively. The condition is that S2 can be executed only after S1. This can be achieved by having a semaphore, synch initialized to 0. A and B will now execute the following code:
A: S1; V(synch); B: P(synch); S2;Because synch is initialized to 0, B will execute S2 only after A has signalled it by invoking V(synch) which is after S1.
The disadvantage of the above implementation of semaphores is that it involves busy waiting - continuously testing a variable until some value appears. It should be avoided because it wastes CPU time. Only when there is an expectation that the wait will be short is busy waiting used. Taking this into account, we can redefine P() and V() as follows. When a process executes P() and finds that the semaphore is not positive, it must wait. However, rather than busy waiting, the process can block itself. The block operation places the process in a wait queue associated with the semaphore. Control is then transferred to the CPU scheduler, which selects another process to execute. A process that is blocked, waiting on a semaphore, should be restarted when some other process executes V() operation.
The important thing to realize about semaphores is that they work only if the operations P() and V() are atomic. In a uniprocessor environment, this can be achieved by simply disabling interrupts for the duration of these operations.
Note that busy waiting has not been
completely eliminated - Instead of having a busy wait at the critical sections
of user processes, busy wait is limited to the critical sections of
P() and V(). These sections are short, if properly coded.
Before a process exits its critical section, it invokes V(). As part of this operation, a process waiting in the semaphore queue is removed and scheduled to start executing. This is done by placing it in the Ready List maintained by the scheduler. The process is appended to this list for later assignment to the CPU (ReadyToRun() in scheduler.cc.).
Lock->Acquire(); critical section Lock->Release();By convention, a lock is released by the process that acquires it. Note that this is different from semaphores - it is not necessary that the process decrementing the semaphore value to say, x , will increment it to x+1 upon finishing with its critical section.
There are three issues to consider here:
The implementation of the above primitives
may lead to a situation where two
or more processes end up waiting indefinitely for an event that can only be
caused by one of the waiting processes. This event may be the invocation of a
V() operation. This state is called a deadlock.
Another problem that may occur is starvation or
indefinite blocking in which processes wait indefinitely
within the semaphore.
A monitor is a collection of procedures, variables, and data structures grouped together in a special module or package. Following is an example monitor:
monitor synch integer i; condition c; procedure producer(x); . . end; procedure consumer(x); . . end; end monitor;A process may call the procedures within a monitor but cannot directly access the monitor's internal data structures. The most important property of a monitor is that only one process can be active in a monitor at any instant. When a process calls a monitor procedure, the procedure first checks if any other process is currently active within the monitor. If so, the calling process is suspended until the other process has left the monitor. If no other process is using the monitor, the calling process may enter.
Why is this good? Note that it is the compiler that is implementing the mutual exclusion that the processes need, not the processes themselves. Thus, it is much less likely that something will go wrong. Also, the person writing the monitor does not have to be aware of how the compiler arranges for mutual exclusion. It is sufficient to know that by turning all the critical regions into monitor procedures, no two processes will ever execute their critical regions at the same time.
To avoid having two processes active inside the monitor at the same time, we need a rule specifying what happens after a signal. Should B wait till A leaves the monitor or should A wait until B leaves the monitor. Two schemes have been proposed - the Hoare style and the Mesa style. In the former, the signalling process gives up control over the monitor and the CPU to the woken process which starts running immediately and gives back control of the monitor to the signaller when it leaves its critical section. In the Mesa style, the woken thread is simply put on the ready list and it becomes the responsibility of the woken thread to re-enter the monitor.
An important thing to remember is that condition variables are not counters. They do not accumulate signals the way semaphores do. Thus, if a condition variable is signalled with no one waiting on it, the signal is lost.
All operations on a condition variable in Nachos must be made while the current process/thread has acquired a lock (see parameters to the functions Wait(), Signal() and Broadcast()). All accesses to a given condition variable must be protected by the same lock. This means that your implementation must enforce mutual exclusion among processes invoking the operations on condition variables.
Wait(), Signal() and Broadcast() have the following semantics:
In Nachos, condition variables obey the Mesa semantics. When a Signal() or Broadcast() wakes up another thread/process, it simply puts the thread on the ready list, and it is the responsibility of the woken thread to re-acquire the lock. Note that this re-acquire should be taken care of within your implementation of Wait().
The consequences of using the Mesa
semantics are that some other thread can acquire the lock and change the data
structures, before the woken thread gets a chance to run.
Your implementation for condition variables must include a queue for processes waiting on the condition variable (you can use the List class for this purpose, defined in list.h). Thus, in your implementation for Wait(), the calling process should release the lock, append itself to the queue and sleep until it is woken up by another thread. Upon waking up, it should re-acquire the lock. For Signal(), the calling process should remove a thread from the condition queue and place it on the scheduler's ready list. The Broadcast() operation must do the same but for all the threads on the condition queue.
Of course, remember to make these operations atomic. It is also a good idea to check whether the process that is holding the lock is the one invoking the operations on the condition variable.
Source code in code/threads
Lab
Quiz
Last modified on Wednesday, 18-Feb-98 00:10:21 EST