1 of 100

Unit 3 –Process Synchronization & Deadlocks

  • Critical Section Problem
  • Peterson’s Solution
  • Synchronization Hardware
  • Semaphores
  • Classical Problems of synchronization
    • Producer consumer problem
    • Readers writers problem
    • Dining philosophers problem
  • Monitors
  • System model
  • Deadlock characterization
  • Deadlock prevention
  • Deadlock avoidance
  • Deadlock detection
  • Deadlock recovery

2 of 100

  • Cooperating processes can either directly share a logical address space (that is, both code and data) or be allowed to share data only through files or messages. Concurrent access to shared data may result in data inconsistency.

  • Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes

  • Suppose that we wanted to provide a solution to the “producer-consumer” problem that fills all the buffers.

  • We can do so by having an integer variable “count” that keeps track of the number of full buffers.

  • Initially, count is set to 0.

  • It is incremented by the producer after it produces a new buffer.

  • It is decremented by the consumer after it consumes a buffer.

3 of 100

Producer : while (true) {

/* produce an item and put in next Produced */ while (count == BUFFER_SIZE)

; // do nothing

buffer [in] = next Produced;

in = (in + 1) % BUFFER_SIZE;

count++;

}

Consumer

while (true) {

while (count == 0)

; // do nothing

next Consumed = buffer[out];

out = (out + 1) % BUFFER_SIZE;

count--;

/* consume the item in next Consumed

}

4 of 100

As an illustration, suppose that the value of the variable counter is currently 5 and that the producer and consumer processes execute the statements "counter++" and "counter--" concurrently.

Following the execution of these two statements, the value of the variable counter may be 4, 5, or 6! The only correct result, though, is counter = 5, which is generated correctly if the producer and consumer execute separately.

count++ could be implemented in machine language as

R1 = count

R1 = R1 + 1

count = R1

count-- could be implemented as

R2 = count

R2 = R2 - 1

count = R2

5 of 100

Consider this execution interleaving with “count = 5” initially:

T0: producer execute R1 = counter {R1 = 5}�T1: producer execute R1 = R1 + 1 {R1 = 6} �T2: consumer execute R2 = counter { R2 = 5} �T3: consumer execute R2 = R2 – 1 {R2 = 4} �T4: producer execute counter = R1 { counter = 6 } �T5: consumer execute counter = R2 {counter = 4}

We would arrive at this incorrect state because we allowed both processes to manipulate the variable counter concurrently.

A situation like this, where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place, is called a race condition.

To guard against the race condition above, we need to ensure that only one process at a time can be manipulating the variable counter.

To make such a guarantee, we require that the processes be synchronized in some way.

6 of 100

At a given point in time, many kernel-mode processes may be active in the

operating system. As a result, the code implementing an operating system

(kernel code) is subject to several possible race conditions.

Consider as an example a kernel data structure that maintains a list of all open files in the system. This list must be modified when a new file is opened or closed (adding the file to the list or removing it from the list). If two processes were to open files simultaneously, the separate updates to this list could result in a race condition.

Other kernel data structures that are prone to possible race conditions include structures for maintaining memory allocation, for maintaining process lists, and for interrupt handling.

It is up to kernel developers to ensure that the operating system is free from such race conditions.

7 of 100

Critical Section Problem

Consider system of n processes {p0, p1, … pn-1}

Each process has a segment of code, called a critical section, in which Process may be changing common variables, updating table, writing file, etc

The important feature of the system is that, when one process is executing in its critical section, no other process is to be allowed to execute in its critical section. That is, no two processes are executing in their critical sections at the same time.

The critical-section problem is to design a protocol that the processes

can use to cooperate.

Each process must request permission to enter its critical section.

The section of code implementing this request is the entry section.

The critical section may be followed by an exit section.

The remaining code is the remainder section.

8 of 100

General structure of process Pi

9 of 100

A solution to the critical-section problem must satisfy the following three

requirements:

1. Mutual exclusion. If process Pi is executing in its critical section, then no

other processes can be executing in their critical sections.

2. Progress. If no process is executing in its critical section and some

processes wish to enter their critical sections, then only those processes

that are not executing in their remainder sections can participate in the

decision on which will enter its critical section next, and this selection

cannot be postponed indefinitely.

3. Bounded waiting. There exists a bound, or limit, on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that

request is granted.

10 of 100

Two general approaches are used to handle critical sections in operating systems: (1) preemptive kernels and (2) nonpreemptive kernels.

A preemptive kernel allows a process to be preempted while it is running in kernel mode.

A nonpreemptive kernel does not allow a process running in kernel mode to be preempted;

kernel-mode process will run until it exits kernel mode, blocks, or voluntarily yields control of the CPU.

Obviously, a nonpreemptive kernel is essentially free from race conditions on kernel data structures, as only one process is active in the kernel at a time.

We cannot say the same about nonpreemptive kernels, so they must be carefully designed to ensure that shared kernel data are free from race conditions.

Preemptive kernels are especially difficult to design for SMP architectures, since in these environments it is possible for two kernel-mode processes to run simultaneously on different processors.

preemptive kernel is more suitable for real-time programming, as it will allow a real-time process to preempt a process currently running in the kernel.

Furthermore/ a preemptive kernel may be more responsive, since there is less risk that a kernel-mode process will run for an arbitrarily long period before relinquishing the processor to waiting processes.

11 of 100

Peterson's Solution

a classic software-based solution to the critical-section problem known as Peterson's solution. Because of the way modern computer architectures perform basic machine-language instructions/ such as load and store,

there are no guarantees that Peterson's solution will work correctly on such architectures. However, we present the solution because it provides a good algorithmic description of solving the critical-section problem and

illustrates some of the complexities involved in designing software that addresses the requirements of mutual exclusion, progress/ and bounded waiting requirements.

Peterson's solution is restricted to two processes that alternate execution between their critical sections and remainder sections.

The processes are numbered P0 and P1. For convenience, when presenting Pi, we use Pj to denote the other process; that is, j equals 1 - i.

12 of 100

Peterson's solution requires two data items to be shared between the two processes:

int turn;

boolean flag[2];

The variable turn indicates whose turn it is to enter its critical section. That is, if turn == i, then process Pi is allowed to execute in its critical section.

The flag array is used to indicate if a process is ready to enter its critical section. For example, if flag [i] is true, this value indicates that Pi is ready to enter its critical section.

To enter the critical section, process Pi first sets flag [i] to be true and then sets turn to the value j, thereby asserting that if the other process wishes to enter the critical section, it can do so.

If both processes try to enter at the same time, turn will be set to both i and j at roughly the same time. Only one of these assignments will last; the other will occur but will be overwritten immediately.

The eventual value of turn decides which of the two processes is allowed to enter its critical section first.

13 of 100

The structure of process A in Peterson's solution.

do {

flag[i] = true;

turn = j;

while (flag[j] && turn = = j);

critical section

flag[i] = false;

remainder section

} while (true);

We now prove that this solution is correct. We need to show that:

1. Mutual exclusion is preserved.

2. The progress requirement is satisfied.

3. The bounded-waiting requirement is met.

14 of 100

To prove property I, we note that each Pi enters its critical section only

if either flag[j] == false or turn == i.

Also note that, if both processes can be executing in their critical sections at the same time, then flag [0] == flag [1] == true.

These two observations imply that Po and PI could not have successfully executed their while statements at about the same time, since the

value of turn can be either 0 or 1 but cannot be both.

Hence, one of the processes -say Pj -must have successfully executed the while statement, where as Pi had to execute at least one additional statement ("turn == j ").

However, since, at that time, flag [j] == true, and turn == j, and this condition will persist as long as Pj is in its critical section, the result follows: Mutual exclusion is

preserved.

15 of 100

To prove properties 2 and 3, we note that a process Pi can be prevented from

entering the critical section only if it is stuck in the while loop with the condition

flag [j] == true and turn =j; this loop is the only one possible.

If Pj is not ready to enter the critical section, then flag [j] == false, and Pi can enter its critical section.

If Pj has set flag [j] to true and is also executing in its while statement, then either turn= i or turn == j. If turn == i, then Pi will enter the critical section.

If turn == j, then Pj will enter the critical section.

However, once Pi exits its critical section, it will reset flag [j] to false, allowing Pi to enter its critical section.

If Pj resets flag [j] to true, it must also set turn to i. Thus, since Pi does not change the value of the variable turn while executing the while statement, Pi will enter the critical section (progress) after at most one entry by Pj (bounded waiting).

16 of 100

Synchronization Hardware

In general, we can state that any solution to the critical-section problem requires a simple tool-a lock.

Race conditions are prevented by requiring that critical regions be protected by locks. That is, a process must acquire a lock before entering a critical section; it releases the lock when it exits the critical section.

Solution to Critical Section Problem using Locks.

 

do{

acquire lock

  Critical Section

  release lock

  Remainder Section

  }while (True);

 

Note:- Race Conditions are prevented by protecting the critical region by the locks.

17 of 100

The critical-section problem could be solved simply in a uniprocessor environment

if we could prevent interrupts from occurring while a shared variable was being modified.

In this manner, we could be sure that the current sequence of instructions would be allowed to execute in order without preemption.

No other instructions would be run, so no unexpected modifications could be made to the shared variable. This is the approach taken by nonpreemptive kernels.

Unfortunately, this solution is not as feasible in a multiprocessor environment. Disabling interrupts on a multiprocessor can be time consuming

As the message is passed to all the processors. This message passing delays entry into

each critical section, and system efficiency decreases.

Many modern computer systems therefore provide special hardware instructions that allow us either to test and modify the content of a word or to swap the contents of two words atomically that is, as one uninterruptible unit.

We can use these special instructions to solve the critical-section problem in a relatively simple manner.

18 of 100

Test_and_Set Instruction

Definition:

bool test_and_set (bool *target)

{

boolean rv = *target;

*target = TRUE;

return rv:

}

  1. Executed atomically
  2. Returns the original value of passed parameter
  3. Set the new value of passed parameter to “TRUE”.

19 of 100

Solution using Test_and_Set Instruction

  • Shared Boolean variable lock, initialized to FALSE
  • Solution:

do {� while (test_and_set(&lock));

/* do nothing */

/* critical section */

lock = false;

/* remainder section */

} while (true);

20 of 100

swap Instruction

The Swap () instruction, in contrast to the TestAndSet () instruction, operates on the contents of two words; it is defined as

void Swap(boolean *a, boolean *b) {

boolean temp = *a;

*a =*b;

*b = temp;

}

The definition of the Swap () instruction.

it is executed atomically. If the machine supports the Swap () instruction, then mutual exclusion can be provided as follows.

A global Boolean variable lock is declared and is initialized to false. In addition, each process has a local Boolean variable key. The structure of process Pi is

21 of 100

A global Boolean variable lock is declared and is initialized to false. In addition, each process has a local Boolean variable key. The structure of process Pi is

do {

key = TRUE;

while (key == TRUE)

Swap (&lock, &key);

II critical section

lock = FALSE;

II remainder section

}while (TRUE);

Mutual-exclusion implementation with the Swap () instruction.

Although these algorithms satisfy the mutual-exclusion requirement, they do not satisfy the bounded-waiting requirement.

22 of 100

using the another TestAndSet() instruction that satisfies all the critical-section requirements. The common data structures are

boolean waiting[nJ ;

boolean lock;

These data structures are initialized to false. To prove that the mutual exclusion requirement is met, we note that process Pi can enter its critical section only if either waiting [iJ == false or key== false.

The value of key can become false only if the TestAndSet () is executed.

The first process to execute the TestAndSet() will find key== false; all others must wait.

The variable waiting [i] can become false only if another process

leaves its critical section; only one waiting [i] is set to false, maintaining the mutual-exclusion requirement.

23 of 100

do {

waiting[i] = TRUE;

key = TRUE;

while (waiting[i] && key)

key = TestAndSet(&lock);

waiting[i] = FALSE;

II critical section

j = (i + 1) % n;

while ((j != i) && !waiting[j])

j = (j + 1) % n;

if (j == i)

lock = FALSE;

else

waiting[j] = FALSE;

II remainder section

}while (TRUE);

Bounded-waiting mutual exclusion with TestAndSet().

24 of 100

To prove that the progress requirement is met, we note that the arguments presented for mutual exclusion also apply here, since a process exiting the critical section either sets lock to false or sets waiting[j] to false. Both allow a process that is waiting to enter its critical section to proceed.

To prove that the bounded-waiting requirement is met, we note that, when a process leaves its critical section, it scans the array waiting in the cyclic ordering (i + 1, i + 2, ..., n - 1,0, ..., i-I).

It designates the first process in this ordering that is in the entry section (waiting [j] == true) as the next one to enter the critical section. Any process waiting to enter its critical section will thus do so within n - 1 turns.

25 of 100

compare_and_swap Instruction

Definition:

int compare _and_swap(int *value, int expected, int new_value)

{

int temp = *value;

if (*value == expected)

*value = new_value;

return temp;

}

  1. Executed atomically
  2. Returns the original value of passed parameter “value”
  3. Set the variable “value” the value of the passed parameter “new_value” but only if “value” ==“expected”. That is, the swap takes place only under this condition.

26 of 100

Solution using compare_and_swap

Shared integer “lock” initialized to 0;

Solution:

do {� while(compare_and_swap(&lock,0,1)!=0)

; /* do nothing */

/* critical section */

lock = 0;

/* remainder section */

} while (true);

27 of 100

MUTUX LOCKS

Previous solutions are complicated and generally inaccessible to application programmers

OS designers build software tools to solve critical section problem

Simplest is mutex lock

Protect a critical section by first acquire() a lock then release() the lock

Boolean variable indicating if lock is available or not

Calls to acquire() and release() must be atomic. Usually implemented via hardware atomic instructions. But this solution requires busy waiting This lock therefore called a spinlock.

28 of 100

acquire() and release()

acquire() {� while (!available)

; /* busy wait */

available = false;

}

release() {

available = true;

}

do {

acquire lock

critical section

release lock

remainder section

} while (true);

29 of 100

Semaphores

The various hardware-based solutions to the critical-section problem (using the TestAndSet() and Swap() instructions) are complicated for application programmers to use.

To overcome this difficulty, we can use a synchronization tool called a semaphore.

A semaphore S is an integer variable that, apart from initialization, is accessed only through two standard atomic operations: wait() and signal().

The wait () operation was originally termed P (from the Dutch proberen, "to test"); signal() was originally called V (from verhagen, "to increment").

30 of 100

The definition of wait () is as follows:

wait (S) {

While S <= 0

; II no-op

S- - ;

}

The definition of signal() is as follows:

signal (S) {

S++;

}

All the modifications to the integer value of the semaphore in the wait () and signal() operations must be executed indivisibly.

That is, when one process modifies the semaphore value, no other process can simultaneously modify that same semaphore value.

In addition, in the case of wait (S), the testing of the integer value of S (S<= 0), and its possible modification (S--), must also be executed without interruption.

31 of 100

Usage Of Semaphore

Operating systems often distinguish between counting and binary semaphores.

The value of a counting semaphore can range over an unrestricted domain.

The value of a binary semaphore can range only between 0 and 1.

On some systems, binary semaphores are known as mutex locks, as they are locks that provide mutual exclusion.

We can use binary semaphores to deal with the critical-section problem for

multiple processes.

The n processes share a semaphore, mutex, initialized to 1.

32 of 100

Each process Pi is organized as

do {

wait (mutex) ;

II critical section

signal (mutex) ;

II remainder section

}while (TRUE);

Mutual-exclusion implementation with semaphores.

Counting semaphores can be used to control access to a given resource consisting of a finite number of instances.

The semaphore is initialized to the number of resources available.

Each process that wishes to use a resource performs a wait() operation on the semaphore (thereby decrementing the count).

When a process releases a resource, it performs a signal () operation (incrementing the count).

When the count for the semaphore goes to 0, all resources are being used. After that, processes that wish to use a resource will block until the count becomes greater than 0.

33 of 100

We can also use semaphores to solve various synchronization problems.

For example, consider two concurrently running processes: P1 with a statement S1 and P2 with a statement S2.

Suppose we require that S2 be executed only after S1 has completed.

We can implement this scheme readily by letting P1 and P2 share a common semaphore synch, initialized to 0, and by inserting the statement

S1;

signal(synch);

in process P1, and the statements

wait(synch);

S2;

in process P2 . Because synch is initialized to 0, P2 will execute S2 only after P1

has invoked signal (synch), which is after statement S1 has been executed.

34 of 100

Implementation Of Semaphore

The main disadvantage of the semaphore definition given here is that it requires busy waiting.

While a process is in its critical section, any other process that tries to enter its critical section must loop continuously in the entry code.

This continual looping is clearly a problem in a real multiprogramming system,

where a single CPU is shared among many processes.

Busy waiting wastes CPU cycles that some other process might be able to use productively.

This type of semaphore is also called a spinlock because the process "spins" while waiting for the lock.

Spinlocks do have an advantage in that no context switch is required when a process must wait on a lock, and a context switch may take considerable time. Thus, when locks are expected to be held for short times, spinlocks are useful;

35 of 100

To overcome the need for busy waiting, we can modify the definition of

the wait () and signal () semaphore operations.

When a process executes the wait () operation and finds that the semaphore value is not positive, it must wait.

However, rather than engaging in busy waiting, the process can block

itself.

The block operation places a process into a waiting queue associated with the semaphore, and the state of the process is switched to the waiting state. Then control is transferred to the CPU scheduler, which selects another process to execute.

A process that is blocked, waiting on a semaphore S, should be restarted

when some other process executes a signal() operation.

The process is restarted by a wakeup () operation, which changes the process from the waiting state to the ready state. The process is then placed in the ready queue.

36 of 100

To implement semaphores under this definition, we define a semaphore as

a "C" struct:

typedef struct {

int value;

struct process *list;

} semaphore;

Each semaphore has an integer value and a list of processes list. When a process must wait on a semaphore, it is added to the list of processes. A Signal() operation removes one process from the list of waiting processes and awakens that process.

The wait () semaphore operation can now be defined as

wait(semaphore *S) {

S->value--;

if (S->value < 0) {

add this process to S->list;

block();

}

}

37 of 100

The signal () semaphore operation can now be defined as

signal(semaphore *S) {

S->value++;

if (S->value <= 0) {

remove a process P from S->list;

wakeup(P);

}

}

The block () operation suspends the process that invokes it. The wakeup (P) operation resumes the execution of a blocked process P. These two operations are provided by the operating system as basic system calls.

the classical definition of semaphores with busy waiting the semaphore value is never negative, this implementation may have negative semaphore values.

If the semaphore value is negative, its magnitude is the number of processes waiting on that semaphore.

The list of waiting processes can be easily implemented by a link field in each process control block (PCB). Each semaphore contains an integer value and a pointer to a list of PCBs.

38 of 100

One way to add and remove processes from the list in a way that ensures bounded waiting is to use a FIFO queue, where the semaphore contains both head and tail pointers to the queue.

The critical aspect of semaphores is that they be executed atomically. We must guarantee that no two processes can execute wait () and signal() operations on the same semaphore at the same time.

This is a critical-section problem; and in a single-processor environment (that is, where only one CPU exists), we can solve it by simply inhibiting interrupts during the time the wait () and signal () operations are executing.

This scheme works in a single processor environment because, once interrupts are inhibited, instructions from different processes cannot be interleaved.

In a multiprocessor environment, interrupts must be disabled on every processor; Disabling interrupts on every processor can be a difficult task and furthermore can seriously diminish performance.

Therefore, SMP systems must provide alternative locking techniques-such as spinlocks-to ensure that wait () and signal() are performed atomically.

39 of 100

Deadlock and Starvation

The implementation of semaphore with waiting queue may result in the situation where two or more processes are waiting for an event is called as Deadlocked.

 

To illustrate this, let us assume two processes P0 and P1 each accessing two semaphores S and Q which are initialized to 1 :-

 

40 of 100

Let S and Q be two semaphores initialized to 1

P0 P1

wait(S); wait(Q);

wait(Q); wait(S);

... ...

signal(S); signal(Q);

signal(Q); signal(S);

� Now process P0 executes wait(S) and P1 executes wait(Q), assume that P0 wants to execute wait(Q) and P1 executes wait(S). But it is possible only after process P1 executes the signal(Q) and P0 executes signal(S).

 

Starvation or indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended.

41 of 100

Classical Problems of Synchronization

Bounded-Buffer Problem

Readers and Writers Problem

Dining-Philosophers Problem

Bounded-Buffer Problem

 Let us assume N buffers, each can hold only one item.

 

Semaphore mutex initialized to the value 1 which is used to provide mutual exclusion.

Semaphore full initialized to the value 0

Semaphore empty initialized to the value N.

Semaphore full and empty are used to count the number of buffers.

42 of 100

The structure of the producer process

while (true) {

……….

produce an item

……….

wait (empty);

wait (mutex);

……….

add the item to the buffer

……….

signal (mutex);

signal (full);

  }

43 of 100

The structure of the consumer process

while (true) {

  wait (full);

  wait (mutex);

……….

  remove an item from buffer

……….

signal (mutex);

  signal (empty);

……….  

consume the removed item

……….

  }

44 of 100

Readers-Writers Problem

 A data set is shared among a number of concurrent processes

 

– Readers – only read the data set, do not perform any updates

 

– Writers – can both read and write the data set (perform the updates).

 

If two readers read the shared data simultaneously, there will be no problem. If both a reader(s) and writer share the same data simultaneously then there will be a problem.

In the solution of reader-writer problem, the reader process share the following data structures:

Semaphore mutex, wrt;

int readcount;

Where à Semaphore mutex and wrt are initialized to 1. Integer readcount is initialized to 0.

 

45 of 100

The structure of a writer process

do {� wait(wrt);

...� /* writing is performed */

...

signal(wrt);

} while (true);

First variation – no reader kept waiting unless writer has permission to use shared object.

Second variation – once writer is ready, it performs the write ASAP Both may have starvation leading to even more variations Problem is solved on some systems by kernel providing reader-writer locks.

46 of 100

The structure of a reader process

do {� wait(mutex);� read_count++;� if (read_count == 1)

wait(wrt);

signal(mutex);

...� /* reading is performed */

...

wait(mutex);� read_count--;� if (read_count == 0)

signal(wrt);

signal(mutex);

} while (true);

47 of 100

Dining-Philosophers Problem

five philosophers who spend their lives thinking and eating.

The philosophers share a circular table surrounded by five chairs, each belonging to one philosopher. In the centre of the table is a bowl of rice, and the table is laid with five single chopsticks . When a philosopher thinks, she does not interact with her colleagues.

From time to time, a philosopher gets hungry and tries to pick up the two chopsticks that are closest to her (the chopsticks that are between her and her left and right neighbours).

48 of 100

A philosopher may pick up only one chopstick at a time. Obviously, she cannot pick up a chopstick that is already in the hand of a neighbor.

When a hungry philosopher has both her chopsticks at the same time, she eats without releasing her chopsticks.

When she is finished eating, she puts down both of her chopsticks and starts thinking again.

it is an example of a large class of concurrency-control problems. It is a simple representation of the need to allocate several resources among several processes in a deadlock-free and starvation-free manner.

One simple solution is to represent each chopstick with a semaphore. A philosopher tries to grab a chopstick by executing a wait () operation on that semaphore; she releases her chopsticks by executing the signal () operation on the appropriate semaphores.

Thus, the shared data are

semaphore chopstick [5] ;

where all the elements of chopstick are initialized to 1.

49 of 100

The structure of Philosopher i:

do {

wait (chopstick[i] );

wait (chopStick[ (i + 1) % 5] );

// eat

signal (chopstick[i] );

signal (chopstick[ (i + 1) % 5]);

// think

} while (TRUE);

Although this solution guarantees that no two neighbors are eating simultaneously, it nevertheless must be rejected because it could create a deadlock.

Suppose that all five philosophers become hungry simultaneously and each grabs her left chopstick. All the elements of chopstick will now be equal to 0. When each philosopher tries to grab her right chopstick, she will be delayed forever.

50 of 100

Deadlock handling

we present a solution to the dining-philosophers problem that ensures freedom from deadlocks.

• Allow at most four philosophers to be sitting simultaneously at the table.

  • Allow a philosopher to pick up her chopsticks only if both chopsticks are available (to do this she must pick them up in a critical section).

• Use an asymmetric solution; that is, an odd philosopher picks up first her left chopstick and then her right chopstick, whereas an even philosopher picks up her right chopstick and then her left chopstick.

Finally, any satisfactory solution to the dining-philosophers problem must

guard against the possibility that one of the philosophers will starve to death.

A deadlock-free solution does not necessarily eliminate the possibility of

starvation.

51 of 100

Monitors

Although semaphores provide a convenient and effective mechanism for process synchronization, using them incorrectly can result in timing errors that are difficult to detect, since these errors happen only if some particular execution sequences take place and these sequences do not always occur.

We have seen an example of such errors in the use of counters in our

solution to the producer-consumer problem

Unfortunately, such timing errors can still occur when semaphores are

used. To illustrate how, we review the semaphore solution to the critical section problem.

All processes share a semaphore variable mutex, which is initialized to 1. Each process must execute wait (mutex) before entering the critical section and signal (mutex) afterward.

52 of 100

If this sequence is not observed, two processes may be in their critical sections simultaneously.

Let us examine the various difficulties that may result. Note that these difficulties will arise even if a single process is not well behaved.

Suppose that a process interchanges the order in which the wait () and

signal () operations on the semaphore mutex are executed, resulting in

the following execution:

signal(mutex);

critical section

wait(mutex);

In this situation, several processes maybe executing in their critical sections simultaneously, violating the mutual-exclusion requirement.

This error may be discovered only if several processes are simultaneously active in their critical sections. Note that this situation may not always be

reproducible.

53 of 100

• Suppose that a process replaces signal (mutex) with wait (mutex). That

is, it executes

wait(mutex);

critical section

wait(mutex);

In this case, a deadlock will occur.

• Suppose that a process omits the wait (mutex), or the signal (mutex), or

both. In this case, either mutual exclusion is violated or a deadlock will

occur.

These examples illustrate that various types of errors can be generated easily when programmers use semaphores incorrectly to solve the critical-section problem.

To deal with such errors, researchers have developed high-level language

Synchronization construct called the MONITOR.

54 of 100

Monitor is a high-level abstraction that provides a convenient and effective mechanism for process synchronization.

It is a type, or abstract data type, encapsulates private data with public methods to operate on that data.

A monitor type presents a set of programmer-defined operations that are provided mutual exclusion within the monitor.

The monitor type also contains the declaration of variables whose values define the state of an instance of that type, along with the bodies of procedures or functions that operate on those variables.

The syntax of a monitor is

monitor monitor-name

{

// shared variable declarations

procedure P1 (…) { …. }

procedure Pn (…) {……}

Initialization code (…) { … }

}

}

55 of 100

The representation of a monitor type cannot be used directly by the various processes.

Thus, a procedure defined within a monitor can access only those variables declared locally within the monitor and its formal parameters. Similarly, the local variables of a monitor can be accessed by only the local procedures.

Only one process may be active within the monitor at a time But not powerful enough to model some synchronization schemes.

Consequently, the programmer does not need to code this synchronization constraint explicitly

However, the monitor construct, as defined so far, is not sufficiently powerful for modeling some synchronization schemes.

For this purpose, we need to define additional synchronization mechanisms. These mechanisms are provided by the condition construct.

A programmer who needs to write a tailor-made synchronization scheme can define one or more variables of type condition:

56 of 100

Schematic view of a Monitor

57 of 100

condition x, y;

The only operations that can be invoked on a condition variable are wait() and signal().

The operation

x.wait();

means that the process invoking this operation is suspended until another process invokes

x. Signal() ;

The x. Signal () operation resumes exactly one suspended process. If no process is suspended, then the signal() operation has no effect; that is, the state of x is the same as if the operation had never been executed

Contrast this operation with the signal() operation associated with

semaphores, which always affects the state of the semaphore.

58 of 100

Monitor with Condition Variables

59 of 100

Condition Variables Choices

Now suppose that, when the x. signal () operation is invoked by a process

P, there is a suspended process Q, associated with condition x. Clearly, if the suspended process Q is allowed to resume its execution, the signaling process P must wait. Otherwise, both P and Q would be active Simultaneously within the monitor.

Note, however, that both processes can conceptually continue with their execution. Two possibilities exist:

Signal and wait – P waits until Q either leaves the monitor or it waits for another condition

Signal and continue – Q waits until P either leaves the monitor or it waits for another condition

There are reasonable arguments in favor of adopting either option. On the

one hand, since P was already executing in the monitor, the signal-and-continue method seems more reasonable.

60 of 100

On the other hand, if we allow thread P to continue, then by the time Q is resumed, the logical condition for which Q was waiting may no longer hold.

A compromise between these two choices was adopted in the language Concurrent PascaL When thread P executes the signal operation, it immediately leaves the monitor. Hence, Q is immediately resumed.

61 of 100

Monitor Solution to Dining Philosophers

monitor DiningPhilosophers

{

enum { THINKING; HUNGRY, EATING) state [5] ;

condition self [5];

void pickup (int i) {

state[i] = HUNGRY;

test(i);

if (state[i] != EATING) self[i].wait;

}

void putdown (int i) {

state[i] = THINKING;

// test left and right neighbors

test((i + 4) % 5);

test((i + 1) % 5);

}

62 of 100

void test (int i) {

if ((state[(i + 4) % 5] != EATING) &&

(state[i] == HUNGRY) &&

(state[(i + 1) % 5] != EATING) ) {

state[i] = EATING ;

self[i].signal () ;

}

}

initialization_code() {

for (int i = 0; i < 5; i++)

state[i] = THINKING;

}

}

63 of 100

Each philosopher i invokes the operations pickup() and putdown() in the following sequence:

DiningPhilosophers.pickup(i);

EAT

DiningPhilosophers.putdown(i);

No deadlock, but starvation is possible

64 of 100

DEAD LOCKS

Under the normal mode of operation, a process may utilize a resource in only the following sequence:

1. Request: If the request cannot be granted immediately (for example, the resource is being used by another process), then the requesting process must wait until it can acquire the resource.

2. Use: The process can operate on the resource (for example, if

the resource is a printer, the process can print on the printer).

3. Release: The process releases the resource.

A system table records whether each resource is free or allocated, and, if a resource is allocated, to which process. If a process requests a resource that is currently allocated to another process, it can be added to a queue of processes waiting for this resource.

65 of 100

To illustrate a deadlock state, we consider a system with three tape drives. Suppose that there are three processes, each holding one of these tape drives. If each process now requests another tape drive, the three processes will be in a deadlock state. Each is waiting for the event "tape drive is released," which can be caused only by one of the other waiting processes.

a system with one printer and one tape drive. Suppose that process Pi is holding the tape drive and process Pj is holding the printer. If Pi, requests the printer and Pj requests the tape drive, a deadlock occurs.

Deadlock Characterization

Necessary Conditions

A deadlock situation can arise if the following four conditions hold simultaneously in a system:

  1. Mutual exclusion: At least one resource must be held in a non-sharable mode; that is, only one process at a time can use the resource. If another process requests that resource, the requesting process must be delayed until the resource has been released.

66 of 100

2. Hold and wait: There must exist a process that is. holding at least one resource and wait to acquire additional resources that are currently being held by other processes.

3. No preemption: Resources cannot be preempted; that is, a resource can be released only voluntarily by the process holding it, after that process has completed its task.

4. Circular wait: There must exist a set {P0, P1, ..., Pn} of waiting processes such that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2,..., Pn- 1 is waiting for a resource that is held by Pn, and Pn is waiting for a resource that is held by P0.

We emphasize that all four conditions must hold for a deadlock to occur. The circular-wait condition implies the hold-and-wait condition, so the four conditions are not completely independent.

67 of 100

Resource-Allocation Graph

Deadlocks can be described using a directed graph called a system resource-allocation graph. This graph consists of a set of vertices V and a set of edges E. The set of vertices V is partitioned into two different types of nodes P = {P1, P2, ..., Pw}, the set consisting of all the active processes in the system, and R = {R1, R2, ..., Rm}, the set consisting of all resource types in the system.

A directed edge from process Pi to resource type Rj is denoted by Pi —> Rj it signifies that process Pi, requested an instance of resource type Rj and is currently waiting for that resource. A directed edge from resource type Rj to process Pi is denoted by Rj —> Pi; it signifies that an instance of resource type Rj has been allocated to process Pi. A directed edge Pi —> Rj is called, a request edge; a directed edge Rj —> Pi is called an assignment edge. Pictorially, we represent each process Pi as a circle, and each resource type Rj- as a square. Since resource type Rj may have more than one instance, we represent each such instance as a dot within the square.

68 of 100

When process Pi requests an instance of resource type Rj, a request edge is inserted in the resource-allocation graph. When this request can be fulfilled, the request edge is instantaneously transformed to an assignment edge. When the process no longer needs access to the resource it releases the resource, and as a result the assignment edge is deleted.

Example : - The resource-allocation The sets P, R, and E:

 

P = { P1 , P2, P3 } R = {R1, R2, R3, R4 }

 

E = { P1 🡪 R1, P2🡪 R3, R1 🡪 P2, R2 🡪P2, R2 🡪 P1, R3 🡪 P3 }

 

• Resource instances:

One instance of resource type R1

Two instances of resource type R2

One instance of resource type R3

Three instances of resource type R4

69 of 100

Process states:

Process P1 is holding an instance of resource type R2, and is waiting for an instance of resource type R1.

Process P2 is holding an instance of R1 and R2, and is waiting for an instance of resource type R3.

Process P3 is holding an instance of R3.

70 of 100

Given the definition of a resource-allocation graph, it can be shown that, if the graph contains no cycles, then no process in the system is deadlocked. If, on the other hand, the graph contains a cycle, then a deadlock may exist.

If each resource type has several instances, then a cycle does not necessarily imply that a deadlock occurred. In this case, a cycle in the graph is a necessary but not a sufficient condition for the existence of deadlock.

To illustrate this concept, let us return to the resource-allocation graph depicted in the above Figure. Suppose that process P3 requests an instance of resource type R2. a request edge P3🡪R2 is added to the graph. At this point, two minimal cycles exist in the system.

71 of 100

Processes P1, P2, P3 are deadlocked. Process P2 is waiting for the resource R3, which is held by process P3. Process P3, on the other hand, is waiting for either process P2 or process P2 to release resource R1 In addition, process P1 is waiting for process P2 to release resource R1.

 

Now consider the following resource-allocation graph In this example, we also have a cycle

72 of 100

However, there is no deadlock. Observe that process P4 may release its instance of resource type R2. That resource can then be allocated to P3, breaking the cycle.

In summary, if a resource-allocation graph does not have a cycle, then the system is not in a deadlock state. On the other hand, if there is a cycle, then the system may or may not be in a deadlock state.

73 of 100

Methods for Handling Deadlocks

Principally, there are three different methods for dealing with the deadlock problem:

  • We can use a protocol to ensure that the system will never enter a deadlock state.
  • We can allow the system to enter a deadlock state and then recover.
  • We can ignore the problem all together, and pretend that deadlocks never occur in the system.

To ensure that deadlocks never occur, the system can use either a deadlock prevention or a deadlock-avoidance scheme. Deadlock prevention is a set of methods for ensuring that at least one of the necessary conditions cannot hold.

Deadlock avoidance, on the other hand, requires that the operating system be given in advance additional information concerning which resources a process will request and use during its lifetime. With this additional knowledge, we can decide for each request whether or not the process should wait.

74 of 100

If a system does not employ either a deadlock-prevention or a deadlock avoidance algorithm, then a deadlock situation may occur. In this case, the system must provide an algorithm to detect deadlock and an algorithm to recover from the deadlock.

Deadlock Prevention

deadlock to occur, each of the four necessary conditions must hold. By ensuring that at least one of these conditions cannot hold, we can prevent the occurrence of a deadlock.

  1. Mutual Exclusion

The mutual-exclusion condition must hold for nonsharable resources. For example, a printer cannot be simultaneously shared by several processes. Sharable resources, on the other hand, do not require mutually exclusive access, and thus cannot be involved in a deadlock. Read- only files are a good example of a sharable resource. If several processes attempt to open a read-only file at the same time, they can be granted simultaneous access to the file. A process never prevent deadlocks by denying the mutual-exclusion condition: Some resources are intrinsically non-sharable.

75 of 100

2. Hold and Wait

To ensure that the hold-and-wait condition never occurs in the system, we must guarantee that, whenever a process requests a resource, it does not hold any other resources. One protocol that can be used requires each process to request and be allocated all its resources before it begins execution.

An alternative protocol allows a process to request resources only when process has none. A process may request some resources and use them. Before it can request any additional resources, however, it must release all the resources that it is currently allocated.

To illustrate the difference between these two protocols, we consider a process that copies data from a tape drive to a disk file, sorts the disk file, and then prints the results to a printer. If all resources must be requested at the beginning of the process, then the process must initially request the tape drive, disk file, and the printer. It will hold the printer for its entire execution, even though it needs the printer only at the end.

76 of 100

The second method allows the process to request initially only the tape drive and disk file. It copies from the tape drive to the disk, then releases both the tape drive and the disk file. The process must then again request the disk file and the printer. After copying the disk file to the printer, it releases these two resources and terminates.

There are two main disadvantages to these protocols. First, resource utilization may be low, since many of the resources may be allocated but unused for a long period. In the example given, for instance, we can release the tape drive and disk file, and then again request the disk file and printer, only if we can be sure that our data will remain on the disk file. If we cannot be assured that they will, then we must request all resources at the beginning for both protocols.

Second, starvation is possible. A process that needs several popular resources may have to wait indefinitely, because at least one of the resources that it needs is always allocated to some other process.

77 of 100

3. No Preemption

The third necessary condition is that there be no preemption of resources that have already been allocated. To ensure that this condition does not hold, we can use the following protocol. If a process that is holding some resources requests another resource that cannot be immediately allocated to it, then all resources are implicitly released. The preempted resources are added to the list of resources for which the process is waiting. The process will be restarted only when it can regain its old resources, as well as the new ones that it is requesting.

Alternatively, if a process requests some resources, we first check whether, they are available. If they are, we allocate them. If they are not available, we check whether they are allocated to some other process that is waiting for additional resources. If so, we preempt the desired resources from the waiting process and allocate them to the requesting process. If the resources are not either available or held by a waiting process, the requesting process must wait. While it is waiting, some of its resources may be preempted, but only if another process requests them.

78 of 100

4.Circular Wait

One way to ensure that the circular-wait condition never holds is to impose a total ordering of all resource types, and to require that each process requests resources in an increasing order of enumeration.

Let R = {R1, R2, …..Rm} be the set of resource types. We assign to each resource type a unique integer number, which allows us to compare two resources and to determine whether one precedes another in our ordering. the function F might be defined as follows:

F(tape drive) = 1,

F(disk drive) = 5, F(Printer) = 12.

We can now consider the following protocol to prevent deadlocks: Each process can request resources only in an increasing order of enumeration. That is, a process can initially request any number of instances of a resource type, say Ri. After that, the process can request instances of resource type Rj if and only if F(Rj) > F(Ri ). If several instances of the same resource type are needed, a single request for all of them must be issued.

79 of 100

If these two protocols are used, then the circular-wait condition cannot hold. We can demonstrate this fact by assuming that a circular wait exists (proof by contradiction). Let the set of processes involved in the circular wait be { P0, P1, …. , Pn }, where Pi, is waiting for a resource Rj, which is held by process Pi+1 (Modulo arithmetic is used on the indexes, so that Pn is waiting for a resource Rn held by P0.) Then, since process Pi+1 is holding resource Ri while requesting resource Ri+1, we must have F(Ri) < F(Ri+1), for all i. But this condition means that

F(R0) < F(R1) < ... < F(Rn) < F(R0).

By transitivity, F(R0) < F(R0), which is impossible.

Therefore, there can be no circular wait.

80 of 100

7.5 Deadlock Avoidance

Possible side effects of preventing deadlocks are low device utilization and reduced system throughput. An alternative method for avoiding deadlocks is to require additional information about how resources are to be requested.

For example, in a system with one tape drive and one printer, we might be told that process P will request first the tape drive, and later the printer, before releasing both resources. Process Q, on the other hand, will request first the printer, and then the tape drive. to decide whether the current request can be satisfied or must wait to avoid a possible future deadlock.

The simplest and most useful model requires that each process declare the maximum number of resources of each type that it may need. Given a priori information about the maximum number of resources of each type that may be requested for each process, it is possible to construct an algorithm defines the deadlock-avoidance approach.

81 of 100

Safe State

A state is safe if the system can allocate resources to each process (up to its maximum) in some order and still avoid a deadlock. More formally, a system is in a safe state only if there exists a safe sequence. A sequence of processes < P1, P2, ...,Pn > is a safe sequence for the current allocation state. If no such sequence exists, then the system state is said to be unsafe. A safe state is not a deadlock state. Conversely, a deadlock state is an unsafe state. Not all unsafe states are deadlocks. An unsafe state may lead to a deadlock

To illustrate, we consider a system with 12 magnetic tape drives and 3 processes: P0, P1, and P2. Process P0 requires 10 tape drives, process P1 may need as many as 4, and process P2 may need up to 9 tape drives. Suppose that, at time t0, process P0 is holding 5 tape drives, process P1 is holding 2, and process P2 is holding 2 tape drives. (Thus, there are 3 free tape drives.)

82 of 100

At time To, the system is in a safe state. The sequence <P1, P0, P2> satisfies the safety condition, since process P1 can immediately be allocated all its tape drives and then return them (the system will then have 5 available tape drives), then process P0 can get all its tape drives and return them (the system will then have 10 available tape drives), and finally process P2 could get all its tape drives and return them (the system will then have all 12 tape drives available).

Note that it is possible to go from a safe state to an unsafe state. Suppose that, at time T1, process P2 requests and is allocated 1 more tape drive. The system is no longer in a safe state. At this point, only process P1 can be allocated all its tape drives. When it returns them, the system will have only 4 available tape drives. Since process P0 is allocated 5 tape drives, but has a maximum of 10, it may then request 5 more tape drives. Since they are unavailable, process P0 must wait.

83 of 100

Similarly, process P2 may request an additional 6 tape drives and have to wait, resulting in a deadlock. Our mistake is in granting the request from process P1 for 1 more tape drive. If we had made P1 wait until either of the other processes had finished and released its resources, then we could have avoided the deadlock situation.

Given the concept of a safe state, we can define avoidance algorithms that ensure that the system will never deadlock. The idea is simply to ensure that the system will always remain in a safe state. Initially, the system is in a safe state. Whenever a process requests a resource that is currently available, the system must decide whether the resource can be allocated immediately or whether the process must wait. The request is granted only if the allocation leaves the system in a safe state.

Note that, in this scheme, if a process requests a resource that is currently available, it may still have to wait. Thus, resource utilization may be lower than it would be without a deadlock-avoidance algorithm.

84 of 100

Resource-Allocation Graph Algorithm

If we have a resource-allocation system with only one instance of each resource type . In addition to the request and assignment edges, we introduce a new type of edge, called a claim edge. A claim edge Pi —> Rj indicates that process Pi- may request resource Rj at some time in the future. This edge resembles a request edge in direction, but is represented by a dashed line. When process Pi, requests resource Rj, the claim edge Pi —>• Rj is converted to a request edge. Similarly, when a resource Rj is released by Pi, the assignment edge Rj —> Pi is reconverted to a claim edge Pi —> Rj.

Suppose that process Pi requests resource Rj. The request can be granted only if converting the request edge Pi —> Rj to an assignment edge Rj —> Pi does not result in the formation of a cycle in the

85 of 100

To illustrate this algorithm, we consider the resource-allocation graph of above. Suppose that P2 requests R2- Although R2 is currently free, we cannot allocate it to P2, since this action will create a cycle in the graph. A cycle indicates that the system is in an unsafe state.

86 of 100

Banker's Algorithm

The resource-allocation graph algorithm is not applicable to a resource allocation system with multiple instances of each resource type. This algorithm is commonly known as the banker's algorithm. The name was chosen because this algorithm could be used in a banking system to ensure that the bank never allocates its available cash such that it can no longer satisfy the needs of all its customers. When a new process enters the system, it must declare the maximum number of instances of each resource type that it may need. This number may not exceed the total number of resources in the system. When a user requests a set of resources will leave the system in a safe state. If it will, the resources are allocated; otherwise/ the process must wait until some other process releases enough resources.

Several data structures must be maintained to implement the banker's algorithm. These data structures encode the state of the resource-allocation system. Let n be the number of processes in the system and m be the number of resource types. We need the following data structures:

87 of 100

Available: A vector of length m indicates the number of available resources of each type. If Available[j] = k, there are k instances of resource type Rj available.

Max: An n x m matrix defines the maximum demand of each process. If Max[i,j] = k, then Pi may request at most k instances of resource type Rj.

Allocation: An n x m matrix defines the number of resources of each type currently allocated to each process. If Allocation[i,j]= k, then process Pi is currently allocated k instances of resource type Rj.

Need: An n x m matrix indicates the remaining resource need of each process. If Need[i,j] = k, then Pi may need k more instances of resource type Rj to complete its task. Note that Need[i,j] = Max[i,j] - Allocation[i,j].

88 of 100

Safety Algorithm 

1. Let Work and Finish be vectors of length m and n, respectively. Initialize: Work = Available

Finish [i] = false for i = 0, 1, …, n- 1

2. Find an i such that both:

(a) Finish [i] = false

(b) NeediWork If no such i exists, go to step 4

  1. Work = Work + AllocationiFinish[i] = true go to step 2
  2. If Finish [i] == true for all i, then the system is in a safe state

89 of 100

Resource-Request Algorithm

Requesti = request vector for process Pi. If Requesti [j] = k then process Pi wants k instances of resource type Rj

    • If Requesti Needi go to step 2. Otherwise, raise error condition, since process has exceeded its maximum claim

    • If RequestiAvailable, go to step 3. Otherwise Pi must wait, since resources are not available

3. Pretend to allocate requested resources to Pi by modifying the state as follows: Available = Available Requesti

Allocationi = Allocationi + Requesti

Needi = NeediRequesti

If the resulting resource-allocation state is safe, the transaction is completed and process Pi is allocated its resources.

      • If safe ⇒ the resources are allocated to Pi
      • If unsafe ⇒ Pi must wait, and the old resource-allocation state is restored

90 of 100

Example

Consider a system with five processes P0 through P4 and three resource types A,B,C. Resource type A has10 instances, B has 5 instances, and resource type C has 7 instances. Suppose that, at time T0, the following snapshot of the system has been taken:

The content of the matrix Need is defined to be Max — Allocation and is

We claim that the system is currently in a safe state. Indeed, the sequence <P1, P3, P4, P2, P0> satisfies the safety criteria

91 of 100

Suppose now that process Pi requests one additional instance of resource type A and two instances of resource type C, so Requesti = (1,0,2). To decide whether this request can be immediately granted, we first check that Request < Available (that is, (1,0,2) < (3,3,2)), which is true. We then pretend that this request has been fulfilled, and we arrive at the following new state:

We must determine whether this new system state is safe. To do so, we execute our safety algorithm and find that the sequence <P1, P3, P4, P0, P2> satisfies our safety requirement. Hence, we can immediately grant the request of process Pi.

  request for (3,3,0) by P4 cannot be granted, since the resources are not available.

A request for (0,2,0) by P0 can be granted, since the resulting state is safe.

92 of 100

Deadlock Detection

If a system does not employ either a deadlock-prevention or a deadlock avoidance algorithm, then a deadlock situation may occur. the system must provide:

      • An algorithm that examines the state of the system to determine whether a deadlock has occurred and
      • An algorithm to recover from the deadlock

Single Instance of Each Resource Type

If all resources have only a single instance, then we can define a deadlock detection algorithm that uses a variant of the resource-allocation graph, called a wait-for graph. We obtain this graph from the resource-allocation graph by removing the nodes of type resource and collapsing the appropriate edges. More precisely, an edge from Pi to Pj in a wait-for graph implies that process Pi, is waiting for process Pj to release a resource that Pi needs. An edge Pi —> Pj exists in a wait-for graph if and only if the corresponding resource allocation graph contains two edges Pi —> Rq and Rq —> Pj for some resource : Rq.

93 of 100

For example, in Figure we present a resource-allocation graph and the corresponding wait-for graph. As before, a deadlock exists in the system if and only if the wait-for graph contains a cycle. To detect deadlocks, the system needs to maintain the wait-for graph and periodically to invoke an algorithm that searches for a cycle in the graph.

Several Instances of a Resource Type

The wait-for graph scheme is not applicable to a resource-allocation system with multiple instances of each resource type. The deadlock-detection algorithm that we describe next is applicable to such a system. The data structures used for the algorithm are similar to bankers algorithm.

94 of 100

Deadlock Detection Algorithm

1. Let Work and Finish be vectors of length m and n, respectively Initialize:

(a) Work = Available

(b) For i = 1,2, …, n, if Allocationi ≠ 0, then �Finish[i] = false; otherwise, Finish[i] = true

2. Find an index i such that both:

(a) Finish[i] == false

(b) RequestiWork�

If no such i exists, go to step 4

3. Work = Work + AllocationiFinish[i] = truego to step 2�

4. If Finish[i] == false, for some i, 1 ≤ in, then the system is in deadlock state. Moreover, if Finish[i] == false, then Pi is deadlocked

95 of 100

To illustrate this algorithm, we consider a system. with five processes P0 through P4 and three resource types A, B, C. Resource type A has 7 instances, resource type B has 2 instances, and resource type C has 6 instances. Suppose that, at time T0, we have the following resource-allocation state:

Five processes P0 through P4; three resource types �A (7 instances), B (2 instances), and C (6 instances)

Snapshot at time T0:

Allocation Request Available

A B C A B C A B C

P0 0 1 0 0 0 0 0 0 0

P1 2 0 0 2 0 2

P2 3 0 3 0 0 0

P3 2 1 1 1 0 0

P4 0 0 2 0 0 2

96 of 100

We claim that the system is not in a deadlocked state. Indeed, if we execute our algorithm, we will find that the sequence <P0, P2, P3, P1, P4> will result in Finish[i] = true for all i.

P2 requests an additional instance of type C

Request

A B C

P0 0 0 0

P1 2 0 2

P2 0 0 1

P3 1 0 0

P4 0 0 2

State of system?

Can reclaim resources held by process P0, but insufficient resources to fulfill other processes; requests

Deadlock exists, consisting of processes P1, P2, P3, and P4

97 of 100

We claim that the system is now deadlocked. Although we can reclaim the resources held by process P0, the number of available resources is not sufficient to full fill the requests of the other processes. Thus, a deadlock exists, consisting of processes P1, P2, P3, and P4.

Recovery from Deadlock

 

When a detection algorithm determines that a deadlock exists, several alternatives exist. One possibility is to inform the operator that a deadlock has occurred, and to let the operator deal with the deadlock manually. The other possibility is to let the system recover from the deadlock automatically. There are two options for breaking a deadlock. One solution is simply to abort one or more processes to break the circular wait. The second option is to preempt some resources from one or more of the deadlocked processes.

98 of 100

Process Termination

 

To eliminate deadlocks by aborting a process, we use one of two methods. In both methods, the system reclaims all resources allocated to the terminated processes.

 

Abort all deadlocked processes: This method clearly will break the deadlock cycle, but at a great expense, since these processes may have computed for a long time, and the results of these partial computations must be discarded, and probably must be recomputed later.

Abort one process at a time until the deadlock cycle is eliminated: This method incurs considerable overhead, since, after each process is aborted, a deadlock-detection algorithm must be invoked to determine whether any processes are still deadlocked. 

99 of 100

Many factors may determine which process is chosen, including:

  1. What the priority of the process is
  2. How long the process has computed, and how much longer the process will compute before completing its designated task
  3. How many and what type of resources the process has used (for example, whether the resources are simple to preempt)
  4. How many more resources the process needs in order to complete
  5. How many processes will need to be terminated
  6. Whether the process is interactive or batch

Resource Preemption 

To eliminate deadlocks using resource preemption, we successively preempt some resources from processes and give these resources to other processes until the deadlock cycle is broken.

If preemption is required to deal with deadlocks, then three issues need to be addressed:

100 of 100

1. Selecting a victim: As in process termination, we must determine the order of preemption to minimize cost. Cost factors may include such parameters as the number of resources a deadlock process is holding, and the amount of time a deadlocked process has thus far consumed during its execution.

 

2. Rollback: If we preempt a resource from a process, it is missing some needed resource. We must roll back the process to some safe state, and restart it from that state. Since, in general, it is difficult to determine what a safe state is, the simplest solution is a total rollback:

 

3. Starvation: In a system where victim selection is based primarily on cost factors, it may happen that the same process is always picked as a victim. As a result, this process never completes its designated task, a starvation situation that needs to be dealt with in any practical system. Clearly, we must ensure that a process can be picked as a victim only a (small) finite number of times