Home
/
Blog
/
Mutex and Semaphore in OS

Mutex and Semaphore in OS

What is Thread?

In operating systems, computer scientists mostly utilize threads to handle complex problems. Threads are pieces of code which represent a specified flow of execution through the process code. They have their separate program count to keep track of the execution of the next instruction. The system keeps track of every single thread, its functions, variables, and its execution stack.

The users also refer to the threads as the light-weight processes because they stimulate the process to achieve the desired outcome faster. As most computer systems nowadays have multi-processors, the idea behind using the threads in programs is to achieve parallelism while using the system’s resources efficiently. For instance, the threads have made it possible to open multiple tabs in the browsers where every tab does its works quickly and efficiently without disturbing other tabs’ working. The efficient execution of threads makes complex problems easy to solve and even give an optimal solution to easy ones. However, working with threads also raises the problem regarding efficient resource utilization. That is why the developers use mutexes and semaphores to control the proper execution of threads.

Note: Threads can create more threads and can communicate with each other as well.

What is Mutex?

Mutex is also known as the object of mutual exclusion. Mutex is a mechanism that locks a specified thread for a specified time. It enables only one thread to acquire it for a limited time. This specific time is called a critical section.

The developers use mutex inside the threads to avoid any drastic outcome of inefficient resource utilization, whether it is a variable, a function, an array, or a file. All the mutex objects have a unique name given to them since the beginning of the program.

Critical Section

While working with threads, it is highly probable that more than one processes have access to a single code segment. This code segment often includes variables and resources that are shared. It is essential to synchronize the process so that these shared resources remain consistent throughout the whole process. Otherwise, the results become unpredictable.

The common practice is that the thread acquires the mutex while entering the critical section and release it after the critical section is over. An example of mutes is the following:

wait (mutex);
 //Critical section
signal (mutex);

The figure below illustrates the two threads, i.e., thread1 and thread2. When the program calls both the threads, one of them who locks the mutex first gets to do the work inside its critical section while the other thread waits for the first thread to complete its specified work and release the mutex. Additionally, when the first thread unlocks the mutex, the second thread takes control and locks the mutex to do its work and then unlocks it afterwards. In the scenarios where the program calls the threads infinitely or multiple times, the threads then alternate the control between them.

Deadlock state in mutex

In multi-threaded or multi-processed programs, where synchronization is exceptionally crucial, sometimes the careless use of mutex can cause deadlock. It happens when threads hold each other’s mutex’s lock and depend on each other to unlock their locks first to proceed. These are the four conditions required to send the program into a deadlock state:

Mutual exclusion
Circular wait
No pre-emption
Hold and wait

Moreover, all these conditions mentioned above need to hold to send the program into a deadlock state. That is why the users must avoid at least one of these conditions to run their program smoothly. The most commonly used method is to lock and unlock the mutexes in a specific order throughout the program to avoid the circular wait.

Below is the sample code to elaborate the deadlock state in a better way:

//thread1
void *thread1(void* p){
            lock(&mutex1);
            lock(&mutex2);//doing some work in the thread
  int store = (int)p;
  store++;
            unlock(&mutex2);
            unlock(&mutex1);
}

//thread 2
void *thread2(void* p){
            lock(&mutex2);
            lock(&mutex1);

  int store = (int)p; //doing some work in the thread
  store++;
            unlock(&mutex1);
            unlock(&mutex2);
}

There are two mutexes, mutex1 and mutex2, and two thread functions, thread1, thread2, in this example. Both the threads lock the mutexes first and then do their work and, after that, unlock the mutexes and return the control. The above functions would work smoothly if one thread locks both mutexes first, then do its work and unlock both of them afterwards. Meanwhile, the other thread waits for the first thread to handover the control. However, in a multi-threading environment, both the threads’ functions can start simultaneously. In that case, thread1 and thread2 can lock the mutex1 and mutex2 respectively and go into infinite wait for the other thread to unlock the mutexes first; thus, a deadlock occurs. Therefore the programmers need to carefully order the mutexes keeping all the possible program flows in his mind.

What is Semaphore?

A semaphore is a variable that is applied to avoid multiple processes accessing a shared resource. For controlled usage of multiple shared resources between various threads, a semaphore contributes to maintaining the program’s concurrency. It proves helpful while working with a multitasking operating system.

Types of Semaphores

Semaphores are usually of two types. Following are the types of semaphores that assist in the process of synchronization:

1. Binary Semaphore

The binary semaphore is used for mutual exclusion and works almost the same as the mutex. It only takes two values, i.e., 0 and 1, and is often called a mutex. Although the implementation of binary semaphore and mutex is somehow similar, they both have different use purposes. The users should not mix the binary semaphore with the mutex because the semaphore is a signalling mechanism. In contrast, the mutex is a locking mechanism.

2. Counting Semaphore

The counting semaphore controls access to a resource. It is different from binary semaphore in a way that there is no known limit to its range. Semaphores help in solving problems of the critical section. These are integer variables known by the keywords: wait and signal.

Wait

The wait is a decrement operation that decreases the value of its provided parameter. The critical concept here is that wait does not decrease the negative value further. It decreases the argument by one if it is a positive integer. Otherwise, it doesn’t perform any action. Below is the simple illustration of the wait operation:

wait (argument){

    while( argument <= 0);
      s--;
}

Signal

Signal operation is the opposite of wait operation. It increases the value of the provided parameter. This operation sends an asynchronous notification to a specified thread to inform it of a specific event’s occurrence. The working behind the signal function is the following:

signal (argument)
{
    argument++;
}

Dining Philosophers problem

The dining philosopher problem is a famous scenario to understand the working of semaphore. This problem states that 5 philosophers sit on a circular table and share a rice bowl with only 5 chopsticks. However, each person needs two chopsticks to eat. In this situation, the philosophers and thinking and eating alternatively, so each philosopher waits for the two chopsticks to get free to eat and then put down both of them to think again. So, the problem is the order they should eat so that no one goes hungry, and there is no conflict in passing the chopsticks among them.

Solution

The problem’s solution is to declare five semaphores where each of them represents one chopstick, and the philosophers cannot disturb the chopsticks’ arrangement. In this way, whenever a philosopher picks chopsticks to eat, the program locks two consecutive chopstick semaphores. Additionally, the solution starts with the assumption that all the chopsticks are free, i.e., are in the bowl. All the semaphores’ initial values are one because the philosophers have not yet picked anything of them.

The following code represents the solution structure:

sempahore chopsticks[5] = {1,1,1,1,1};
do {
 wait ( chopsticks[p] );
 wait ( chopsticks [ (p + 1) % 5] );
 // eat rice
 signal ( chopsticks[p] );
 signal (chopsticks [ (p + 1) % 5] );
 // now think
} while (1);

As the values of the semaphores are initially 1, the program performs the first wait operation on the chopsticks[p] and chopsticks [(p+1) %5]. When these both go into a wait state, the philosopher eats rice and then puts down both together to think, and the other can use those chopsticks.

Lastly, there are a few things that the users must note about the solution:

Two neighbouring philosophers cannot eat at the same time.
Deadlock can occur if all the philosophers pick the chopsticks at the same time.
A philosopher should only pick chopsticks if both of them are available.