OpenMp Part-3: Learning Synchronization



What is Synchronization?

Synchronization is basically a technique which allows us to perform one operation or process at a time. It allows the threads to execute a process one by one rather then all threads executing simultaneously. Most often the process being synchronized are referred to as Critical Sections. A Critical Section is a section that is required to be run synchronously in a multi threaded program for correct results and most often involves reading and writing to a shared value.

Why do we need Synchronization?

As seen in the previous section, the shared value “counter” which is being read and written to by all threads, gets it’s value overwritten because the threads do not know if some other thread updated the shared value. This is where synchronization comes in. Using synchronization techniques will allow us to make sure that the shared value of counter is read and updated by threads one after the other to make sure it’s value is not overwritten.

Synchronization Techniques

Their are many techniques for synchronizing critical sections in a parallel program. Broadly their are two types of synchronization we can implement, they are:

  • Barrier Synchronization: In this technique their is something similar like a checkpoint in a racing game, the only difference is that all cars need to stop at this checkpoint and wait till all other cars reach there. All threads wait at a Barrier until all other threads arrive.
  • Mutual Exclusion: It involves the development of a block of code that only one thread at a time can execute.

OpenMP provides a set of constructs that can be used to provide synchronization. They are mainly of two types:

  • High Level: Critical, Atomic, Barrier and Ordered
  • Low Level: Flush and Locks

In this series of articles we’ll focus on Critical and Atomic constructs. Lets make our previous counter program work correctly!

Let’s Code - SyncedCounter

For this example we mostly borrow the code from the previous example and make few changes to it.

#include <stdio.h>
#include <omp.h>

int main () {
    int counter = 0;

    #pragma omp parallel 
    {
        int thread_num = omp_get_thread_num();
        int num_of_threads = omp_get_num_threads();
        #pragma omp critical
            counter = counter + 1;
        printf("Thread ID: %d \t Total number of threads %d \t Counter Value: %d\n", thread_num, num_of_threads, counter);
    }
}
  • #pragma omp critical - this line in the above code uses OpenMP’s Critical construct to make sure the following line of code is executed synchronously by threads.

Surprise! on running the above code you’ll see that the maximum value of counter printed out is equal to the number of threads you have provided OpenMP to use. Though the order of execution of threads is still random but we get the correct value for our counter variable. You can verify this by adding a print statement to output the value of counter below the parallel block to check the final value of counter.

In this program our threads are created and parallel execution starts when we reach #pragma omp parallel statement in the code. Our threads execute the following two lines of code simultaneously. When our threads reach #pragma omp critical statement they understand that the following line of code needs to be executed synchronously. Hence they wait for all threads to reach this statement and then execute the following command one by one. After which the parallel execution of the program continues.

Next Module we’ll verify that parallelizing low compute problems may lead to more execution time.