C++ Threads

When we write code and execute it, the code in execution is called a process. The operating system does a few things when we start the execution of the code. The OS will assign stack and heap to the process and an entry will be made in the Process Control Block (PCB), a data structure used by the operating system to manage all the processes.

Most of our codes are synchronous in nature which means every line of code is executed sequentially, one after the other. This sequential approach has several down sides. Let’s explore them one by one:

  • Wait for disk operations
  • Wait for network responses
  • Slow or frozen user experience

Threads

We can split the code of a single process into multiple parts that can run independently by using Threads. Let’s see how threads solve the problems mentioned above.

Waiting for IO

Suppose there’s a text editor and we have written a 100 page movie script but we have not formatted it yet. If we save the text then considering it a 100 MB file, we have to wait for a very long time so that the text can be flushed to the disk. This is because if the CPU op takes 1 unit of time then the disk op takes 15,000,000 to 45,000,000 units.

Now if we use threads to code the text editor, then we can pass on the operation of save, which will flush the text to disk to a separate thread. This separate thread will save the file to hard disk while we can continue on formatting the text and can again call the save operation.

Poor User Experience

When we are not using threads, we have to wait and waiting is something we do not want. Using threads help enhancing the user experience by making things feel faster and smooth.

Multi-Threaded Programming

Let’ see how we can use threads to write code. The traditional way of writing multi-threaded applications is to use the Posix Threads but we can use the C++ threads which are now available out of the box with C++ 11. Comparatively making and using threads related programs are cake walk (caution: only if you are confident of the concepts and implications of a complex multi-threaded program).

Creating Threads

We can use the thread library to create and use threads in C++ 11. Let’s create the simplest thread example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include<iostream>
#include<thread>

void hello(std::string name) {
std::cout << "Hello " << name << std::endl;
}

int main() {
std::thread t1(hello, "world");
std::thread t2(hello, "everyone!");

t1.join();
t2.join();

return 0;
}

Now if we run the program we can see the output of the two "Hello" logs but there is no guarantee of which one will be printed first. Don’t worry, we will see later on how we can make them work in sync or ever create execution order but for this example when we run independent threads the execution order of the thread is not guaranteed.

main() function is called the main thread which is executed by the OS for the given program.

Joining Threads

You must have noticed that we used t1.join();. The join() method on the thread object allows the thread to complete it’s operations. We can also call joinable() to check if a thread can join the calling thread.

If a thread is joinable and we still not join it gets destroyed then the C++ runtime library calls std::terminate(). Double join will also cause program termination. Let’s have a look at the console output when we are not using join().

1
2
3
libc++abi: terminating
Hello world
[1] 9645 abort ./threads/thread

Detaching Threads

We can detach a thread from the parent thread as well. The thread becomes a daemon thread (or background thread). It continues to run independently in the background, without any way for the main program to wait for it using join() or manage it through the original std::thread object. When a detached thread is running, if the main thread or process exits, the detached thread will be terminated by the operating system.

Detaching is useful in cases when we want conditions of fire and forget, performing an asynchronous, non-critical operation, like sending a status report or logging a sensor reading.

Race Condition & Critical Section

This is one of the most important concept in concurrency caused due to multiple threads operating on a shared piece of code. Let’s understand it by and example:

Suppose we have a function that is responsible for incrementing the value of a number. If we execute the function using two threads on the same variable them what you think the output will be?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include<iostream>
#include<thread>

void increment (int &num) {
// modern machines this load to simulate
// the data race
for (int i = 0; i < 10000; ++i) {
num++;
}
}

int main() {
int num = 0;

std::thread t1(increment, std::ref(num));
std::thread t2(increment, std::ref(num));

t1.join();
t2.join();

std::cout << "Final number = " << num << std::endl;

return 0;
}

In the above code we are incrementing a number 10,000 times and two threads are doing that simultaneously. As expected the answer should be 20,000.

But here’s the log:

1
2
3
4
5
6
7
8
9
10
11
g++ threads/02_race_threads.cpp -o threads/race
./threads/race
Final number = 15379
./threads/race
Final number = 15015
./threads/race
Final number = 14202
./threads/race
Final number = 14064
./threads/race
Final number = 16193

Mutexes

In the above example, two threads were racing to complete the increment on the shared number variable num. And it lead to data corruption because the critical section was not guarded. A mutex (mutual exclusion) is a technique to lock the critical section so that only one thread can access it at a given time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include<iostream>
#include<thread>
#include<mutex>

std::mutex mut;

void increment (int &num) {
// modern machines this load to simulate
// the data race
mut.lock();
for (int i = 0; i < 10000; ++i) {
num++;
}
mut.unlock();
}

int main() {
int num = 0;

std::thread t1(increment, std::ref(num));
std::thread t2(increment, std::ref(num));

t1.join();
t2.join();

std::cout << "Final number = " << num << std::endl;
}

We declared a global mutex mut which is used inside the increment function to make it thread safe by locking the critical section and thus avoiding data race. If we execute the code with mutex above, the result will be as expected:

1
2
./threads/race
Final number = 20000

Semaphores

There is another mechanism that can be used to control the race condition and critical section called Semaphores. A semaphore is a counter with rules that ensure only a certain number of threads can access a resource at the same time. There are two types of semaphores:

  • Binary Semaphore has only two values 0 & 1. It acts like a mutex.
  • Counting Semaphore which can be any non negative integer and allows up to N threads to access a resource.

Deadlock

Deadlocks are one of the scariest conditions in concurrent programming. A deadlock is a situation where two or more threads are permanently blocked, each waiting for a resource that another thread is holding. Since none of them can proceed, the program gets stuck forever.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <thread>
#include <mutex>

std::mutex a;
std::mutex b;

void print() {
a.lock();
b.lock();
}

void save() {
b.lock();
a.lock();
}

int main() {
std::thread t1(print);
std::thread t2(save);

t1.join();
t2.join();

return 0;
}

On compiling and executing the code, the program will stay stuck proving the deadlock scenario.

Thread Communication

In real lie programming, we often use multiple threads and share information between them. In C++11, there are three primary tools for thread communication:

  • Condition Variables
  • Promises and Futures
  • Atomic Variables

Condition Variable

A condition variable is a synchronization primitive that allows threads to block efficiently until a shared condition becomes true. It works together with a mutex and a predicate (flag), enabling threads to wait without consuming CPU cycles.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <mutex>
#include <condition_variable>
#include <thread>

std::mutex m;
std::condition_variable cv;
bool ready = false;

void producer() {
{
std::lock_guard<std::mutex> lock(m);
ready = true;
}
cv.notify_one();
}

void consumer() {
std::unique_lock<std::mutex> lock(m);
cv.wait(lock, [] { return ready; });
}

int main() {
std::thread t1(producer), t2(consumer);

t1.join();
t2.join();

return 0;
}

Promises & Futures

A promise and future pair provides onetime, thread safe value transfer. A promise allows a thread to set a result or exception, while the corresponding future allows another thread to retrieve the result using get(), which blocks until the value becomes available.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <future>
#include <thread>

void worker(std::promise<int> p) {
p.set_value(10);
}

int main() {
std::promise<int> p;
std::future<int> f = p.get_future();

std::thread t(worker, std::move(p));

int result = f.get(); // result = 10
t.join();

return 0;
}

Atomic Variable

An atomic variable provides operations that are guaranteed to occur indivisibly across threads. This means that loads, stores, and increments happen without race conditions and without the need for an external mutex.They are the foundation of lock-free programming and enable lightweight synchronization for simple shared states like counters and flags.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <atomic>
#include <thread>

std::atomic<int> counter(0);

void add() {
counter.fetch_add(1);
}

int main() {
std::thread t1(add), t2(add);
t1.join();
t2.join();

return 0;
}

Closing Notes

That was a lot of concepts, codes and some mind boggling cases. This is all when we have just started with threads, imagine the hard work and patience required to command such a fascinating domain of programming. I have been reading about threads and multi-threaded programming since last two months and I was holding on the idea of writing about the same because I was not sure on how to start and how to end, finally I did it.

Let me know your feedbacks and suggestions. This will motivate me to write more such articles.

Happy Coding.

References