<

[Last Call] Learn how to a build a cloud-first strategyRegister Now

x

POSIX Threads Programming in C++

Published on
25,019 Points
14,419 Views
6 Endorsements
Last Modified:
Awarded
Written by John Humphreys

C++ Threading and the POSIX Library

This article will cover the basic information that you need to know in order to make use of the POSIX threading library available for C and C++ on UNIX and most Linux systems.  
[step=""]Before we start, I want to say that every time I show source code, it is tested.  And for every modification, I will provide the complete code in so you can just copy paste and try it out yourself.  This means some parts of the code examples are repeated... but it's that way just for your convenience!  I didn’t put my output in here, but you shouldn't need it -- you have the compiler command, so take the 20 seconds to paste it in and do it yourself.  Doing is learning after all :)[/step]
Just to get it out of the way, POSIX stands for Portable Operating System Interface for UNIX.  POSIX is a set of standards (basically common APIs) that operating systems must support in order to be assigned the UNIX branding.   Most versions of Linux support the POSIX libraries as well.

The POSIX threading library is very commonly used for threading on supporting operating systems.  This article will tell you what a thread is, how it exists in the operating system, how to create threads and manage threads, how to synchronize threads, and how to enforce mutual exclusion with the POSIX libraries (if you don’t know what mutual exclusion and critical sections are, you will learn that too!).

So, what is a thread?

In case you’re new to the concept, a thread is basically a very beneficial alternative (in most cases) to spawning a new process.  Programs you use every day make great use of threads whether you know it or not.  The most basic example is whenever you’re using a program that has a lengthy task to achieve (say, downloading a file or backing up a database), the program will most likely spawn a thread to do that job.  This is due to the fact that if a thread was not created, the main thread (where your main function is running) would be stuck waiting for the event to complete, and the screen would freeze.  If you’ve ever clicked something, had the computer hang, and seen the screen/user interface of the program start turning white or glitch then you’ve seen an example of this!

General thread information

For the general bulleted list of thread information, I’m going to reference: YoLinux Posix Tutorial which provides this concise and very well thought-out list:

Thread operations include thread creation, termination, synchronization (joins,blocking), scheduling, data management and process interaction.
A thread does not maintain a list of created threads, nor does it know the thread that created it.
All threads within a process share the same address space.
Threads in the same process share:
    o  Process instructions
    o  Most data
    o  open files (descriptors)
    o  signals and signal handlers
    o  current working directory
    o  User and group id
Each thread has a unique:
    o  Thread ID
    o  set of registers, stack pointer
    o  stack for local variables, return addresses
    o  signal mask
    o  priority
    o  Return value: errno
pthread functions return "0" if OK.

Let’s get to the code!

Okay, so now you should know what a thread is and why you would actually use one… let’s see how it's actually done!

The most important functions with regard to threads are:
1.  pthread_create (): use this to create a thread from a target function.
2.  pthread_join (): use this to make the parent thread (the one you’re making new threads in) wait for
     the completion of one of the child threads (the threads created in it).
3. pthread_exit (): use this to terminate the execution of a thread.

So, here’s a quick example of thread use!  Please note that to compile the code on UNIX/Linux you need to add the
    –lpthread
library to your command.  So, on my system I compile with: { g++ -lpthread filename.cpp }.

#include <iostream>
using namespace std;
#include <pthread.h>

//GLOBAL VARIABLES
int global_counter = 0;

//Prototype for the function we will spawn threads for.
void* IncreaseCounter(void*);

int main() {
	pthread_t t1;	//Line 1
	int t1return = pthread_create(&t1, NULL, IncreaseCounter, NULL); //Line 2
	pthread_join(t1, NULL);	//Line 3
		
	cout<< "The thread returned: " << t1return << endl;
	return 0;
}

//Definition for function we will spawn threads for.
void* IncreaseCounter(void* pointer) {
	for (int i = 0; i < 100; i++)
		global_counter++;
	cout<< "[In thread] The counter is at: " << global_counter << endl;
}

Open in new window


Now to describe the example…

The first thing to note is that we need the #include <pthread.h> header at the top of the program.  The second thing to note is that this will not work on Visual Studio if you’re trying to use it in windows, it’s a UNIX library (I’m sure there’s a workaround, but I’m not looking into it as I run Linux VM’s on my Windows computer).  

The code has a global integer variable (its accessible to the entire program, not just the main function) called global_counter.  As you can see from the top, the program has a prototype for just one function, and it’s style is void* function_name (void*).  This function is defined after main, and all it does is raise the value of the counter variable 100 times using a for loop.
Now for the threading…  In main we have four things to notice:

1.  We create a new thread variable using pthread_t variable_name.  This will give the thread a
      unique id which we can use to manage and keep track of it in the current scope.
2.  The thread variable from step 1 has a thread created or launched from it using the pthread_create
      function which takes 4 parameters.  
       •  The three vital parameters are the first (the thread id from step 1), the third (the name of the
           function we want to be the starting point for the thread), and the fourth (the parameter list for
            the thread’s functions).
       •  Note that the 2nd parameter can have NULL in it for our purposes, and the 4th parameter can
           also be null if we are not passing any variables/parameters into our function (which in this
           example we didn’t).
3.  We used the pthread_join method to make the main thread wait for the spawned thread.  If we
    didn’t do this then the main thread would run to completion and the spawned thread would be
     terminated before finishing its task.  In this case that seems irrelevant, but try increasing the for
     loop number to 1000000 or so and you’ll see that the thread never completes without the join line in main.
4.  As noted earlier, these threads return a value of 0 if they complete in a  good state.  We checked
     our thread return variable at the end and it was in fact 0.

Passing parameters to a thread

Okay, now I’m going to show you how to pass parameters to a thread.  If you’re good with pointers you will have no issue with this, but it’s worth mentioning anyway.

Functions which can be launched from threads have a void pointer (void*) as their only parameter (check the code above).  The reason for this is that we can cast any parameter we want into a void pointer, and simply cast it back into the type we need to use it in the actual thread function.   The second reason for this is that if we need to pass multiple values to the thread we can simply do it my passing a structure or class object of values (possibly a vector or other STL container, or something as simple as a basic struct).

So, by making a couple changes to our program we can pass a list of parameters to the thread function using this void pointer.  

The updated code is shown below.  Keep in mind that all we did was what was described in the previous paragraph, so take a look at the code, notice the differences (mainly the casting from and to void*), and try it out!  No changes required.

#include <iostream>
using namespace std;
#include <pthread.h>

//sMessage structure.
struct sCounterData {
	public:
		const char* name;
		const char* purpose;
		int defnum;
};

//GLOBAL VARIABLES
int global_counter = 0;

//Prototype for the function we will spawn threads for.
void* IncreaseCounter(void*);

int main() {
	
	//Create the structure instance.
	sCounterData* c = new sCounterData();
	c->name = "some name";
	c->purpose = "some purpose";
	c->defnum = 99;
	
	pthread_t t1;
		
	//Create the thread with the 4th parameter as our structure casted to a void*
	int t1return = pthread_create(&t1, NULL, IncreaseCounter, (void*)c);
	pthread_join(t1, NULL);	//Line 3
		
	cout<< "The thread returned: " << t1return << endl;
	return 0;
}

//Modified function to show struct parameters.
void* IncreaseCounter(void* pointer) {
	for (int i = 0; i < 100; i++)
		global_counter++;
	
	//Cast the void pointer back into the right object.
	sCounterData* cd = (sCounterData*)pointer; 
	
	//Print the data.
	cout<< endl << "[In thraed] ";
	cout<< cd->name << endl << cd->purpose << endl << cd-> defnum << endl;
	
	cout<< "[In thread] The counter is at: " << global_counter << endl;
}

Open in new window


Shared access, race conditions, and critical sections

Okay, so now you know the basics about how to thread.  Next, you need to learn about problems with threading which you may not see coming (this is the important stuff!).

Whenever you have multiple asynchronous threads (asynchronous means they’re doing their own thing without any care about what the other threads are doing), you start to have to worry about accessing shared resources.  A shared resource may be something like a variable (like counter in our example), a file, or whatever else.  Now, the problem with allowing multiple things to access shared resources asynchronously is that they can mess each other up.

Here’s an example… in our program we have our global_counter variable.  If we make 10 threads using the same function that we used already, they will all have that for loop incrementing the counter 100 times.  There’s no guarantee that one thread won’t start updating the counter at the same time as another, so this could mean that two counters grab the variable when its value is say, 55, and both update it to 56 (when realistically they should have updated it to 56 and then 57).  Even worse, we might have one thread grab the counter at 2, all the other threads update it to say, 5000, and then have the first thread finish updating it to 3 (thus losing 4997 numbers!).  

This might sound funny, but it has to do with how the assembly language works on your computer – this is really an operating systems topic – it’s easy to look up (search for race conditions online, it’ll be about assembly language and computer hardware), but take my word for it for now!

So, if we had a few threads this could happen:
1.  Global_counter = 0
2.  Thread1 increments the counter to 1
3.  Thread2 and Thread3 grab counter while it is at 1
4.  Thread2 and Thread3 increment counter to 2 (both move it up from 1).
5.  Now we’ve had 3 threads try to increment the counter but it’s only gone up by two values.

This scenario, where multiple threads use the same data and have conflicts, is called a race condition.

Race conditions arise in critical sections.  A critical section is any area of code that accesses a variable that is shared between 2 or more threads.  The hard part about using threads is learning to protect your critical sections so that race conditions or worse things (crashes due to inability to access a file, etc.) do not occur.

Mutual Exclusion

The best way to solve the critical section problem in normal programs is to ensure mutual exclusion.  Mutual exclusion essentially means that a shared resource can only be accessed by one thread at any one time.  So if we had 100 requests for a variable (say our global_counter), then they would be processed one after another, each waiting for the previous thread to finish with the counter.  This is called serially processing.

Before I show you how to use the POSIX library to protect your critical regions and ensure mutual exclusion in your code, I think it would be good to prove to you that there’s a problem in the first place!  The following code is basically the same as earlier, but this time ten of our function is created, and the counter values are much, much higher.  If you run the program a few times you’ll see that the value of the last counter to finish changes every time – and it should be a multiple of 10 anyway since it’s just 10 times the counter number we’re putting in.  These discrepancies are due to the race conditions mentioned earlier.

Here’s the code to show the error:

#include <iostream>
using namespace std;
#include <pthread.h>
#include <vector>

//GLOBAL VARIABLES
int global_counter = 0;

//Prototype for the function we will spawn threads for.
void* IncreaseCounter(void*);

int main() {
	
	vector<pthread_t*> threadIDs;
	for (int i = 0; i < 10; i++) {
		pthread_t* t = new pthread_t();
		//Record the thread ID so we can use wait on it later.
		threadIDs.push_back(t);
		pthread_create(&*t, NULL, IncreaseCounter, NULL);
	}		
		
	//Set main to wait for all threads to complete.
	for (int i = 0; i < 10; i++)
		pthread_join(*threadIDs[i], NULL);
		
	return 0;
}

void* IncreaseCounter(void* pointer) {
	for (int i = 0; i < 100000000; i++)
		global_counter++;
	
	cout<< "[In thread] The counter is at: " << global_counter << endl;
}

Open in new window


If you’re confused by the vector<> syntax, it’s a generic container from the Standard Template Library (STL) capable of holding any type you put in the <>.  I’ve basically just used it as a dynamic (size-changeable) array to hold the ten thread IDs so that I can record them to make main wait for all ten to finish by using the pthread_join command as you see in the second for loop.

Mutexes

Okay, so to fix this problem and ensure mutual exclusion in the easiest way, you should add a mutex.  A mutex is defined with pthread_mutex_t [mutex_name] = PTHREAD_MUTEX_INITIALIZER.  It should be defined at the same scope as the shared resource (in our case, counter).  After defining the mutex, you find each critical region where the shared resource is accessed and wrap it in a
     pthread_mutex_lock(&[mutex_name])
...and...
     pthread_mutex_unlock(&[mutex_name]).  
This will ensure that each thread that accesses that portion of code (or any other portion of code using counter which also has the locks around it) will be forced to check and see if it is locked first.  If it is locked, it will have to wait for it to unlock before proceeding, at which point the thread in question would get to move on and it would in turn lock the critical region using the mutex.

Here’s the updated code with the mutex handling added:

#include <iostream>
using namespace std;
#include <pthread.h>
#include <vector>

//GLOBAL VARIABLES AND LOCKING MECHANISMS.
pthread_mutex_t counterMutex = PTHREAD_MUTEX_INITIALIZER;
int global_counter = 0;

//Prototype for the function we will spawn threads for.
void* IncreaseCounter(void*);

int main() {	
	vector<pthread_t*> threadIDs;
	for (int i = 0; i < 10; i++) {
		pthread_t* t = new pthread_t();
		//Record the thread ID so we can use wait on it later.
		threadIDs.push_back(t);
		pthread_create(&*t, NULL, IncreaseCounter, NULL);
	}		
		
	//Set main to wait for all threads to complete.
	for (int i = 0; i < 10; i++)
		pthread_join(*threadIDs[i], NULL);
		
	return 0;
}

void* IncreaseCounter(void* pointer) {
	for (int i = 0; i < 100000000; i++) {
		pthread_mutex_lock(&counterMutex);
		global_counter++;
		pthread_mutex_unlock(&counterMutex);
	}
	
	pthread_mutex_lock(&counterMutex);
	cout<< "In thread #" << pthread_self() << 
		" --> The counter is at: " << global_counter << endl;
	pthread_mutex_unlock(&counterMutex);	
}

Open in new window


As you can see, both regions where global_counter is accessed are surrounded in mutex lock and unlock lines.  This will ensure that no race condition conflicts occur, and the last output will in fact be the multiple of 10 (huge number) that we were looking for.  

One problem is that this code runs very slowly though… This is because of the locks – we are locking and unlocking the code 10000000 times in each loop – this is bound to slow down the code an order of magnitude.  

Just check out how much faster the code is if you delete all the mutex lock/unlock lines and instead wrap the entire for loop from beginning to end in a mutex line – the whole for loop will in fact be executed only locking the system once – and the execution of all ten will be much, much faster.  This probably wouldn’t serve our purposes though as the point here is that each thread can execute asynchronously and use the same resources.  Realistically if you wrap the whole for loop in a lock, it’s pretty much the same as calling the function ten times in the same thread (as the whole function is being executed serially rather than just the one or two critical lines).

Please note one other change to the code.  We used pthread_self() to show the actual number of the thread in question.  This was just to introduce the thread and show how the system does differentiate between threads; it wasn’t really required for this particular example.

Mutex issues

I just want to very quickly highlight an issue which occurs with Mutexes if you are not very careful.  You may come across a time when you need more than one mutex on the same set of code (it does happen sometimes).  

If you reach this point be aware that if you use two lock statements in a row, there is a possibility that you will pass through the first lock statement (thus locking it) and get stuck waiting at the second lock statement, and another thread will be stuck waiting for the first lock statement creating a situation where it cannot reach the area of code to unlock the second lock.  

So each thread would be waiting for the other to release a lock.  This creates a conflict that is impossible to resolve in the code which is commonly referred to as a deadlock.

Thread condition variables

This will be the last topic.  There are other things that you can do with the POSIX threading library, but you can accomplish most anything you need to with what you’ve been introduced to thus far.

Thread condition variables basically allow you to put a thread to sleep or resume a thread from sleep based on a set condition.  Essentially, you can do wait, timed wait, signal (which is resume a specific thread), and broadcast (which is resume all sleeping threads related to the condition variable).

Things can get a lot more complicated when you decide to include condition variables.  For simpler purposes you probably won’t need them, but if you decide that you do you should do a fair amount of research into them.  You could write a whole article describing where and when to use them, so rather than try and cram it all into one section I’m going to suggest you take a look at an IBM article.  It helped, me, it has a very well coded and lengthy example at the end to show timed wait usage, and it’s reasonably reader friendly.  Please see: pthread_cond_timedwait()--Timed Wait for Condition for the example code.

Conclusion

I hope this article helped you get a grasp on POSIX threads, why they’re needed, and how to use them correctly.  If you have any follow-up questions feel free to comment and I’ll try and check up now and then.  Also I highly recommend viewing the links provided in here.  The first lists all kind of tutorials recommended by IBM, and it can teach you pretty much everything if you dig around.  The second source inspired the layout of this article in terms of topics; though the examples and writing are different, if you found anything ambiguous here I’d recommend checking there!

Sources:
This talks about the POSIX thread library and its creation/improvements.  It’s just interesting background to know:
http://people.redhat.com/drepper/nptl-design.pdf 
This is a good text-book I used back in undergrad if you’re actually willing to purchase one.  I found it through the IBM site and very much approve!  
    Programming with POSIX(R) Threads (Paperback)
   by David R. Butenhof

6
Comment
Author:w00te
4 Comments
 
LVL 40

Expert Comment

by:evilrix
Very nice article... got my yes vote.
0
 
LVL 33

Expert Comment

by:pgnatyuk
Nice. Thanks.
My yes is above
0
 
LVL 12

Author Comment

by:w00te
Thanks :)
It's always nice to hear some nice feedback!
0
 
LVL 9

Expert Comment

by:Subrat (C++ windows/Linux)
After a long time sending my +1.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Join & Write a Comment

The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.
Other articles by this author
Suggested Courses
Next Article:

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month