Solved

pthread_join on Linux crashes on segmentation fault

Posted on 2009-04-07
20
4,879 Views
Last Modified: 2012-05-06
Hi,

I'm getting a segmentation fault when trying to run pthread_join on Linux.
On my program, the main function calls several threads that do all the work.
to prevent the main function from exiting prematurely, the main uses pthread_join.
When waiting for the first thread in line, there is a segmentation fault.
thread creation
 
//
// Call Unix / Linux native API to create thread
//
OT_THREAD_HANDLE OT_CreateThread(	
	void *(*start_routine)(void *), // thread function  
	void *arg,						// argument to thread function						 	
	OT_THREAD_HANDLE *CreatedThreadHandle		// not in use in windows code
)
{
	pthread_create(
			CreatedThreadHandle,
			NULL,
			start_routine,
			arg);
 
	//TODO: print thread details
 
	return CreatedThreadHandle;
}
 
pthread_join
//
// Waits until one or all of the specified objects are in the signaled state or the time-out interval elapses
//
OT_DWORD OT_WaitForThreads(
  OT_DWORD nCount,				
  OT_THREAD_HANDLE *pHandles,	
  OT_BOOL bWaitAll,			
  OT_DWORD dwMilliseconds		
)
{
	void *threadResult;
	int res,i;
 
	res = 0 ;
	for(i = 0 ; i < nCount ; i++)
	{
		res = pthread_join( pHandles[i] , &threadResult );
		if(res != 0) //error
		{
			break;
		}
	}
 
	return (OT_DWORD)res;
}

Open in new window

0
Comment
Question by:optimaltest
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 7
  • 2
  • +1
20 Comments
 
LVL 53

Expert Comment

by:Infinity08
ID: 24084867
(a) How is OT_THREAD_HANDLE defined ? It should be pthread_t.

(b) You don't check the return value of pthread_create ... How will you know if it failed ?

(c) Did you verify that pHandles actually contains nCount valid pthread_t's referring to existing threads ? Can you show the code that actually calls these functions.
0
 
LVL 9

Expert Comment

by:Murugesan Nagarajan
ID: 24086064
use the following command:
               ulimit -c unlimited
The core file will be created.
let us know the output of the following command:
               gdb binaryName coreFileName
0
 
LVL 9

Expert Comment

by:Murugesan Nagarajan
ID: 24086098
gdb binaryFileName coreFileName
gdb) where
0
[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

 

Author Comment

by:optimaltest
ID: 24086158
Answer to Genius:
typedef pthread_t OT_THREAD_HANDLE;
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24086194
How about (b) and (c) ?

Generating the core file like murugesandins suggested will allow you to see the exact location of the segmentation fault, so that's a good idea too.
0
 

Author Comment

by:optimaltest
ID: 24086745
Core dump file is not created. The where command output is as follow:

Program received signal SIGSEGV, Segmentation fault.
0x007c777b in __deallocate_stack () from /lib/tls/libpthread.so.0
(gdb) where
#0  0x007c777b in __deallocate_stack () from /lib/tls/libpthread.so.0
#1  0x007c7d2b in __free_tcb () from /lib/tls/libpthread.so.0
#2  0x007c8d9f in pthread_join () from /lib/tls/libpthread.so.0
#3  0x0804b73a in OT_WaitForThreads (nCount=4, pHandles=0x81efef8, bWaitAll=1, dwMilliseconds=0) at OTThread.c:80
#4  0x0804a89e in EngineCleanUp () at Engine.c:106
#5  0x0804a962 in main () at Engine.c:138
(gdb)
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24086775
That seems to indicate that your heap got corrupted. That could have happened anywhere and anytime. Check whether you can spot any obvious buffer overflows, or other write operations to unallocated (or simply wrong) memory. If you can't spot it immediately, I suggest using a memory debugger to find the problem for you.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24086793
That is, assuming you've already eliminated (b) and (c) I've mentioned earlier ... (you still haven't replied to that)
0
 

Author Comment

by:optimaltest
ID: 24087119
Answer for b:

There is no return value. The application crashes within the function.

Answer for c:

I derify that ThreadArray holds the number of the threads expected. the code you asked for is attached.

BTW - Since the general code in the application is crosspatform and since in Windows i seo problem with this application i'm pretty sure this is pure Linux / Unix pthread_join related issue.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24087187
>> There is no return value. The application crashes within the function.

Of course there is a return value ... It returns int.
The crash you mentioned was not in pthread_create, but in pthread_join. pthread_create happens before pthread_join (or at least it should), and my remark was about pthread_create's return value not being checked. As long as you don't do so, you can't know whether the creation of the thread failed or not (which could cause problems later, including the crash you experience).


>> I derify that ThreadArray holds the number of the threads expected.

And are all entries in the array valid ?


>> the code you asked for is attached.

You must have forgotten it, as I can't see it ;)


>> i'm pretty sure this is pure Linux / Unix pthread_join related issue.

I'm not ... There's a LOT of differences between a Windows and a Linux platform, so just because the problem occurs on one platform and not on the other, does not mean that you can immediately point to one function as the problem.
0
 

Author Comment

by:optimaltest
ID: 24087425
Hre is the code:
for(i = 0 ; i < MAX_THREAD ; i++)
      {
            if(ThreadArray[i] == NULL)
            {
                  waitForAllThreads = 0;
                  break;
            }
      }

      if(      0 != waitForAllThreads)
      {
            // DON'T wait for the listen thread it is block by the read command
            // Make sure Listen thread is always the last one on the handle array
            OT_WaitForThreads(MAX_THREAD-1, ThreadArray, 1 /*TRUE*/, INFINITE);
      }      
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24087506
You're comparing a pthread_t with NULL here ... :

>>             if(ThreadArray[i] == NULL)

pthread_t is not a pointer.

The code you posted also doesn't show how ThreadArray is constructed and filled.
0
 

Author Comment

by:optimaltest
ID: 24087593
LoggerAddMessage( PROXY_LOG_LEVEL_ANALYSIS , FLUSH_TO_DISK,"EngineInit - Start ...\n" ) ;
      QueueInit(&messageQ);
      QueueInit(&writeQ);
      QueueInit(&transmitQ);
      
      strcpy(messageQ.qName , "MessageQ");
      messageQ.hMutex = OT_CreateMutex(messageQ.qName);

      strcpy(writeQ.qName , "WriteQ");
      writeQ.hMutex = OT_CreateMutex(writeQ.qName);

      strcpy(transmitQ.qName , "TransmitQ");
      transmitQ.hMutex = OT_CreateMutex(transmitQ.qName);

      ThreadArray[WORKER_THREAD] = OT_CreateThread( MessageQWorkerThread,            // thread function
                                                        &messageQ,                                          // argument to thread function                                     
                                                        &ThreadArray[WORKER_THREAD]);                  // The pointer that will store the thread handle - in use only in non windows systems

      ThreadArray[WRITER_THREAD] = OT_CreateThread(      MessageQWriterThread,      // thread function
                                                            &writeQ,                                          // argument to thread function
                                                            &ThreadArray[WRITER_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems

      ThreadArray[TRANSMIT_THREAD] = OT_CreateThread(      MessageQTransmitThread,      // thread function
                                                            &transmitQ,                                          // argument to thread function
                                                            &ThreadArray[TRANSMIT_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems

      ThreadArray[CMT_THREAD] = OT_CreateThread(      CMTThread,                              // thread function
                                                                        &messageQ,                                          // argument to thread function
                                                                        &ThreadArray[TRANSMIT_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems

      
      // Make sure Listen thread is always the last one on the handle array - refer to the comment in the EngineCleanup
      ThreadArray[LISTEN_THREAD] = OT_CreateThread(      ListeningThread,            // thread function
                                                                              &messageQ,                                          // argument to thread function
                                                                              &ThreadArray[LISTEN_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems
0
 
LVL 53

Accepted Solution

by:
Infinity08 earned 500 total points
ID: 24087765
I assume that MAX_THREAD is equal to 5, and that WORKER_THREAD, WRITER_THREAD, TRANSMIT_THREAD, CMT_THREAD and LISTEN_THREAD are 0, 1, 2, 3 and 4 resp. ?

Ok.

I can see a typo here though :

>>       ThreadArray[CMT_THREAD] = OT_CreateThread(      CMTThread,                              // thread function
>>                                                                         &messageQ,                                          // argument to thread function
>>                                                                         &ThreadArray[TRANSMIT_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems

which should be :

      ThreadArray[CMT_THREAD] = OT_CreateThread(      CMTThread,                              // thread function
                                                                        &messageQ,                                          // argument to thread function
                                                                        &ThreadArray[CMT_THREAD]);            // The pointer that will store the thread handle - in use only in non windows systems


If the problem still persists after that, then do not forget to check the return value of each pthread_create, and to fix this :

>>             if(ThreadArray[i] == NULL)

as I remarked earlier.


If all of that doesn't get you closer to the solution, then it's time to bring out the memory debugger :)
0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
ID: 24087861
Just in case some one is interested in a formatted code snippet ...
   LoggerAddMessage( PROXY_LOG_LEVEL_ANALYSIS , FLUSH_TO_DISK,"EngineInit - Start ...\n" ) ;
   QueueInit(&messageQ);
 
   QueueInit(&writeQ);
   QueueInit(&transmitQ);
 
   strcpy(messageQ.qName , "MessageQ");
   messageQ.hMutex = OT_CreateMutex(messageQ.qName);
 
   strcpy(writeQ.qName , "WriteQ");
   writeQ.hMutex = OT_CreateMutex(writeQ.qName);
 
   strcpy(transmitQ.qName , "TransmitQ");
   transmitQ.hMutex = OT_CreateMutex(transmitQ.qName);
 
   ThreadArray[WORKER_THREAD] = 
      OT_CreateThread( MessageQWorkerThread,          
                      &messageQ,                      
                      &ThreadArray[WORKER_THREAD]);    
 
   ThreadArray[WRITER_THREAD] = 
      OT_CreateThread(MessageQWriterThread,           
                      &writeQ,                        
                      &ThreadArray[WRITER_THREAD]);    
 
   ThreadArray[TRANSMIT_THREAD] = 
      OT_CreateThread(MessageQTransmitThread,         
                      &transmitQ,                     
                      &ThreadArray[TRANSMIT_THREAD]);  
 
   ThreadArray[CMT_THREAD] = 
      OT_CreateThread(CMTThread,                      
                      &messageQ,                      
                      &ThreadArray[TRANSMIT_THREAD]);  
 
 
   ThreadArray[LISTEN_THREAD] = 
      OT_CreateThread(ListeningThread,                
                      &messageQ,                      
                      &ThreadArray[LISTEN_THREAD]);    

Open in new window

0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
ID: 24087908
>>>> I can see a typo here though :

Good catch, Infinity  ;-)

Should be possible that

         ThreadArray[TRANSMIT_THREAD]);  

has the wrong handles and

         ThreadArray[CMT_THREAD]

is uninitialized.

Could be enough to make the system crash.
0
 

Author Comment

by:optimaltest
ID: 24088756
Good catch with the typo. I'm sure it solved some future problem, but It didn't help this issue.

I've added a line "res = pthread_create(...)" to see the result. They all get 0.
I can also see that ThreadArray has 5 consequtive values

I'll start working with a memory debugger then

// Number of threads in the application 
#define MAX_THREAD			5
// Assign ID's to the applicaion threads
#define WORKER_THREAD		0
#define WRITER_THREAD		1
#define TRANSMIT_THREAD		2
#define CMT_THREAD			3
#define	LISTEN_THREAD		4

Open in new window

0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24089362
>> I'll start working with a memory debugger then

Likely candidates to look for are buffer overflows.
0
 

Author Comment

by:optimaltest
ID: 24089616
I've run valgrind memory check, and it helped me see that the CreatedThreadHandle thread is actually lost (and used redundantly).
I fixed it and it stop falling on segmentation.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 24089742
Great :) Glad you found it !
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand opening and writing to files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question