Solved

Problem with multi-threads

Posted on 2006-06-08
21
943 Views
Last Modified: 2008-01-09
Hello,

This is VC++ 6.

I'm coding a multi-thread application, and have a problem. At random times, not all the threads ends. Sometimes a few of them stay running forever.

Before to go into the threads internal coding, I need to be sure the "framework" I'm using is ok.

So this is the basic source code, with all the unnecessary stuff removed.

Take into account this is the code for a DLL. Not sure if this is important. As you can see, there are two exported functions. The first one, StartSession(), creates and starts the threads. They run forever until the 2nd function, StopSession() is called.

Inside the While block in the ThreadProc() function, there's a bunch of If's with some function calls, but nothing that could make the thread to get hung.

I'm waiting for your comments.


Thanks in advance.


------->8-------

BOOL bRunning=FALSE;

ULONG ulTotalSlots=128; // Let's say it's a constant value, for simplicity

HANDLE *ahThread;

....

unsigned int __stdcall ThreadProc(void *PIndex) {
  while(bRunning) {
    ....
    }
  _endthreadex(0);
  return(0);
}

VOID WINAPI StartSession(VOID) {
ULONG ulK;
  if(!bRunning) {
      bRunning=TRUE;
      ahThread=(HANDLE *)calloc(ulTotalSlots,sizeof(HANDLE));
      for(ulK=0;ulK<ulTotalSlots;ulK++)
        ahThread[ulK]=(HANDLE)_beginthreadex(NULL,0,ThreadProc,(void *)ulK,0,NULL);
      }
}

VOID WINAPI StopSession(VOID) {
ULONG ulK;
  if(bRunning) {
    bRunning=FALSE;
    (void)WaitForMultipleObjects(ulTotalSlots,ahThread,TRUE,INFINITE);
    for(ulK=0;ulK<ulTotalSlots;ulK++)
      CloseHandle(ahThread[ulK]);
    free(ahThread);
    }
}
0
Comment
Question by:Alvein
  • 9
  • 5
  • 2
  • +1
21 Comments
 
LVL 16

Expert Comment

by:PaulCaswell
Comment Utility
Hi Alvein,

This code:

  if(!bRunning) {
      bRunning=TRUE;

Is not thread safe. If one thread sets 'bRunning' while the other is between these two lines of code your locking process will fail.

Either use semaphores or increment/decrement the lock value. Semaphores are safer, look up 'mutex' in the help system.

Paul
0
 

Author Comment

by:Alvein
Comment Utility
Hello,

Thanks for your reply.

I understand your reasoning, but, notice how the two if(!bRunning) statemets used in the code are inside two top level functions never called from a thread (at least not from the threads this DLL creates). In a few words, no thread sets the bRunning value. Just checks its value, and nothing more.
0
 
LVL 86

Expert Comment

by:jkr
Comment Utility
It *might* work here, but I agree with Paul. I simply wouldn't do it like that.You're talking about a framework, and there will be a moment where you might start to extend that framework still carring potentially dangerous legacy code, so do it right from the beginning, e.g. using a proper Win32 primitive like a sync object, e.g.

HANDLE g_hEvent;

BOOL IsRunning () {

    return WAIT_TIMEOUT == WaitForSingleObject(g_hEvent, 1);
}

void StopRunning () {

    SetEvent(g_hEvent);
}

void StartRunning () {

    ResetEvent(g_hEvent);
}

// ... setup the event properly by calling 'CreateEvent()'

unsigned int __stdcall ThreadProc(void *PIndex) {
  while(IsRunning()) {
    ....
    }
  _endthreadex(0);
  return(0);
}

VOID WINAPI StartSession(VOID) {
ULONG ulK;
  if(!bRunning) {
      StartRunning();
      ahThread=(HANDLE *)calloc(ulTotalSlots,sizeof(HANDLE));
      for(ulK=0;ulK<ulTotalSlots;ulK++)
        ahThread[ulK]=(HANDLE)_beginthreadex(NULL,0,ThreadProc,(void *)ulK,0,NULL);
      }
}

VOID WINAPI StopSession(VOID) {
ULONG ulK;
  if(bRunning) {
    StopRunning();
    (void)WaitForMultipleObjects(ulTotalSlots,ahThread,TRUE,INFINITE);
    for(ulK=0;ulK<ulTotalSlots;ulK++)
      CloseHandle(ahThread[ulK]);
    free(ahThread);
    }
}
0
 

Author Comment

by:Alvein
Comment Utility
Hi,

Well, thanks for the tips. Apart of the proposed changes, I will have to wait a big while to see if something happens (or does not). If the problem persist, I guess I will have to post the real ThreadProc(), but let's hope not.

....

There's a really weird thing I've seen by using this DLL. The caller does create a global array and pass the pointer to the DLL. The ThreadProc() function updates this array, but each thread has it's own array slot and does not interfere with the rest, so there's no need of to synchronize anything, right? Well, I added a "debug" field to the array base type, and I'm setting it to TRUE at the beginning of ThreadProc(), and resetting to FALSE near the end, just before _endthreadex().

When look the contents of that field, in the caller, after calling StopSession(), there's a bunch of slots with the debug field still being TRUE! only after about a second, everybody is FALSE. If StopSession() is visually a blocking function, what's the reason for some threads continue running (which is what I can infere of this behavior) for a little time after an explicit WaitForMultipleObjects(,,TRUE,INFINITE)?

Thanks again.
0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
Comment Utility
>>>> bRunning = TRUE;
>>>> Is not thread safe.

Actually, that doesn't matter in this case provided StopSession is not called prior to the last thread was started.

>>>> At random times, not all the threads ends.

Actually the code above seems safe - beside of one weak point that I will explain below - provided the global bool 'bRunning' is really global, what means that if there are more than one cpp file involved you would need to define the global variables in one cpp file - most  likely in the main module - and have an external reference in aller other cpp files. In case of dlls you would need to 'export' all global variables.

The week point I spoke of is the " bRunning=FALSE; " statement in the StopSession function. After bRunning is false and prior to calling WaitForMultipleObjects there is a (little) chance that one of the threads already exited. By calling endthreadex the handle already was closed (freed). So it wouldn't get to signal state anymore what might cause the WaitForMultipleObjects to hang infinite....

Generally, there are a few unnecessary code parts which I would recommend to change:

1. unsigned int __stdcall ThreadProc(void *PIndex)

When using beginthreadex the prototype of a thread procedure is "void threadproc(void*)". That is different thant to using CreatedThread. To keep it simple and clean you simply should take the prototype not making unneeded castings.

2. __stdcall WINAPI

You don't need the __stdcall specifier nor any other (equivalent) specifier like WINAPI  as long as you are not using mixed-language libraries. Note, __stdcall definitively isn't necessarty when a function has only one argument (cause the order of the arguments isn't different for one argument). All these specifiers are an inheritage of a long (and evil) past which should be left behind wherever possible (esspecially if you ever intend to write portable code.

3. _endthreadex(0);

Remove that statement as all threads correctly end without it if reaching (one of) the return statements in the thread procedure. As told above that statement most likely is responsibly for the hung up as it frees the thread handles.

4. calloc

If 'ulTotalSlots' really is a constant you should init a global array for the handles. If not you should define a maximum number of threads and create an array on the the stack like that:

  enum { MAX_THREADS = 256 };

  HANDLE ahthread[MAX_THREADS] = { NULL };

You still could use 'ulTotalSlots' as an upper bound. Note, in C++ you should create dynamic memory by using the 'new' operator. But here it isn't necessary (and an additional error source).

5. _beginthreadex

I couldn't see any need for beginthreadex as you didn't use any of the extra arguments. Use beginthread for simplicity reasons.

Regards, Alex







 
0
 

Author Comment

by:Alvein
Comment Utility
Hello,

After the last changes, SetEvent(), etc., the program is leaving more threads running with no apparent reason. Last night they were 7, compared with the 2 of the previous version.

@ itsmeandnobodyelse,

>>>> In case of dlls you would need to 'export' all global variables.

The bRunning variable is only used in the main module.

>>>> By calling endthreadex the handle already was closed (freed). So it wouldn't get to signal state anymore what might cause the WaitForMultipleObjects to hang infinite....

According to MSDN, _endthreadex() does not close the thread handle.

>>>> You don't need the __stdcall specifier nor any other (equivalent) specifier like WINAPI  as long as you are not using mixed-language libraries.

The caller is coded in VB.

>>>> As told above that statement most likely is responsibly for the hung up as it frees the thread handles.

I will try this as soon as I can. I'm not in front of my PC right now.

>>>> If 'ulTotalSlots' really is a constant you should init a global array for the handles.

It's a global variable and its value is set by the caller (thru another exported function). What's the problem with calloc() anyways?


Thanks for your comments.
0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
Comment Utility
>>>> According to MSDN, _endthreadex() does not close the thread handle.

You are right, but it may influence the signal status. Actually, you  should/could try to remove endthreadex statement. If you have no endless loop somewhere else it seems to me the only thing what could prevent the thread handle from getting signalled.

>>>> The caller is coded in VB.

As I told __stdcall has no influence on functions with less than two arguments. AFAIK VB has the same calling conventions than VC.

>>>> What's the problem with calloc() anyways?

No problem. You can't 'delete' storage that was allocated by malloc, calloc or realloc. And you can't 'free' memory allocated by 'new'. Using malloc/calloc/realloc needs a cast while 'new' is the C++ way. Furthermore for class types the constructor wouldn't be called if using 'alloc functions.. Generally, you shouldn't use C allocation in C++ cause there isn't any advantage (but some disadvantages). Moreover, as I told for the purpose above you don't need dynamic memory but should use an array of maximum size allocated on the stack (or using a class member).

Regards, Alex



0
 

Author Comment

by:Alvein
Comment Utility
Hello,

Well, after remove the _endthreadex() statement, the half of the total threads left running after a call to StopScrape(), so I continued using _endthreadex(). Not sure what follows now.

In the meanwhile, please help me to synchronize these two routines. They are called from the IPThread() code. At first sight, they are not able to make the thread code hangs. However, they access and modify a global resource, and I need to be sure that the multi-threading stuff is not skewing the results, and in fact, is what it's doing:

unsigned long AddToResults(char *szId,unsigned long ulTimestamp) {
unsigned long ulK;
  for(ulK=0;(ulK<ulTotalSlots)&&strcmp(arResults[ulK].szId,szId)&&arResults[ulK].ulTimestamp;ulK++);
  if(ulK<ulTotalSlots)
    if(!arResults[ulK].ulTimestamp) {
      strcpy(arResults[ulK].szId,szId);
      arResults[ulK].ulTimestamp=ulTimestamp;
      }
    else
      if(arResults[ulK].ulTimestamp<ulTimestamp)
        arResults[ulK].ulTimestamp=ulTimestamp;
  return(ulK);
}

int IsFoundInResults(char *szId) {
unsigned long ulK;
  for(ulK=0;(ulK<ulTotalSlots)&&strcmp(arResults[ulK].szId,szId)&&arResults[ulK].ulTimestamp;ulK++);
  return((ulK<ulTotalSlots)&&arResults[ulK].ulTimestamp);
}

Don't pay much attention to the code itself. AddToResults() inserts one item in a global array, and IsFoundInResults() is TRUE if a given item is found in that global array. I need to block that array the way only one thread at time can insert or search for items there. By not doing this, sometimes the array ends with duplicated items. I hope you get the point.


Thanks again.
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 

Author Comment

by:Alvein
Comment Utility
Paul,

It sounded like a new question, but read well, two functions called from the same thread code that has a problem. I'm trying to examine all the possible sources of failure. The functions does not behave like I want, so maybe there lies the bug. I'm not the expert in such stuff, I just want to give enough details for the rest of people can give a diagnose, since the first recommendations I've got are making the things worse.


Thanks.
0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
Comment Utility
>>>> I need to block that array the way only one thread at time can insert or search for items there.

For synchronisation you may use that little helper class:

// mutex.h
#ifndef MUTEX_H
#define MUTEX_H

#include <windows.h>

class Mutex
{
private:
    CRITICAL_SECTION mutex;
public:
    Mutex() {  InitializeCriticalSection(&mutex);  }

    // enter critical section and block others threads or wait
    void enter() { EnterCriticalSection(&mutex); }
   
    // leave critical section and release any other thread blocked
    void leave() { LeaveCriticalSection(&mutex); }
};

#endif

#endif // MUTEX_H


Use it by either defining a global (and shared) Mutex object or by adding it as a member to an already shared class (object).

// main.cpp

   ...

  #include "mutex.h"
  Mutex theMutex;   // the one and only

// thread.cpp

  #include "mutex.h"
  extern Mutex theMutex;


  ...

  unsigned long AddToResults(char *szId,unsigned long ulTimestamp)
  {
        theMutex.enter();

        // put here your original code
        ...

        theMutex.leave();
  }

  int IsFoundInResults(char *szId)
  {
        theMutex.enter();

        // put here your original code
        ...

        theMutex.leave();
  }


Note, if you have additional returns (or exceptions) you *have* to call 'theMutex.leave() ' prior to returning or you'll get a deadlock.

>>>> remove the _endthreadex() statement, the half of the total threads left running after a call to StopScrape()

How did you know that?

Note, after the call to WaitForMultipleObjects you should wait a little while (some milliseconds) before freeing the handles. The handle may get signalled though the thread actually hasn't terminated.

>>>> I'm trying to examine all the possible sources of failure.

Yes. If you are operating on (globally) shared structures and arrays where some were inserting while others were evaluating, there is a good chance of running into an infinite loop.

Regards, Alex


 


0
 

Author Comment

by:Alvein
Comment Utility
Hello,

>>>> For synchronisation you may use that little helper class

Ok. Thanks. I'll do the edits and tell you.

>>>> How did you know that?

I don't remember if it was exactly the half. I did not count it, but there was a lot of threads still running, seconds, minutes after a call to StopSession(). I'm using Sysinternals' Process Explorer to check this condition.

>>>> after the call to WaitForMultipleObjects you should wait a little while (some milliseconds) before freeing the handles.

Isn't INFINITE enough? ;)

>>>> If you are operating on (globally) shared structures and arrays where some were inserting while others were evaluating, there is a good chance of running into an infinite loop.

As I said, I'm just trying to examine everything. I don't see how these routines could get hung in an infinite loop, but who knows.


Thanks again.
0
 

Author Comment

by:Alvein
Comment Utility
BTW, isn't required to add a destructor with a call to DeleteCriticalSection()?
0
 
LVL 16

Accepted Solution

by:
PaulCaswell earned 100 total points
Comment Utility
Hi Alvein,

>>I don't see how these routines could get hung in an infinite loop, but who knows.
The hangs are usually caused by the data corruption resulting in a missed deadlock situation. I can see several places where these two together might mislead the caller. If that is sufficient to cause corruption then that may  be your symptom. The problem is still present.

Imagine, for example, AddToResults is called by two threads at once and the first one is interrupred part-way through one of the strcmps at the start, all sorts of strange things will start to happen. Even worse in the middle of an strcpy!

After this line:

 if(ulK<ulTotalSlots)

you have decided that this is the slot to place the data in. If another thread gets to this point in the code at the same time, both threads will attempt to update the data at the same time!

Paul
0
 
LVL 39

Expert Comment

by:itsmeandnobodyelse
Comment Utility
>>>> Isn't INFINITE enough? ;)

INFINITE only works as long as you are waiting. I thought, you already would reach the for loop that happens after the call of WaitForMultipleObjects. Then, all thread handles were signalled but some of the threads might be still not be removed from memory. If your main thread closes immediately after wait, some of the threads still might be terminated by force what could mean that not all resources were freed.


>>>> I'm using Sysinternals' Process Explorer to check this condition.

Actually, the condition easily could be verified by simply using a debugger. Put the breakpoint at the while condition that checks the stop flag and every thread *must* break there. If you don't know how to identify the threads you either could assign the current thread id to a temporary variable

    unsigned long threadID =  __threadid();
    while (bRunning)
     ...

or you make a copy of the thread function for any thread - adding a unique suffix to each function. Then, you need to set breakpoints to each of these functions:

   unsigned int __stdcall ThreadProc_A(void *PIndex) {
   while(bRunning)   // set the breakpoint here
   {

Actually, if your synchronisation is done well, *all* threads will recognize the stop flag and the call to WaitForMultipleObjects definitively returns.

>>>> isn't required to add a destructor with a call to DeleteCriticalSection()?

Yes, it would be better cause it would free all resources associated with. However, if the mutex was defined globally or as  a static class member resources were freed anyway.

Regards, Alex


0
 

Author Comment

by:Alvein
Comment Utility
I had to put a "Sleep(10000);" statement just after WaitForMultipleObjects(). That's ridicule. I've never been this way before. I don't have idea about what to do.
0
 

Author Comment

by:Alvein
Comment Utility
Look this, I've changed the line

(void)WaitForMultipleObjects(ulTotalSlots,ahThread,TRUE,INFINITE);

To

for(ulK=0;ulK<ulTotalSlots;ulK++)
  (void)WaitForSingleObject(ahThread[ulK],INFINITE);

And after some tests, the threads are ending ok. Aren't those lines the same???? The call to StopSession() is taking rather more time than expected, but who cares if the bug disappeared?
0
 
LVL 39

Assisted Solution

by:itsmeandnobodyelse
itsmeandnobodyelse earned 100 total points
Comment Utility
>>>> That's ridicule.

The only thing Sleep has to prevent is that the main thread terminates prior to any of the dependent threads (what would terminate the thread unsmooothfully) . You wouldn't need if it wasn't the final code sequence in your prog.

>>>> for(ulK=0;ulK<ulTotalSlots;ulK++)
>>>>  (void)WaitForSingleObject(ahThread[ulK],INFINITE);

The difference is, that you are waiting for one handle at one time only. If endthreadex can spoil the signal status of the associated thread handle in an undefined way - what is I suspected - the loop above minimizes that risk cause only one handle was considered at one time.

Note, I often made multi-threaded programs and *never* used endthreadex. Sometimes I used the WaitFor... functions like in your code, but mostly I passed a pointer to each thread where the thread simply could set a 'I've recognized termination'-flag. Then, in the final loop I only check if there is any recognized flag not set. Of course, these techniques should be be equivalent and using the WaitFor... functions might be the more professional way but polling some bool flags. However, I love pragmatical solutions that are working ... ;-)

Regards, Alex



0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

This article will show you some of the more useful Standard Template Library (STL) algorithms through the use of working examples.  You will learn about how these algorithms fit into the STL architecture, how they work with STL containers, and why t…
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now