Link to home
Start Free TrialLog in
Avatar of onsight
onsight

asked on

pthread_join isn't successfull

Hi experts,
I'm programming an applicationServer which runs under Linux. Server is started by using p_thread_create(). Runnning of the server is no problem.
The problem is how to stop the server thread.

I try to stop using sigaction with a handler method. This method calls a member function to call p_thread_cancel() followed by p_thread_join.(see code).The  pthread_join is waiting for the server thread to finish but never returns. Program is blocked !  :-(  
When I call the stop method within the applServer class pthread_join  works successfully.
Thanks in advance.
Martin

Now see the code snippets.

void ServerRestart(int p_iSig);
void ServerStop(int p_iSig);

c_ApplServer g_oApplServer(DATASERVERCONFIGFILE);

int main()
{

      struct sigaction s_SigRestart, s_SigStop;

      //start dataserver
      if (g_oApplServer.start() != 0) exit(0);

      s_SigStop.sa_handler = ServerStop;
      sigemptyset(&s_SigStop.sa_mask);
      s_SigStop.sa_flags = 0;
      sigaction(SIGINT,&s_SigStop,NULL);
      sigaction(SIGTERM,&s_SigStop,NULL);
      sigaction(SIGQUIT,&s_SigStop,NULL);
      sigaction(SIGTSTP,&s_SigStop,NULL);

      //wait for signals

      while(1) pause();

      exit(0);

}
void ServerStop(int p_iSig)
{

        printf("Try to stop server\n");
      fflush(stdout);
g_oApplServer.stop();
exit(0);
}

Now the ServerCode snippet:
--------------------------


c_ApplServer::c_ApplServer(char *p_pcConfigFile)
{
      a_bServerRun = true;
}
c_ApplServer::~c_ApplServer()
{
}

int c_ApplServer::start(void)
{

      if(pthread_create(&a_ThreadServer,NULL,ThreadServerWatchFunction,this) != 0)
        {
      }
      return 0;
}

int c_ApplServer::restart(void)
{
      return 0;
}

int c_ApplServer::stop(void)
{
        int res;
        //a_bServerRun = false;
      sleep(1);

        printf("Waiting for Server thread to finish\n");
      fflush(stdout);
      if(pthread_cancel(a_ThreadServer)!= 0)
      {
        fprintf(stderr,"Cancel ThreadServer failed\n");
      }

if(pthread_join(a_ThreadServer, NULL)!=0)
        {
        fprintf(stderr,"Thread joining failed\n");
      }      fflush(stdout);

        return 0;
}

void *c_ApplServer::ThreadServerWatchFunction(void* p_pvArg)
{
      c_ApplServer *l_poApplServer = (c_ApplServer*)p_pvArg;
      l_poApplServer -> ThreadServerWatch(p_pvArg);

      return NULL;
}

void c_ApplServer::ThreadServerWatch(void* p_pvArg)
{

    //c_ApplServer* appl = (c_ApplServer*)p_pvArg;
    int res = pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

    pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL);

    while(1)
    {
        printf("I'm running... \n");
        pthread_testcancel();
      usleep(500);
        pthread_testcancel();
    }

 }
Avatar of oumer
oumer

i try to rewrote what you have written so that it could run, and it did, when I press CTR+c, the server stops. could you try this program on your machine and see if it runs. if it does, then something could be wrong with the other parts of your code that you didn't mention, maybe you have set the cancelled state to disabled somewhere or something like that. And I didn't get it when you said "when
 I call the stop method within the applserver class pthread_join  works successfully"



ooops the code ..

#include <iostream>
#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>

using namespace std;


//void ServerRestart(int p_iSig);
void ServerStop(int p_iSig);

class c_ApplServer
{
 public:
  bool a_bServerRun;
  pthread_t a_ThreadServer;
  c_ApplServer(char *);
  ~c_ApplServer();
  int start(void);
  int restart(void);
  int stop(void);
  static void *ThreadServerWatchFunction(void*);
  void ThreadServerWatch(void*);
};

c_ApplServer::c_ApplServer(char *p_pcConfigFile)
{
  a_bServerRun = true;
}

c_ApplServer::~c_ApplServer()
{
}

int c_ApplServer::start(void)
{
  if(pthread_create(&a_ThreadServer,NULL,ThreadServerWatchFunction,this) != 0)
    {
    }
  return 0;
}

int c_ApplServer::restart(void)
{
     return 0;
}

int c_ApplServer::stop(void)
{
  // int res;
  //a_bServerRun = false;
  sleep(1);
 
  printf("Waiting for Server thread to finish\n");
  fflush(stdout);
  if(pthread_cancel(a_ThreadServer)!= 0)
    {
      fprintf(stderr,"Cancel ThreadServer failed\n");
    }
 
  if(pthread_join(a_ThreadServer, NULL)!=0)
    {
      fprintf(stderr,"Thread joining failed\n");
    }
  fflush(stdout);
 
  return 0;
}

void *c_ApplServer::ThreadServerWatchFunction(void* p_pvArg)
{
  c_ApplServer *l_poApplServer = (c_ApplServer*)p_pvArg;
  l_poApplServer -> ThreadServerWatch(p_pvArg);
 
  return NULL;
}

void c_ApplServer::ThreadServerWatch(void* p_pvArg)
{

  //c_ApplServer* appl = (c_ApplServer*)p_pvArg;
  //int res;
  pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

  pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL);

  while(1)
    {
     
      printf("I'm running... \n");
      usleep(500);
    }
 
  return ;
}

c_ApplServer g_oApplServer(NULL);


void ServerStop(int p_iSig)
{

  printf("Try to stop server\n");
  fflush(stdout);
  g_oApplServer.stop();
  exit(0);
}


int main()
{
     struct sigaction s_SigStop;

     //start dataserver
     if (g_oApplServer.start() != 0) exit(0);

     s_SigStop.sa_handler = ServerStop;
     sigemptyset(&s_SigStop.sa_mask);
     s_SigStop.sa_flags = 0;
     sigaction(SIGINT,&s_SigStop,NULL);
     sigaction(SIGTERM,&s_SigStop,NULL);
     sigaction(SIGQUIT,&s_SigStop,NULL);
     sigaction(SIGTSTP,&s_SigStop,NULL);

     //wait for signals

     while(1) pause();

     exit(0);

}


ASKER CERTIFIED SOLUTION
Avatar of oumer
oumer

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of onsight

ASKER

Now I find the trap. Without the sleep(1) it doesn't work ?-) But why ???

There are two other question which I can't fix.
When I use the sleep(1) command  and evreything works fine folowing lines are displayed.

> Try to stop server
> Try to stop server
> Try to join Server Thread
> Thread joined

This means that ServerStop runs two times. But why????

>void ServerStop(int p_iSig)
>{
>       printf("Try to stop server\n");
>      g_oApplServer.stop();
>      exit(0);
>}

When I start the applserver the ps-ax command lists me three entries for that server
  912 root        556 S   ./server
  913 root        556 S   ./server
  914 root        556 S   ./server
  927 root        292 R   ps
Why are there three entries and not only two (Parent and child thread ?)


int c_ApplServer::stop(void)
{
  // int res;
  //a_bServerRun = false;
  sleep(1);                                                                <========== I recognized the sleep command makes it !!!
 
  printf("Waiting for Server thread to finish\n");
  fflush(stdout);
  if(pthread_cancel(a_ThreadServer)!= 0)
    {
      fprintf(stderr,"Cancel ThreadServer failed\n");
    }
 
  if(pthread_join(a_ThreadServer, NULL)!=0)
    {
      fprintf(stderr,"Thread joining failed\n");
    }
  fflush(stdout);
 
  return 0;
}
The three listings are one for the parent process, one for the main thread, and one for the child thread. If you do just ps you will see the parent process only. In a nonthreaded program, you will be able to see only this. But once you make the program a multithreaded one, you will see one for the parent process and one for the main thread, and for every child thread.

And I think you get the server stop called twice because the SIGINT is sent to both the threads as there are two threads running and if you don't mask the SIGINT both thread will try to call the signal handler. So using sigprocmask will make sure only one thread will call the signal handler.

I am sorry to say that my description of the three threads was wrong, as I found it today while looking for something.

The implementation of posix threads on linux creates a new process (special process different from the one you will get from a fork, as it shares the same address space as the original process)  whenever a pthread_create call is called, which will run the thread.

so in your case the first pid you saw is for the main process, the second one for the thread spawner that was created with a call to pthread_create, the third one is the child thread.

sorry for misinforming you, I didn't know at the time