Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

pthread_join  isn't successfull

Posted on 2004-03-28
6
Medium Priority
?
976 Views
Last Modified: 2007-12-19
Hi experts,
I'm programming an applicationServer which runs under Linux. Server is started by using p_thread_create(). Runnning of the server is no problem.
The problem is how to stop the server thread.

I try to stop using sigaction with a handler method. This method calls a member function to call p_thread_cancel() followed by p_thread_join.(see code).The  pthread_join is waiting for the server thread to finish but never returns. Program is blocked !  :-(  
When I call the stop method within the applServer class pthread_join  works successfully.
Thanks in advance.
Martin

Now see the code snippets.

void ServerRestart(int p_iSig);
void ServerStop(int p_iSig);

c_ApplServer g_oApplServer(DATASERVERCONFIGFILE);

int main()
{

      struct sigaction s_SigRestart, s_SigStop;

      //start dataserver
      if (g_oApplServer.start() != 0) exit(0);

      s_SigStop.sa_handler = ServerStop;
      sigemptyset(&s_SigStop.sa_mask);
      s_SigStop.sa_flags = 0;
      sigaction(SIGINT,&s_SigStop,NULL);
      sigaction(SIGTERM,&s_SigStop,NULL);
      sigaction(SIGQUIT,&s_SigStop,NULL);
      sigaction(SIGTSTP,&s_SigStop,NULL);

      //wait for signals

      while(1) pause();

      exit(0);

}
void ServerStop(int p_iSig)
{

        printf("Try to stop server\n");
      fflush(stdout);
g_oApplServer.stop();
exit(0);
}

Now the ServerCode snippet:
--------------------------


c_ApplServer::c_ApplServer(char *p_pcConfigFile)
{
      a_bServerRun = true;
}
c_ApplServer::~c_ApplServer()
{
}

int c_ApplServer::start(void)
{

      if(pthread_create(&a_ThreadServer,NULL,ThreadServerWatchFunction,this) != 0)
        {
      }
      return 0;
}

int c_ApplServer::restart(void)
{
      return 0;
}

int c_ApplServer::stop(void)
{
        int res;
        //a_bServerRun = false;
      sleep(1);

        printf("Waiting for Server thread to finish\n");
      fflush(stdout);
      if(pthread_cancel(a_ThreadServer)!= 0)
      {
        fprintf(stderr,"Cancel ThreadServer failed\n");
      }

if(pthread_join(a_ThreadServer, NULL)!=0)
        {
        fprintf(stderr,"Thread joining failed\n");
      }      fflush(stdout);

        return 0;
}

void *c_ApplServer::ThreadServerWatchFunction(void* p_pvArg)
{
      c_ApplServer *l_poApplServer = (c_ApplServer*)p_pvArg;
      l_poApplServer -> ThreadServerWatch(p_pvArg);

      return NULL;
}

void c_ApplServer::ThreadServerWatch(void* p_pvArg)
{

    //c_ApplServer* appl = (c_ApplServer*)p_pvArg;
    int res = pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

    pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL);

    while(1)
    {
        printf("I'm running... \n");
        pthread_testcancel();
      usleep(500);
        pthread_testcancel();
    }

 }
0
Comment
Question by:onsight
  • 5
6 Comments
 
LVL 4

Expert Comment

by:oumer
ID: 10699151
i try to rewrote what you have written so that it could run, and it did, when I press CTR+c, the server stops. could you try this program on your machine and see if it runs. if it does, then something could be wrong with the other parts of your code that you didn't mention, maybe you have set the cancelled state to disabled somewhere or something like that. And I didn't get it when you said "when
 I call the stop method within the applserver class pthread_join  works successfully"



0
 
LVL 4

Expert Comment

by:oumer
ID: 10699161
ooops the code ..

#include <iostream>
#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>

using namespace std;


//void ServerRestart(int p_iSig);
void ServerStop(int p_iSig);

class c_ApplServer
{
 public:
  bool a_bServerRun;
  pthread_t a_ThreadServer;
  c_ApplServer(char *);
  ~c_ApplServer();
  int start(void);
  int restart(void);
  int stop(void);
  static void *ThreadServerWatchFunction(void*);
  void ThreadServerWatch(void*);
};

c_ApplServer::c_ApplServer(char *p_pcConfigFile)
{
  a_bServerRun = true;
}

c_ApplServer::~c_ApplServer()
{
}

int c_ApplServer::start(void)
{
  if(pthread_create(&a_ThreadServer,NULL,ThreadServerWatchFunction,this) != 0)
    {
    }
  return 0;
}

int c_ApplServer::restart(void)
{
     return 0;
}

int c_ApplServer::stop(void)
{
  // int res;
  //a_bServerRun = false;
  sleep(1);
 
  printf("Waiting for Server thread to finish\n");
  fflush(stdout);
  if(pthread_cancel(a_ThreadServer)!= 0)
    {
      fprintf(stderr,"Cancel ThreadServer failed\n");
    }
 
  if(pthread_join(a_ThreadServer, NULL)!=0)
    {
      fprintf(stderr,"Thread joining failed\n");
    }
  fflush(stdout);
 
  return 0;
}

void *c_ApplServer::ThreadServerWatchFunction(void* p_pvArg)
{
  c_ApplServer *l_poApplServer = (c_ApplServer*)p_pvArg;
  l_poApplServer -> ThreadServerWatch(p_pvArg);
 
  return NULL;
}

void c_ApplServer::ThreadServerWatch(void* p_pvArg)
{

  //c_ApplServer* appl = (c_ApplServer*)p_pvArg;
  //int res;
  pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

  pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL);

  while(1)
    {
     
      printf("I'm running... \n");
      usleep(500);
    }
 
  return ;
}

c_ApplServer g_oApplServer(NULL);


void ServerStop(int p_iSig)
{

  printf("Try to stop server\n");
  fflush(stdout);
  g_oApplServer.stop();
  exit(0);
}


int main()
{
     struct sigaction s_SigStop;

     //start dataserver
     if (g_oApplServer.start() != 0) exit(0);

     s_SigStop.sa_handler = ServerStop;
     sigemptyset(&s_SigStop.sa_mask);
     s_SigStop.sa_flags = 0;
     sigaction(SIGINT,&s_SigStop,NULL);
     sigaction(SIGTERM,&s_SigStop,NULL);
     sigaction(SIGQUIT,&s_SigStop,NULL);
     sigaction(SIGTSTP,&s_SigStop,NULL);

     //wait for signals

     while(1) pause();

     exit(0);

}


0
 
LVL 4

Accepted Solution

by:
oumer earned 1500 total points
ID: 10699316
I was playing with your code, and as you said sometimes it fails. I think it is because when the interrupt signal is received, any thread can run the signal handler. Ie the signal handler method is run from one of the threads. If you want to control this, you should block the signal in the threads where you don't want to handle it.

something like this, and the problem seems to disappear....

int main()
{
  struct sigaction s_SigStop;

// mask the signals before we create the thread
  sigset_t mask;
  //start dataserver
  sigemptyset(&mask);
  sigaddset(&mask, SIGINT);
  sigaddset(&mask, SIGTERM);
  sigaddset(&mask, SIGSTOP);
  sigaddset(&mask, SIGQUIT);

  pthread_sigmask(SIG_BLOCK, &mask, NULL);
 
  if (g_oApplServer.start() != 0) exit(0);

//now that we have created the thread, we unblock the signal
//and we install the signal handler, this will make sure that the
//only thread that will accept this signal is the main thread ..

  pthread_sigmask(SIG_UNBLOCK, &mask, NULL);
   

  s_SigStop.sa_handler = ServerStop;
  sigemptyset(&s_SigStop.sa_mask);
  s_SigStop.sa_flags = 0;
  sigaction(SIGINT,&s_SigStop,NULL);
  sigaction(SIGTERM,&s_SigStop,NULL);
  sigaction(SIGQUIT,&s_SigStop,NULL);
  sigaction(SIGTSTP,&s_SigStop,NULL);

     //wait for signals

  while(1) pause();

     exit(0);

}


0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 

Author Comment

by:onsight
ID: 10702121
Now I find the trap. Without the sleep(1) it doesn't work ?-) But why ???

There are two other question which I can't fix.
When I use the sleep(1) command  and evreything works fine folowing lines are displayed.

> Try to stop server
> Try to stop server
> Try to join Server Thread
> Thread joined

This means that ServerStop runs two times. But why????

>void ServerStop(int p_iSig)
>{
>       printf("Try to stop server\n");
>      g_oApplServer.stop();
>      exit(0);
>}

When I start the applserver the ps-ax command lists me three entries for that server
  912 root        556 S   ./server
  913 root        556 S   ./server
  914 root        556 S   ./server
  927 root        292 R   ps
Why are there three entries and not only two (Parent and child thread ?)


int c_ApplServer::stop(void)
{
  // int res;
  //a_bServerRun = false;
  sleep(1);                                                                <========== I recognized the sleep command makes it !!!
 
  printf("Waiting for Server thread to finish\n");
  fflush(stdout);
  if(pthread_cancel(a_ThreadServer)!= 0)
    {
      fprintf(stderr,"Cancel ThreadServer failed\n");
    }
 
  if(pthread_join(a_ThreadServer, NULL)!=0)
    {
      fprintf(stderr,"Thread joining failed\n");
    }
  fflush(stdout);
 
  return 0;
}
0
 
LVL 4

Expert Comment

by:oumer
ID: 10702597
The three listings are one for the parent process, one for the main thread, and one for the child thread. If you do just ps you will see the parent process only. In a nonthreaded program, you will be able to see only this. But once you make the program a multithreaded one, you will see one for the parent process and one for the main thread, and for every child thread.

And I think you get the server stop called twice because the SIGINT is sent to both the threads as there are two threads running and if you don't mask the SIGINT both thread will try to call the signal handler. So using sigprocmask will make sure only one thread will call the signal handler.

0
 
LVL 4

Expert Comment

by:oumer
ID: 10742983
I am sorry to say that my description of the three threads was wrong, as I found it today while looking for something.

The implementation of posix threads on linux creates a new process (special process different from the one you will get from a fork, as it shares the same address space as the original process)  whenever a pthread_create call is called, which will run the thread.

so in your case the first pid you saw is for the main process, the second one for the thread spawner that was created with a call to pthread_create, the third one is the child thread.

sorry for misinforming you, I didn't know at the time
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever been frustrated by having to click seven times in order to retrieve a small bit of information from the web, always the same seven clicks, scrolling down and down until you reach your target? When you know the benefits of the command l…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Screencast - Getting to Know the Pipeline
Suggested Courses

916 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question