Solved

Understanding the behaviour of waitpid(-1, NULL, WNOHANG)

Posted on 2004-04-29
3
3,259 Views
Last Modified: 2010-08-05
As far as I understand from waitpid's man page, when calling waitpid with WNOHANG, the function should return 0 if there are no child processes to wait for.
When running the following program -

===
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

#define N 5

void error_exit(char *str)
{
      perror(str);
      exit(1);
}

void sigchld_handler(int sig)
{
      pid_t      pid;
      do {
            pid = waitpid(-1, NULL, WNOHANG);
            printf("child %d\n",pid);
      } while ( pid > 0 );
      
      if ( errno == ECHILD )
            puts("errno=ECHILD");
      if ( pid < 0 )
            error_exit("waitpid");

      if ( signal(SIGCHLD, sigchld_handler) == SIG_ERR )
            error_exit("signal");
}

int main()
{
      int       i;
      pid_t       pid;
      
      if ( signal(SIGCHLD, sigchld_handler) == SIG_ERR )
            error_exit("signal");
      
      for ( i = 0; i < N; i += 1 ) {
            pid = fork();
            if ( pid < 0 )
                  // fork error
                  error_exit("fork");
            else if ( pid == 0 ) {
                  // child process - sleep
                  sleep(N-i+1);
                  exit(0);
            }
      }
      while(1);
      
      return 0;
}
===

I get this output -

===
child 5398
child 0
child 5397
child 0
child 5396
child 0
child 5395
child 0
child 5394
child -1
errno=ECHILD
waitpid: No child processes
===

As you can see, after all child processes have been "collected", waitpid returns -1 (which means an error occured) and sets errno to ECHILD. I'd like to know why do I get this error, and what is the problem with my code.

TIA

Edit:
My kernel version is 2.6.4 (also tried 2.4.22), and my GCC version is 3.3.3 (also tried 3.2.3).
0
Comment
Question by:zagzag
3 Comments
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 10954723
Your interpretation of the man page is not quite right:

      WNOHANG
              which  means  to return immediately if no child has
              exited.

If you start a bunch of child processes, but none of them has exited, waitpid will return right away if you specify WNOHANG. Your system is therefore doing the right thing by reporting "no child process".
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 125 total points
ID: 10957142
In your signal handler you have

     do {
          pid = waitpid(-1, NULL, WNOHANG);
          printf("child %d\n",pid);
     } while ( pid > 0 );


As long as there is a child, waitpid can get something (even if not at the current instant) so it keeps returning 0 ... When it collects the last child, the loop makes it execute waitpid again and this time there are no children left to collect ... So it produces error ECHILD

Also the way you have added error checking is inaccurate ...

Instead of

     if ( errno == ECHILD )
          puts("errno=ECHILD");
     if ( pid < 0 )
          error_exit("waitpid");

It should have been

if  ( pid < 0 )
{
     if ( errno == ECHILD )
          puts("errno=ECHILD");
     error_exit("waitpid");
}

you should not check only the error number ... It could have been set by some other program ... It is necessary to check the return value
0
 

Author Comment

by:zagzag
ID: 10958618
Thanks :)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
Windows 10 is mostly good. However the one thing that annoys me is how many clicks you have to do to dial a VPN connection. You have to go to settings from the start menu, (2 clicks), Network and Internet (1 click), Click VPN (another click) then fi…
In this video I am going to show you how to back up and restore Office 365 mailboxes using CodeTwo Backup for Office 365. Learn more about the tool used in this video here: http://www.codetwo.com/backup-for-office-365/ (http://www.codetwo.com/ba…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now