?
Solved

Problem with SIGCHLD

Posted on 2003-11-05
5
Medium Priority
?
424 Views
Last Modified: 2010-04-21
Hi,
  I have several child processes (about 10) forked by a parent. I have provided the SIGCHLD handler in the process and calling a wait() inside it. I observe that if I kill (kill -9) some of the child processes (say 5). Not all SIGCHLD are reaching the parent. The child processes for which SIGCHLD is not reaching become <defunct>. I would like to know what could be done to make all the child exits to be reported properly. Will waitpid(-1,...) help? If so with what options?

 Thanks in advance
0
Comment
Question by:sgupta001
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 45

Expert Comment

by:sunnycoder
ID: 9686636
when you are handling a signal and same signal arrives, it is not delivered ... also if more than one children exit simultaneously, your application will see only one signal ... this is the default and proper behaviour of signal mechanism

if you are expecting situations like you have described, keep track of pids of children you have forked and use waitpid() for those who have exited. use WNOHANG in options to waitpid so that your program does not get stuck

you may find this interesting
http://oldlook.experts-exchange.com/Programming/Programming_Platforms/Unix_Programming/Q_20776323.html
0
 
LVL 1

Expert Comment

by:TriShakti
ID: 9844993
#include<stdio.h>

#include<unistd.h>

#include<sys/types.h>
#include<sys/wait.h>

#include<stdlib.h>


    int main(void)
        {
            pid_t dead_process ;

            int no_of_child_processes = 10 ;

            int i = 0, j = 0 ;

            int child_process[10] = {0,0,0,0,0,0,0,0,0,0};


            for( i = 0 ; i < no_of_child_processes ; i++ )
                {
                   child_process[i]=fork();
                   if(child_process[i] == 0)
                       break;
                   else
                       printf( "Child no [%d] has pid = [%d]\n", i, child_process[i] ) ;
                    if( i == 9 )
                        break;
                }
            if( child_process[i] == 0 )
               sleep(50);
            if( child_process[i] != 0 )
                while(1)
                  for( i=0 ; i<no_of_child_processes ; i++ )
                    {
                      dead_process=waitpid( child_process[i], (int*)0, WNOHANG);
                      if( dead_process > 0 )
                         printf( "Process [%d] is dead\n", dead_process ) ;
                      sleep(2);
                    }
            return(i);
        }


Execute this program it provide u messages for every that u kill  (kill -9 1 2 3 4 . . .) or dies .


You can't use  signal handlers to achieve this because the data structures used only have the capability to
remember which signal occurred. They can't remember the number of times a particular signal ocuured.

This is reason for processes becoming defunct when u do kill -9 <more tham one process>.
( The signal handler is invoked only once, for  all the pids specified in a single kill command and accordingly wait OR waitpid ( in the signal handler  is invoked once, which displays death message ONLY for one child).  If you insert the infinite loop of this program into your signal handler it will work.
But this is not advised practically, Just test whether this change works or Not.


Insert a comment here if  you need more help or if this does not work and I will getback.

TRISHAKTI


 

0
 

Author Comment

by:sgupta001
ID: 9896675
Hi TriShakti,
  I have already found a solution that works. It is also based on waitpid.
 Like inserting the following lines in the SIGCHLD handler:

       while((stat = waitpid(-1,(int*)0, WNOHANG|WUNTRACED)) > 0)
       {
            // one child process died
       }

Regards
0
 

Accepted Solution

by:
modulo earned 0 total points
ID: 11687220
PAQed, with points refunded (50)

modulo
Community Support Moderator
0

Featured Post

Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Using libpcap/Jpcap to capture and send packets on Solaris version (10/11) Library used: 1.      Libpcap (http://www.tcpdump.org) Version 1.2 2.      Jpcap(http://netresearch.ics.uci.edu/kfujii/Jpcap/doc/index.html) Version 0.6 Prerequisite: 1.      GCC …
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Suggested Courses
Course of the Month15 days, 11 hours left to enroll

741 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question