• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3567
  • Last Modified:

Parent script needs to kill all children on SIGTERM

I would like to implement a wrapper script which would ensure that all child processes that it has spawned are killed when the wrapper receives a SIGTERM (or a SIGINT). Currently I have a solution which works on Solaris but not (for some reason) on Linux. the wrapper script utilises a perl script which parses the information given by `ps -ef`, figures out a set of children for a given PID and `kill -9`s them all (I cannot use pstree as it does not exist under Solaris, but `ps -ef` gives similar output  on both OSs). This script works OK on both Solaris and Linux (I have tested this). The problem I am having is a little more subtle. First though, let me give the wrapper that I have so far:


# specify a handler for SIGTERM
sigtermHandler() {
      # find and kill all the children of the GIVEN process.
      echo "Killing children of $PID"
      perl killAllChildren.pl $PID         # killAllChildren.pl is the Perl script
      echo "Killed children of $PID"

# this traps SIGTERM (a Ctrl-C from a terminal)
trap sigtermHandler 2 15                   # Solaris
trap sigtermHandler SIGINT SIGTERM  # Linux

# check if there is anything to execute
if test $# -eq 0
      echo "Usage: `basename $0` <path to executable>"
      exit 1;

# fork off the required process
exec $@ &


# wait for it to finish, or a SIGTERM to come.
wait $!

The idea is as follows: the execution wrapper blindly executes all the arguments it is given and then waits for either the executed executable (whose PID is $PID) to return or a termination signal to be received. If a termination signal is received, the signal handler sigtermHandler( ) runs the perl script which figures out and kills all the children that the process with PID $PID might have spawned. Well, that is what I would like ti to do.

The problem is that, under Linux, when the signal handler runs, the children of the process executed by the wrapper are already orphaned! I found this by running `ps` inside the signal handler. Because the children are already orphaned (they are not under Solaris!) by the time the handler starts executing, the perl script is unable to find the children and they remain running! It seems like the process $PID that is spawned by the execution wrapper gets killed before the execution wrapper enters the handler! Why?

So, I need to have the following two questions answered:
a) Why does Linux behave like this? I.e. what is the *exact* behaviour? Why does $PID get killed before the signal handler? and
b) How can I remedy this? I.e. obtain the output of `ps -ef` *before* the process with pid $PID is killed?

I enclose the code for the Perl script killAllChildren.pl below, to help testing.



killAllChildren.pl (feel free to remove printouts - there's too much)

use Data::Dumper;

print "killAllChildren.pl running\n";

# this is the pid whose children we are going to find.
my $pid = $ARGV[ 0 ];

die "No parent pid specified.\n" if $pid == "";

print "PID is $pid.";

# get information about all processes.
my $cmd = `ps -ef`;

# turn them into an array of lines.
my @lines = split( /\n/, $cmd );

# this is a map which will map a pid to an array of its children.
my $map = {};

# process the array of lines. we only need pids and parent pids
# from this we can construct the pid -> { children } map above.
foreach my $line ( @lines )
      # pattern match on
       $line =~ /^\s*\w*\s*(\d*)\s*(\d*)/;

      my $cpid  = $1;      # child pid
      my $ppid  = $2;      # parent pid

      # create an empty child set if we have
      # not seen this parent pid before
      $map->{ $ppid } = [] if( !defined $map->{ $ppid } );

      # append the child to the child array for $ppid
      push @{ $map->{ $ppid } }, $cpid;

print Dumper( $map );

# create a new array and initialise it with $pid's children
my @all_children = @{ $map->{ $pid } };

print "Initial all_children:\n";
print Dumper( @all_children );

# iterate through the array, adding children of the
# processes already in the array, until we have no more
# children to append.

for ( my $i = 0; $i < @all_children; $i++ )
            @child = @{ $map->{ $all_children[$i] } };
            push @all_children, @child;

print "Processes to kill: \n";
print Dumper( @all_children );

# kill the children
for ( my $i = 0; $i < @all_children; $i++ )
            print $all_children[ $i ] . "\n";
            `kill -9 $all_children[ $i ]`;
print "killAllChildren.pl done.\n"
1 Solution
Hi tikiliainen,
>  `kill -9 $all_children[ $i ]`;
Perl has a built-in kill command.

But the real point here is that iterating over all children is probably not a good idea (so is using -9).

Check process group IDs. This works on Solaris:

ps -ef -o pid,ppid,pgid | grep $$ ; echo $$

You can kill the entire process group by providing a parent PID, like:

ps -ef -o ppid,pid | grep ^$$

This gets all direct descendants of the current shell. To kill them, do:

ps -ef -o ppid,pid | grep ^$$ | awk '{print $2}' | xargs kill

In case you want to replace the $$ by something else, make sure you're using a grep regex like

grep "^$pid "

You could kill wrong processes with a 4-digit PID otherwise.

tikiliainenAuthor Commented:
Hello Stefan,

My problem is that I need to kill *all* the children -- not just the direct descendants. What you describe will kill only the direct descendants. I would not use a perl script otherwise. ;-)

The whole point is to gather the entire list of children from the process tree and I need to do this without using pstree (Linux) or ptree (Solaris).


ps -ef -o ppid,pid | grep ^$$

only provides the list of direct descendants.
I don't know perl but in c and in shell programming you have the call to make processes form a session
using setsid() call, so what you need to do is make the main process a session leader using the setsid command
make some kind of signal handler or at exit function (if there is an option of doing so in perl) that would be called when the main process exits....

if it was in c I do something like

  struct sigaction sa;
  sa.sa_handler = SIG_IGN;
  sigaction(SIGTERM, &sa, NULL);
  kill((pid_t) 0, SIGTERM);  
// will kill everything under this session, whether they are direct children or not
  printf("bye bye %d \n", getpid());

int main()
atexit(bye);  //function to be called when the main processe exits


/* do the other process creation here */

return 0;//at exit will be called here or any other place where the process exits


Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

tikiliainenAuthor Commented:
I am not sure about using setsid in shell scripts. As far as I understand, my wrapper script (and it has to be a script, not a C binary) should perform a


to become the session (or is it group? is there a difference between the two notions?) leader and then the SIGTERM handler should do a

kill -9 0

to kill all members of the session. Is this correct? I could not get setsid to work in the way that I would expect it to. Man page is useless. Grrr...
Don't use -9 (SIGKILL) unless you have to. Many programs have cleanup code which kills children when you stop them with SIGTERM (kill -15 or without arguments).
tikiliainenAuthor Commented:

I realise the difference between SIGTERM and SIGKILL, however, in this given context, the assumption is that a SIGTERM is sent to the wrapper script when it is hanging -- i.e. something is gone seriously wrong with one of the children and it SIGKILL is the intended signal to be sent to the children that are still running.
PAQed - no points refunded (of 400)

E-E Admin

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Tackle projects and never again get stuck behind a technical roadblock.
Join Now