Solved

Parent script needs to kill all children on SIGTERM

Posted on 2004-04-05
9
3,424 Views
Last Modified: 2012-05-04
I would like to implement a wrapper script which would ensure that all child processes that it has spawned are killed when the wrapper receives a SIGTERM (or a SIGINT). Currently I have a solution which works on Solaris but not (for some reason) on Linux. the wrapper script utilises a perl script which parses the information given by `ps -ef`, figures out a set of children for a given PID and `kill -9`s them all (I cannot use pstree as it does not exist under Solaris, but `ps -ef` gives similar output  on both OSs). This script works OK on both Solaris and Linux (I have tested this). The problem I am having is a little more subtle. First though, let me give the wrapper that I have so far:

------------------------------------------------------------------------
executionWrapper.sh
------------------------------------------------------------------------
#!/bin/sh

# specify a handler for SIGTERM
sigtermHandler() {
      # find and kill all the children of the GIVEN process.
      echo "Killing children of $PID"
      perl killAllChildren.pl $PID         # killAllChildren.pl is the Perl script
      echo "Killed children of $PID"
      exit
}

# this traps SIGTERM (a Ctrl-C from a terminal)
trap sigtermHandler 2 15                   # Solaris
trap sigtermHandler SIGINT SIGTERM  # Linux

# check if there is anything to execute
if test $# -eq 0
then
      echo "Usage: `basename $0` <path to executable>"
      exit 1;
fi

# fork off the required process
exec $@ &

PID=$!

# wait for it to finish, or a SIGTERM to come.
wait $!
------------------------------------------------------------------------

The idea is as follows: the execution wrapper blindly executes all the arguments it is given and then waits for either the executed executable (whose PID is $PID) to return or a termination signal to be received. If a termination signal is received, the signal handler sigtermHandler( ) runs the perl script which figures out and kills all the children that the process with PID $PID might have spawned. Well, that is what I would like ti to do.

The problem is that, under Linux, when the signal handler runs, the children of the process executed by the wrapper are already orphaned! I found this by running `ps` inside the signal handler. Because the children are already orphaned (they are not under Solaris!) by the time the handler starts executing, the perl script is unable to find the children and they remain running! It seems like the process $PID that is spawned by the execution wrapper gets killed before the execution wrapper enters the handler! Why?

So, I need to have the following two questions answered:
a) Why does Linux behave like this? I.e. what is the *exact* behaviour? Why does $PID get killed before the signal handler? and
b) How can I remedy this? I.e. obtain the output of `ps -ef` *before* the process with pid $PID is killed?

I enclose the code for the Perl script killAllChildren.pl below, to help testing.

Thanks,

MT

------------------------------------------------------------------------
killAllChildren.pl (feel free to remove printouts - there's too much)
------------------------------------------------------------------------
#!/opt/bin/perl

use Data::Dumper;

print "killAllChildren.pl running\n";

# this is the pid whose children we are going to find.
my $pid = $ARGV[ 0 ];

die "No parent pid specified.\n" if $pid == "";

print "PID is $pid.";

# get information about all processes.
my $cmd = `ps -ef`;

# turn them into an array of lines.
my @lines = split( /\n/, $cmd );

# this is a map which will map a pid to an array of its children.
my $map = {};

# process the array of lines. we only need pids and parent pids
# from this we can construct the pid -> { children } map above.
foreach my $line ( @lines )
  {
      # pattern match on
       $line =~ /^\s*\w*\s*(\d*)\s*(\d*)/;

      my $cpid  = $1;      # child pid
      my $ppid  = $2;      # parent pid

      # create an empty child set if we have
      # not seen this parent pid before
      $map->{ $ppid } = [] if( !defined $map->{ $ppid } );

      # append the child to the child array for $ppid
      push @{ $map->{ $ppid } }, $cpid;
  }

print Dumper( $map );

# create a new array and initialise it with $pid's children
my @all_children = @{ $map->{ $pid } };

print "Initial all_children:\n";
print Dumper( @all_children );

# iterate through the array, adding children of the
# processes already in the array, until we have no more
# children to append.

for ( my $i = 0; $i < @all_children; $i++ )
      {
            @child = @{ $map->{ $all_children[$i] } };
            push @all_children, @child;
      }

print "Processes to kill: \n";
print Dumper( @all_children );

# kill the children
for ( my $i = 0; $i < @all_children; $i++ )
      {
            print $all_children[ $i ] . "\n";
            `kill -9 $all_children[ $i ]`;
      }
      
print "killAllChildren.pl done.\n"
------------------------------------------------------------------------
0
Comment
Question by:tikiliainen
9 Comments
 
LVL 12

Expert Comment

by:stefan73
ID: 10755973
Hi tikiliainen,
>  `kill -9 $all_children[ $i ]`;
Perl has a built-in kill command.

But the real point here is that iterating over all children is probably not a good idea (so is using -9).

Check process group IDs. This works on Solaris:

ps -ef -o pid,ppid,pgid | grep $$ ; echo $$

You can kill the entire process group by providing a parent PID, like:

ps -ef -o ppid,pid | grep ^$$

This gets all direct descendants of the current shell. To kill them, do:

ps -ef -o ppid,pid | grep ^$$ | awk '{print $2}' | xargs kill

In case you want to replace the $$ by something else, make sure you're using a grep regex like

grep "^$pid "

You could kill wrong processes with a 4-digit PID otherwise.

Cheers,
Stefan
0
 

Author Comment

by:tikiliainen
ID: 10756073
Hello Stefan,

My problem is that I need to kill *all* the children -- not just the direct descendants. What you describe will kill only the direct descendants. I would not use a perl script otherwise. ;-)

The whole point is to gather the entire list of children from the process tree and I need to do this without using pstree (Linux) or ptree (Solaris).

Unfortunately,

ps -ef -o ppid,pid | grep ^$$

only provides the list of direct descendants.
0
 
LVL 4

Expert Comment

by:oumer
ID: 10756104
I don't know perl but in c and in shell programming you have the call to make processes form a session
using setsid() call, so what you need to do is make the main process a session leader using the setsid command
 and
make some kind of signal handler or at exit function (if there is an option of doing so in perl) that would be called when the main process exits....

if it was in c I do something like

bye()
{
  struct sigaction sa;
  sa.sa_handler = SIG_IGN;
  sigemptyset(&sa.sa_mask);
  sigaction(SIGTERM, &sa, NULL);
 
  kill((pid_t) 0, SIGTERM);  
// will kill everything under this session, whether they are direct children or not
  printf("bye bye %d \n", getpid());
}

int main()
{
atexit(bye);  //function to be called when the main processe exits

setsid();

/* do the other process creation here */

return 0;//at exit will be called here or any other place where the process exits

}

in
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:tikiliainen
ID: 10756239
I am not sure about using setsid in shell scripts. As far as I understand, my wrapper script (and it has to be a script, not a C binary) should perform a

setsid()

to become the session (or is it group? is there a difference between the two notions?) leader and then the SIGTERM handler should do a

kill -9 0

to kill all members of the session. Is this correct? I could not get setsid to work in the way that I would expect it to. Man page is useless. Grrr...
0
 
LVL 12

Expert Comment

by:stefan73
ID: 10756322
Don't use -9 (SIGKILL) unless you have to. Many programs have cleanup code which kills children when you stop them with SIGTERM (kill -15 or without arguments).
0
 

Author Comment

by:tikiliainen
ID: 10756444
Stefan,

I realise the difference between SIGTERM and SIGKILL, however, in this given context, the assumption is that a SIGTERM is sent to the wrapper script when it is hanging -- i.e. something is gone seriously wrong with one of the children and it SIGKILL is the intended signal to be sent to the children that are still running.
0
 
LVL 1

Accepted Solution

by:
Computer101 earned 0 total points
ID: 12185414
PAQed - no points refunded (of 400)

Computer101
E-E Admin
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Have you ever been frustrated by having to click seven times in order to retrieve a small bit of information from the web, always the same seven clicks, scrolling down and down until you reach your target? When you know the benefits of the command l…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now