Solved

Running a background process from a Korn shell function

Posted on 2013-10-25
28
1,288 Views
Last Modified: 2013-11-22
I have a Korn shell script containing the following code:

setAlarm() {
  set -x
  ( sleep $TIMEOUT; kill -USR1 $$ ) &
  echo=$!
  return
}

This function is called in the following statement:

[ "$TIMEOUT" ] && ALARM=`setAlarm $TIMEOUT`

When I run the script with the debug flag, I get the following:

+ [ 60 ]
+ + setAlarm 60
+ sleep 60
+ echo=11075826
+ kill -USR1 7733266
### SIXTY SECOND PAUSE ###
ALARM=
+ soundAlarm
+ /home/155477/NetBackup/pre-exec 120 0

The problem is that between the statement "kill -USR1 7733266" and "ALARM=," there's a sixty second pause. My expectation is that the ALARM variable assignment would happen immediately after the setAlarm function returns. Not only is that statement not executed when expected, but the assignment is null!? The setAlarm function returns the PID of the command group that it runs in the background. It is that value that should get assigned to the variable named ALARM. Not only does the code wait for a function that appears to return immediately, but the value that function returns is lost.
0
Comment
Question by:babyb00mer
  • 15
  • 13
28 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39601703
To populate the variable you should use

echo $!

instead of "echo=$!"

The delay should occur between "return" and "kill -USR1 ...", but not at all between "kill ..." and the start of your signal handler routine (resp. the debug display of the ALARM variable).
Strange that "return" does not appear in the debug log.  Are you sure that you're posting the actual code?

By the way, inside the function "sleep $1" should suffice, since you're passing the timeout value as a  parameter to it.

wmp
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39601771
<< a function that appears to return immediately >>

The function if called "alone" would indeed return immediately, but might not be able to find the process to kill with USR1 (this parent process might well have ended in the meantime).

When used in a variable assignment via command substitution the assignment can only take place when everything which is needed for it has finished.

The shell has to wait for everything inside the function to finish in order to be able to decide what the full output of the function might be.
0
 

Author Comment

by:babyb00mer
ID: 39601820
Oops! I spoke too soon! The statement called by the function is still not behaving as a concurrent process. In the debug output, it pauses right after the return statement. I know I've seen this work, but I wrote that code quite a few years ago. I'm pretty sure I can access it, but I'll have to wait until I get home.

setAlarm() {
  set -x
  ( sleep $1; kill -USR1 $$ ) &
  echo $!
  return
}

+ echo 15:05:19 [shtest_25Oct13] StreamNumber1 has not finished
+ 1>> /home/155477/NetBackup/logs/Just.Testing.FULL/queue.102513
+ [[ -n 15 ]]
+ + setAlarm 15
+ sleep 15
+ echo 8585372
+ return
+ kill -USR1 7733408
ALARM=8585372
+ soundAlarm
+ /home/155477/NetBackup/pre-exec 30 0
+ 1> /dev/null 2>> /home/155477/NetBackup/logs/Just.Testing.FULL/log.102513

Hmm. I've got another idea!
0
 

Author Comment

by:babyb00mer
ID: 39601846
Okay. I had to replace this...

ALARM=`setAlarm $TIMEOUT`

with...

setAlarm $TIMEOUT
ALARM=$!

I think it's working now.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39601855
This works when the function contains

return $!

because

echo $!

doesn't make sense anymore then.
0
 

Author Comment

by:babyb00mer
ID: 39605981
Hmm. Isn't the return statement updating the status register, and therefor must be between zero and two hundred and fifty-five?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39606036
Yep, you're right, in two ways.

First, of course, you can only return values between 0 and 255. Seems I'm getting too old (or demented) ...

Second, I obviously misread your solution, by assuming "$?" instead of "$!" - or did you edit the comment?? Anyway, since "$!" doesn't get overwritten by any other subprocess you can well refer to this variable outside of the function, so your solution will indeed work. Good job!

But there's still the problem that the process to kill with USR1 (the function's parent process $$) might well have ended in the meantime, so the signal handler for USR1 would never fire.
The script must survive for more than TIMEOUT seconds (by doing other work?) after the function call to get the signal handled.

wmp
0
 

Author Comment

by:babyb00mer
ID: 39606810
If I'm following you correctly, I have made provisions for terminating the background job when the parent exits. There's a trap command that I did not include in the sample code.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39607086
A trap command for USR1 will have nothing to trap (the kill -USR1 && will have no target) if && (the PID of the shell calling the function) has exited during the TIMEOUT period.

I assume there is also a "trap ... EXIT" (or "trap ... 0"), or how did you make it work?
0
 

Author Comment

by:babyb00mer
ID: 39607220
I did find my 10-year-old code this weekend. It's residing on a Sun Ultra5 running Solaris 8!

Anyway, to answer your question, here's the statement:

trap "[ \"\$ALARM\" ] && ps -p \$ALARM >/dev/null 2>&1 && cancelAlarm" EXIT

Open in new window


The cancelAlarm function looks like this:

cancelAlarm() {
  kill -KILL $ALARM
}

Open in new window

0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39607332
OK, makes sense.

As I guessed, there  is an EXIT trap whose action consists of first testing the variable $ALARM for not being NULL, then checking "ps" for a process whose PID is stored in $ALARM and, if both tests succeed, calling cancelAlarm().

cancelAlarm() kills $ALARM which is the PID of the background process started by the function setAlarm() as soon as the main script exits (because it's initiated by an EXIT trap).

Nice work!
0
 

Author Comment

by:babyb00mer
ID: 39609961
Except...

The trap for the USR1 signal looks like this...

trap soundAlarm USR1

Open in new window


The evidence is that the parent sees the signal, but it doesn't execute the corresponding function until the parent process exits. That defeats the entire purpose because the alarm is intended to signal that the parent process is running long. The soundAlarm function looks like this...

soundAlarm() {
  echo $ALARM_MESSAGE | mailx -s "$ALARM_SUBJECT" $RECIPIENTS
}

Open in new window


I need the soundAlarm function invoked as soon as the parent gets the USR1 signal. Either the parent is ignoring all signals until it terminates, or it's postponing execution of the function until it terminates. I know it's receiving the signal, because the soundAlarm function is executed when the parent terminates.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39610135
I'm not sure if I can understand all implications.

Could you please have a look at the following test piece?

If the parent runs longer than the function (MAINDELAY>TIMEOUT) it runs the "USR1" handler just fine and timely (after TIMEOUT has passed) , but the EXIT handler doesn't find anything to kill (as expected).

If the function runs longer than the parent (MAINDELAY<TIMEOUT) it runs the "EXIT" handler timely (kills what's running in the function) , but never reaches the USR1 handler (also as expected).

#!/usr/bin/ksh
set +xv
trap usr1_handler USR1
trap exit_handler EXIT

usr1_handler() {
echo in USR1 signal handler
echo $ALARM
}

exit_handler() {
echo in EXIT signal handler
kill $ALARM
}

setAlarm() {
  set +xv
  echo in function
  ( for i in $(seq 1 $1); do echo "F$i \c"; sleep 1; done; kill -USR1 $$ ) &
  echo $!
  return
}

MAINDELAY=30
TIMEOUT=10

setAlarm $TIMEOUT

ALARM=$!
echo in main
echo $ALARM
for i in $(seq 1 $MAINDELAY); do echo "M$i \c"; sleep 1; done

Open in new window

0
 

Author Comment

by:babyb00mer
ID: 39628832
Okay. I took your example and modified it a bit, but I think we've proven that the concept is viable. The problem I'm having is that the solution isn't scalable. Specifically, when I add code to the program, processing of the signal seems to get postponed until the program terminates. Obviously there's something in the code that I'm adding that's interfering with interrupt processing. It's as though the signal is being queued. The difference between the example you devised and the production code is a really gnarly nested 'if' statement - which looks something like this:

if ...; then
    .
    .
elif ...; then
    .
    .
else
    if <call_to_an_external_script_goes_here>; then
        .
        .
    else
        .
        .
    fi
fi

When the logic falls through the if statement, the program is done; which appears to be when my signal finally gets processed. I wonder if it would make a difference if I changed the conditional as follows:

<call_to_an_external_script_goes_here>
if (( $? == 0 )); then
    .
    .
else
    .
    .
fi
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39628890
I think you still not showing the full picture.

Let's say we create an external script called ext_script reading:
#!/bin/ksh
 for i in $(seq 1 $1); do echo "F$i \c"; sleep 1; done;

Open in new window

and we modify the original script like this (line 16):
#!/usr/bin/ksh
set +xv
trap usr1_handler USR1
trap exit_handler EXIT
usr1_handler() {
echo in USR1 signal handler
echo $ALARM
}
exit_handler() {
echo in EXIT signal handler
kill $ALARM
}
setAlarm() {
  set +xv
  echo in function
  ( ext_script $1; kill -USR1 $$ ) &
  echo $!
  return
}

MAINDELAY=30
TIMEOUT=10
setAlarm $TIMEOUT
ALARM=$!
echo in main
echo $ALARM
for i in $(seq 1 $MAINDELAY); do echo "M$i \c"; sleep 1; done

Open in new window

then the whole thing behaves just the same way as before. A whatever gnarly "if" construct shouldn't change anything in that aspect.

By the way, if you had to modify my example because you don't have the "seq" utility I'd strongly suggest installing the GNU "coretutils" package from Michael Perzl's collection. You won't regret it!
0
 

Author Comment

by:babyb00mer
ID: 39628925
I've don't think I've ever seen the seq operator. How does that work?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39628943
"seq" is not an operator, it's a tool.

It writes to stdout sequential numerical values whose boundaries are specified by the positional parameters (1) start and (2) end (ascending/descending).

If there are three parameters then the second value is interpreted as the increment which will otherwise default to "1".

"seq 1 10" will show

1
2
3
4
5
6
7
8
9
10

"seq 10 1" will show

10
9
8
7
6
5
4
3
2
1

"seq 1 2 10" will show

1
3
5
7
9

and finally "seq 10 -2 1" will show

10
8
6
4
2

Thus

for i in $(seq 1 10)

is the same as

for i in 1 2 3 4 5 6 7 8 9 10

The nice thing is that you don't have to know the boundaries in advance - just specify a variable/variables instead of the  parameter(s).
0
 

Author Comment

by:babyb00mer
ID: 39631022
Hmm. It would appear that seq is not among the tools I have at my disposal...

ksh: seq:  not found

 Is it a shell built-in or a transient command?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39631218
As I said above, it's in the "coreutils" RPM available at http://www.perzl.org or in the AIX toolbox.
0
 

Author Comment

by:babyb00mer
ID: 39634682
So, here is the piece of code I've been testing with:
COUNTDOWN=20
while (( COUNTDOWN > 0 )); do
  echo Still running...
  sleep 1
  (( COUNTDOWN -= 1 ))
done

Open in new window

If I run this code from the parent process, the signal is acknowledged when it is sent...
In main function...
in setAlarm function
Back in main function...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
in USR1 signal handler
Received USR1 at 11:25:15 ...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
+ echo in EXIT signal handler
+ 1>& 2
in EXIT signal handler
+ [[ -n 7995498 ]]
+ ps -p 7995498
+ 1> /dev/null 2>& 1

Open in new window

If, on the other hand, I move this code to an external file, the signal is not processed until the parent terminates...
In main function...
in setAlarm function
Back in main function...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
Still running...
in USR1 signal handler
Received USR1 at 11:28:56 ...
+ echo in EXIT signal handler
+ 1>& 2
in EXIT signal handler
+ [[ -n 7995526 ]]
+ ps -p 7995526
+ 1> /dev/null 2>& 1

Open in new window

I've turned the code every which way but loose. I even tried running the external file as an argument to the exec command. I'm going to have to concede defeat on this one.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39634717
Seems that you're calling the external script from "main" instead of from inside the function as I assumed.

That makes a big difference, since this call is not executed in the background. The main script has to wait for the external script to return before any signals can be handled.

"exec" cannot help because the whole script will then be overwritten by the external one, all signal handling will be lost.

I think you're right, we'll not get any further here without a bare metal redesign of the entire logic.
0
 

Author Comment

by:babyb00mer
ID: 39634725
Yeah, that exec command thing was pretty desperate, huh? LOL
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39634788
One idea though - if you run the external script in background and issue two (!) "wait" statements - what happens?

I have this:

#!/usr/bin/ksh
set +xv
trap usr1_handler USR1
trap exit_handler EXIT
usr1_handler() {
echo in USR1 signal handler
echo $ALARM
}
exit_handler() {
echo in EXIT signal handler
kill $ALARM
}
setAlarm() {
  set +xv
  echo in function
  ( ext_script $1 F; kill -USR1 $$ ) &
  echo $!
  return
}

MAINDELAY=30
TIMEOUT=10
setAlarm $TIMEOUT
ALARM=$!
echo in main
echo $ALARM
ext_script  $MAINDELAY M &
wait  ; wait


and get this:

in function
893148
in main
893148
F1 M1 F2 M2 F3 M3 F4 M4 F5 M5 F6 M6 F7 M7 F8 M8 F9 M9 F10 M10
in USR1 signal handler
893148
M11 M12 M13 M14 M15 M16 M17 M18 M19 M20 M21 M22 M23 M24 M25 M26 M27 M28 M29 M30
 in EXIT signal handler
kill: 893148: 0403-003 The specified process does not exist.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39634811
Forgot to mention - the external script now looks like this to make it more flexible in regard to the displayed prefix:

 for i in $(seq 1 $1); do echo "$2$i \c"; sleep 1; done;
0
 

Author Comment

by:babyb00mer
ID: 39634897
Hmm. Interesting. I had tried a single wait statement without success. It appears that two waits might be working. Please enlighten me.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39635025
I think I'll need a bit of enlightenment myself.

"wait" without parameter should wait for all subprocesses to terminate, but this doesn't seem to be quite true.

Rather it seems that when the first one of these subprocesses terminates the EXIT handler is triggered, unless there's another "wait".

Instead of two times "wait" without parameter you can also use

wait $ALARM; wait $!

which works as well.

The drawback is that the first "wait" must be the one for the shorter running subprocess, which is of course not known in all cases.

Hmm.

I don't have any plausible explanation for this behaviour. It might be just "normal", but if so, why?
0
 

Author Comment

by:babyb00mer
ID: 39666286
I guess you can't have it all. The "wait; wait" statement works, but I also need to be able to capture the exit status of the external command. Currently, I'm running the external command in the background like this:

( ext_command && cancelAlarm || ( cancelAlarm && false ) ) &

I've also tried...

( ext_command && cancelAlarm || ( cancelAlarm; STATUS=1 ) ) &

The STATUS variable is a global defined in  the parent process, but the background job doesn't seem to be modifying it. Perhaps there's too much nesting.
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39666550
"wait", when issued with a PID operand will return the exit status of the process requested by this operand.

So the version

wait $ALARM; wait $!

could be augmented like this:

wait $ALARM; ALARMRC=$?; wait $!; MAINRC=$?

Now you have two values to check. But as I said above, the drawback here "is that the first "wait" must be the one for the shorter running subprocess, which is of course not known in all cases."

You could also modify "ext_command" to write its status to a log file, which you could examine once the command has finished.

Afaik there is no other tool than "wait" which would return the exitcode of a background process, and modifications made to variables in a subshell never become known to the calling process (a security measure!)

The most common approach here is using a logfile, as described above.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Using 'screen' for session sharing, The Simple Edition Step 1: user starts session with command: screen Step 2: other user (logged in with same user account) connects with command: screen -x Done. Both users are connected to the same CLI sessio…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now