Solved

prevent shell script running twice on linux 5 redhat

Posted on 2011-09-16
21
1,633 Views
Last Modified: 2012-06-27
Hi Folks,

I have a ksh shell script named "appcrt" which takes agruments from commandline and runs  managed servers. This script first checks if the process is already running or not. If already running its exit out. This script incorporates while true to start the managed servers. So if the managed servers crashes the script (which was used to start the managed server) will start it again.
This script takes 3 argument. First env. ,then managed server name and then action(which is start/stop)

I have one admin and 3 managed servers.So, if i need to start the admin server i would type
$appcrt prd jasdom_a1 start

So, the script goes till the RUNNING command (please see the script) and check if "appcrt prd jasdom_a1 start" is already running by check the count of ps -ef | grep -c "appcrt prd jasdom_a1 start" and if its greater than 1 then exit otherwise restart the managed server.

This script has been running fine on solaris10 and but i trying to get it working on Redhat linux5 and causing issues.
For somereason, on linux when i run the command to start the admin server using:
$appcrt prd jasdom_a1 start
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start
jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start

It returns 2 process for the RUNNING check>for debugging im echo RUNNING output. while on solaris it returns 1 which we expect.

Could you please help find the issue here?

SCRIPT:
#! /bin/ksh
umask 022

PATH=/usr/local/bin:/bin:/usr/bin
RUNAS=jasadm
APP=jas
HOSTNAME=`/bin/hostname`;

while getopts :d arg1
do
  case $arg1 in
      d) DEBUG=1;;
  esac
done
shift OPTIND-1

if test -n "$1" && test -n "$2" && test -n "$3" ; then
    ENV_NAME=$1
    INSTANCE=$2
    ACTION=$3
else
    echo Usage:
    echo
    echo "      $0 [-d] [environment] [instance] [start|stop|restart] "
    echo
    echo Some examples:
    echo
    echo "      $0 prd jasdom_a1 start"
    echo "      $0 prd jasdom_m1 start"
    echo "      $0 prd jasdom_m2 start"
    exit 1
fi

DOMAIN=$(print $INSTANCE|awk -F\_ '{print $1}')
JAVA_HOME=/apps/jas/prd/wl9config/jdk150_12
BEA_HOME=/apps/jas/prd/wl9config/
WL_HOME=$BEA_HOME/weblogic92
CONFIG_HOME=/apps/$APP/$ENV_NAME/wl9config/$DOMAIN


JVM="java"
JVM_TYPE="-hotspot"
JVM_TYPE="-server"
JVM_MEM="-ms128m -mx128m -XX:MaxPermSize=32m -XX:NewSize=32m"



CP=$WL_HOME/server/lib/weblogic_sp.jar:$WL_HOME/server/lib/weblogic.jar:$WL_HOME/server/lib/webservices.jar
POST_CP=$JAVA_HOME/lib/tools.jar

CLASSPATH=$CP:$POST_CP

case "$ENV_NAME" in
qa)
;;
prd)
;;
esac


PATH=$WL_HOME/server/bin:$JAVA_HOME/jre/bin:$JAVA_HOME/bin:$PATH
STARTMODE=true
WLS_USER=weblogic
WLS_PW=weblogic1
export WLS_USER WLS_PW STARTMODE PATH CLASSPATH LD_LIBRARY_PATH
export ENV_NAME ENV_HOME INSTANCE
export BEA_HOME JAVA_HOME L


ulimit -n 1024


case "$INSTANCE" in

jasdom_a1)
    PORT=7210
    JVM_MEM="-ms1024m -mx1024m"
    SERVER_TYPE=admin
    HOST=prdcd1-jaswap01.svr.us.xcrom.net
;;
jasdom_m1)
    PORT=7211

JVM_MEM="-ms2048m -mx2048m
    ADMINURL=prdcd1-jaswap01.svr.us.xcrom.net:7210
    SERVER_TYPE=managed
    HOST=prdcd1-jaswap01.svr.us.xcrom.net

;;
jasdom_m2)
    PORT=7212
#JVM_MEM="-ms1024m -mx1024m
    JVM_MEM="-ms1024m -mx1024m"
    ADMINURL=prdcd1-jaswap01.svr.us.xcrom.net:7210
    SERVER_TYPE=managed
    HOST=prdcd1-jaswap01.svr.us.xcrom.net
;;
jasdom_m5)
    PORT=7213
JVM_MEM="-ms2048m -mx2048m
    ADMINURL=prdcd1-jaswap01.svr.us.xcrom.net:7210
    SERVER_TYPE=managed
    HOST=prdcd1-jaswap01.svr.us.xcrom.net

;;
*)
    echo $0: Error: Unknown environment/application combination.
    exit 1
esac

JAVA_STOP_COMMAND="$JVM weblogic.Admin -url $HOST:$PORT FORCESHUTDOWN -username $WLS_USER -password $WLS_PW"
JAVA_PING_COMMAND="$JVM weblogic.Admin -url $ADMINURL  -username $WLS_USER -password $WLS_PW ping "

case $ACTION in
start)

            cd $WL_HOME
     STRING2ADD=" "

      # Special checks for managed instances
      #
    if [ "$SERVER_TYPE" = "managed" ]
    then
        STRING2ADD=" -Dweblogic.management.server=${ADMINURL}"

        RS=$($JAVA_PING_COMMAND 2>&1)

        if [ $(print $RS | grep -c "RTT = ") -eq 0 ]
        then
                print "
                        ==================================
                        Error : Admin Server unavailable.
                        ==================================
                        Status          : Admin Server UNREACHABLE
                        Action          : Aborting this script...
                        ========================================================="
                exit
        fi

    else
        STRING2ADD=" "
    fi

    RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    RUNNING1=`/bin/ps -ef | /bin/egrep  "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    echo $RUNNING1
    if [ "$RUNNING" -gt 1 ]; then
        echo $0: The start script for this server is still running,
        echo $0: and will restart weblogic automatically if it exits.
    else
        echo WebLogic output redirected to $WL_OUT
        (
        while true
        do
               
                WL_ARGS="$JVM_TYPE -showversion $JVM_MEM -classpath $CLASSPATH \
                $DEBUG_ARGS \
                $WL_OPTION \
                -client
                -verbose:gc \
                -XX:+PrintGCTimeStamps \
                -XX:+PrintGCDetails \
                  -XX:SurvivorRatio=8 \
                -XX:CompileThreshold=8000 \
                -XX:PermSize=48m \
                -XX:MaxPermSize=128m \
                -Xverify:none \
                -da \
                -Dibportal.version=$ENV_NAME \
                -Dibportal.logDir="$LOG_HOME/" \
                -Djava.awt.headless=true \
                -Duser.home=$WEBAPP_HOME/WEB-INF/config \
                -Dweblogic.RootDirectory=$CONFIG_HOME\
                -Dweblogic.Name=$INSTANCE \
                -Dbea.home=$BEA_HOME \
                -Dweblogic.management.username=$WLS_USER \
                -Dweblogic.management.password=$WLS_PW  $STRING2ADD\
                -Dweblogic.ProductionModeEnabled=$STARTMODE \
                -Djava.security.policy=$WL_HOME/server/lib/weblogic.policy \
                -Dplatform.home=/apps/jas/prd/wl9config/weblogic92 \
                -Dplatform.home=/apps/jas/prd/wl9config/weblogic92 \
                -Dweblogic.management.discover=true
                 weblogic.Server"

            MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p  %Z>'` <Alert> <startWebLogic.sh> <Starting webLogic $INSTANCE> <$HOSTNAME>"

            nohup  $JVM $WL_ARGS >> /dev/null 2>&1

            RETURN_CODE=$?
            MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p  %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
            sleep 3
        done
        )&
    fi
    cd $CONFIG_HOME
;;
stop)
    SCRIPT_PID=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?$ENV_NAME[ \t]+$INSTANCE[ \t]+start" | /usr/bin/perl -e 'print (( split /\s+/, <>)[1])'`
    if [ -n "$SCRIPT_PID" ]; then
        kill -TERM $SCRIPT_PID > /dev/null 2>&1
        RETURN_CODE=$?
    fi
    echo $PORT
    echo $HOST
    $JAVA_STOP_COMMAND
;;
*)
    echo $0: Error: Action "$ACTION" is not supported.
esac




Thank you,
0
Comment
Question by:jayatallen
  • 9
  • 8
  • 2
  • +1
21 Comments
 
LVL 76

Expert Comment

by:arnold
ID: 36549956
You should use pid files /var/run/instance.pid
Check for the absence of the file prior to starting the instance.

Is the instance part of the data in the ps -ef | grep jasdom?
0
 

Author Comment

by:jayatallen
ID: 36550175
instance part is jasdom_a1
If i chose to start managed server1, i would use
$appcrt prd jasdom_m1 start

then INSTANCE would be jasdom_m1.
I wonder why its not working on linux,its been running on solaris and a reliable script.
is it possible to find why  ps -ef | grep "appcrt prd $INSTANCE start" returns 2 or why the shell script being run twice only i ran it only once.
0
 
LVL 9

Expert Comment

by:parparov
ID: 36550189
What shell are you running this under in Solaris?
0
 
LVL 19

Accepted Solution

by:
simon3270 earned 500 total points
ID: 36550216
It's not so much a problem, as a difference in the way Linux and Solaris run shell scripts.

For Linux, the shell you are using (ksh) is a separate process from the actual script (in Solaris they are the same process).  You can see this in the ps output:
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start
jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
The 25042 process is the shell itself, while the 26794 process (which has a parent PID of 25042) is the script.

There are two fixes required:

- in your "start" processing, change:
    if [ "$RUNNING" -gt 1 ]; then
to
    if [ "$RUNNING" -gt 2 ]; then

and in your "stop" processing, change:
    SCRIPT_PID=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?$ENV_NAME[ \t]+$INSTANCE[ \t]+start" | /usr/bin/perl -e 'print (( split /\s+/, <>)[1])'`

Open in new window

to
    SCRIPT_PID=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?$ENV_NAME[ \t]+$INSTANCE[ \t]+start" | awk '{print $2}'`

Open in new window

This is because there are now 2 numbers reported by egrep, but your perl only prints the first one.
0
 

Author Comment

by:jayatallen
ID: 36550252
ksh. i tried that on linux too.
i mean i changed my shell using
$/bin/ksh
then enter
and ran the script. Still 2 process o/p. if you iam echoing RUNNING above and it show 2 processes.
one of them is child process.
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    RUNNING1=`/bin/ps -ef | /bin/egrep  "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    echo $RUNNING1

result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start

i dont understand why shell executed the script twice.
0
 

Author Comment

by:jayatallen
ID: 36550426
thank you Simon..

As you can see, the script calls while true and starts the managed server,if managed server crashes for some reason,"appcrt prd INSTANCE start" will start managed sever again. If i want to stop managed server then i will stop it using "appcrt prd INSTANCE stop" so it will kill the appcrt script and managed server.

i have tried your suggestion when  i have script  check for
  if [ "$RUNNING" -gt 2 ]; then

The problem is as i have 2 appcrt  processes running (after i run the appcrt script),both processes starts 1 seperate managed server.So the end result is i end up having 2 managed server process.

One more thing after few secs, one of appcrt goes away hence 1 managed server also.
But this is not reliable.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36554255
You can probably leave the $RUNNING check as "-gt 1", since that will catch any time when the start script is actually running (given that the script has two processes in the process list).

if you do end up with two managed processes, the second one probably dies because it is trying to use a resource (e.g. listen on an IP address) whcih the first one is already using.
0
 

Author Comment

by:jayatallen
ID: 36554433
Hi simon,

thanks for your reply.
If i leave
$RUNNING check as "-gt 1"
then the script will exit out as in linux it creating 2 processes for itself.
Is there way to stop shell  creating child process of its own?
I mean when i type "appcrt  prd  jasdom_a1 start" and then do ps -ef | grep  "appcrt  prd  jasdom_a1 start"  should spit out only 1 process in  output.
0
 
LVL 76

Expert Comment

by:arnold
ID: 36554609
You would need to add logic to your script which can be simple since you already piping the data from the grep to perl, you can use the perl script to only output the line where the child process is.

Why do you not send the application into the background, or if you want it to be restarted on exit, enclose it in an infinite while loop.

while (true) ; do
#Do logic and start the application
#as long as the start process does not go into the background, the process #will be running.  As soon as the application exits, crashes, the process will #move along will restart at the top of the queue.

done
   
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36554668
Sorry, yes, I confused myself - you do need "-gt 2", since when the first command runs, $RUNNING will be 2, but when the second one runs it will be 4.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:jayatallen
ID: 36554772
thank you guys for your reply. i think i didnt state my question clearly
Arnold:
i am using appcrt to start another java processes (JVM) . appcrt takes argument and works accordingly.
when i pass the argument (basically the name of the JVM and action) appcrt checks if the already running or not, if not then it starts the given JVM and puts that JVM in background.
if you start in case statement:

            RETURN_CODE=$?
            MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p  %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
            sleep 3
        done
        )&

&..appcrt starts JVM in background.
On solaris,it works perfect.
If i starts a managed server (JVM) on solaris , i would end up having two process:
1) appcrt prd <INSTANCE> start
2) java process ,which was started by above process in background
If for somereason, JVM crashes , process 1 will start it again.becuase of while true loop.

So, if i want to stop/kill the JVM , i would use appcrt with stop (action) to stop both.
This logic has been working fine.
The only problem is on linux,because when i starts the script, Linux creates 2 similar processes .

For instance, suppose nothing is running on linux box and i want to start a managed server. I would type
$appcrt prd jasdom_m1 start

Since nothing was running the o/p for
ps -ef | grep "appcrt prd jasdom_m1 start"  should be 1. as this is only script will proceed further and would start the managed server(JVM) in background.

ISSUE:
On linux, check for ps -ef | grep "appcrt prd jasdom_m1 start" returns 2 (Main problem)
Logic in Script:
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    RUNNING1=`/bin/ps -ef | /bin/egrep  "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
    echo $RUNNING1

For my own understanding, iam echoing RUNNING .I dont understand linux shows below o/p saying 2 appcrt process is running.

result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start

So, the problem is linux shelll running the script twice.? I dont know why its doing that

Please suggest.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36554925
It is not running it twice.  One "appcrt prd jasdom_a1start jasadm" is the call to run a shell script, the other is the script itself being processed.  You will see that the Parent Process ID (the thrid column of "ps" output) is the same number as the Process ID (the second column) of the other entry - one is the parent of the other.

That's just the way Linux does it.
0
 

Author Comment

by:jayatallen
ID: 36554969
is there way to stop shell to create a child shell?
i tried to run the script in one shell rather than forking a new child shell like this
$. ./appcrt jasdom_a1 start

dot white space and then script.
this causing weird behavior. Shell starts and terminates the process and keep doing it until i kill the ksh process id which was used to start the script.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36555660
Rather than try to work around this, just modify your script to accept the way Linux works.

If you need the same script to work on Linux and Solaris, set a variable to 2 if uname reports Linux, and 1 otherwise. Then compare $RUNNING against that.
0
 

Author Comment

by:jayatallen
ID: 36713168
I've requested that this question be deleted for the following reason:

no specific answer was provided. it will confuse others
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36713169
It's not confusing, just different.  If you try to assume that all systems are the same (e.g. Solaris and Linux), you will be bitten by this and other differences.

The solution is to accept that Linux creates two processes and code for that (by changing to "-gt 2", and the change in the "stop" section).
0
 

Author Comment

by:jayatallen
ID: 36713776
hi simon,

didnt mean to offend you .but that was the issue. if i have gt 2 , i will end up having two processes. i dint find any answer how i can make this script to execute one process only.
the only way to make it work is to take while true section out,but then the script is no more good to start the process automatically.

0
 

Author Closing Comment

by:jayatallen
ID: 36713793
The suggestion provided helps but doesnt eliminate the orginal issue.
0
 
LVL 9

Expert Comment

by:parparov
ID: 36714199
You can only eliminate the issue by eliminating one of the environments.

The solution I employed in similar situation is as following:have two different configs, one per Solaris, one per Linux, in which you may have
MAX_PROCESSES=2

Open in new window

on Linux
MAX_PROCESSES=1

Open in new window

on Solaris
source it via
. config.sh

Open in new window

and use
if $RUNNING -gt $MAX_PROCESSES

Open in new window


But the healthiest way remains to use PID files.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36716238
The script is running two process because that's the way Linux organises it.  As you suggest, the while loop seems to be the trigger for this.

The two processes are not simply two versions of the same program running - they are a parent+child pair, so that one runs the script itself, and the other looks after the backgrounded while loop.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 36716519
PID files could, a parparov suggest, be a more reliable way of finding the original program, but they do suffer from the problem that the PID alone does not identify a process.  For example, a process creates it PID file, runs for a very long time, then crashes leaving its PID file still present.   Since the PID is generated from a limited set of values (from 1 to 32768 on Linux), it is possible that the PID has wrapped round, and the same PID has been allocated to some new process, entirely unrelated to the original one.  This may seem like a theoretical problem, but I have been bitten by it in the past (on a system doing enormous numbers of compilations, where the PID wrapped round twice per day).
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Using 'screen' for session sharing, The Simple Edition Step 1: user starts session with command: screen Step 2: other user (logged in with same user account) connects with command: screen -x Done. Both users are connected to the same CLI sessio…
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now