jayatallen
asked on
prevent shell script running twice on linux 5 redhat
Hi Folks,
I have a ksh shell script named "appcrt" which takes agruments from commandline and runs managed servers. This script first checks if the process is already running or not. If already running its exit out. This script incorporates while true to start the managed servers. So if the managed servers crashes the script (which was used to start the managed server) will start it again.
This script takes 3 argument. First env. ,then managed server name and then action(which is start/stop)
I have one admin and 3 managed servers.So, if i need to start the admin server i would type
$appcrt prd jasdom_a1 start
So, the script goes till the RUNNING command (please see the script) and check if "appcrt prd jasdom_a1 start" is already running by check the count of ps -ef | grep -c "appcrt prd jasdom_a1 start" and if its greater than 1 then exit otherwise restart the managed server.
This script has been running fine on solaris10 and but i trying to get it working on Redhat linux5 and causing issues.
For somereason, on linux when i run the command to start the admin server using:
$appcrt prd jasdom_a1 start
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start
jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
It returns 2 process for the RUNNING check>for debugging im echo RUNNING output. while on solaris it returns 1 which we expect.
Could you please help find the issue here?
SCRIPT:
#! /bin/ksh
umask 022
PATH=/usr/local/bin:/bin:/ usr/bin
RUNAS=jasadm
APP=jas
HOSTNAME=`/bin/hostname`;
while getopts :d arg1
do
case $arg1 in
d) DEBUG=1;;
esac
done
shift OPTIND-1
if test -n "$1" && test -n "$2" && test -n "$3" ; then
ENV_NAME=$1
INSTANCE=$2
ACTION=$3
else
echo Usage:
echo
echo " $0 [-d] [environment] [instance] [start|stop|restart] "
echo
echo Some examples:
echo
echo " $0 prd jasdom_a1 start"
echo " $0 prd jasdom_m1 start"
echo " $0 prd jasdom_m2 start"
exit 1
fi
DOMAIN=$(print $INSTANCE|awk -F\_ '{print $1}')
JAVA_HOME=/apps/jas/prd/wl 9config/jd k150_12
BEA_HOME=/apps/jas/prd/wl9 config/
WL_HOME=$BEA_HOME/weblogic 92
CONFIG_HOME=/apps/$APP/$EN V_NAME/wl9 config/$DO MAIN
JVM="java"
JVM_TYPE="-hotspot"
JVM_TYPE="-server"
JVM_MEM="-ms128m -mx128m -XX:MaxPermSize=32m -XX:NewSize=32m"
CP=$WL_HOME/server/lib/web logic_sp.j ar:$WL_HOM E/server/l ib/weblogi c.jar:$WL_ HOME/serve r/lib/webs ervices.ja r
POST_CP=$JAVA_HOME/lib/too ls.jar
CLASSPATH=$CP:$POST_CP
case "$ENV_NAME" in
qa)
;;
prd)
;;
esac
PATH=$WL_HOME/server/bin:$ JAVA_HOME/ jre/bin:$J AVA_HOME/b in:$PATH
STARTMODE=true
WLS_USER=weblogic
WLS_PW=weblogic1
export WLS_USER WLS_PW STARTMODE PATH CLASSPATH LD_LIBRARY_PATH
export ENV_NAME ENV_HOME INSTANCE
export BEA_HOME JAVA_HOME L
ulimit -n 1024
case "$INSTANCE" in
jasdom_a1)
PORT=7210
JVM_MEM="-ms1024m -mx1024m"
SERVER_TYPE=admin
HOST=prdcd1-jaswap01.svr.u s.xcrom.ne t
;;
jasdom_m1)
PORT=7211
JVM_MEM="-ms2048m -mx2048m
ADMINURL=prdcd1-jaswap01.s vr.us.xcro m.net:7210
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u s.xcrom.ne t
;;
jasdom_m2)
PORT=7212
#JVM_MEM="-ms1024m -mx1024m
JVM_MEM="-ms1024m -mx1024m"
ADMINURL=prdcd1-jaswap01.s vr.us.xcro m.net:7210
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u s.xcrom.ne t
;;
jasdom_m5)
PORT=7213
JVM_MEM="-ms2048m -mx2048m
ADMINURL=prdcd1-jaswap01.s vr.us.xcro m.net:7210
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u s.xcrom.ne t
;;
*)
echo $0: Error: Unknown environment/application combination.
exit 1
esac
JAVA_STOP_COMMAND="$JVM weblogic.Admin -url $HOST:$PORT FORCESHUTDOWN -username $WLS_USER -password $WLS_PW"
JAVA_PING_COMMAND="$JVM weblogic.Admin -url $ADMINURL -username $WLS_USER -password $WLS_PW ping "
case $ACTION in
start)
cd $WL_HOME
STRING2ADD=" "
# Special checks for managed instances
#
if [ "$SERVER_TYPE" = "managed" ]
then
STRING2ADD=" -Dweblogic.management.serv er=${ADMIN URL}"
RS=$($JAVA_PING_COMMAND 2>&1)
if [ $(print $RS | grep -c "RTT = ") -eq 0 ]
then
print "
========================== ========
Error : Admin Server unavailable.
========================== ========
Status : Admin Server UNREACHABLE
Action : Aborting this script...
========================== ========== ========== ========== ="
exit
fi
else
STRING2ADD=" "
fi
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
if [ "$RUNNING" -gt 1 ]; then
echo $0: The start script for this server is still running,
echo $0: and will restart weblogic automatically if it exits.
else
echo WebLogic output redirected to $WL_OUT
(
while true
do
WL_ARGS="$JVM_TYPE -showversion $JVM_MEM -classpath $CLASSPATH \
$DEBUG_ARGS \
$WL_OPTION \
-client
-verbose:gc \
-XX:+PrintGCTimeStamps \
-XX:+PrintGCDetails \
-XX:SurvivorRatio=8 \
-XX:CompileThreshold=8000 \
-XX:PermSize=48m \
-XX:MaxPermSize=128m \
-Xverify:none \
-da \
-Dibportal.version=$ENV_NA ME \
-Dibportal.logDir="$LOG_HO ME/" \
-Djava.awt.headless=true \
-Duser.home=$WEBAPP_HOME/W EB-INF/con fig \
-Dweblogic.RootDirectory=$ CONFIG_HOM E\
-Dweblogic.Name=$INSTANCE \
-Dbea.home=$BEA_HOME \
-Dweblogic.management.user name=$WLS_ USER \
-Dweblogic.management.pass word=$WLS_ PW $STRING2ADD\
-Dweblogic.ProductionModeE nabled=$ST ARTMODE \
-Djava.security.policy=$WL _HOME/serv er/lib/web logic.poli cy \
-Dplatform.home=/apps/jas/ prd/wl9con fig/weblog ic92 \
-Dplatform.home=/apps/jas/ prd/wl9con fig/weblog ic92 \
-Dweblogic.management.disc over=true
weblogic.Server"
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Starting webLogic $INSTANCE> <$HOSTNAME>"
nohup $JVM $WL_ARGS >> /dev/null 2>&1
RETURN_CODE=$?
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
sleep 3
done
)&
fi
cd $CONFIG_HOME
;;
stop)
SCRIPT_PID=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?$ENV_NAME[ \t]+$INSTANCE[ \t]+start" | /usr/bin/perl -e 'print (( split /\s+/, <>)[1])'`
if [ -n "$SCRIPT_PID" ]; then
kill -TERM $SCRIPT_PID > /dev/null 2>&1
RETURN_CODE=$?
fi
echo $PORT
echo $HOST
$JAVA_STOP_COMMAND
;;
*)
echo $0: Error: Action "$ACTION" is not supported.
esac
Thank you,
I have a ksh shell script named "appcrt" which takes agruments from commandline and runs managed servers. This script first checks if the process is already running or not. If already running its exit out. This script incorporates while true to start the managed servers. So if the managed servers crashes the script (which was used to start the managed server) will start it again.
This script takes 3 argument. First env. ,then managed server name and then action(which is start/stop)
I have one admin and 3 managed servers.So, if i need to start the admin server i would type
$appcrt prd jasdom_a1 start
So, the script goes till the RUNNING command (please see the script) and check if "appcrt prd jasdom_a1 start" is already running by check the count of ps -ef | grep -c "appcrt prd jasdom_a1 start" and if its greater than 1 then exit otherwise restart the managed server.
This script has been running fine on solaris10 and but i trying to get it working on Redhat linux5 and causing issues.
For somereason, on linux when i run the command to start the admin server using:
$appcrt prd jasdom_a1 start
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start
jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
It returns 2 process for the RUNNING check>for debugging im echo RUNNING output. while on solaris it returns 1 which we expect.
Could you please help find the issue here?
SCRIPT:
#! /bin/ksh
umask 022
PATH=/usr/local/bin:/bin:/
RUNAS=jasadm
APP=jas
HOSTNAME=`/bin/hostname`;
while getopts :d arg1
do
case $arg1 in
d) DEBUG=1;;
esac
done
shift OPTIND-1
if test -n "$1" && test -n "$2" && test -n "$3" ; then
ENV_NAME=$1
INSTANCE=$2
ACTION=$3
else
echo Usage:
echo
echo " $0 [-d] [environment] [instance] [start|stop|restart] "
echo
echo Some examples:
echo
echo " $0 prd jasdom_a1 start"
echo " $0 prd jasdom_m1 start"
echo " $0 prd jasdom_m2 start"
exit 1
fi
DOMAIN=$(print $INSTANCE|awk -F\_ '{print $1}')
JAVA_HOME=/apps/jas/prd/wl
BEA_HOME=/apps/jas/prd/wl9
WL_HOME=$BEA_HOME/weblogic
CONFIG_HOME=/apps/$APP/$EN
JVM="java"
JVM_TYPE="-hotspot"
JVM_TYPE="-server"
JVM_MEM="-ms128m -mx128m -XX:MaxPermSize=32m -XX:NewSize=32m"
CP=$WL_HOME/server/lib/web
POST_CP=$JAVA_HOME/lib/too
CLASSPATH=$CP:$POST_CP
case "$ENV_NAME" in
qa)
;;
prd)
;;
esac
PATH=$WL_HOME/server/bin:$
STARTMODE=true
WLS_USER=weblogic
WLS_PW=weblogic1
export WLS_USER WLS_PW STARTMODE PATH CLASSPATH LD_LIBRARY_PATH
export ENV_NAME ENV_HOME INSTANCE
export BEA_HOME JAVA_HOME L
ulimit -n 1024
case "$INSTANCE" in
jasdom_a1)
PORT=7210
JVM_MEM="-ms1024m -mx1024m"
SERVER_TYPE=admin
HOST=prdcd1-jaswap01.svr.u
;;
jasdom_m1)
PORT=7211
JVM_MEM="-ms2048m -mx2048m
ADMINURL=prdcd1-jaswap01.s
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u
;;
jasdom_m2)
PORT=7212
#JVM_MEM="-ms1024m -mx1024m
JVM_MEM="-ms1024m -mx1024m"
ADMINURL=prdcd1-jaswap01.s
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u
;;
jasdom_m5)
PORT=7213
JVM_MEM="-ms2048m -mx2048m
ADMINURL=prdcd1-jaswap01.s
SERVER_TYPE=managed
HOST=prdcd1-jaswap01.svr.u
;;
*)
echo $0: Error: Unknown environment/application combination.
exit 1
esac
JAVA_STOP_COMMAND="$JVM weblogic.Admin -url $HOST:$PORT FORCESHUTDOWN -username $WLS_USER -password $WLS_PW"
JAVA_PING_COMMAND="$JVM weblogic.Admin -url $ADMINURL -username $WLS_USER -password $WLS_PW ping "
case $ACTION in
start)
cd $WL_HOME
STRING2ADD=" "
# Special checks for managed instances
#
if [ "$SERVER_TYPE" = "managed" ]
then
STRING2ADD=" -Dweblogic.management.serv
RS=$($JAVA_PING_COMMAND 2>&1)
if [ $(print $RS | grep -c "RTT = ") -eq 0 ]
then
print "
==========================
Error : Admin Server unavailable.
==========================
Status : Admin Server UNREACHABLE
Action : Aborting this script...
==========================
exit
fi
else
STRING2ADD=" "
fi
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
if [ "$RUNNING" -gt 1 ]; then
echo $0: The start script for this server is still running,
echo $0: and will restart weblogic automatically if it exits.
else
echo WebLogic output redirected to $WL_OUT
(
while true
do
WL_ARGS="$JVM_TYPE -showversion $JVM_MEM -classpath $CLASSPATH \
$DEBUG_ARGS \
$WL_OPTION \
-client
-verbose:gc \
-XX:+PrintGCTimeStamps \
-XX:+PrintGCDetails \
-XX:SurvivorRatio=8 \
-XX:CompileThreshold=8000 \
-XX:PermSize=48m \
-XX:MaxPermSize=128m \
-Xverify:none \
-da \
-Dibportal.version=$ENV_NA
-Dibportal.logDir="$LOG_HO
-Djava.awt.headless=true \
-Duser.home=$WEBAPP_HOME/W
-Dweblogic.RootDirectory=$
-Dweblogic.Name=$INSTANCE \
-Dbea.home=$BEA_HOME \
-Dweblogic.management.user
-Dweblogic.management.pass
-Dweblogic.ProductionModeE
-Djava.security.policy=$WL
-Dplatform.home=/apps/jas/
-Dplatform.home=/apps/jas/
-Dweblogic.management.disc
weblogic.Server"
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Starting webLogic $INSTANCE> <$HOSTNAME>"
nohup $JVM $WL_ARGS >> /dev/null 2>&1
RETURN_CODE=$?
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
sleep 3
done
)&
fi
cd $CONFIG_HOME
;;
stop)
SCRIPT_PID=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?$ENV_NAME[ \t]+$INSTANCE[ \t]+start" | /usr/bin/perl -e 'print (( split /\s+/, <>)[1])'`
if [ -n "$SCRIPT_PID" ]; then
kill -TERM $SCRIPT_PID > /dev/null 2>&1
RETURN_CODE=$?
fi
echo $PORT
echo $HOST
$JAVA_STOP_COMMAND
;;
*)
echo $0: Error: Action "$ACTION" is not supported.
esac
Thank you,
ASKER
instance part is jasdom_a1
If i chose to start managed server1, i would use
$appcrt prd jasdom_m1 start
then INSTANCE would be jasdom_m1.
I wonder why its not working on linux,its been running on solaris and a reliable script.
is it possible to find why ps -ef | grep "appcrt prd $INSTANCE start" returns 2 or why the shell script being run twice only i ran it only once.
If i chose to start managed server1, i would use
$appcrt prd jasdom_m1 start
then INSTANCE would be jasdom_m1.
I wonder why its not working on linux,its been running on solaris and a reliable script.
is it possible to find why ps -ef | grep "appcrt prd $INSTANCE start" returns 2 or why the shell script being run twice only i ran it only once.
What shell are you running this under in Solaris?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
ksh. i tried that on linux too.
i mean i changed my shell using
$/bin/ksh
then enter
and ran the script. Still 2 process o/p. if you iam echoing RUNNING above and it show 2 processes.
one of them is child process.
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
i dont understand why shell executed the script twice.
i mean i changed my shell using
$/bin/ksh
then enter
and ran the script. Still 2 process o/p. if you iam echoing RUNNING above and it show 2 processes.
one of them is child process.
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
i dont understand why shell executed the script twice.
ASKER
thank you Simon..
As you can see, the script calls while true and starts the managed server,if managed server crashes for some reason,"appcrt prd INSTANCE start" will start managed sever again. If i want to stop managed server then i will stop it using "appcrt prd INSTANCE stop" so it will kill the appcrt script and managed server.
i have tried your suggestion when i have script check for
if [ "$RUNNING" -gt 2 ]; then
The problem is as i have 2 appcrt processes running (after i run the appcrt script),both processes starts 1 seperate managed server.So the end result is i end up having 2 managed server process.
One more thing after few secs, one of appcrt goes away hence 1 managed server also.
But this is not reliable.
As you can see, the script calls while true and starts the managed server,if managed server crashes for some reason,"appcrt prd INSTANCE start" will start managed sever again. If i want to stop managed server then i will stop it using "appcrt prd INSTANCE stop" so it will kill the appcrt script and managed server.
i have tried your suggestion when i have script check for
if [ "$RUNNING" -gt 2 ]; then
The problem is as i have 2 appcrt processes running (after i run the appcrt script),both processes starts 1 seperate managed server.So the end result is i end up having 2 managed server process.
One more thing after few secs, one of appcrt goes away hence 1 managed server also.
But this is not reliable.
You can probably leave the $RUNNING check as "-gt 1", since that will catch any time when the start script is actually running (given that the script has two processes in the process list).
if you do end up with two managed processes, the second one probably dies because it is trying to use a resource (e.g. listen on an IP address) whcih the first one is already using.
if you do end up with two managed processes, the second one probably dies because it is trying to use a resource (e.g. listen on an IP address) whcih the first one is already using.
ASKER
Hi simon,
thanks for your reply.
If i leave
$RUNNING check as "-gt 1"
then the script will exit out as in linux it creating 2 processes for itself.
Is there way to stop shell creating child process of its own?
I mean when i type "appcrt prd jasdom_a1 start" and then do ps -ef | grep "appcrt prd jasdom_a1 start" should spit out only 1 process in output.
thanks for your reply.
If i leave
$RUNNING check as "-gt 1"
then the script will exit out as in linux it creating 2 processes for itself.
Is there way to stop shell creating child process of its own?
I mean when i type "appcrt prd jasdom_a1 start" and then do ps -ef | grep "appcrt prd jasdom_a1 start" should spit out only 1 process in output.
You would need to add logic to your script which can be simple since you already piping the data from the grep to perl, you can use the perl script to only output the line where the child process is.
Why do you not send the application into the background, or if you want it to be restarted on exit, enclose it in an infinite while loop.
while (true) ; do
#Do logic and start the application
#as long as the start process does not go into the background, the process #will be running. As soon as the application exits, crashes, the process will #move along will restart at the top of the queue.
done
Why do you not send the application into the background, or if you want it to be restarted on exit, enclose it in an infinite while loop.
while (true) ; do
#Do logic and start the application
#as long as the start process does not go into the background, the process #will be running. As soon as the application exits, crashes, the process will #move along will restart at the top of the queue.
done
Sorry, yes, I confused myself - you do need "-gt 2", since when the first command runs, $RUNNING will be 2, but when the second one runs it will be 4.
ASKER
thank you guys for your reply. i think i didnt state my question clearly
Arnold:
i am using appcrt to start another java processes (JVM) . appcrt takes argument and works accordingly.
when i pass the argument (basically the name of the JVM and action) appcrt checks if the already running or not, if not then it starts the given JVM and puts that JVM in background.
if you start in case statement:
RETURN_CODE=$?
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
sleep 3
done
)&
&..appcrt starts JVM in background.
On solaris,it works perfect.
If i starts a managed server (JVM) on solaris , i would end up having two process:
1) appcrt prd <INSTANCE> start
2) java process ,which was started by above process in background
If for somereason, JVM crashes , process 1 will start it again.becuase of while true loop.
So, if i want to stop/kill the JVM , i would use appcrt with stop (action) to stop both.
This logic has been working fine.
The only problem is on linux,because when i starts the script, Linux creates 2 similar processes .
For instance, suppose nothing is running on linux box and i want to start a managed server. I would type
$appcrt prd jasdom_m1 start
Since nothing was running the o/p for
ps -ef | grep "appcrt prd jasdom_m1 start" should be 1. as this is only script will proceed further and would start the managed server(JVM) in background.
ISSUE:
On linux, check for ps -ef | grep "appcrt prd jasdom_m1 start" returns 2 (Main problem)
Logic in Script:
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
For my own understanding, iam echoing RUNNING .I dont understand linux shows below o/p saying 2 appcrt process is running.
result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
So, the problem is linux shelll running the script twice.? I dont know why its doing that
Please suggest.
Arnold:
i am using appcrt to start another java processes (JVM) . appcrt takes argument and works accordingly.
when i pass the argument (basically the name of the JVM and action) appcrt checks if the already running or not, if not then it starts the given JVM and puts that JVM in background.
if you start in case statement:
RETURN_CODE=$?
MESSAGE="`date +'<%b %d, %Y %l:%M:%S %p %Z>'` <Alert> <startWebLogic.sh> <Server exited with code: $RETURN_CODE>"
sleep 3
done
)&
&..appcrt starts JVM in background.
On solaris,it works perfect.
If i starts a managed server (JVM) on solaris , i would end up having two process:
1) appcrt prd <INSTANCE> start
2) java process ,which was started by above process in background
If for somereason, JVM crashes , process 1 will start it again.becuase of while true loop.
So, if i want to stop/kill the JVM , i would use appcrt with stop (action) to stop both.
This logic has been working fine.
The only problem is on linux,because when i starts the script, Linux creates 2 similar processes .
For instance, suppose nothing is running on linux box and i want to start a managed server. I would type
$appcrt prd jasdom_m1 start
Since nothing was running the o/p for
ps -ef | grep "appcrt prd jasdom_m1 start" should be 1. as this is only script will proceed further and would start the managed server(JVM) in background.
ISSUE:
On linux, check for ps -ef | grep "appcrt prd jasdom_m1 start" returns 2 (Main problem)
Logic in Script:
CODE FROM SCRIPT:
RUNNING=`/bin/ps -ef | /bin/egrep -c "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
RUNNING1=`/bin/ps -ef | /bin/egrep "appcrt[ \t]+(-d[ \t]+)?+$ENV_NAME[ \t]+$INSTANCE[ \t]+start"`
echo $RUNNING1
For my own understanding, iam echoing RUNNING .I dont understand linux shows below o/p saying 2 appcrt process is running.
result when i run the script:
bash-3.2$ ./appcrt prd jasdom_a1 start
jasadm 25042 18332 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1start jasadm 25794 25042 0 09:38 pts/0 00:00:00 /bin/ksh ./appcrt prd jasdom_a1 start
So, the problem is linux shelll running the script twice.? I dont know why its doing that
Please suggest.
It is not running it twice. One "appcrt prd jasdom_a1start jasadm" is the call to run a shell script, the other is the script itself being processed. You will see that the Parent Process ID (the thrid column of "ps" output) is the same number as the Process ID (the second column) of the other entry - one is the parent of the other.
That's just the way Linux does it.
That's just the way Linux does it.
ASKER
is there way to stop shell to create a child shell?
i tried to run the script in one shell rather than forking a new child shell like this
$. ./appcrt jasdom_a1 start
dot white space and then script.
this causing weird behavior. Shell starts and terminates the process and keep doing it until i kill the ksh process id which was used to start the script.
i tried to run the script in one shell rather than forking a new child shell like this
$. ./appcrt jasdom_a1 start
dot white space and then script.
this causing weird behavior. Shell starts and terminates the process and keep doing it until i kill the ksh process id which was used to start the script.
Rather than try to work around this, just modify your script to accept the way Linux works.
If you need the same script to work on Linux and Solaris, set a variable to 2 if uname reports Linux, and 1 otherwise. Then compare $RUNNING against that.
If you need the same script to work on Linux and Solaris, set a variable to 2 if uname reports Linux, and 1 otherwise. Then compare $RUNNING against that.
ASKER
I've requested that this question be deleted for the following reason:
no specific answer was provided. it will confuse others
no specific answer was provided. it will confuse others
It's not confusing, just different. If you try to assume that all systems are the same (e.g. Solaris and Linux), you will be bitten by this and other differences.
The solution is to accept that Linux creates two processes and code for that (by changing to "-gt 2", and the change in the "stop" section).
The solution is to accept that Linux creates two processes and code for that (by changing to "-gt 2", and the change in the "stop" section).
ASKER
hi simon,
didnt mean to offend you .but that was the issue. if i have gt 2 , i will end up having two processes. i dint find any answer how i can make this script to execute one process only.
the only way to make it work is to take while true section out,but then the script is no more good to start the process automatically.
didnt mean to offend you .but that was the issue. if i have gt 2 , i will end up having two processes. i dint find any answer how i can make this script to execute one process only.
the only way to make it work is to take while true section out,but then the script is no more good to start the process automatically.
ASKER
The suggestion provided helps but doesnt eliminate the orginal issue.
You can only eliminate the issue by eliminating one of the environments.
The solution I employed in similar situation is as following:have two different configs, one per Solaris, one per Linux, in which you may have
source it via
But the healthiest way remains to use PID files.
The solution I employed in similar situation is as following:have two different configs, one per Solaris, one per Linux, in which you may have
MAX_PROCESSES=2
on LinuxMAX_PROCESSES=1
on Solarissource it via
. config.sh
and use if $RUNNING -gt $MAX_PROCESSES
But the healthiest way remains to use PID files.
The script is running two process because that's the way Linux organises it. As you suggest, the while loop seems to be the trigger for this.
The two processes are not simply two versions of the same program running - they are a parent+child pair, so that one runs the script itself, and the other looks after the backgrounded while loop.
The two processes are not simply two versions of the same program running - they are a parent+child pair, so that one runs the script itself, and the other looks after the backgrounded while loop.
PID files could, a parparov suggest, be a more reliable way of finding the original program, but they do suffer from the problem that the PID alone does not identify a process. For example, a process creates it PID file, runs for a very long time, then crashes leaving its PID file still present. Since the PID is generated from a limited set of values (from 1 to 32768 on Linux), it is possible that the PID has wrapped round, and the same PID has been allocated to some new process, entirely unrelated to the original one. This may seem like a theoretical problem, but I have been bitten by it in the past (on a system doing enormous numbers of compilations, where the PID wrapped round twice per day).
Check for the absence of the file prior to starting the instance.
Is the instance part of the data in the ps -ef | grep jasdom?