Create script to restart service

OS:  CENTOS7

Hi,

   I need some help creating a script to restart a service in Centos7.  The service 'RADARCH2' runs a java process (JBOSS) and I see it in grep as
root     13897 10.7  6.5 26496152 8705024 ?    Sl   Dec18 130:36 java -Dprogram.name=run.sh -server -Xms512m -Xmx4g -XX:MaxPermSize=1g -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djboss.messaging.ServerPeerID=0 -Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl -Djava.awt.headless=true -Dapp.name=dcm4chee -Djava.net.preferIPv4Stack=true -Djava.library.path=/opt/dcm4chee-2.18.0-mysql/bin/native -Djava.endorsed.dirs=/opt/dcm4chee-2.18.0-mysql/lib/endorsed -classpath /opt/dcm4chee-2.18.0-mysql/bin/run.jar org.jboss.Main -b 0.0.0.0 -c default

Open in new window


Right now, on system startup, the service starts up on its own.  I can successfully stop/restart the service with:
'systemctl stop radarch2' or 'systemctl start radarch2'.   The three options only available are 'stop|start|status', so there is no 'restart' option.

On occasion the service will stop with no warning, and nothing in its log to show why or what made it stop at all.  Its frustrating, so I think I need to go this route.

After this script is created, what would be the best way to run it?  Just create a service for it?

thank you
doc_jayAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

woolmilkporcCommented:
Best run the script via cron at a suitable interval, maybe every 10 minutes:

*/10 * * * * <username> /path/to/radarch2_check_and_restart_if_needed.sh > /path/to/radarch2_restart.log 2>&1

Add the above to the system crontab by modyfying (as root) /etc/crontab accordingly. Replace <username> with the required name.

Alternatively you can put it into the user's crontab. To do this run as <username>:

crontab -e

then add the above line, but without <username> in column 6:

*/10 * * * * /path/to/radarch2_check_and_restart_if_needed.sh > /path/to/radarch2_restart.log 2>&1
0
doc_jayAuthor Commented:
thanks for the crontab suggestion, I appreciate it.  Would you be able to assist with creating a script to check if the pid has died?

thank you
0
woolmilkporcCommented:
Please post sample outputs of

systemctl status radarch2

once when it's running and once when it's dead.

Assuming there is something like "Running" in the normal status output you can simply do this:

if ! systemctl status radarch2 2>&1 | grep -q "Running"
   then
       echo "radarch2 is dead! Restarting"
       systemctl stop radarch2
       systemctl start radarch2
fi

That should already do the trick. But please post the requested data anyway. "Running" is just a wild guess!
0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

doc_jayAuthor Commented:
FYI - service is currently named 'radarch4', but I will be renaming it to 'radarch2'.  We decided to change the host name after it was configured and wanted the service named after the host as we have two of these setup for redundancy.

systemctl status radarch4  (this is the current status

[root@radarch2 bin]# systemctl status radarch4
radarch4.service - SYSV: Start the DCM4CHEE DICOM Image Manager
   Loaded: loaded (/etc/rc.d/init.d/radarch4)
   Active: active (exited) since Thu 2014-12-18 13:49:28 CST; 21h ago
  Process: 13780 ExecStop=/etc/rc.d/init.d/radarch4 stop (code=exited, status=1/FAILURE)
  Process: 13835 ExecStart=/etc/rc.d/init.d/radarch4 start (code=exited, status=0/SUCCESS)

Dec 18 13:49:28 radarch2 radarch4[13835]: JBOSS_CMD_START = cd /opt/dcm4chee-2.18.0-mysql/bin; authbind --deep /opt/dcm4chee-2.18.0-mysq... default
Dec 18 13:49:28 radarch2 su[13839]: (to root) root on none
Dec 18 13:49:28 radarch2 systemd[1]: Started SYSV: Start the DCM4CHEE DICOM Image Manager.
Hint: Some lines were ellipsized, use -l to show in full.

Open in new window


systemctl status radarch4  (this is the status of what it looks like when it crashes.

[root@radarch2 dcm4chee-ae]# systemctl status radarch4
radarch4.service - SYSV: Start the DCM4CHEE DICOM Image Manager
   Loaded: loaded (/etc/rc.d/init.d/radarch4)
   Active: active (exited) since Thu 2014-12-18 07:22:23 CST; 6h ago
  Process: 2902 ExecStart=/etc/rc.d/init.d/radarch4 start (code=exited, status=0/SUCCESS)

Dec 18 07:22:23 radarch2 radarch4[2902]: JBOSS_CMD_START = cd /opt/dcm4chee-2.18.0-mysql/bin; authbind --deep /opt/dcm4chee-2.18.0-mysql... default
Dec 18 07:22:23 radarch2 su[2915]: (to root) root on none
Dec 18 07:22:23 radarch2 systemd[1]: Started SYSV: Start the DCM4CHEE DICOM Image Manager.
Hint: Some lines were ellipsized, use -l to show in full.

Open in new window


systemctl stop radarch4

Open in new window

 (this is what it outputs when it is stopped

[root@radarch2 dcm4chee-ae]# systemctl stop radarch4
[root@radarch2 dcm4chee-ae]# systemctl status radarch4
radarch4.service - SYSV: Start the DCM4CHEE DICOM Image Manager
   Loaded: loaded (/etc/rc.d/init.d/radarch4)
   Active: failed (Result: exit-code) since Thu 2014-12-18 13:49:21 CST; 2s ago
  Process: 13780 ExecStop=/etc/rc.d/init.d/radarch4 stop (code=exited, status=1/FAILURE)
  Process: 12898 ExecStart=/etc/rc.d/init.d/radarch4 start (code=exited, status=0/SUCCESS)

Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.Socket.connect(Socket.java:579)
Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.Socket.connect(Socket.java:528)
Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.Socket.<init>(Socket.java:425)
Dec 18 13:49:21 radarch2 radarch4[13780]: at java.net.Socket.<init>(Socket.java:319)
Dec 18 13:49:21 radarch2 radarch4[13780]: at org.jnp.interfaces.TimedSocketFactory.createSocket(TimedSocketFactory.java:84)
Dec 18 13:49:21 radarch2 radarch4[13780]: at org.jnp.interfaces.TimedSocketFactory.createSocket(TimedSocketFactory.java:77)
Dec 18 13:49:21 radarch2 radarch4[13780]: at org.jnp.interfaces.NamingContext.getServer(NamingContext.java:244)
Dec 18 13:49:21 radarch2 radarch4[13780]: ... 5 more
Dec 18 13:49:21 radarch2 systemd[1]: radarch4.service: control process exited, code=exited status=1
Dec 18 13:49:21 radarch2 systemd[1]: Stopped SYSV: Start the DCM4CHEE DICOM Image Manager.
Dec 18 13:49:21 radarch2 systemd[1]: Unit radarch4.service entered failed state.

Open in new window

0
woolmilkporcCommented:
This looks strange.
You wrote that "On occasion the service will stop with no warning,"
Doesn't that mean that the associated process will have died (and have vanished from the process list)?
And if so, how does systemctl manage to talk to a dead process?

The "crashed" output doesn't differ much from the normal output!

Or does "stop with no warning" mean that the services just ceases doing its normal work, but is still present in the process list and is able to talk to systemctl anyway?
0
doc_jayAuthor Commented:
when it does crash, I usually look for:

ps aux | grep java

and if one of the two java processes don't exist (the other is mirthconnect) then I know its not running any longer.  The process I look for after the above command will display the output below if it is running correctly

root     13897 10.7  6.5 26496152 8705024 ?    Sl   Dec18 130:36 java -Dprogram.name=run.sh -server -Xms512m -Xmx4g -XX:MaxPermSize=1g -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djboss.messaging.ServerPeerID=0 -Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl -Djava.awt.headless=true -Dapp.name=dcm4chee -Djava.net.preferIPv4Stack=true -Djava.library.path=/opt/dcm4chee-2.18.0-mysql/bin/native -Djava.endorsed.dirs=/opt/dcm4chee-2.18.0-mysql/lib/endorsed -classpath /opt/dcm4chee-2.18.0-mysql/bin/run.jar org.jboss.Main -b 0.0.0.0 -c default

Open in new window


If I see this line:  

Dprogram.name=run.sh -server -Xms512m -Xmx4g

then I know it is running.  The service 'radarch4' calls 'run.sh' to start up JBOSS.

After it crashes, you can see in my output of 'systemctl status radarch4' that there is only one process listed compared to its current status at the moment which shows that it is currently running.  (this is my very top code selection I posted above out of the three)

Hope this clears things up
0
woolmilkporcCommented:
OK,

then let's count the "Process" lines:

if [[ $(systemctl status radarch4 2>&1 | grep -c "Process:") -lt 2 ]]
   then
       echo "radarch4 is dead! Restarting"
       systemctl stop radarch4 2>&1
       systemctl start radarch4 2>&1
fi

If you feel that "stop" is not necessary before "start" simply omit that line.

Please note that the script writes to stdout, so redirecting the script's output to a logfile (or /dev/null) in crontab is mandatory (see my "crontab" suggestion above).

Alternatively we can redirect the output inside the script, but this means that the name of the logfile will have to be hard-coded there, and the script must be changed if you want/must change the logfile name (for what reasons ever). In the other case you can simply change that name in crontab without having to touch the script.
0
doc_jayAuthor Commented:
okay, great.  I'll give it a shot, set it up in cron, kill the pid and report back

thank you!

ps. One other thing,  in your script, if I use mailx, how can we make it send an email that the service was bounced?
0
woolmilkporcCommented:
Replace

 echo "radarch4 is dead! Restarting"

with

 echo "radarch4 is dead! Restarting" | mailx -s "radarch4 bounced" recipient@domain.tld

Instead of just echoing a free text we could also mail the full "systemctl" output:  

if [[ $(systemctl status radarch4 2>&1 | grep -c "Process:") -lt 2 ]]
   then
       echo "radarch4 is dead! Restarting"
       systemctl stop radarch4 2>&1
       systemctl start radarch4 2>&1
fi | mailx -s "radarch4 bounced" recipient@domain.tld


Please don't forget to make the script executable with "chmod +x scriptname".
It's also good practice to add "#!/bin/bash" (without the quotes) as the very first line, yet it's not really mandatory.
0
doc_jayAuthor Commented:
I just tried this out and my test didn't work out as I had hoped.  I have everything in place.  I'm sure this might work when the process actually fails on its own, but when I killed the service with 'pkill 2497' and then did 'systemctl status radarch4', the status still had two processes.  The script didn't start up the service again because of it.  Is there a different way I can test?

my crontab:

0 7 25 * * /opt/scripts/monitor_system/monthly_report.sh
*/1 * * * * /opt/scripts/monitor_system/watch_D4C_job.sh > /opt/scripts/monitor_system/log/radarch4_restart.log 2>&1

Open in new window


a 'radarch4_restart.log' file is created in the appropriate folder

Below is a 'systemctl status radarch4' and after that a 'ps aux | grep java' command

[root@radarch2 bin]# systemctl status radarch4
radarch4.service - SYSV: Start the DCM4CHEE DICOM Image Manager
   Loaded: loaded (/etc/rc.d/init.d/radarch4)
   Active: active (exited) since Fri 2014-12-19 12:32:34 CST; 4min 26s ago
  Process: 2379 ExecStop=/etc/rc.d/init.d/radarch4 stop (code=exited, status=1/FAILURE)
  Process: 2435 ExecStart=/etc/rc.d/init.d/radarch4 start (code=exited, status=0/SUCCESS)

Dec 19 12:32:34 radarch2 systemd[1]: Starting SYSV: Start the DCM4CHEE DICOM Image Manager...
Dec 19 12:32:34 radarch2 radarch4[2435]: JBOSS_CMD_START = cd /opt/dcm4chee-2.18.0-mysql/bin; authbind --deep /opt/dcm4chee-2.18.0-mysql... default
Dec 19 12:32:34 radarch2 su[2439]: (to root) root on none
Dec 19 12:32:34 radarch2 systemd[1]: Started SYSV: Start the DCM4CHEE DICOM Image Manager.
Hint: Some lines were ellipsized, use -l to show in full.
[root@radarch2 bin]#
[root@radarch2 bin]#
[root@radarch2 bin]# ps aux | grep java
root      2226  1.3  0.3 7497460 404748 ?      Sl   Dec18  23:30 /usr/bin/java -Dinstall4j.jvmDir=/usr -Dexe4j.moduleName=/opt/mirthconnect/mcservice -Dinstall4j.launcherId=144 -Dinstall4j.swt=false -server -Xmx256m -Djava.awt.headless=true -Di4j.vmov=true -Di4j.vmov=true -Di4j.vpt=true -classpath /opt/mirthconnect/.install4j/i4jruntime.jar:/opt/mirthconnect/mirth-server-launcher.jar com.install4j.runtime.launcher.Launcher start com.mirth.connect.server.launcher.MirthLauncher false false   true true false  true true 0 0  20 20 Arial 0,0,0 8 500 version 3.1.0.7420.b1421 20 40 Arial 0,0,0 8 500 -1
root      2497 29.4  0.6 18381828 817284 ?     Sl   12:32   1:19 java -Dprogram.name=run.sh -server -Xms512m -Xmx4g -XX:MaxPermSize=1g -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djboss.messaging.ServerPeerID=0 -Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl -Djava.awt.headless=true -Dapp.name=dcm4chee -Djava.net.preferIPv4Stack=true -Djava.library.path=/opt/dcm4chee-2.18.0-mysql/bin/native -Djava.endorsed.dirs=/opt/dcm4chee-2.18.0-mysql/lib/endorsed -classpath /opt/dcm4chee-2.18.0-mysql/bin/run.jar org.jboss.Main -b 0.0.0.0 -c default
root      3020  0.0  0.0 112644   964 pts/0    S+   12:37   0:00 grep --color=auto java
root      3226  0.0  0.0 34939752 85020 ?      Sl   Dec18   1:17 ../jre/bin/java -classpath ../jre/lib/rt.jar:../jre/lib/jsse.jar:../jre/lib/jce.jar:mail.jar:Framework.jar -Djava.library.path=. Framework.FrameworkManager
root      4252  0.0  0.0 34784092 1052 ?       Ss   Dec18   0:00 ../jre/bin/java -classpath ../jre/lib/rt.jar:../jre/lib/jsse.jar:../jre/lib/jce.jar:mail.jar:Framework.jar -Djava.library.path=. Framework.FrameworkManager
[root@radarch2 bin]#

Open in new window

0
woolmilkporcCommented:
You wrote that "the service 'radarch4' calls 'run.sh' to start up JBOSS."
So we could check for the presence of "run.sh" in the process list:

if ! ps aux | grep "java" | grep -q "run.sh"
   then
       echo "radarch4 is dead! Restarting"
       systemctl stop radarch4 2>&1
       systemctl start radarch4 2>&1
fi
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
doc_jayAuthor Commented:
that worked out just as I had hoped!  

on this line:
echo "radarch4 is dead! Restarting" | mailx -s "radarch4 bounced" myemail@domain.com

If I would like the date inserted, would I call it like this?

echo "radarch4 is dead @ date! Restarting" | mailx -s "radarch4 bounced" myemail@domain.com

or do I have to set a variable?

thanks!
0
doc_jayAuthor Commented:
I got the date working with $(date)
0
doc_jayAuthor Commented:
Excellent expert!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux Distributions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.