Solved

Tracking which Unix process pid just crashed

Posted on 2010-09-05
3
719 Views
Last Modified: 2012-05-10

I got a monitoring alert that says a Unix process just crashed but the alert did not
specify the pid or process name that just crashed.

Is there any way to find out?

What about the directory /usr/ucb/... : does it hold any clue

I recall in Linux  /var/run   there are *.pid files that hold the pid of processes.
If a process was abrupted terminated or manually "killed", does the pid file
stays behind?  I thought of going thru one by one the .pid files to see the pid
& check which ones are no longer found in "ps -ef"

/var/log/messages did not give any clue

any good Shell script / command to check this easily would be most welcome
as well
0
Comment
Question by:sunhux
3 Comments
 

Author Comment

by:sunhux
ID: 33608897


What does the date stamp of those /var/run/*.pid files mean?
0
 
LVL 6

Accepted Solution

by:
apresence earned 470 total points
ID: 33608966
Not all applications write the pid files.  But, you are right, the pid files are usually deleted when a process exits normally.  The date stamp (mtime) on the /var/run/*.pid files is the last time the process was started.

If you want to check those pid files to see if any of their processes are missing, the attached code will do it for ya.

Sample output (I create a theoretical testproc.pid file with a pid that doesn't exist for testing):
root@beta:~/exex/test9 $ echo 999 >/var/run/testproc.pid
root@beta:~/exex/test9 $ ./show_missing_pids.sh
PID 3146 (/var/run/atd.pid): RUNNING
PID 2334 (/var/run/auditd.pid): RUNNING
PID 3015 (/var/run/crond.pid): RUNNING
...
PID 999 (/var/run/testproc.pid): NOT RUNNING
...
root@beta:~/exex/test9 $
#!/bin/sh
for i in `ls /var/run/*.pid`; do
  actual_pid=`perl -ne 'print "$1\n" if /^(\d+)/' < $i`
  if [ -n "$actual_pid" ]; then
    ps -p $actual_pid >/dev/null 2>&1
    if [ $? -eq 0 ]; then
      echo "PID $actual_pid ($i): RUNNING"
    else
      echo "PID $actual_pid ($i): NOT RUNNING"
    fi
  fi
done

Open in new window

0
 
LVL 62

Assisted Solution

by:gheist
gheist earned 30 total points
ID: 33610611
do you have any log or boot message entry confirming a process crash?
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tech tip describes how to install the Solaris Operating System from a tape backup that was created using the Solaris flash archive utility. I have used this procedure on the Solaris 8 and 9 OS, and it shoudl also work well on the Solaris 10 rel…
How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question