Solved

Tracking which Unix process pid just crashed

Posted on 2010-09-05
3
723 Views
Last Modified: 2012-05-10

I got a monitoring alert that says a Unix process just crashed but the alert did not
specify the pid or process name that just crashed.

Is there any way to find out?

What about the directory /usr/ucb/... : does it hold any clue

I recall in Linux  /var/run   there are *.pid files that hold the pid of processes.
If a process was abrupted terminated or manually "killed", does the pid file
stays behind?  I thought of going thru one by one the .pid files to see the pid
& check which ones are no longer found in "ps -ef"

/var/log/messages did not give any clue

any good Shell script / command to check this easily would be most welcome
as well
0
Comment
Question by:sunhux
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 

Author Comment

by:sunhux
ID: 33608897


What does the date stamp of those /var/run/*.pid files mean?
0
 
LVL 6

Accepted Solution

by:
apresence earned 470 total points
ID: 33608966
Not all applications write the pid files.  But, you are right, the pid files are usually deleted when a process exits normally.  The date stamp (mtime) on the /var/run/*.pid files is the last time the process was started.

If you want to check those pid files to see if any of their processes are missing, the attached code will do it for ya.

Sample output (I create a theoretical testproc.pid file with a pid that doesn't exist for testing):
root@beta:~/exex/test9 $ echo 999 >/var/run/testproc.pid
root@beta:~/exex/test9 $ ./show_missing_pids.sh
PID 3146 (/var/run/atd.pid): RUNNING
PID 2334 (/var/run/auditd.pid): RUNNING
PID 3015 (/var/run/crond.pid): RUNNING
...
PID 999 (/var/run/testproc.pid): NOT RUNNING
...
root@beta:~/exex/test9 $
#!/bin/sh
for i in `ls /var/run/*.pid`; do
  actual_pid=`perl -ne 'print "$1\n" if /^(\d+)/' < $i`
  if [ -n "$actual_pid" ]; then
    ps -p $actual_pid >/dev/null 2>&1
    if [ $? -eq 0 ]; then
      echo "PID $actual_pid ($i): RUNNING"
    else
      echo "PID $actual_pid ($i): NOT RUNNING"
    fi
  fi
done

Open in new window

0
 
LVL 62

Assisted Solution

by:gheist
gheist earned 30 total points
ID: 33610611
do you have any log or boot message entry confirming a process crash?
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you do backups in the Solaris Operating System, the file system must be inactive. Otherwise, the output may be inconsistent. A file system is inactive when it's unmounted or it's write-locked by the operating system. Although the fssnap utility…
Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.
Suggested Courses

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question