Solved

killing processes when kill -9 does not work

Posted on 2000-03-07
14
598 Views
Last Modified: 2013-12-16
I'd like to know how can processes be killed when commands like kill, kill -9 and killall don't work. I have this problem with a 'rm' command executed by cron. I execute 'kill -9 pid' but the pid remains there. I have also changed the priority of the process, but what I actually want is to remove it.
 
0
Comment
Question by:foron
14 Comments
 
LVL 40

Expert Comment

by:jlevie
Comment Utility
If "kill -9" can't get rid of the process, then it's trully hosed and your only recourse is to reboot.
0
 
LVL 14

Accepted Solution

by:
chris_calabrese earned 50 total points
Comment Utility
I'd like to elaborate a bit here...when a process hangs like this it's because it's in a non-interruptable system call.  This type of thing is most common when the process is waiting for some kind of disk IO on a filesystem that's hosed in some way or is on a stale NFS mount.
0
 
LVL 14

Expert Comment

by:chris_calabrese
Comment Utility
Oh yeah, the way to find out exactly what's happening is to do a system call trace on the process.  I don't know what's out there in the Linux world to do this, but in the commercial Unix's the tool you'd use would be truss, trace, or tusc (depending on the flavor of Unix.).
0
 
LVL 2

Expert Comment

by:bernardh
Comment Utility
one alternative is the fuser command, try fuser -k /dev/device_name (e.g. /dev/tty0)
0
 
LVL 14

Expert Comment

by:chris_calabrese
Comment Utility
Since we already know which process is hung, fuser probably won't help too much here.

lsof (ftp://vic.cc.purdue.edu/pub/tools/unix/lsof) can be a little more helpful as it will tell you what files the hung process has open.  But even that will often not tell you what's really going on (if the open itself is hanging, for instance, it won't show up in lsof).

The only sure way is through a system call trace.
0
 
LVL 2

Expert Comment

by:bernardh
Comment Utility
fuser -k /dev/tty0 (for example) will terminate all of the processes using a given file system, or device. it works all the time. fuser only shows you what file is openened by a particular process or program. it doesn't help you kill a hung process. on other unix flavors, there are commands like /usr/lbin/tty/stty-cxma flush ttyX or /usr/sbin/strreset -M ## -m ## but i don't think they exist on linux.
0
 
LVL 2

Expert Comment

by:aaryal
Comment Utility
you can trace system calls with strace. look up man page for the details.
on a side note:
it's cousin is the ltrace which does library call trace (for dynamic libs)

i think you can do strace -e trace=<PID>
to trace a running process.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 2

Expert Comment

by:bernardh
Comment Utility
"fuser only shows you what file is opened by a particular process              or program. it doesn't help you kill a hung process." lsof, i meant.
0
 
LVL 14

Expert Comment

by:chris_calabrese
Comment Utility
lsof is more useful than fuser, but as I pointed out above still isn't sufficient in many situations.  The system call trace is definitely the way to go.  As aaryal pointed out, you can do a system call trace under Linux with strace.

If you could split point in this system, I'd say give half to aaryal.  But since you can't and I'm greedy, I won't ;-)
0
 
LVL 2

Expert Comment

by:bernardh
Comment Utility
what was being asked here is HOW TO REMOVE the pid of a process that has been terminated with the kill -9 command, not how to trace system calls or find out which file was opened by which process, blah-blah-blah...in short an alternative to the kill -9 command...
0
 
LVL 14

Expert Comment

by:chris_calabrese
Comment Utility
According to the fuser manual page at http://www.kashpureff.org/nic/linux/man.shtml/fuser(1) and also the same on my HP-UX 10.20 box, fuser -k calls kill -9; therefore we can conclude that fuser -k can not kill any process that kill -9 cannot.

If something is in a state that kill -9 can not deal with, it's because it's blocked at a very low level in the kernel, and the only thing you can do is figure out what's blocking it.

If it's a hung NFS mountpoint, you might be able to remount it.  If it's a tape that's shoeshining, you may be able to eject the tape.  If it's a filesystem that's plain hanging, you need kernel patches to fix the hang.  Etc.

However, you need to first find out what it's hanging on.  fuser may be helpful here.  lsof will be more useful.  There are circumstances where only a system call trace (and therefore strace) will do.  Of course, you need to trace it before it hangs, but it it's something running out of cron that hangs every time....
0
 

Author Comment

by:foron
Comment Utility
What I want to do is not to trace the system call that hung the process. What I want to do is to kill the process. If you need more info, the process is using a NFS partition. Chris_calabrese answered that I should remount the NFS partition, but I have other processes running on it.
0
 
LVL 14

Expert Comment

by:chris_calabrese
Comment Utility
If the process is hanging on a call to an NFS mounted file, my guess would be that the entire NFS mountpoint is dead and that other processes attempting to access it will hang too.  However, you're going to need the system call trace (strace) to figure out if it's really an NFS problem.
0
 
LVL 2

Expert Comment

by:bernardh
Comment Utility
just get straight to the point, unmount the file system then try to stop and restart the nfs daemons using: killall -HUP rpc.nfsd rpc.mountd or /etc/rc.d/init.d/nfs stop; /etc/rc.d/init.d/nfs start, then remount the filesystem
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Currently, there is not an RPM package available under the RHEL/Fedora/CentOS distributions that gives you a quick and easy way to allow PHP to interface with Oracle. As a result, I have included a set of instructions on how to do this with minimal …
Over the last ten+ years I have seen Linux configuration tools come and go. In the early days there was the tried-and-true, all-powerful linuxconf that many thought would remain the one and only Linux configuration tool until the end of times. Well,…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now