Solved

killing processes when kill -9 does not work

Posted on 2000-03-07
14
599 Views
Last Modified: 2013-12-16
I'd like to know how can processes be killed when commands like kill, kill -9 and killall don't work. I have this problem with a 'rm' command executed by cron. I execute 'kill -9 pid' but the pid remains there. I have also changed the priority of the process, but what I actually want is to remove it.
 
0
Comment
Question by:foron
14 Comments
 
LVL 40

Expert Comment

by:jlevie
ID: 2591281
If "kill -9" can't get rid of the process, then it's trully hosed and your only recourse is to reboot.
0
 
LVL 14

Accepted Solution

by:
chris_calabrese earned 50 total points
ID: 2592269
I'd like to elaborate a bit here...when a process hangs like this it's because it's in a non-interruptable system call.  This type of thing is most common when the process is waiting for some kind of disk IO on a filesystem that's hosed in some way or is on a stale NFS mount.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2592283
Oh yeah, the way to find out exactly what's happening is to do a system call trace on the process.  I don't know what's out there in the Linux world to do this, but in the commercial Unix's the tool you'd use would be truss, trace, or tusc (depending on the flavor of Unix.).
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2593215
one alternative is the fuser command, try fuser -k /dev/device_name (e.g. /dev/tty0)
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2595951
Since we already know which process is hung, fuser probably won't help too much here.

lsof (ftp://vic.cc.purdue.edu/pub/tools/unix/lsof) can be a little more helpful as it will tell you what files the hung process has open.  But even that will often not tell you what's really going on (if the open itself is hanging, for instance, it won't show up in lsof).

The only sure way is through a system call trace.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2596721
fuser -k /dev/tty0 (for example) will terminate all of the processes using a given file system, or device. it works all the time. fuser only shows you what file is openened by a particular process or program. it doesn't help you kill a hung process. on other unix flavors, there are commands like /usr/lbin/tty/stty-cxma flush ttyX or /usr/sbin/strreset -M ## -m ## but i don't think they exist on linux.
0
 
LVL 2

Expert Comment

by:aaryal
ID: 2605550
you can trace system calls with strace. look up man page for the details.
on a side note:
it's cousin is the ltrace which does library call trace (for dynamic libs)

i think you can do strace -e trace=<PID>
to trace a running process.
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 
LVL 2

Expert Comment

by:bernardh
ID: 2605594
"fuser only shows you what file is opened by a particular process              or program. it doesn't help you kill a hung process." lsof, i meant.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2605753
lsof is more useful than fuser, but as I pointed out above still isn't sufficient in many situations.  The system call trace is definitely the way to go.  As aaryal pointed out, you can do a system call trace under Linux with strace.

If you could split point in this system, I'd say give half to aaryal.  But since you can't and I'm greedy, I won't ;-)
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2605992
what was being asked here is HOW TO REMOVE the pid of a process that has been terminated with the kill -9 command, not how to trace system calls or find out which file was opened by which process, blah-blah-blah...in short an alternative to the kill -9 command...
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2606064
According to the fuser manual page at http://www.kashpureff.org/nic/linux/man.shtml/fuser(1) and also the same on my HP-UX 10.20 box, fuser -k calls kill -9; therefore we can conclude that fuser -k can not kill any process that kill -9 cannot.

If something is in a state that kill -9 can not deal with, it's because it's blocked at a very low level in the kernel, and the only thing you can do is figure out what's blocking it.

If it's a hung NFS mountpoint, you might be able to remount it.  If it's a tape that's shoeshining, you may be able to eject the tape.  If it's a filesystem that's plain hanging, you need kernel patches to fix the hang.  Etc.

However, you need to first find out what it's hanging on.  fuser may be helpful here.  lsof will be more useful.  There are circumstances where only a system call trace (and therefore strace) will do.  Of course, you need to trace it before it hangs, but it it's something running out of cron that hangs every time....
0
 

Author Comment

by:foron
ID: 2623159
What I want to do is not to trace the system call that hung the process. What I want to do is to kill the process. If you need more info, the process is using a NFS partition. Chris_calabrese answered that I should remount the NFS partition, but I have other processes running on it.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2623758
If the process is hanging on a call to an NFS mounted file, my guess would be that the entire NFS mountpoint is dead and that other processes attempting to access it will hang too.  However, you're going to need the system call trace (strace) to figure out if it's really an NFS problem.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2624652
just get straight to the point, unmount the file system then try to stop and restart the nfs daemons using: killall -HUP rpc.nfsd rpc.mountd or /etc/rc.d/init.d/nfs stop; /etc/rc.d/init.d/nfs start, then remount the filesystem
0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now