Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

killing processes when kill -9 does not work

Posted on 2000-03-07
14
602 Views
Last Modified: 2013-12-16
I'd like to know how can processes be killed when commands like kill, kill -9 and killall don't work. I have this problem with a 'rm' command executed by cron. I execute 'kill -9 pid' but the pid remains there. I have also changed the priority of the process, but what I actually want is to remove it.
 
0
Comment
Question by:foron
14 Comments
 
LVL 40

Expert Comment

by:jlevie
ID: 2591281
If "kill -9" can't get rid of the process, then it's trully hosed and your only recourse is to reboot.
0
 
LVL 14

Accepted Solution

by:
chris_calabrese earned 50 total points
ID: 2592269
I'd like to elaborate a bit here...when a process hangs like this it's because it's in a non-interruptable system call.  This type of thing is most common when the process is waiting for some kind of disk IO on a filesystem that's hosed in some way or is on a stale NFS mount.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2592283
Oh yeah, the way to find out exactly what's happening is to do a system call trace on the process.  I don't know what's out there in the Linux world to do this, but in the commercial Unix's the tool you'd use would be truss, trace, or tusc (depending on the flavor of Unix.).
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 2

Expert Comment

by:bernardh
ID: 2593215
one alternative is the fuser command, try fuser -k /dev/device_name (e.g. /dev/tty0)
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2595951
Since we already know which process is hung, fuser probably won't help too much here.

lsof (ftp://vic.cc.purdue.edu/pub/tools/unix/lsof) can be a little more helpful as it will tell you what files the hung process has open.  But even that will often not tell you what's really going on (if the open itself is hanging, for instance, it won't show up in lsof).

The only sure way is through a system call trace.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2596721
fuser -k /dev/tty0 (for example) will terminate all of the processes using a given file system, or device. it works all the time. fuser only shows you what file is openened by a particular process or program. it doesn't help you kill a hung process. on other unix flavors, there are commands like /usr/lbin/tty/stty-cxma flush ttyX or /usr/sbin/strreset -M ## -m ## but i don't think they exist on linux.
0
 
LVL 2

Expert Comment

by:aaryal
ID: 2605550
you can trace system calls with strace. look up man page for the details.
on a side note:
it's cousin is the ltrace which does library call trace (for dynamic libs)

i think you can do strace -e trace=<PID>
to trace a running process.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2605594
"fuser only shows you what file is opened by a particular process              or program. it doesn't help you kill a hung process." lsof, i meant.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2605753
lsof is more useful than fuser, but as I pointed out above still isn't sufficient in many situations.  The system call trace is definitely the way to go.  As aaryal pointed out, you can do a system call trace under Linux with strace.

If you could split point in this system, I'd say give half to aaryal.  But since you can't and I'm greedy, I won't ;-)
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2605992
what was being asked here is HOW TO REMOVE the pid of a process that has been terminated with the kill -9 command, not how to trace system calls or find out which file was opened by which process, blah-blah-blah...in short an alternative to the kill -9 command...
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2606064
According to the fuser manual page at http://www.kashpureff.org/nic/linux/man.shtml/fuser(1) and also the same on my HP-UX 10.20 box, fuser -k calls kill -9; therefore we can conclude that fuser -k can not kill any process that kill -9 cannot.

If something is in a state that kill -9 can not deal with, it's because it's blocked at a very low level in the kernel, and the only thing you can do is figure out what's blocking it.

If it's a hung NFS mountpoint, you might be able to remount it.  If it's a tape that's shoeshining, you may be able to eject the tape.  If it's a filesystem that's plain hanging, you need kernel patches to fix the hang.  Etc.

However, you need to first find out what it's hanging on.  fuser may be helpful here.  lsof will be more useful.  There are circumstances where only a system call trace (and therefore strace) will do.  Of course, you need to trace it before it hangs, but it it's something running out of cron that hangs every time....
0
 

Author Comment

by:foron
ID: 2623159
What I want to do is not to trace the system call that hung the process. What I want to do is to kill the process. If you need more info, the process is using a NFS partition. Chris_calabrese answered that I should remount the NFS partition, but I have other processes running on it.
0
 
LVL 14

Expert Comment

by:chris_calabrese
ID: 2623758
If the process is hanging on a call to an NFS mounted file, my guess would be that the entire NFS mountpoint is dead and that other processes attempting to access it will hang too.  However, you're going to need the system call trace (strace) to figure out if it's really an NFS problem.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2624652
just get straight to the point, unmount the file system then try to stop and restart the nfs daemons using: killall -HUP rpc.nfsd rpc.mountd or /etc/rc.d/init.d/nfs stop; /etc/rc.d/init.d/nfs start, then remount the filesystem
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
ifconfig 4 69
AWS EC2 HTTP & HTTPS 2 44
SUSE Linux Enterprise 11.x Ensure tftp server is not enabled 1 44
Samba 4, Users Permission, 5 46
Daily system administration tasks often require administrators to connect remote systems. But allowing these remote systems to accept passwords makes these systems vulnerable to the risk of brute-force password guessing attacks. Furthermore there ar…
Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question