attack of the killer zombies.

here's a nutkicker. in my five years with linux, i've never had this problem before.

this is a plain jane RedHat 6.1 (no upgrades except for PHP3 and mysql)

i come to work, the computer is 'frozen'. (i use kde and rarely log off. last night i had run several programs that interact with MySQL running on the same server. number of select statements == 100,000 if not more.)

i had to resort to telnetting in because tty7 was frozen and couldn't do ctrl-alt-Fn

i tried kill -9 <zombiePID>. no kill. (i guess i needed a wooden stick!)

i had about 38 processes, 33 of em were zombies. including all forks of httpd, mysqld, smbd etc..

did sync and ran a script that goes:
sleep 60 #so that i get time to get out of telnet before the halt begins.

nohup <scriptname> &

nothing happened. i log back in thru telnet, turns out that script had been zombified too. what's up?
the only way i could turn that sucker off (to kill those zombies) was to flip the power switch at the back. now i'm waiting for fsck to finish checking some 6-8Gigs of harddisk.

anyone have any ideas of what to do in this kind of situation? why this happened in the first place?

Who is Participating?
bernardhConnect With a Mentor Commented:
a process started by the kernel like init, which you have no control must have died. that's why even if you kill the parent process of those zombies nothing will happen.

if the process seems to have no parent, kill -s SIGHUP 1 will probably clean up zombies. the command will send a hungup signal to init.
Another culprit might be the kernel daemon. Kerneld has forked request-route and not bothered to wait for it to terminate. Kerneld is still there, and the request-route which is marked as zombie will go away as soon as kerneld do a wait for it. An easy way to force kerneld to do that is to just kill it.
aaryalAuthor Commented:
i'm taking your word for it. i have to way to test this theory until this things happens again and since it only happened once in 5 years....

but then probability theory (the no-memory property of some distribution, i forget.) dictates that it could happen again soon :)

but seems like a logical thing to do.

thanks bro,
aaryalAuthor Commented:
well, whaddya know!! it happened again. and the kill -s SIGHUP 1 didn't work. nor did killing kerneld.

this time, someone put an sql statement in an infinite loop. and that zombified a mysqld process. then, although there weren't any other zombies, the system essentially 'froze'. not exactly, since, we had control over everything except for process management. ie. kill

running processes was not a problem. couldn't run anything in the background.

that's a very weird problem.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.