Solved

attack of the killer zombies.

Posted on 2000-03-10
4
542 Views
Last Modified: 2013-12-15
here's a nutkicker. in my five years with linux, i've never had this problem before.

this is a plain jane RedHat 6.1 (no upgrades except for PHP3 and mysql)

i come to work, the computer is 'frozen'. (i use kde and rarely log off. last night i had run several programs that interact with MySQL running on the same server. number of select statements == 100,000 if not more.)

i had to resort to telnetting in because tty7 was frozen and couldn't do ctrl-alt-Fn

i tried kill -9 <zombiePID>. no kill. (i guess i needed a wooden stick!)

i had about 38 processes, 33 of em were zombies. including all forks of httpd, mysqld, smbd etc..

did sync and ran a script that goes:
#!/bin/bash
sleep 60 #so that i get time to get out of telnet before the halt begins.
halt

typed:
nohup <scriptname> &
exit

nothing happened. i log back in thru telnet, turns out that script had been zombified too. what's up?
the only way i could turn that sucker off (to kill those zombies) was to flip the power switch at the back. now i'm waiting for fsck to finish checking some 6-8Gigs of harddisk.

anyone have any ideas of what to do in this kind of situation? why this happened in the first place?


thanks.
0
Comment
Question by:aaryal
  • 2
  • 2
4 Comments
 
LVL 2

Accepted Solution

by:
bernardh earned 100 total points
ID: 2606093
a process started by the kernel like init, which you have no control must have died. that's why even if you kill the parent process of those zombies nothing will happen.

if the process seems to have no parent, kill -s SIGHUP 1 will probably clean up zombies. the command will send a hungup signal to init.
0
 
LVL 2

Expert Comment

by:bernardh
ID: 2606136
Another culprit might be the kernel daemon. Kerneld has forked request-route and not bothered to wait for it to terminate. Kerneld is still there, and the request-route which is marked as zombie will go away as soon as kerneld do a wait for it. An easy way to force kerneld to do that is to just kill it.
0
 
LVL 2

Author Comment

by:aaryal
ID: 2606183
i'm taking your word for it. i have to way to test this theory until this things happens again and since it only happened once in 5 years....

but then probability theory (the no-memory property of some distribution, i forget.) dictates that it could happen again soon :)

but seems like a logical thing to do.

thanks bro,
anoop
0
 
LVL 2

Author Comment

by:aaryal
ID: 2612788
well, whaddya know!! it happened again. and the kill -s SIGHUP 1 didn't work. nor did killing kerneld.

this time, someone put an sql statement in an infinite loop. and that zombified a mysqld process. then, although there weren't any other zombies, the system essentially 'froze'. not exactly, since, we had control over everything except for process management. ie. kill

running processes was not a problem. couldn't run anything in the background.

that's a very weird problem.

0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
nagios 1 33
CentOS create a user with predefined MD5 Hashed password 17 84
Why isn't object file created? 6 57
awk variable in printf 1 22
How many times have you wanted to quickly do the same thing to a list but found yourself typing it again and again? I first figured out a small time saver with the up arrow to recall the last command but that can only get you so far if you have a bi…
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question