We help IT Professionals succeed at work.

Troubleshooting Redhat Enterprise Server

Clement P
Clement P asked
on
Hi,
We have two Red hat servers, running a tomcat web server on one and the other one is the database server mysql.

Since three weeks every Tuesday early morning our webserver is crashing, our understanding is probably because its trying to access something on the database for very long and since it is not able to access its giving up and its crashing eventually.

To troubleshoot this issue, where do i start first?

My goal on this question is just to determine what the problem is and to resolve it i can contact red hat support, but i should really know what the problem is.

Many thanks for your help
Comment
Watch Question

You need to look at both the Apache and the system logs first, then move to Tomcat, and then finally MySQL on the DB server.

After a crash, look at the last few hundred lines of /var/log/httpd/error_log - Apache is usually pretty verbose about why it goes down.
Also, look at /var/log/messages - search for the pattern "killed" - the kernel on modern Linux systems will automatically kill processes starting with the heavy users when memory gets too low.
Finally, look at your Tomcat logs, probably /usr/local/tomcat/logs, but that will depend on where the admin installed it.  I doubt you'll find much here, but if the other two come up dry, this may be an option.

If there is anything of interest on the MySQL server, it will probably be in the slow queries log - this is NOT always turned on by default, so you may have to enable it and restart mysqld, and then the next time you get a crash, see if the timing coincides with a slow query.  Be sure that both systems clocks match as it will help tremendously with troubleshooting.

Feel free to paste any logs that you find consequential.  My personal bet is that it's the Linux kernel killing the java/tomcat/httpd processes for hogging too much memory.

Author

Commented:
Thanks for you help folks, i have tried to check different logs as suggested by xterm and cudnt find anything. I guess since this is recurring issue, i see more appropriate way to troubleshoot will be by enabling crash dump or some kind of recording on the activity on the servers during this time.

I have tried to follow the links from farzanj and they look promising, please send any other instructions like this am sure that will help me first to understand the issue in better and then i can comeback to you guys with the logs. However that will be a difference question i promise. As said earlier my goal on this question is to determine the issue.

Thanks again

Author

Commented:
Thanks folks for the help