Tomcat crashes every day

Every day tomcat crashes with out any crash report.

It started 20 days back. On 20th Sep tomcat crashed and generated crash report.

PFA.

After that every day tomcat crashes and I didn't find any crash report till now.

We have Centos 5 and 15 GB Ram and tomcat 6.0.20

It's working fine for years.

In this crash report all threads status showing as _thread_blocked.

I observed one thing whenever tomcat crashes I am not able to ssh for some time (5 minutes) to that machine.

There are no changes to the java code. We didn't update or change any java code.
hs-err-pid5737.log
sasidhar1229Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

CEHJCommented:
That's a native code JVM error. Since 1.5.0_22-b03 is out of date by a long way and contains vulnerabilities, the first thing to do is to update to the latest version you can.
0
sasidhar1229Author Commented:
I already did that. Couple of days after first crash I updated the jdk to 1.7 .

Even though I am getting the same issue.

Tomcat not generating the crash report.

I used jstack and Thread Dump Analyzer.

TDA displayed one message '54% of all threads are sleeping on a monitor'.

And the description is 'This might indicate they are waiting for some external resource(e.g. database) which is overloaded or not available  or are just waiting to get to do something (idle threads).'

We didn't modify any db code. And number of connections on db also fine.

How can exactly we find on which resources these threads are waiting.  

And this percentage increasing consistently.
0
CEHJCommented:
I already did that. Couple of days after first crash I updated the jdk to 1.7 .
Why, in that case, is it showing a 1.5 jvm is being used?
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

sasidhar1229Author Commented:
That crash report generated on 20th Sep2013.

By that time We are using 1.5 jvm.

After few days of crash we changed the jvm to 1.7.
0
Sathish David Kumar NArchitectCommented:
Check your java home envi variable
0
CEHJCommented:
Please post current report. It's pointless posting the report of a jvm you're no longer even running
0
sasidhar1229Author Commented:
Please post current report.

It's not generating any crash reports.

Do you mean Thread Dump.

PFA of thread dump.
14thOct13-08-26-22-pm.txt
0
sasidhar1229Author Commented:
Check your java home envi variable
Its Pointing to the right jvm.
0
Radek BaranowskiFull-stack Java DeveloperCommented:
Did the number of users increase prior to the day crashes started to occur?
Have the DB started to be more loaded, which could affect it's performance?

It seems to me like a massive deadlock on your server, which leads to resource saturation and eventual crash. If your system has previously ran on the verge of clogging, small increase of load/user sessions might have led to the crashes.
0
CEHJCommented:
Please also post latest Tomcat log file
0
Sharon SethCommented:
Most of the  threads are waiting on org.apache.tomcat.util.net.JIoEndpoint's worker threads . A little googling  pointed out that 'waiting on' is normal on this class , which is waiting for incoming requests from users . How many user requests is the server configured to handle?

Couple of questions:
-- What happens when you say the server crashes ? Do you have a non responsive app , the app dies or what?
-- Is there a pattern when you see this ? Does this happen sometime after you start the server , or happns when performing certain actions on the app?
0
sasidhar1229Author Commented:
How many user requests is the server configured to handle?
2000 connections.

What happens when you say the server crashes ? Do you have a non responsive app , the app dies or what?
Application dies and not generating any crash report. And OS itself not responding to ssh. But I am able to ping the server. After few minutes I am able to ssh(for ex: 5minutes).

Is there a pattern when you see this ? Does this happen sometime after you start the server , or happens when performing certain actions on the app?
Yes, it's happening after restarting the server. After 24 hrs.
0
CEHJCommented:
Sounds like CPU starvation, memory starvation or both. Run top in batch mode and examine the log file:
http://www.inmotionhosting.com/support/website/server-usage/using-the-linux-top-command-in-batch-mode
0
sasidhar1229Author Commented:
PFA of Top in batch mode and Thread Dump Analyzer.
topbatchmode.png
TDA.png
0
CEHJCommented:
PFA of Top in batch mode
Not useful i'm afraid - you did notice that Tomcat/java are not even listed ..? Just attach ${TOP_LOG} to the question. Of course, that will need to have taken a snapshot at the appropriate time (when you're in the danger zone)
0
sasidhar1229Author Commented:
OS killing the Java process. We observed the log messages there we found that OOM Killer killed the java process.

But we didn't add or modified any of the code in the app. How can we identify the problem.
0
Radek BaranowskiFull-stack Java DeveloperCommented:
consider your performance measures/system scaling.

If, as you claim, nothing changed in the config,code and machine then the reason must lie somewhere nearby:
1. other programs on the same machine consuming more resources (ask why?)
2. your application started to receive more requests, hence bigger memory consumption (more user sessions) and running into out of memory condition, which eventually leads to oom killer action

no wonders here.
0
CEHJCommented:
there we found that OOM Killer killed the java process.
But that's no reason to have an empty top log file. You can start the top logging before you start Tomcat. Then you can observe

a. how quickly the memory usage climbs
b.
other programs on the same machine consuming more resources (ask why?)
whether in fact that was the case
0
sasidhar1229Author Commented:
PFA TOP_LOG file at the time of tomcat crash.
TOP-LOG
0
CEHJCommented:
I think you ought to run that (and the whole question) by the Linux TAs, especially ones that might be familiar with Nautilus and (probably) the Gnome desktop. Nautilus seems to be using a large number of threads and at times a lot of CPU. You need to check that out i think.
0
sasidhar1229Author Commented:
Added additional swap space. And it's working for last 4 days without any crash.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sasidhar1229Author Commented:
OOM-Killer killing the java process. So we added additional swap space to OS. It solved the problem.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.