Solved

Out of Memory Errors

Posted on 2004-03-29
18
1,601 Views
Last Modified: 2007-12-19
Hello Experts,

We just launched a new server and we are having a lot of performance related problems. We have the following environment.

uname -a
SunOS behemoth 5.9 Generic sun4u sparc SUNW,Sun-Fire-880

java -version
java version "1.4.2_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02)
Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)

RAM: 4096M

 ./httpd -version
Server version: Apache/1.3.27 (Unix)

tomcat version 4.1, xlst, oscache.

After starting the server, the load goes up very high - currently: load average: 1.25 and the webserver is very slow and then eventually dies if the site becomes very busy. I saw the following error in the tocat logs:

Unexpected Signal : 10 occurred at PC=0xFECD6270
Function=[Unknown. Nearest: JVM_ArrayCopy+0x5600]
Library=/usr/local/j2sdk1.4.2_03/jre/lib/sparc/server/libjvm.so


Dynamic libraries:
0x10000         /usr/local/j2sdk1.4.2_03/bin/java
0xff370000      /usr/lib/libthread.so.1
0xff3a0000      /usr/lib/libdl.so.1
0xff280000      /usr/lib/libc.so.1
0xff360000      /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
0xfec00000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/server/libjvm.so
0xff240000      /usr/lib/libCrun.so.1
0xff210000      /usr/lib/libsocket.so.1
0xfeb00000      /usr/lib/libnsl.so.1
0xfebd0000      /usr/lib/libm.so.1
0xff1f0000      /usr/lib/libsched.so.1
0xff270000      /usr/lib/libw.so.1
0xfeae0000      /usr/lib/libmp.so.2
0xfeac0000      /usr/lib/librt.so.1
0xfeaa0000      /usr/lib/libaio.so.1
0xfea70000      /usr/lib/libmd5.so.1
0xfea50000      /usr/platform/SUNW,Sun-Fire-880/lib/libmd5_psr.so.1
0xfea10000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/native_threads/libhpi.so
0xfe9c0000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libverify.so
0xfe970000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libjava.so
0xfe950000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libzip.so
0xfbc20000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libnet.so
0xf9be0000      /usr/lib/nss_files.so.1

Heap at VM Abort:
Heap
 par new generation   total 8128K, used 8064K [0x75800000, 0x76000000, 0x76000000)
  eden space 8064K, 100% used [0x75800000, 0x75fe0000, 0x75fe0000)
  from space 64K,   0% used [0x75ff0000, 0x75ff0000, 0x76000000)
  to   space 64K,   0% used [0x75fe0000, 0x75fe0000, 0x75ff0000)
 concurrent mark-sweep generation total 2088960K, used 1929831K [0x76000000, 0xf5800000, 0xf580
0000)
 concurrent-mark-sweep perm gen total 16384K, used 9840K [0xf5800000, 0xf6800000, 0xf9800000)

Local Time = Sun Mar 28 03:22:11 2004
Elapsed Time = 118113
#
# HotSpot Virtual Machine Error : 10
# Error ID : 4F530E43505002EF 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.2_03-b02 mixed mode)


Any help will be greatly appreciated. Thanks in advance.

Tomcat is started with the following options :

CATALINA_OPTS="-server -Xms2048m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC "



- Ian








0
Comment
Question by:kamarja
  • 9
  • 9
18 Comments
 
LVL 5

Expert Comment

by:twobitadder
ID: 10706552
try

-XX:+UseParallelGC

for the throughput collector.
this uses several threads to perform minor collections, your eden space is saturated. This could help.
0
 
LVL 5

Accepted Solution

by:
twobitadder earned 400 total points
ID: 10706618
I say this because:

Function=[Unknown. Nearest: JVM_ArrayCopy+0x5600]

implies that there could be a problem copying to the the 'to' region from eden.


read this also:
http://java.sun.com/docs/hotspot/gc1.4.2/

and consider using :
XX:+UseAdaptiveSizePolicy  to try and judge the new size for the young gen after each minor collection.

could also adjust the ratio of young:tenured with:
-XX:NewRatio= >>newRatioOfyoung:tenured<<
0
 

Author Comment

by:kamarja
ID: 10706766
Thanks for the quick response. So you suggest that I start tomcat like :


"-server -Xms2048m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParrallelGC XX:+UseAdaptiveSizePolicy "



0
 

Author Comment

by:kamarja
ID: 10706855
Just wondering if XX:+UseConcMarkSweepGC - and XX:+UseParrallelGC  play well together ???

Thanks.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10707137
Nope,  concurrent mark-sweep tries 'to reduce the time taken to collect the tenure generation', this is the generation of objects that have passed from short term to longer term, which isn't the kind of objects you'll be dealing primarily with in the case of your short lived web connections.

Try it without XX:+UseConcMarkSweepGC, it's a different garbage collector.

Use concMarkSweep when objects reside longer in the system and you have high % tenure.
0
 

Author Comment

by:kamarja
ID: 10707184
Thanks. I will try it and let you know. It's seems that we are having long pause times when the GC is working.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10707412
btw signal 10 is sigbus signal which flags a data bus error, I think this can occur from young overflow, not sure though.
0
 

Author Comment

by:kamarja
ID: 10707707
Hey,

It looks a lot happier right now and the site seems much faster. The load is stable and the memory usage is much better. We currently have 3GiGs free memory while before with the previous config, we would have about 500M free memory. I am going to watch it until tomorrow and I will let you know. Usually, it crashes more in the morning, so I will wait until tomorrow. Thanks so much.

I starting tomcat with :
CATALINA_OPTS="-server -Xmx2048m -Xmx2048m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"
Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10708074
luck :P
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:kamarja
ID: 10710810
OOps, not out of the woods yet. The webserver is again extremely slow, memory usage and loads high. I found this in the catalina.out log.

Look how long it takes now.
 before it was :

[Full GC 9648K->9560K(27712K), 0.2556779 secs]
[GC 19736K->12858K(27648K), 0.0046218 secs]
[GC 20396K->11729K(27968K), 0.0184282 secs]
[GC 21713K->15950K(29504K), 0.0217567 secs]
[GC 23246K->14443K(31424K), 0.0294165 secs]
[Full GC 14443K->14219K(31424K), 0.2671838 secs]

approx midnight it's:

[Full GC 1791224K->1791224K(1944192K), 11.8895081 secs]
[Full GC 1791224K->1791224K(1944192K), 11.8294296 secs]
[Full GC 1791224K->1791008K(1944192K), 12.1928545 secs]
[Full GC 1791224K->1791133K(1944192K), 11.9237863 secs]


0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10712389
I think there's probably some kind of memory leak in one of the java components, maybe try something like JProfile to try and find the problem.
0
 

Author Comment

by:kamarja
ID: 10712920
Thanks. Will do.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10712993
JProbe not JProfile sorry.

http://www.quest.com/jprobe/index.asp

They have a free trial download.
0
 

Author Comment

by:kamarja
ID: 10713587
Oh - ok thanks. I also found this error in the catalina.out, not sure if it's related tho:

WebappClassLoader: Lifecycle error : CL stopped

and

SEVERE: Error in action code
ava.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at
java.net.SocketOutputStream.write(SocketOutputStream.java:136)  at
org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:457) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:654) at
org.apache.jk.server.JkCoyoteHandler.action(JkCoyoteHandler.java:435) at org.apache.coyote.Response.action(Response.java:222)
at org.apache.coyote.Response.finish(Response.java:343)
at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:314) at
org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:387) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:673) at
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:615) at
> org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:786) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:666)
0
 

Author Comment

by:kamarja
ID: 10714852
Hey,

Do you think I could decrease the initial heap size as a work around while we try to find the memory leak. Think it would help ?
FROM:"-server -Xms2048m -Xmx2048m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"

TO:
-server -Xmx512m -Xmx512m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"

Thanks.

Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10716029
For the broken pipe check:
http://archives.real-time.com/pipermail/tomcat-users/2003-January/091849.html
It could be the interrupted download described.

I don't know the tomcat classes but the WebAppClassLoader has a stop method() and I don't know why it's being called, some printStackTrace would be needed to pinpoint the reason.

General docs:
http://jakarta.apache.org/tomcat/tomcat-5.0-doc/catalina/docs/api/org/apache/catalina/loader/WebappClassLoader.html
Lifecycle docs:
http://jakarta.apache.org/tomcat/tomcat-5.0-doc/catalina/docs/api/org/apache/catalina/Lifecycle.html

I'm sorry but I don't know the reason for the problem and my best guess is it's a memory leak that's leaving objects with references to fill up the heap.

I think decreasing the initial heap size will just degrade performance more by using slower hard disc swap space. Perhaps increasing it will help a little if there is some swapping going on atm, but this wouldn't solve your problem.

0
 

Author Comment

by:kamarja
ID: 10716731
Thanks - you did point me in the right direction. I think we may have found some large responses from a few of our php files that uses xml. They are over 5megs. We'll see what happens when we reduce tha amount of data that they return.

Thanks for the pointers and your time.

- Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10716753
I suppose it's a kind of memory leak :P
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Video by: Michael
Viewers learn about how to reduce the potential repetitiveness of coding in main by developing methods to perform specific tasks for their program. Additionally, objects are introduced for the purpose of learning how to call methods in Java. Define …
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now