?
Solved

Out of Memory Errors

Posted on 2004-03-29
18
Medium Priority
?
1,618 Views
Last Modified: 2007-12-19
Hello Experts,

We just launched a new server and we are having a lot of performance related problems. We have the following environment.

uname -a
SunOS behemoth 5.9 Generic sun4u sparc SUNW,Sun-Fire-880

java -version
java version "1.4.2_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02)
Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)

RAM: 4096M

 ./httpd -version
Server version: Apache/1.3.27 (Unix)

tomcat version 4.1, xlst, oscache.

After starting the server, the load goes up very high - currently: load average: 1.25 and the webserver is very slow and then eventually dies if the site becomes very busy. I saw the following error in the tocat logs:

Unexpected Signal : 10 occurred at PC=0xFECD6270
Function=[Unknown. Nearest: JVM_ArrayCopy+0x5600]
Library=/usr/local/j2sdk1.4.2_03/jre/lib/sparc/server/libjvm.so


Dynamic libraries:
0x10000         /usr/local/j2sdk1.4.2_03/bin/java
0xff370000      /usr/lib/libthread.so.1
0xff3a0000      /usr/lib/libdl.so.1
0xff280000      /usr/lib/libc.so.1
0xff360000      /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
0xfec00000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/server/libjvm.so
0xff240000      /usr/lib/libCrun.so.1
0xff210000      /usr/lib/libsocket.so.1
0xfeb00000      /usr/lib/libnsl.so.1
0xfebd0000      /usr/lib/libm.so.1
0xff1f0000      /usr/lib/libsched.so.1
0xff270000      /usr/lib/libw.so.1
0xfeae0000      /usr/lib/libmp.so.2
0xfeac0000      /usr/lib/librt.so.1
0xfeaa0000      /usr/lib/libaio.so.1
0xfea70000      /usr/lib/libmd5.so.1
0xfea50000      /usr/platform/SUNW,Sun-Fire-880/lib/libmd5_psr.so.1
0xfea10000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/native_threads/libhpi.so
0xfe9c0000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libverify.so
0xfe970000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libjava.so
0xfe950000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libzip.so
0xfbc20000      /usr/local/j2sdk1.4.2_03/jre/lib/sparc/libnet.so
0xf9be0000      /usr/lib/nss_files.so.1

Heap at VM Abort:
Heap
 par new generation   total 8128K, used 8064K [0x75800000, 0x76000000, 0x76000000)
  eden space 8064K, 100% used [0x75800000, 0x75fe0000, 0x75fe0000)
  from space 64K,   0% used [0x75ff0000, 0x75ff0000, 0x76000000)
  to   space 64K,   0% used [0x75fe0000, 0x75fe0000, 0x75ff0000)
 concurrent mark-sweep generation total 2088960K, used 1929831K [0x76000000, 0xf5800000, 0xf580
0000)
 concurrent-mark-sweep perm gen total 16384K, used 9840K [0xf5800000, 0xf6800000, 0xf9800000)

Local Time = Sun Mar 28 03:22:11 2004
Elapsed Time = 118113
#
# HotSpot Virtual Machine Error : 10
# Error ID : 4F530E43505002EF 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.2_03-b02 mixed mode)


Any help will be greatly appreciated. Thanks in advance.

Tomcat is started with the following options :

CATALINA_OPTS="-server -Xms2048m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC "



- Ian








0
Comment
Question by:kamarja
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 9
18 Comments
 
LVL 5

Expert Comment

by:twobitadder
ID: 10706552
try

-XX:+UseParallelGC

for the throughput collector.
this uses several threads to perform minor collections, your eden space is saturated. This could help.
0
 
LVL 5

Accepted Solution

by:
twobitadder earned 1600 total points
ID: 10706618
I say this because:

Function=[Unknown. Nearest: JVM_ArrayCopy+0x5600]

implies that there could be a problem copying to the the 'to' region from eden.


read this also:
http://java.sun.com/docs/hotspot/gc1.4.2/

and consider using :
XX:+UseAdaptiveSizePolicy  to try and judge the new size for the young gen after each minor collection.

could also adjust the ratio of young:tenured with:
-XX:NewRatio= >>newRatioOfyoung:tenured<<
0
 

Author Comment

by:kamarja
ID: 10706766
Thanks for the quick response. So you suggest that I start tomcat like :


"-server -Xms2048m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParrallelGC XX:+UseAdaptiveSizePolicy "



0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:kamarja
ID: 10706855
Just wondering if XX:+UseConcMarkSweepGC - and XX:+UseParrallelGC  play well together ???

Thanks.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10707137
Nope,  concurrent mark-sweep tries 'to reduce the time taken to collect the tenure generation', this is the generation of objects that have passed from short term to longer term, which isn't the kind of objects you'll be dealing primarily with in the case of your short lived web connections.

Try it without XX:+UseConcMarkSweepGC, it's a different garbage collector.

Use concMarkSweep when objects reside longer in the system and you have high % tenure.
0
 

Author Comment

by:kamarja
ID: 10707184
Thanks. I will try it and let you know. It's seems that we are having long pause times when the GC is working.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10707412
btw signal 10 is sigbus signal which flags a data bus error, I think this can occur from young overflow, not sure though.
0
 

Author Comment

by:kamarja
ID: 10707707
Hey,

It looks a lot happier right now and the site seems much faster. The load is stable and the memory usage is much better. We currently have 3GiGs free memory while before with the previous config, we would have about 500M free memory. I am going to watch it until tomorrow and I will let you know. Usually, it crashes more in the morning, so I will wait until tomorrow. Thanks so much.

I starting tomcat with :
CATALINA_OPTS="-server -Xmx2048m -Xmx2048m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"
Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10708074
luck :P
0
 

Author Comment

by:kamarja
ID: 10710810
OOps, not out of the woods yet. The webserver is again extremely slow, memory usage and loads high. I found this in the catalina.out log.

Look how long it takes now.
 before it was :

[Full GC 9648K->9560K(27712K), 0.2556779 secs]
[GC 19736K->12858K(27648K), 0.0046218 secs]
[GC 20396K->11729K(27968K), 0.0184282 secs]
[GC 21713K->15950K(29504K), 0.0217567 secs]
[GC 23246K->14443K(31424K), 0.0294165 secs]
[Full GC 14443K->14219K(31424K), 0.2671838 secs]

approx midnight it's:

[Full GC 1791224K->1791224K(1944192K), 11.8895081 secs]
[Full GC 1791224K->1791224K(1944192K), 11.8294296 secs]
[Full GC 1791224K->1791008K(1944192K), 12.1928545 secs]
[Full GC 1791224K->1791133K(1944192K), 11.9237863 secs]


0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10712389
I think there's probably some kind of memory leak in one of the java components, maybe try something like JProfile to try and find the problem.
0
 

Author Comment

by:kamarja
ID: 10712920
Thanks. Will do.
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10712993
JProbe not JProfile sorry.

http://www.quest.com/jprobe/index.asp

They have a free trial download.
0
 

Author Comment

by:kamarja
ID: 10713587
Oh - ok thanks. I also found this error in the catalina.out, not sure if it's related tho:

WebappClassLoader: Lifecycle error : CL stopped

and

SEVERE: Error in action code
ava.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at
java.net.SocketOutputStream.write(SocketOutputStream.java:136)  at
org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:457) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:654) at
org.apache.jk.server.JkCoyoteHandler.action(JkCoyoteHandler.java:435) at org.apache.coyote.Response.action(Response.java:222)
at org.apache.coyote.Response.finish(Response.java:343)
at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:314) at
org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:387) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:673) at
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:615) at
> org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:786) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:666)
0
 

Author Comment

by:kamarja
ID: 10714852
Hey,

Do you think I could decrease the initial heap size as a work around while we try to find the memory leak. Think it would help ?
FROM:"-server -Xms2048m -Xmx2048m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"

TO:
-server -Xmx512m -Xmx512m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:+PrintGCDetails"

Thanks.

Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10716029
For the broken pipe check:
http://archives.real-time.com/pipermail/tomcat-users/2003-January/091849.html
It could be the interrupted download described.

I don't know the tomcat classes but the WebAppClassLoader has a stop method() and I don't know why it's being called, some printStackTrace would be needed to pinpoint the reason.

General docs:
http://jakarta.apache.org/tomcat/tomcat-5.0-doc/catalina/docs/api/org/apache/catalina/loader/WebappClassLoader.html
Lifecycle docs:
http://jakarta.apache.org/tomcat/tomcat-5.0-doc/catalina/docs/api/org/apache/catalina/Lifecycle.html

I'm sorry but I don't know the reason for the problem and my best guess is it's a memory leak that's leaving objects with references to fill up the heap.

I think decreasing the initial heap size will just degrade performance more by using slower hard disc swap space. Perhaps increasing it will help a little if there is some swapping going on atm, but this wouldn't solve your problem.

0
 

Author Comment

by:kamarja
ID: 10716731
Thanks - you did point me in the right direction. I think we may have found some large responses from a few of our php files that uses xml. They are over 5megs. We'll see what happens when we reduce tha amount of data that they return.

Thanks for the pointers and your time.

- Ian
0
 
LVL 5

Expert Comment

by:twobitadder
ID: 10716753
I suppose it's a kind of memory leak :P
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
Suggested Courses
Course of the Month10 days, 1 hour left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question