Link to home
Start Free TrialLog in
Avatar of ntzanos
ntzanosFlag for Greece

asked on

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Hello,

I get the exception in the title in an application I have written.
It seems to be caused by a large StringBuffer. The settings I
use are -Xms1024m -Xmx1512m and
-XX:+AggressiveOpts -XX:+UseParallelGC.

The machine is running Kubuntu 9.04 64bit and the physical
memory is 2g.

Any ideas on how I could overcome the problem?

Regards,

Nick
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

>>Any ideas on how I could overcome the problem?

Probably by using the StringBuffer more carefully - they have big gotchas. Please post code
try running it in a 32 bit jvm

though there are issues in frequent StringBuffer creation and setting size, mostly the issue will be with String copies created out of the buffer and memory leaks. first try running profiler and see if there are leaks. On the other hand, also check if it is really required to have that big string in memory. check the possibility of reading partial string to memory through stream rather loading the entire string into memory. also consider increasing the heap!
Avatar of ntzanos

ASKER

The following bit of code is the problematic one.
The error is thrown for the line where it says sb.append(line)

Could you suggest some more elaborate solution on an alternative? I am a bit lost at the moment

Many thanks

Nick
String sFile = utils.loadFileToStringWithNewlines(sInputFile);
		ArrayList<String> alLines = new ArrayList<String>(Arrays.asList(sFile.split("\n")));
 
		for (String sLine : alLines) {
            String[] saTokens = sLine.replaceAll(",,", ", ,").split("\"*,\"*");
            if (saTokens.length < 4)
                continue;
            String sSense = saTokens[0];
 
            StringBuffer sbDef = new StringBuffer();
            int iCnt = 4;
            while (iCnt < saTokens.length)
                sbDef.append(saTokens[iCnt++]);
            String sDefinition = sbDef.toString();
 
            String sPolarity = saTokens[2].trim();
            int iPolarity = POLARITY_NONPOLAR;
            if (sPolarity.length() > 0)
                iPolarity = POLARITY_POSITIVE;
            else
                if (saTokens[3].trim().length() > 0)
                    iPolarity = POLARITY_NEGATIVE;
 
			hSenseToDefinition.put(sSense, sDefinition);
            hSenseToPolarity.put(sSense, iPolarity);
 
		// Start of code that prepares for 10-fold cross validation
 
		int giSize = alLines.size();
		int tenth = giSize / 10;
		LineNumberReader lnr = null;
		try {
			lnr = new LineNumberReader(new FileReader(sInputFile));
		} catch (FileNotFoundException e) {
			System.err.println("File not found..");
		}
 
		StringBuffer sb = new StringBuffer();
		HashMap<Integer,String> giChunks = new HashMap<Integer,String>();
		int counter = 1;
		String line = "";
		try {
			while(true) { // The reason this is done is so that I can get the final chunk (which will throw the EOFException and be caught below
			//while((String line = lnr.readLine()) != null) {
				line = lnr.readLine();
				if (((lnr.getLineNumber() + 1) % tenth) == 0) {
					giChunks.put(counter++, sb.toString()); // Also increment the counter.
					sb = new StringBuffer();
				}
				sb.append(line);
			}
        } catch (IOException e) {
			// put the final tenth of the file into the HashMap
			giChunks.put(counter, sb.toString());
            try {
                lnr.close();
            } catch (IOException ex) {
                Logger.getLogger(GraphTenFold.class.getName()).log(Level.SEVERE, null, ex);
            }
		}

Open in new window

>                                         sb = new StringBuffer();

you don't need to create a new one each time, just clear it

>>sb = new StringBuffer();

replace with

sb.delete( 0, sb.length() ) ;
Avatar of ntzanos

ASKER

I changed it (sb.delete(0,sb.length()) and it still crushes, but it crushes in 1min and 40 sec instead of the 40 secs that it took initially

That will help but a much worse use of memory is the following:

>>
        String sFile = utils.loadFileToStringWithNewlines(sInputFile);
        ArrayList<String> alLines = new ArrayList<String>(Arrays.asList(sFile.split("\n")));
>>

You create a massive String then proceed to split that String with a regex in order to put it into a List, when you could have simply read it into the List in the first place
can you try this,

String sFile = utils.loadFileToStringWithNewlines(sInputFile);
            ArrayList<String> alLines = new ArrayList<String>(Arrays.asList(sFile.split("\n")));
 StringBuffer sb = new StringBuffer();
StringBuffer sbDef = new StringBuffer();

            for (String sLine : alLines) {
            String[] saTokens = sLine.replaceAll(",,", ", ,").split("\"*,\"*");
            if (saTokens.length < 4)
                continue;
            String sSense = saTokens[0];
 
if( sbDef.length() > 0 ) sbDef.delete( 0, sbDef.length() ) ;

            int iCnt = 4;
            while (iCnt < saTokens.length)
                sbDef.append(saTokens[iCnt++]);
            String sDefinition = sbDef.toString();
 
            String sPolarity = saTokens[2].trim();
            int iPolarity = POLARITY_NONPOLAR;
            if (sPolarity.length() > 0)
                iPolarity = POLARITY_POSITIVE;
            else
                if (saTokens[3].trim().length() > 0)
                    iPolarity = POLARITY_NEGATIVE;
 
                  hSenseToDefinition.put(sSense, sDefinition);
            hSenseToPolarity.put(sSense, iPolarity);
 
            // Start of code that prepares for 10-fold cross validation
 
            int giSize = alLines.size();
            int tenth = giSize / 10;
            LineNumberReader lnr = null;
            try {
                  lnr = new LineNumberReader(new FileReader(sInputFile));
            } catch (FileNotFoundException e) {
                  System.err.println("File not found..");
            }
 
            
                                if( sb.length() > 0 ) sb.delete( 0, sb.length() ) ;
            HashMap<Integer,String> giChunks = new HashMap<Integer,String>();
            int counter = 1;
            String line = "";
            try {
                  while(true) { // The reason this is done is so that I can get the final chunk (which will throw the EOFException and be caught below
                  //while((String line = lnr.readLine()) != null) {
                        line = lnr.readLine();
                        if (((lnr.getLineNumber() + 1) % tenth) == 0) {
                              giChunks.put(counter++, sb.toString()); // Also increment the counter.
                              sb.delete( 0, sb.length() ) ;
                        }
                        sb.append(line);
                  }
        } catch (IOException e) {
                  // put the final tenth of the file into the HashMap
                  giChunks.put(counter, sb.toString());
            try {
                lnr.close();
            } catch (IOException ex) {
                Logger.getLogger(GraphTenFold.class.getName()).log(Level.SEVERE, null, ex);
            }
            }
... in fact the finer details are even worse, but the big picture is bad enough
Out of interest - what size is that input file?
use the following to read the lines on the file into a list

http://helpdesk.objects.com.au/java/how-do-i-read-a-text-file-line-by-line-into-a-list

you can adapt it to filter your lines as required

Avatar of ntzanos

ASKER

The size of the file is about 1mb.
I will try again and let you all know. Maybe I need to reimplement the code because it was a mixture of my own code, with code from another person, so it would be better if I start it from scratch.

I will keep you informed.

Nick
ASKER CERTIFIED SOLUTION
Avatar of Mick Barry
Mick Barry
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ntzanos

ASKER

Thanks for all the help. I finally decided that it would be best to reimplement the code. And now it goes much faster (using advice from all people here). Thanks a lot
ntzanos, can you tell us why you accepted that answer please? Future searchers need to know