[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 5690
  • Last Modified:

OutOfMemoryError when using apache POI API to write Excel file

I am using apache POI API in a Servlet to write data into an Excel (.xls) file then send to browsers. It work fine when the data is not much. But it give an java.lang.OutOfMemoryError when the data volume is high.

Then I have try JExcel API, it is more lightweight compare to POI where it is can support higher volume of data. But there is still a limit (26k of rows with 25 columns).

The main reason I found is that both API use the same mechanism to write Excel file: create an object to represent workbook --> add Sheets --> add Cells --> write to OutputStream.

so the workbook object will encapsule all data before it was write to OutputStream. This size of object will grow very fast when there is lots of data until OutOfMemoryError occurs.

Any suggstion to solve this problem besides configure JVM heap size(Since it will still have a limit) ? Thanks in advance.
0
b612_forever
Asked:
b612_forever
  • 6
  • 3
1 Solution
 
bloodredsunCommented:
0
 
bloodredsunCommented:
0
 
bloodredsunCommented:
If you were doing this for a commercial project and had the budget, you could try ExtenXLs http://www.extentech.com/estore/product_detail.jsp?product_group_id=1 or Actuate http://www.actuate.com/services/eservices/offerings.asp#javaed. They may be a bit expensive but currently they appear to be better featured than POI/HSSF, but I have no idea whether they could deal with large datasets where POI doesn't.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
b612_foreverAuthor Commented:
I am still new in Apache POI, it that possible to write the workbook part by part to OutputStream?

which means the flow will be something like this:

1. insert 5000 rows into workbook
2. write workbook to OutputStream
3. repeat step 1 and 2 until all rows written.

I have try this with JExcel but it end up with client get an corrupted file (because it rewrite the whole workboook to OutputStream everytime) and the OutOfMemoryException will still happen (because the workbook object still do not release any resources all cells that already written to OutputStream).
0
 
bloodredsunCommented:
If you use " hssfworkbook.write(new FileOutputStream(filename));" then I don't think so, although I'll admit it might be possible as I've never tried it and I've never seen any examples do it this way either.

 I seem to remember that there is some issue with the OLE format that prevents you from doing this, which is a great pity as it would at least free up the memory.

I found that changing the memory settings with as much RAM as I could get away with was the best answer. The worst increase in memory usage seemed to be just at the writing stage where the memory usage would just double as the app tried to create the file.

They do know about this memoery problem as in the docs you'll find:
-------------
What's Next?

Further effort on HSSF is going to focus on the following major areas:

    * Performance: POI currently uses a lot of memory for large sheets.
...
--------------


0
 
b612_foreverAuthor Commented:
thanks for ur suggestion.

I had do some reading and is now considering 2 alternative:
1) send data to user as CSV
2) send data to user in HTML table format but the content-type of "application/vnd.ms-excel"

both work with huge amount of data since It can send data into OutputStream while it loop through the data to be send. However, both have the same limitation: MS Excel can only support 65,536 row per sheet.

Is there any way to write break data into different Sheets (not file) in a CSV file or the HTML table way?

Thank You.
0
 
bloodredsunCommented:
Are far as your options go:
1./ is a valid option as Excel can read CSV natively although you will lose any formatting and CSV doesn't support multiple sheets, but at least it's nowhere near as memory intensive as creating an xls file.
2./I know that excel can read XML files but I'm not sure about HTML tables. If that's a valid option then it sounds like a good idea.
But if you use XML rather than HTML you shoud be able to produce multiple sheets if you use the correct schema. Have a look here http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlsmlinss.asp, apparently the only things that the XML file cannot support are:
# OLE objects
# Drawing shapes
# Charts
# Microsoft® Visual Basic® for Applications (VBA) code
# Groups and outlines
# Custom envelope information

Which means that you shoud be able to support multiple sheets (fingers crossed!)
0
 
b612_foreverAuthor Commented:
hi bloodredsun, Thank You for your suggestion on XML. After reading on the MSDN link u post, I am now finally able to export my data into Excel file that is more than 1 sheet.

Just to shared with u, I am finding this link useful, (all I do basically is just change the ASP into Java):
http://www.experts-exchange.com/Web/Web_Languages/XML/Q_20919841.html

how ever, this solution create a file with larger file size compare to the normal Excel file. But it is good enough for me. Thank You.
0
 
bloodredsunCommented:
Thanks dude, and cheers for the link
0
 
ittsiCommented:
Not sure if others are still looking at this post 5 years later but here is the real solution to this problem using POI:

http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/usermodel/examples/BigGridDemo.java
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now