Can I speed up Apache POI when working with xlsx files?

FLEA2007
FLEA2007 used Ask the Experts™
on
I have been using POI for a few years to generate excel files for users.  I've recently had to move from xls files (HSSF) to xlsx files (XSSF) due to the number of rows in a particular export.  Unfortunately, this has resulted in a massive performance loss.  

My use case usually involves a template file that I open and create sheet(s) inside.  Generally there will be some pivot charts and/or other static data in these templates which is why my test uses an empty 'template' file, if I didn't say this someone would suggest using streaming, which would be nice, but I can't lose my template sheets (or the styles on them).

So, here's a fairly simple demonstration:

//import org.apache.poi.hssf.usermodel.HSSFWorkbook;
//import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;

import java.io.*;
import java.util.*;

public class ExcelTest {
	static public int NUM_ROWS = 10000;
	static public int NUM_COLS = 25;
	
	private Workbook workbook;
	private Sheet cSheet;
	private boolean useHSSF;
	
	public ExcelTest(boolean useHSSF) {
		this.useHSSF = useHSSF;
	}
	
	public boolean Test() {
		boolean tbr = true;
		try {
			//Open template xlsx file.  This is a single workbook with a single, empty sheet:
			this.workbook = WorkbookFactory.create(new FileInputStream((this.useHSSF ? "data/test.xls" : "data/test.xlsx")));
			
			//generate a new sheet:
			this.cSheet = this.workbook.createSheet();
			
			//write NUM_ROWS rows
			for (int i = 0; i < NUM_ROWS; i++) {
				Row r = this.cSheet.createRow(i);
				
				//write NUM_COLS columns
				for (int j = 0; j < NUM_COLS; j++) {
					Cell c = r.getCell(j);
					if (c == null) {
						c = r.createCell(j);
					}
					c.setCellValue("TEST " + i + ":" + j);
				}
			}
		} catch (Exception e) {
			tbr = false;
			e.printStackTrace(System.out);
		}
		
		return tbr;
	}
	
	public boolean Export(String filename) {
		boolean tbr = true;
		try {
			FileOutputStream out = new FileOutputStream(filename);
			this.workbook.write(out);
			out.close();
		} catch (Exception e) {
			tbr = false;
			e.printStackTrace(System.out);
		}
		
		return tbr;
	}
	
	static public void main(String[] args) {
		long start;
		boolean didTest;
		ExcelTest et;
		System.out.println("Starting test...");
		
		start = System.currentTimeMillis();
		et = new ExcelTest(true);
		didTest = et.Test();
		if (didTest) {
			et.Export("TEST.xls");
		}
		System.out.println("XLS (HSSF) version took " + (System.currentTimeMillis() - start) + " ms.");
		
		start = System.currentTimeMillis();
		et = new ExcelTest(false);
		didTest = et.Test();
		if (didTest) {
			et.Export("TEST.xlsx");
		}
		System.out.println("XLSX (XSSF) version took " + (System.currentTimeMillis() - start) + " ms.");
	}
}

Open in new window


You will notice that even for a relatively-small data set of 10k rows with 25 cols, the xls version takes ~2 seconds, while the xlsx version takes ~20 seconds.  Does anyone know how/if I can speed up POI's performance when working with large xlsx files?

Thanks!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2015

Commented:
Does the same happen with antivirus off for directory where you export?
xlsx is a zip, it takes second extra to compress and another for antivirus to decompress..

Author

Commented:
I've requested that this question be deleted for the following reason:

Misunderstanding in issue
Top Expert 2015

Commented:
Can you detail a bit?
Commented:
We went with the streaming option as we found the correct documentation where this can be accomplished with a template.

https://poi.apache.org/apidocs/org/apache/poi/xssf/streaming/SXSSFWorkbook.html

Author

Commented:
Found solution in official documentation

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial