Avatar of FLEA2007
FLEA2007
 asked on

Can I speed up Apache POI when working with xlsx files?

I have been using POI for a few years to generate excel files for users.  I've recently had to move from xls files (HSSF) to xlsx files (XSSF) due to the number of rows in a particular export.  Unfortunately, this has resulted in a massive performance loss.  

My use case usually involves a template file that I open and create sheet(s) inside.  Generally there will be some pivot charts and/or other static data in these templates which is why my test uses an empty 'template' file, if I didn't say this someone would suggest using streaming, which would be nice, but I can't lose my template sheets (or the styles on them).

So, here's a fairly simple demonstration:

//import org.apache.poi.hssf.usermodel.HSSFWorkbook;
//import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;

import java.io.*;
import java.util.*;

public class ExcelTest {
	static public int NUM_ROWS = 10000;
	static public int NUM_COLS = 25;
	
	private Workbook workbook;
	private Sheet cSheet;
	private boolean useHSSF;
	
	public ExcelTest(boolean useHSSF) {
		this.useHSSF = useHSSF;
	}
	
	public boolean Test() {
		boolean tbr = true;
		try {
			//Open template xlsx file.  This is a single workbook with a single, empty sheet:
			this.workbook = WorkbookFactory.create(new FileInputStream((this.useHSSF ? "data/test.xls" : "data/test.xlsx")));
			
			//generate a new sheet:
			this.cSheet = this.workbook.createSheet();
			
			//write NUM_ROWS rows
			for (int i = 0; i < NUM_ROWS; i++) {
				Row r = this.cSheet.createRow(i);
				
				//write NUM_COLS columns
				for (int j = 0; j < NUM_COLS; j++) {
					Cell c = r.getCell(j);
					if (c == null) {
						c = r.createCell(j);
					}
					c.setCellValue("TEST " + i + ":" + j);
				}
			}
		} catch (Exception e) {
			tbr = false;
			e.printStackTrace(System.out);
		}
		
		return tbr;
	}
	
	public boolean Export(String filename) {
		boolean tbr = true;
		try {
			FileOutputStream out = new FileOutputStream(filename);
			this.workbook.write(out);
			out.close();
		} catch (Exception e) {
			tbr = false;
			e.printStackTrace(System.out);
		}
		
		return tbr;
	}
	
	static public void main(String[] args) {
		long start;
		boolean didTest;
		ExcelTest et;
		System.out.println("Starting test...");
		
		start = System.currentTimeMillis();
		et = new ExcelTest(true);
		didTest = et.Test();
		if (didTest) {
			et.Export("TEST.xls");
		}
		System.out.println("XLS (HSSF) version took " + (System.currentTimeMillis() - start) + " ms.");
		
		start = System.currentTimeMillis();
		et = new ExcelTest(false);
		didTest = et.Test();
		if (didTest) {
			et.Export("TEST.xlsx");
		}
		System.out.println("XLSX (XSSF) version took " + (System.currentTimeMillis() - start) + " ms.");
	}
}

Open in new window


You will notice that even for a relatively-small data set of 10k rows with 25 cols, the xls version takes ~2 seconds, while the xlsx version takes ~20 seconds.  Does anyone know how/if I can speed up POI's performance when working with large xlsx files?

Thanks!
JavaMicrosoft ExcelProgramming Languages-OtherProgramming

Avatar of undefined
Last Comment
FLEA2007

8/22/2022 - Mon
gheist

Does the same happen with antivirus off for directory where you export?
xlsx is a zip, it takes second extra to compress and another for antivirus to decompress..
FLEA2007

ASKER
I've requested that this question be deleted for the following reason:

Misunderstanding in issue
gheist

Can you detail a bit?
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
ASKER CERTIFIED SOLUTION
FLEA2007

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
FLEA2007

ASKER
Found solution in official documentation