troubleshooting Question

Excel VBA: How to load whole text (CSV) file in memory and parse strings

Avatar of Igor Spryzhkov
Igor SpryzhkovFlag for Russian Federation asked on
Productivity AppsMicrosoft ExcelDocument Management
4 Comments1 Solution6241 ViewsLast Modified:
Hello!

First at all ... please sorry for my English!

Data file

I have CSV file (text file with TAB delimiters). It's a XER file used by Primavera P6 Professional v8 for import/export data.

First task: How to load XER file into Workbook?

Mostly XER file size is not big: 0.5 – 3.3 MB (5200 – 17000 rows). Maximum filesize (seldom) is 20 MB (160 000 rows).

And I think it's have sense to load whole file in memory and then parse it.
How to do it?

Next task is Parsing

1. XER file is a text file with TAB delimited columns.

2. XER file consist from block of date.  Each block have 3 parts:
First line (started from "%T") of block is Table header. Second line ("%F") is Fields headers. And next rows contain data ("%R").

Example:
%T	RCATVAL
%F	rsrc_catg_id	rsrc_catg_type_id	seq_num	rsrc_catg_short_name	rsrc_catg_name	parent_rsrc_catg_id
%R	56	22	0	Yes	Yes	
%R	27968	22	0	Yes	Yes	
%R	28020	27776	10	Yes	Description
%T	NEXT_TABLE
%F new fields set
%R A B C
     
3. Each table have regular structure.

4. Fields sets in each table (text block) are different.

5. Values of data are variant (but stored in XER file as text)

6. XER file not contain blank rows or special data blocks separator.

7. Last row of XER file is "%E".

My plane is:
A. Parse XER file in memory row by row.
B. When new table ("%T") appears create new Sheet.
C. Then write fields headers ("%F") and rows ("%R")

Questions

1. What is more efficient method for split row (by TAB) to cells (when you don't know columns count, at least for first row "%F" of table)?

2. Maybe if we deal with large data files (160 000 rows) have sence to load whole file (20 MB) into memory, split to tables (block of data) via RegExp and then parse block to rows (or somehow paste them into sheets as array).

3. How to determine total rows count for showing form with progress bar?

4. How to determine Excel version for setup rows limit? --> Global Const EXCEL_ROW_LIMIT = 65536 (or 1048576)

P.S. Now I have some parser (attached) but this one hav¿ slow performance (because in load procedure used approach "symbol by symbol"). Source of XERparser is Oracle Support web-site (access only by suscription). Parser description is here: http://www.plannertuts.com/tutorials/using-primaveras-xer-file-parser-import-wbs-from-excel-without-the-sdk-840/
Difference 2003/2007 versions of Parser is only setup "Global Const EXCEL_ROW_LIMIT = 65536 (or 1048576)" (as far as I know)

Resume

There is base XER Parser (from Oracle) and my task is to optimize performance for load date.

Excel 2003 compatible code needed

Any help will be high appreciated!
xerFileParserBuilder-2003-and-20.zip
Join the community to see this answer!
Join our exclusive community to see this answer & millions of others.
Unlock 1 Answer and 4 Comments.
Join the Community
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 4 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros