<

Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x

Understanding Excel File Internals

Published on
12,037 Points
5,137 Views
4 Endorsements
Last Modified:
Approved
You may be aware that Excel (and Office) 2007 has some new file formats using OOXML and this article is aimed at exploring those new formats.

So how does Excel store the file contents - cell data including formulas and formats, conditional formatting, VBA code, etc. etc. Lets investigate. Create a new Excel file MyTest.xlsm and enter some dummy test data in the first sheet. Add some formulas and conditional formatting (if you want to really understand the details).

Sample sheet
Let us now see how excel stores this data in the file. Open the file in notepad or a hex editor. Did you notice the first 2 characters? "PK". So Excel compresses its file contents. Now we know why there is not much difference if you compress an Office 2007 file.

Notepad view
Lets look into the compressed contents. Rename the file extension from .xlsm to .zip

Rename
Open the MyFile.zip file. Wow! its an extensive structure with xml files to store the workbook, worksheets, calculations, sharedstrings, etc.

Folder view
This is how the XML of the Sheet1 looks.

Sheet1 XML
Lets explore more. Lets go back to our original file and add some VBA code to it.

Adding code
Save the file and redo the same steps as earlier to open the xml file structure. We now have another XML file called vbaProject.bin. This is the part that stores the VBA code and any objects such as userforms.

Some more interesting bits. Check how Excel stores its file contents. Look into the file sharedStrings.xml. It stores the strings at a global level, not in the sheet itself. Also look into how it stores the formulas and formatting. Try out things like like dragging a cell content (like Area 1) into 5 cells (producing Area 1, Area 2, Area 3...) and see how it stores the values... using ranges.

That also says that, it may be better to access Excel data directly from these XML files rather than through the Excel application model. We could have much more control and simplified logic for storing (for instance in database), transporting (eg client machine to server), opening and displaying simple content. Many possibilities here...

Well, I was pretty pleased that today I learned something new.

http://www.baniks.com/blog/2010/08/understanding-excel-file-internals.html
4
Comment
Author:gbanik
1 Comment
 
LVL 13

Author Comment

by:gbanik
Yes. thats my blog.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Join & Write a Comment

This Micro Tutorial will demonstrate how to use a scrolling table in Microsoft Excel using the INDEX function.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa‚Ķ
Other articles by this author

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month