Solved

Why the Bloat?

Posted on 2011-03-03
24
429 Views
Last Modified: 2012-05-11
About two days ago, we moved a  current project, to a Network Share (Server 2008)  where we could Beta test it.  The file was 4Mb.  After copying it from the Network, back to a local Sandbox, a couple of minor changes were made and re-saved within the Folder.  The file size went from 4Mb to 10Mb.  Fortunately, there was separate back-up of only a couple of hours old, so it wasn't catastrophic.

Since then, we have moved the file back and forth, made changes, etc.  with no bloating issues.  The only difference, may be that I had a couple of Utility Add-Ins, that seemed to have been stripped out or disabled.  But, that shouldn't cause any bloat like this to occur.

What would cause this sudden increase and/or where can one begin to look?  Only a couple of lines of code were changed, within a 15 minute time period,  so are there any specific locations or places to look into that could show such an increase, possibly storing something in a cache?

Thanks,
0
Comment
Question by:Cook09
  • 10
  • 10
  • 2
  • +1
24 Comments
 
LVL 22

Expert Comment

by:rspahitz
ID: 35028204
if you did something to the spreadsheet, it could affect the size.  For example, if you accidentally went to the last row and typed something in, even if you later deleted it, Excel might think that it should save every cell from that row up.

Ctrl+G, [Special...], (*) Last Cell

should get you there so see if the bottom of the sheet is what you think it is.
0
 

Author Comment

by:Cook09
ID: 35028266
But the spreadsheet only had six rows, and there is only 18 cells that had any computations in them, can't see that would would more than double the size of the entire file; affect the memory, maybe, not the entire file.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35028559
as soon as you change any cell (sometimes even if you cancel) then Excel assumes you want everything up to that point.  So you you change cell A100000, Excel assumes that you want to work with 100000 rows and allocates space for them (all blank.)  If you later remove the contents of A100000, it doesn't always go back and remove the references to the blank rows, although sometimes it will if you close and reopen the spreadsheet.
0
 

Author Comment

by:Cook09
ID: 35029248
OK, for the sake of argument, let's say that is true.  Even after closing and re-opening Excel and the values have been previously deleted.  WHERE, do I go to find out, what it's referencing, and then correct it?
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35029372
That's my original statement

Use Ctrl+G to open up the Goto window
bottom left is a [Special...] button
on the next window is an option for (*) Last Cell.

Select that and click OK.  It will take you to what it thinks is the last cell of the current sheet.  You may need to do that on each sheet.

If it's not that, I suggest exporting each sheet to a new workbook, and for each sheet, check the size of the file where you export it to see if any of them cause the size to jump.  If not, your new workbook will be all better; otherwise, at least you know which sheet is the problem sheet.
0
 

Author Comment

by:Cook09
ID: 35036527
One issue that I found out yesterday, right before going home, is the use of the word "CLEAR."

I had a 3Mb .xlsx file and in the immediate window typed:

Range(Cells(20,1),Cells 100000,1000)).Clear  'I have Excel 2007

The .xlsx was then saved.  It's size went from 3Mb to 274Mb.  I need to now reopen it and find out why it does that.  Rumor has it that if:

Application.PageBreaks = False is used prior to using .Clear that this may not happen.  The question also remains, there should also be something that would reduce the file size based upon something as innocuous as the word .Clear.


0
 
LVL 45

Expert Comment

by:patrickab
ID: 35038932
Ron,

Where have you got to with this problem?

Patrick
0
 

Author Comment

by:Cook09
ID: 35039550
Patrick,
The path forward of the project is, at the moment, to be careful and save often.

 Clearly the .Clear CAN cause bloat to occur, but not every time. The .ClearFormats and .ClearContents never caused the file size to change.  It didn't seem that the Activesheet.DisplayPageBreaks had any effect.  However, the command Application.UsedRange command does work very well in resetting the range, which surprised me.  That one I will keep in my arsenal. Probably set it as a shorcut key.  However, it doesn't reduce the file size.  Even though, one range included all 1,048,576 rows, and it reset it back to where the active cell was located.

 I looked at the XML portions and did see some differences, but am not skilled enough to mess with them yet.  It is interesting that on one page, each row number had a =Hidden, while another just had it on one row and the rest of the rows were blank (in XML).  So Excel clearly has formatted them or documented the formatting differently.  I don't know if that is part of any issue or not.  Someday, I'll explore the XML portion or find someone who would know how Excel determines file size and what one can really do about it.

While the project is stable, I have developed a way to do a "mirrored back-up"  on a sheet-to-sheet basis, so it shouldn't be that difficult to do it on a workbook by workbook basis.  I think.  Fortunately, it is Formula based and not VBA, so if something in VBA causes a bloat, it shouldn't bother the other one.

The file size increase to 274Mb does provide a concern, so caution is the main ingredient at this point.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35049158
I suspect that since Excel-2007 was the first version to use the XML format that maybe MS didn't quite work out all of the save issues, at least in terms of an optimal file size.  IT would be interesting to know if this same problem occurs in 2010 since that's the 2nd generation.
0
 

Author Comment

by:Cook09
ID: 35056743
I guess we'll see, as of now it is still a mystery.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35318137
If issue never resolved then delete; otherwise an explanation and either accept 35029372 (if it helped solve the problem) or points refund probably would be good.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 

Author Comment

by:Cook09
ID: 35323609
Sorry for the delay, but other concerns arose.  Attached is a fairly stripped down version of a workbook, that still shows over7MB in size.  I have tried almost all of the "Reset Used Cell" procedures that could be found.  None seemed to work.

I know a couple of pages show almost the entire page as a Used Range but cannot delete those rows or columns in order to reset it.  What code is necessary to reduce the file size and/or reset the used range?

Thanks
Histologyv-114r.xlsm
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35328643
First of all, I see several high-quality pictures embedded into the sheets.  That alone will cause the file size to rise.  If you can get away with it, lower the quality of the images so that it saves memory.

Second, on the LogFile and Costs tabs, when I use the Goto LastCell function that I described earlier, it proceeds to the last cell even though everything above it is blank.  To fix this, to the last cell and use Ctrl_Shift+UpArrow (or Ctrl-Home if there's no data on the page) and right-click-delete and shift up.  When you re-open the document, these cells might get be reset.

Further, possibly unrelated, in the Name Manager (Formulas | Name Manager) I see references to things outside the spreadsheet.  Depending on how Excel is interpreting those, it may be taking extra processing time to access those.

Anyway, removing the pictures resulted in major reductions:

Utilities tab (18 pictures): -1,300K
LoginLog tab: -100K
LogFile tab: -400K
Procedures: -100K

Still big at 5MB, but that's 30% less, and reducing the resolution of those could change it from using almost 2MB to maybe using 200KB.  You could also save the pictures as links rather than embedded.

Anyway, I don't really see any useful data in there.  Is this used as a report?  If there are no calculations, then maybe Word would be a better choice.
Also, if the data source is Access, you could use it for the reports.

So to answer the question, at least some of the bloat is caused by large embedded pictures.  The rest is hard to determine in a quick preview, but you can try copying all the pieces and moving to new sheets then deleting the old sheet to solve some of the problems with extra unused cells being saved.  (Of course, do this on a backup since you could have references there that get deleted when you move to a new sheet.
0
 

Author Comment

by:Cook09
ID: 35332834
Yes, I understand what you're saying about the pictures, and if need be, they can be removed.  As it is running on a decent machine, the pictures, in and of them selves don't cause a lot of slow down. The issue is really the Used Range in the LogFile and Costs Sheet.

You mentioned how to reset the Last Cell, and maybe I'm a little thick to fully understand the process, but anything I try, doesn't reset the Last Cell.

There is almost nothing in the workbook. Yet, it is still over 7 MB.  Had I left the code in there, there would have been discussions about how the code is written and I didn't want to go down that path.

What I had to do was totally delete the LogFile and Costs Worksheets and start over on two completely different worksheets.  Yes, it's still around 4.5MB, with thousands of lines of code, about a dozen UserForms, a few more graphic images, and over 20 Modules.  It's performance is also acceptable.  Here we have almost no code with a few graphic images and I can't get it below 7MB.

That's the issue, what is keeping the file size so large with almost nothing in it?  If I delete the LogFile and the Costs Sheet, the File Size reduces to 1.8MB.  Clearly there is something within those two sheets that is adding over 5MB to the overall file size.  If I remove the images from the LogFile and Login Worksheets, it reduces it to 5.5 MB.    To take the images out of the conversation, I've removed them all, and a basically blank workbook is still over 5.5MB.  The question still remains.   Why and How to Reset the Used Range that does work to reduce the workbook size?
Bloat-with-no-pictures.xlsm
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35335348
I don't see why it's so bloated.  As I indicated very early, Excel 2007/2010 handles file creation a different way from the old and apparently there's something holding onto a large block of information only on those sheets.

My recommendation is to rename those sheets, create new ones with the original names, copy / paste all of the parts to the new sheet, then delete the original.
that should leave your macros and images intact and hopefully the bloat will go away.
0
 

Author Comment

by:Cook09
ID: 35336505
Is there someone else within EE that may have had some experience with this?  Surely, I'm not the only one to experience this issue.

I understand the recommendation, but in certain situations that may not be feasible.  As you know, coping and pasting have issues of their own to contend with.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35337410
I have requested additional assistance; not sure who has experienced this before.

FYI I am seeing the same issue in 2010 so either it carries over with the file or it was never fixed.

(Just a thought...did you try to save as a 2003 xls file?  since you don't care about # of rows or columns, it may fix it, then you can try to save back as 2007 format to see if it remains fixed.)
0
 
LVL 85

Accepted Solution

by:
Rory Archibald earned 500 total points
ID: 35346031
Select the log file sheet and select column A. Format-row height and set to 12.6
Repeat with Costs sheet columns A:B
Save.
File size should be about 50KB.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35346756
Good job Rory...I missed that. :)
0
 

Author Comment

by:Cook09
ID: 35354983
Rory,
Could you explain why that makes a difference in the bloat?  Fortunately, I just had a finished product jump to 56MB, and through deleting the not needed formatted cells, that somehow appeared, and this method, it went down to 4.5MB.
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35355027
I think the problem is that Excel is tracking the height of every row that is not the default height.  Since there are 1 million rows, it is presumably adding 1 million values to tell it how high to make those rows.  With numbers taking up a few bytes each, times 1 million rows, times two or more sheets, you get a lot of unneeded bloat.
But let's see if Rory has any additional insight.
I suspect that in the future, Microsoft may find a more efficient way to store this information so it doesn't cause so much bloat, but as I indicated, this is a new file format and they're probably still working out the issues with it.
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 35355289
That's pretty much it. If you have a look at the XML for one of those sheets you will see that it is storing the formatting information for each cell in col A because it is not the default row height (one sheet alone was 56MB unzipped). One of the drawbacks of the big grid.
0
 

Author Closing Comment

by:Cook09
ID: 35355302
Thanks Rory....
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Using Word 2013, I was experiencing some incredible lag when typing.  Here's what worked for me....
In this article we discuss how to recover the missing Outlook 2011 for Mac data like Emails and Contacts manually.
This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.
This Micro Tutorial will demonstrate in Microsoft Excel how to add style and sexy appeal to horizontal bar charts.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now