Solved

Why is filling cells in excel so slow when using Strings

Posted on 2010-09-02
18
721 Views
Last Modified: 2012-05-10
I use the following small simplified VBA macro to fill alot of cells with data.

The following line is the interesting one:

mArray(j, i) = i & "a" & j

if I assign a unique string as above the runtime is 5 seconds for 10000 rows and 20 seconds for 20000 rows
if I assign a constant string such as "a"  the runtime is 0.2 seconds for 10000 rows and 0.4 seconds for 20000 rows
if I assign a number (i *j) the runtime is similar to the one for constant strings


So,

1) why does it take so much longer to insert unique Strings
2) why is there an exponetial growth when intering unique Strings
3) Is there any work around possible for this problem?

Thanks in advance,

Finaris



Source Code:

Public Sub test1()
Dim t1 As Currency
Dim t2 As Currency
Dim mArray As Variant
Const rows As Long = 20000

t1 = Timer

 Application.ScreenUpdating = False
 mArray = ActiveSheet.Range("A1:L" & rows)
 'Fill your array
 For i = 1 To 12
     For j = 1 To rows
        mArray(j, i) = i & "a" & j
     Next j
 Next i
 ActiveSheet.Range("A1:L" & rows).FormulaR1C1 = mArray
 t2 = Timer
 Debug.Print "Execution completed in " & (t2 - t1) & "sec"
 
 Application.ScreenUpdating = True
End Sub
0
Comment
Question by:finaris
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 5
  • 3
  • +1
18 Comments
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33588212
Perhaps it's the concatenate that is causing the increase in time, although I don't see why.  Have you tired to concatenate static strings and test how long that takes?

myArray(j,i) = "A" & "B" & "C"
myArray(j,i) = 1 & 3
0
 

Author Comment

by:finaris
ID: 33588392
The original algorithm on which the example is based reads different strings without concatenation from a file.

So the concatenation is not the problem.

The time is consumed when filling the values into the cells, which can also be seen using debug outputs using Timer
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33588418
I will add there is no reason to use a loop here, which is the most inefficient part of your code.  A range object in Excel is essentially an array, so you can simply write to it directly using your formula.

I have edited your test code below.  Run as is it takes just over 1 second.  If you uncomment the .Copy/PasteSpecial, this jumps to between 50-60 seconds.  My guess is that the PasteSpecial is using a loop to perform the replacements.

WC
Public Sub test1()
Dim t1 As Currency
Dim t2 As Currency
Dim mArray As Variant
Dim mRng As Range
Const rows As Long = 20000

t1 = Timer

 Application.ScreenUpdating = False
 Set mRng = ActiveSheet.Range("A1:L" & rows)
 mRng.Formula = "=COLUMN(" & Cells(mRng.Row, mRng.Column).Address(False, False) & ") & ""a"" & ROW(" & Cells(mRng.Row, mRng.Column).Address(False, False) & ")"
 'mRng.Copy
 'mRng.PasteSpecial xlPasteValues
t2 = Timer
 Debug.Print "Execution completed in " & (t2 - t1) & "sec"
 
 Application.ScreenUpdating = True
End Sub

Open in new window

0
Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33588486
If you read your list of strings in as a 2-dimensional array, you should be able to write it out as well without using a loop.

Is your file a .csv?  Can you upload an example of the file being read?
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33588544
Here's a better question for you.

When you read in this file of strings, how are you storing it?
0
 

Author Comment

by:finaris
ID: 33588592
@ WarCrimes: My problem is not to write coordinates to cells. I have insert strings into cells, which i cannot calculate with formulas. Code above is a simplification to show the problem.

My file is not CSV, its a proprietary format:

[FIle Header]
[section header 1]
a0[TAB]b0[TAB]c0[TAB]d0
a1[TAB]b1[TAB]c1[TAB]d1
a2[TAB]b2[TAB]c2[TAB]d2
[section header 1]
a0[TAB]b0[TAB]c0[TAB]d0
a1[TAB]b1[TAB]c1[TAB]d1
a2[TAB]b2[TAB]c2[TAB]d2
...

0
 
LVL 18

Assisted Solution

by:Cory Vandenberg
Cory Vandenberg earned 400 total points
ID: 33588862
That is why I wasn't concerned with the filetype but how you are storing it once you read it in.  I am guessing you have read it in succesfully, right?  How are you storing it?  In a worksheet?  If so, it's already in a range object.  There should be no need to loop.

I noticed something I've never investigated before though.

Using your test code I tested writing the following values:
 mRng = "=""A"""    - This took 0.5 seconds
 mRng = "A"           ' This took 75 seconds

So it seems it doesn't have anything to do with whether the string is constant or not, but how you are writing it to the Range object.

According to this link, your test loop shouldn't really be causing large delays either, as this states it can do 100,000 records in about 0.1 seconds.

http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/


Using your exact code, I tested the following
        mArray(j, i) = "=""a"""   - 1 second
        mArray(j, i) = "a"           - 75 seconds
        mArray(j, i) = i & "a" & j - 58 seconds
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33589601
I restarted my machine thinking there was some memory leakage going on.  Good news is that

mRng = "A" is now performing faster than
mRng = "=""A"""

as I would have expected.  Both are taking less than 0.2 seconds.

Your code is still slow for any value I try, even the

mArray(j,i) = "=""a""" which was really fast before.

Since you are going to be reading in a dataset anyways, I think it's best to leave the benchmarking for another question.  You are really here to figure out how to read your data faster, right?

If you are reading in the data successfully, then you need to read it into VBA as a Variant array.  If it's stored in a worksheet, this is easy, simply copy the Range.  When you're writing, write to a Range object.  This should reduce the overhead as much as possible.  Here is a link I think might help.  I have used this site before when researching Excel's calculation times.

http://www.decisionmodels.com/VersionCompare.htm
0
 
LVL 7

Expert Comment

by:philip m o'brien
ID: 33589821
May also be worth looking at this page for ideas: http://stackoverflow.com/questions/1489533/process-large-textfiles-quickly-w-vba if you're importing the data in the first place.
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33589836
I've done some further testing and the loop is not the issue.  It's how you are writing the data.

In your code you have

 mArray = ActiveSheet.Range("A1:L" & rows)

This is setting the variable mArray as a Range object (which can be treated as a 2-D array).  But when you do

mArray(j,i) = <whatever value>

it actually is performing a write from VBA to Excel, which is a lot more overhead than if you were just writing to an array in VBA.

The code below is pretty efficient for constants.  It is still slow for cases like

mArray(j,i) = i & "a" & j

but I believe this is because of the overhead involved with reading the iterators.  That is all I can come up with.

To speed up your procedure, make sure you are storing the strings in a VBA array, not in the Range object.  Once you have your array populated, then write to the Range.

WC
Public Sub test2()
Dim t1 As Currency
Dim t2 As Currency
Dim t3 As Currency
Const rows As Long = 20000
Dim mArray As Variant: ReDim mArray(1 To rows, 1 To 12)

t1 = Timer

 Application.ScreenUpdating = False
 'Fill your array
 For i = 1 To 12
     For j = 1 To rows
        'mArray(j, i) = i & "A" & j
        mArray(j, i) = "A"
     Next j
 Next i
 t2 = Timer
 ActiveSheet.Range("A1:L" & rows) = mArray
 t3 = Timer
 Debug.Print "Loop completed in " & (t2 - t1) & "sec"
 Debug.Print "Write completed in " & (t3 - t2) & "sec"
 Debug.Print "Execution completed in " & (t3 - t1) & "sec"
 
 Application.ScreenUpdating = True
 'Set mArray = Nothing
End Sub

Open in new window

0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 33591546
mArray is not a Range, it's a Variant, which ends up as a 2D array. Writing to the array does not write to the sheet.

@Finaris,
Which version of Excel are you using?
0
 

Author Comment

by:finaris
ID: 33594352
@rorya I use Excel 2003 SP3

@ WarCrimes I think you still do not understand the problem.

-> Writing constant String values into the sheet is already FAST with my Sub
-> I read from a proprietary file format, written by another application. No workbook!
-> Reading the data is not the problem, here is NO exponetial growth, and the time spent is little in comparison to intertin the data into te sheet.


=> if I assign a unique string (i & "A" & j will give a unique string for each cell), then the runtime is 5 seconds for 10000 rows and 20 seconds for 20000 rows

The amount of cells increases by 2, the amount of required time increases by 4 ! => Exponential growth!

I need:

- A workaround for this, OR a fix of the problem
0
 
LVL 85

Accepted Solution

by:
Rory Archibald earned 100 total points
ID: 33595042
I don't believe there is a workaround or fix for this, as I think it is due to the way the BIFF (the Excel file format) works. If you put the same string into multiple cells, you effectively create a shared formula record, whereby each cell simply holds a pointer to the actual formula and the formula itself is only stored once. If you enter different strings then obviously you have to store each formula in the record for each cell, which is slower.
Regards,
Rory
0
 

Author Comment

by:finaris
ID: 33595566
I think i figured it out now.

The following 2 lines nearly do exactly the same, the only thing is that the formula notation is much faster, 20 times for my 20k rows example

        mArray(j, i) = i & "a" & j
        mArray(j, i) = "=" & Chr(34) & i & "a" & j & Chr(34)
0
 

Author Closing Comment

by:finaris
ID: 33595583
see my last comment, figured out myself with the most rewarded hint
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33596391
Actually Rory, yes it's a Variant, but as soon as it is set equal to a Range, it's a Range, and when you assign an element of that range a value, it's written to the sheet.  This can be seen by simply removing the Application.ScreenUpdating = False.   You're point about shared formulas is right on though.  Thanks for adding that.

It's pretty obvious from my test that writing to an array in VBA is much faster than writing to a Range object which VBA has to write back to Excel.  

Anyways, I'm not really going to waste any more time on this.

Glad you are satisfied with a workable solution.

Cheers,
WC
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 33596497
But it's not 'Set' - that's the point. It's:
mArray = ActiveSheet.Range("A1:L" & rows)
and NOT:
Set mArray = ActiveSheet.Range("A1:L" & rows)


The former is equivalent to:
mArray = ActiveSheet.Range("A1:L" & rows).Value


and mArray is not a Range object.
0
 
LVL 18

Expert Comment

by:Cory Vandenberg
ID: 33597009
You are correct.  My apologies.

It's still about 3x or more slower using that line in the code, and not just the loop is slower either, but the write as well.

        mArray(j, i) = "A"  'with the line setting mArray = range
Loop completed in 0.957sec
Write completed in 0.75sec
Execution completed in 1.707sec

        mArray(j, i) = "A"  'without the line setting mArray = range
Loop completed in 0.2187sec
Write completed in 0.3086sec
Execution completed in 0.5273sec

        mArray(j, i) = i & "A" & j  'with line setting mArray = range
Loop completed in 2.9531sec
Write completed in 2.6719sec
Execution completed in 5.625sec

        mArray(j, i) = i & "A" & j 'without line setting mArray = range
Loop completed in 0.5938sec
Write completed in 0.8906sec
Execution completed in 1.4844sec

Granted, my sample size is only about 5 runs each, but all were similar.
0

Featured Post

Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you see single cell contains number and text, and you have to get any date out of it seems like cracking our heads.
Do you use a spreadsheet like Microsoft's Excel?  Have you ever wanted to link out to a non excel file on your computer or network drive?  This is the way I found to do it!
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question