<

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x

Fast Data Push to Excel

Published on
40,705 Points
14,005 Views
17 Endorsements
Last Modified:
Awarded
Editor's Choice
Community Pick

Introduction

One of my clients needed to import, massage, and push some data into an Excel workbook.  He was comfortable with the VB classic environment and felt that he could accomplish the massaging of data once it was in an array, but he wasn't sure how to import data.  Our main production application has a few data import routines that would do the trick, but I thought that this was simple enough to warrant creating a quickie application for him.  I would handle the I/O and create an event where he could write his data massage code.  Multiple files would need be imported, in a single or multiple passes, before the next two steps would happen.  I suggested that we package the code inside an Excel environment, but he was more comfortable with the stand-alone VB environment.
This article walks through the code development process.  I also share the discoveries I made that greatly improved the performance of pushing the data into the workbook.

The Data
Seeing the data made me realize the importing process would be relatively straight forward.  The first line in each file was an integer with the number of actual data lines that followed.  The data lines were two tab-delimited number columns (single-precision data).  Example:
4501
1	19.7585
1.666	24.8874
2.332	27.468
2.998	25.9469

Open in new window

 

Importing the Data

This was almost trivially easy.  For the user interface (UI), I created a form with a drive, directory, and filelist controls.  I added the usual event code for the drive and directory controls' change events to cascade the change to the next control down the logical hierarchy.  I packaged the data-related events into three command buttons and added a label to display the progress of the event code.
Data Import Form screenshot 
Option Explicit

Dim sngData() As Single

Private Sub cmdImport_Click()
  Dim intFN As Integer
  Dim lngFiles As Long
  Dim lngCount As Long
  Dim lngStart As Long
  Dim lngLines As Long
  
  Me.MousePointer = vbHourglass
  Me.Enabled = False
  
  For lngFiles = 0 To File1.ListCount - 1
    If File1.Selected(lngFiles) Then
      intFN = FreeFile
      Open File1.Path & "\" & File1.List(lngFiles) For Input As intFN
      Input #intFN, lngLines
      lblStatus.Caption = "Reading " & File1.List(lngFiles)
      
      lngStart = UBound(sngData, 2)
      If lngStart = 1 Then lngStart = 0
      
      ReDim Preserve sngData(1 To 2, 1 To (lngStart + lngLines))
      
      For lngCount = (lngStart + 1) To (lngStart + lngLines)
        Input #intFN, sngData(1, lngCount), sngData(2, lngCount)
      Next
      Close intFN
    End If
  Next
  
  Me.MousePointer = vbDefault
  Me.Enabled = True
  
  lblStatus.Caption = "Finished Data Import. " & UBound(sngData, 2) & " rows in memory."
End Sub

Open in new window

Notes:

  • The FileListbox control is set to allow multiple file selections.
  • This code is written for the expected two columns.  It can be easily changed to import a different number of columns in the same manner that the Export routine does.
  • Remember that you can only change the last dimension of an array when using the Preserve keyword
  • Since the array data structure is defined at the form level, multiple imports will cause the array to grow with each pass's data.
  • With the two sample files, this import process has sub-second performance.  I only see the completed message.
 

Exporting to Excel

The client originally planned to save the results of the multiple imports to a text file and then open that file with Excel.  This seemed like a big inconvenience for the user (him).  Besides, I had wanted to play with this process since I'd used it as a bonus example in one of my earlier articles.

A Note about Early Binding and Late Binding
When developing applications that use MS-Office automation objects that are not present in the development environment, you can define such variables using the type library object or as a general Object data type.  In this case, I'm developing an application in the VB (classic) IDE, not as an Excel macro.  Therefore, I needed to define Excel objects (workbooks, worksheets, ranges, etc.).  The easiest and fastest development happens with early binding, which required me to add a reference to the Microsoft Excel v.r Object Library (v.r are version and release numbers for the library).  Doing this enables the IDE Intellisense feature for these variables and adds a set of constants to the development environment -- really nice behavior for optimal programmer efficiency.

However, the code examples you will see in this article use late binding.  I prefer the late binding advantages for my production applications.  

  • Since the Excel automation object doesn't get instantiated until the routine, the program starts quicker. 
  • The CreateObject("Excel.Automation") statement will work for any version of Office.  If my end user has a different version of Excel, early binding will fail since the type library has a different name.
  • Distribution of the application is much simpler.  I don't have to add instructions to my user on how to add a reference to the Excel library on their PC.
Brute Force is Sloooooow
Despite my best efforts, I couldn't get very good performance out of cell-at-a-time value assignments, as shown in this (now removed) code:
 
'Note: the following is a much slower way to export the data to Excel
  For lngCount = 1 To UBound(sngData, 2)
    rngOut.Offset(lngCount - 1, 0).Value = sngData(1, lngCount)
    rngOut.Offset(lngCount - 1, 1).Value = sngData(2, lngCount)
    rngOut.Offset(lngCount - 1, 2).Formula = "=A" & lngCount & "/B" & lngCount
    If (lngCount Mod 100) = 0 Then
      lblStatus.Caption = "Now at Excel row " & lngCount & " of " & UBound(sngData, 2) & " rows"
      DoEvents
    End If
  Next

Open in new window

When the export code used this push method, it ran about two minutes.  I knew I certainly didn't want my customer waiting that long (or longer) for the export process to finish.

Decent Performance
A block data transfer was the way to go to get decent performance.  I used the
 
rngChunk.Value = parmxlObj.WorksheetFunction.Transpose(sngData)

Open in new window

statement.  During my tests, this statement caused a 13 (type mismatch) trappable error.  After a change of variable data type (to Variant) failed to correct the problem, I did a Google search for possible reasons. Turns out that you can only transfer 5461 cell items with a single Transpose function call.
Reference: http://support.microsoft.com/kb/177991

My first version of the BulkLoad routine used the Transpose function (above), limiting the size of the array to meet the 5461 cell limit.  I populated an array with chunks of data from the imported array.  
 
    'following statement is in a nested loop, one for each dimension
    DataForRange(lngCol, lngLoop) = sngData(lngCol, lngLoop + (lngChunk - 1)) 

    'following statement transfers the chunk of data
    rngChunk.Value = parmxlObj.WorksheetFunction.Transpose(DataForRange)

Open in new window

This reduced the run time from two minutes to under five seconds, which was great.  But I wondered if there might be some more performance tuning tweaks.  

A little investigation into the details of the Transpose function revealed that it did little more than swap the first and second dimensions of a two-dimension array.  In this case, my (1 To 2, 1 to 2500) chunk array data was copied into a (1 to 2500, 1 To 2) array.  Armed with that knowledge, I swapped the dimensions of my chunk array variable (DataForRange) and the order of the indexes used when populating it with my imported data.

By swapping the dimensions with my own code, I was able to eliminate the Transpose function.  This improved the performance again:
  • Removed an unnecessary function call.
  • Removed an Excel object reference required for the transpose method.
  • Removed the need to pass the Excel automation object (parmxlObj) to the BulkLoad routine.
 
Private Sub cmdExport_Click()
  Dim xlObj As Object    'New Excel.Application -- only used with Excel reference
  Dim wkbOut As Object   'Excel.Workbook
  Dim wksOut As Object   'Excel.Worksheet
  Dim rngOut As Object   'Excel.Range
  Dim sngStart As Single	'forperformance measurement  

  'output to Excel workbook
  lblStatus.Caption = "Begin Excel Data Export"
  Set xlObj = CreateObject("Excel.Application")
  Set wkbOut = xlObj.Workbooks.Add
  Set wksOut = wkbOut.Worksheets("Sheet1")  'can skip this step
  Set rngOut = wksOut.Range("A1")           'by replacing with wkbOut.Worksheets("Sheet1").Range("A1")
  
  Me.MousePointer = vbHourglass
  Me.Enabled = False
  
  xlObj.ScreenUpdating = False
  xlObj.Calculation = -4135     '=xlCalculationManual

  sngStart = Timer
  BulkLoad rngOut, sngData
  
  lblStatus.Caption = "Finished Excel Data Export. (" & Timer - sngStart & " seconds)"

  xlObj.Calculation = -4105     '=xlCalculationAutomatic
  xlObj.ScreenUpdating = True
  xlObj.Visible = True
  
  Set rngOut = Nothing
  Set wksOut = Nothing
  Set wkbOut = Nothing
  Set xlObj = Nothing

  Me.MousePointer = vbDefault
  Me.Enabled = True

End Sub

'================================
'================================
'Note: this version of BulkLoad does not use the Transpose function,
'	but it still loads the data in chunks
Private Sub BulkLoad(parmRange As Object, parmData() As Single)
  Dim DataForRange() As Single    'may need Variant in future
  Dim lngLoop As Long
  Dim lngChunk As Long
  Dim rngChunk As Object
  Dim lngCol As Long
  Dim lngChunkSize As Long
  
  lngChunkSize = 5000 \ UBound(parmData, 1)
  
  If UBound(parmData, 2) > lngChunkSize Then
    ReDim DataForRange(1 To lngChunkSize, LBound(parmData, 1) To UBound(parmData, 1)) 
    For lngChunk = 1 To (UBound(parmData, 2) - (UBound(parmData, 2) Mod lngChunkSize)) Step lngChunkSize
      For lngLoop = 1 To lngChunkSize
        For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
          DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1)) 
        Next
      Next
      Set rngChunk = parmRange.Range(parmRange.Offset(lngChunk - 1, 0), parmRange.Offset(lngChunk - 1 + lngChunkSize, 1))
      rngChunk.Value = DataForRange   
      
      lblStatus.Caption = "Now at Excel row " & lngChunk & " of " & UBound(parmData, 2) & " rows"
    Next
    
    'last little chunk
    ReDim DataForRange(1 To (UBound(parmData, 2) Mod lngChunkSize), LBound(parmData, 1) To UBound(parmData, 1)) 'UBound(sngData, 2) - 1)
    For lngLoop = 1 To (UBound(parmData, 2) Mod lngChunkSize)  
      For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
        DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1)) 
      Next
    Next
    Set rngChunk = parmRange.Range(parmRange.Cells(lngChunk, 1), parmRange.Cells(lngChunk - 1 + (UBound(parmData, 2) Mod lngChunkSize), 2))
    rngChunk.Value = DataForRange 
  
  Else
    
    'only little chunk
    lngChunk = 1
    ReDim DataForRange(1 To UBound(parmData, 2), LBound(parmData, 1) To UBound(parmData, 1))
    For lngLoop = 1 To (UBound(parmData, 2) Mod lngChunkSize)
      For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
        DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1))
      Next
    Next
    Set rngChunk = parmRange.Range(parmRange.Cells(lngChunk, 1), parmRange.Cells(lngChunk - 1 + UBound(parmData, 2), 2))
    rngChunk.Value = DataForRange  
  
  End If
End Sub

Open in new window

cmdExport_Click Notes:

  • For a better UI, prevent the user from doing anything with the form while code is running.
  • Present the user with progress feedback.
  • Some of the comments, such as " 'Excel.Workbook" and " '=xlCalculationManual" are remnants of the development process, which used early binding.
  • With late binding, you don't have access to the constants.
BulkLoad Notes:
  • Instead of 5461, I used 5000 as the maximum transfer size.
  • Data transfer (chunk size) is calculated based on the number of columns in the array.
  • Since the customer might add some columns, I wrote the export routine with a loop instead of a hard-coded number of columns like I did with the import routine.
  • The client can easily pass a different array to the BulkLoad routine.
 

Code Refinements

I had trimmed about a half second off the run time by eliminating the Transpose function.  Then it hit me...with the worksheet function eliminated, I was no longer bound by its 5461 cell limit.  I no longer needed to transfer the data in chunks.  The BulkLoadFast version of the routine performs considerably faster than both the original and the tuned versions of BulkLoad.

Private Sub BulkLoadFast(parmRange As Object, parmData() As Single)
  Dim DataForRange() As Single
  Dim lngLoop As Long
  Dim lngChunk As Long
  Dim rngChunk As Object
  Dim lngCol As Long
  Dim lngChunkSize As Long
  
  lngChunkSize = UBound(parmData, 2)
  ReDim DataForRange(1 To UBound(parmData, 2), LBound(parmData, 1) To UBound(parmData, 1))
  For lngLoop = 1 To lngChunkSize
    For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
      DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop)
    Next
  Next
  Set rngChunk = parmRange.Range(parmRange.Offset(0, 0), parmRange.Offset(lngChunkSize - 1, 1))
  rngChunk.Value = DataForRange
End Sub

Open in new window

Note: The cmdExport_Click code is now changed to run this new version.

  sngStart = Timer
  BulkLoadFast rngOut, sngData
  
  lblStatus.Caption = "Finished Excel Data Export. (" & Timer - sngStart & " seconds)"

Open in new window

 

Conclusion

Sometimes such 'throw-away' projects can prove to be good learning experiences.  Don't shy away from them.  Although the Transpose worksheet function may be necessary in some cases, it would probably be worth your time to code around it.
Running several tests for each export method, the average performance times to push 9002 rows into an Excel workbook from the two 4501 record sample files are:
 
Cell-at-a-time	       118.72 secs -- OMG! this is terribly sloooow
Transpose()   		4.52 secs -- 5000 cell chunks
BulkLoad  		4.07 secs -- direct array transfer chunks
BulkLoadFast		2.81 secs -- a single bulk data transfer

Open in new window

Given these figures, the performance times relative BulkLoadFast are:
30.96% elapsed time reduction from direct array chunks
37.83% elapsed time reduction from Transpose function chunks
97.63% elapsed time reduction from cell-at-a-time transfer  (almost 60 times faster)


Even if you have a one-dimensional array, it would be worth your time to copy the data into a two-dimension array and transfer the data directly instead of using the Transpose function.  By default, a one dimension array is treated like a single (Excel) row of data.  You need to add a dimension to align with the rows and columns in the target worksheet.

When doing office automation, your application takes a performance 'hit' every time it does something with one of the automation object variables.  It would behoove you to minimize their use when performance is a concern.

If your data source is accessed through a recordset object (database table or ISAM data driver) you should consider using the CopyFromRecordset range method to transfer the data to Excel.
Reference: http://support.microsoft.com/kb/246335
______________
If you found this article helpful, please click the Yes link to the right.
17
Comment
Author:aikimark
3 Comments

Expert Comment

by:dvadithala
Thank you!
0
LVL 10

Expert Comment

by:broro183
hi Aikimark,

I like the article :-)
It's well written and reminded me of a blog on another site which is where I initially learned some of the techniques for writing from memory to a range and also got some clarification relating to potential issues when using Transpose. I've decided to post the links here as it may be useful for someone when they are troubleshooting...


John Walkenbach's initial blog is http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/ and the comments following the blog include:
- Erik Eckhardt provides a link to the same KB article #177991 as you.
- Patrick O'Beirne goes into a little more detail about a limitation of the array method.
- and Fiaz Idris states that the limitation and excel's resulting behaviour have changed in excel 2010.

Thanks
Rob
0
LVL 48

Author Comment

by:aikimark
Thanks for the links and the kudos, Rob.  It's good to hear that MS has removed some of the worksheet function limitations in 2010
0

Featured Post

OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Join & Write a Comment

This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.
How can you see what you are working on when you want to see it while you to save a copy? Add a "Save As" icon to the Quick Access Toolbar, or QAT. That way, when you save a copy of a query, form, report, or other object you are modifying, you…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month