Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

Fast Data Push to Excel

aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Published:
Updated:

Introduction

One of my clients needed to import, massage, and push some data into an Excel workbook.  He was comfortable with the VB classic environment and felt that he could accomplish the massaging of data once it was in an array, but he wasn't sure how to import data.  Our main production application has a few data import routines that would do the trick, but I thought that this was simple enough to warrant creating a quickie application for him.  I would handle the I/O and create an event where he could write his data massage code.  Multiple files would need be imported, in a single or multiple passes, before the next two steps would happen.  I suggested that we package the code inside an Excel environment, but he was more comfortable with the stand-alone VB environment.
This article walks through the code development process.  I also share the discoveries I made that greatly improved the performance of pushing the data into the workbook.

The Data
Seeing the data made me realize the importing process would be relatively straight forward.  The first line in each file was an integer with the number of actual data lines that followed.  The data lines were two tab-delimited number columns (single-precision data).  Example:
4501
                      1	19.7585
                      1.666	24.8874
                      2.332	27.468
                      2.998	25.9469

Open in new window

 

Importing the Data

This was almost trivially easy.  For the user interface (UI), I created a form with a drive, directory, and filelist controls.  I added the usual event code for the drive and directory controls' change events to cascade the change to the next control down the logical hierarchy.  I packaged the data-related events into three command buttons and added a label to display the progress of the event code.
Data Import Form screenshot 
Option Explicit
                      
                      Dim sngData() As Single
                      
                      Private Sub cmdImport_Click()
                        Dim intFN As Integer
                        Dim lngFiles As Long
                        Dim lngCount As Long
                        Dim lngStart As Long
                        Dim lngLines As Long
                        
                        Me.MousePointer = vbHourglass
                        Me.Enabled = False
                        
                        For lngFiles = 0 To File1.ListCount - 1
                          If File1.Selected(lngFiles) Then
                            intFN = FreeFile
                            Open File1.Path & "\" & File1.List(lngFiles) For Input As intFN
                            Input #intFN, lngLines
                            lblStatus.Caption = "Reading " & File1.List(lngFiles)
                            
                            lngStart = UBound(sngData, 2)
                            If lngStart = 1 Then lngStart = 0
                            
                            ReDim Preserve sngData(1 To 2, 1 To (lngStart + lngLines))
                            
                            For lngCount = (lngStart + 1) To (lngStart + lngLines)
                              Input #intFN, sngData(1, lngCount), sngData(2, lngCount)
                            Next
                            Close intFN
                          End If
                        Next
                        
                        Me.MousePointer = vbDefault
                        Me.Enabled = True
                        
                        lblStatus.Caption = "Finished Data Import. " & UBound(sngData, 2) & " rows in memory."
                      End Sub

Open in new window

Notes:

  • The FileListbox control is set to allow multiple file selections.
  • This code is written for the expected two columns.  It can be easily changed to import a different number of columns in the same manner that the Export routine does.
  • Remember that you can only change the last dimension of an array when using the Preserve keyword
  • Since the array data structure is defined at the form level, multiple imports will cause the array to grow with each pass's data.
  • With the two sample files, this import process has sub-second performance.  I only see the completed message.
 

Exporting to Excel

The client originally planned to save the results of the multiple imports to a text file and then open that file with Excel.  This seemed like a big inconvenience for the user (him).  Besides, I had wanted to play with this process since I'd used it as a bonus example in one of my earlier articles.

A Note about Early Binding and Late Binding
When developing applications that use MS-Office automation objects that are not present in the development environment, you can define such variables using the type library object or as a general Object data type.  In this case, I'm developing an application in the VB (classic) IDE, not as an Excel macro.  Therefore, I needed to define Excel objects (workbooks, worksheets, ranges, etc.).  The easiest and fastest development happens with early binding, which required me to add a reference to the Microsoft Excel v.r Object Library (v.r are version and release numbers for the library).  Doing this enables the IDE Intellisense feature for these variables and adds a set of constants to the development environment -- really nice behavior for optimal programmer efficiency.

However, the code examples you will see in this article use late binding.  I prefer the late binding advantages for my production applications.  

  • Since the Excel automation object doesn't get instantiated until the routine, the program starts quicker. 
  • The CreateObject("Excel.Automation") statement will work for any version of Office.  If my end user has a different version of Excel, early binding will fail since the type library has a different name.
  • Distribution of the application is much simpler.  I don't have to add instructions to my user on how to add a reference to the Excel library on their PC.
Brute Force is Sloooooow
Despite my best efforts, I couldn't get very good performance out of cell-at-a-time value assignments, as shown in this (now removed) code:
 
'Note: the following is a much slower way to export the data to Excel
                        For lngCount = 1 To UBound(sngData, 2)
                          rngOut.Offset(lngCount - 1, 0).Value = sngData(1, lngCount)
                          rngOut.Offset(lngCount - 1, 1).Value = sngData(2, lngCount)
                          rngOut.Offset(lngCount - 1, 2).Formula = "=A" & lngCount & "/B" & lngCount
                          If (lngCount Mod 100) = 0 Then
                            lblStatus.Caption = "Now at Excel row " & lngCount & " of " & UBound(sngData, 2) & " rows"
                            DoEvents
                          End If
                        Next

Open in new window

When the export code used this push method, it ran about two minutes.  I knew I certainly didn't want my customer waiting that long (or longer) for the export process to finish.

Decent Performance
A block data transfer was the way to go to get decent performance.  I used the
 
rngChunk.Value = parmxlObj.WorksheetFunction.Transpose(sngData)

Open in new window

statement.  During my tests, this statement caused a 13 (type mismatch) trappable error.  After a change of variable data type (to Variant) failed to correct the problem, I did a Google search for possible reasons. Turns out that you can only transfer 5461 cell items with a single Transpose function call.
Reference: http://support.microsoft.com/kb/177991

My first version of the BulkLoad routine used the Transpose function (above), limiting the size of the array to meet the 5461 cell limit.  I populated an array with chunks of data from the imported array.  
 
    'following statement is in a nested loop, one for each dimension
                          DataForRange(lngCol, lngLoop) = sngData(lngCol, lngLoop + (lngChunk - 1)) 
                      
                          'following statement transfers the chunk of data
                          rngChunk.Value = parmxlObj.WorksheetFunction.Transpose(DataForRange)

Open in new window

This reduced the run time from two minutes to under five seconds, which was great.  But I wondered if there might be some more performance tuning tweaks.  

A little investigation into the details of the Transpose function revealed that it did little more than swap the first and second dimensions of a two-dimension array.  In this case, my (1 To 2, 1 to 2500) chunk array data was copied into a (1 to 2500, 1 To 2) array.  Armed with that knowledge, I swapped the dimensions of my chunk array variable (DataForRange) and the order of the indexes used when populating it with my imported data.

By swapping the dimensions with my own code, I was able to eliminate the Transpose function.  This improved the performance again:
  • Removed an unnecessary function call.
  • Removed an Excel object reference required for the transpose method.
  • Removed the need to pass the Excel automation object (parmxlObj) to the BulkLoad routine.
 
Private Sub cmdExport_Click()
                        Dim xlObj As Object    'New Excel.Application -- only used with Excel reference
                        Dim wkbOut As Object   'Excel.Workbook
                        Dim wksOut As Object   'Excel.Worksheet
                        Dim rngOut As Object   'Excel.Range
                        Dim sngStart As Single	'forperformance measurement  
                      
                        'output to Excel workbook
                        lblStatus.Caption = "Begin Excel Data Export"
                        Set xlObj = CreateObject("Excel.Application")
                        Set wkbOut = xlObj.Workbooks.Add
                        Set wksOut = wkbOut.Worksheets("Sheet1")  'can skip this step
                        Set rngOut = wksOut.Range("A1")           'by replacing with wkbOut.Worksheets("Sheet1").Range("A1")
                        
                        Me.MousePointer = vbHourglass
                        Me.Enabled = False
                        
                        xlObj.ScreenUpdating = False
                        xlObj.Calculation = -4135     '=xlCalculationManual
                      
                        sngStart = Timer
                        BulkLoad rngOut, sngData
                        
                        lblStatus.Caption = "Finished Excel Data Export. (" & Timer - sngStart & " seconds)"
                      
                        xlObj.Calculation = -4105     '=xlCalculationAutomatic
                        xlObj.ScreenUpdating = True
                        xlObj.Visible = True
                        
                        Set rngOut = Nothing
                        Set wksOut = Nothing
                        Set wkbOut = Nothing
                        Set xlObj = Nothing
                      
                        Me.MousePointer = vbDefault
                        Me.Enabled = True
                      
                      End Sub
                      
                      '================================
                      '================================
                      'Note: this version of BulkLoad does not use the Transpose function,
                      '	but it still loads the data in chunks
                      Private Sub BulkLoad(parmRange As Object, parmData() As Single)
                        Dim DataForRange() As Single    'may need Variant in future
                        Dim lngLoop As Long
                        Dim lngChunk As Long
                        Dim rngChunk As Object
                        Dim lngCol As Long
                        Dim lngChunkSize As Long
                        
                        lngChunkSize = 5000 \ UBound(parmData, 1)
                        
                        If UBound(parmData, 2) > lngChunkSize Then
                          ReDim DataForRange(1 To lngChunkSize, LBound(parmData, 1) To UBound(parmData, 1)) 
                          For lngChunk = 1 To (UBound(parmData, 2) - (UBound(parmData, 2) Mod lngChunkSize)) Step lngChunkSize
                            For lngLoop = 1 To lngChunkSize
                              For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
                                DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1)) 
                              Next
                            Next
                            Set rngChunk = parmRange.Range(parmRange.Offset(lngChunk - 1, 0), parmRange.Offset(lngChunk - 1 + lngChunkSize, 1))
                            rngChunk.Value = DataForRange   
                            
                            lblStatus.Caption = "Now at Excel row " & lngChunk & " of " & UBound(parmData, 2) & " rows"
                          Next
                          
                          'last little chunk
                          ReDim DataForRange(1 To (UBound(parmData, 2) Mod lngChunkSize), LBound(parmData, 1) To UBound(parmData, 1)) 'UBound(sngData, 2) - 1)
                          For lngLoop = 1 To (UBound(parmData, 2) Mod lngChunkSize)  
                            For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
                              DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1)) 
                            Next
                          Next
                          Set rngChunk = parmRange.Range(parmRange.Cells(lngChunk, 1), parmRange.Cells(lngChunk - 1 + (UBound(parmData, 2) Mod lngChunkSize), 2))
                          rngChunk.Value = DataForRange 
                        
                        Else
                          
                          'only little chunk
                          lngChunk = 1
                          ReDim DataForRange(1 To UBound(parmData, 2), LBound(parmData, 1) To UBound(parmData, 1))
                          For lngLoop = 1 To (UBound(parmData, 2) Mod lngChunkSize)
                            For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
                              DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop + (lngChunk - 1))
                            Next
                          Next
                          Set rngChunk = parmRange.Range(parmRange.Cells(lngChunk, 1), parmRange.Cells(lngChunk - 1 + UBound(parmData, 2), 2))
                          rngChunk.Value = DataForRange  
                        
                        End If
                      End Sub

Open in new window

cmdExport_Click Notes:

  • For a better UI, prevent the user from doing anything with the form while code is running.
  • Present the user with progress feedback.
  • Some of the comments, such as " 'Excel.Workbook" and " '=xlCalculationManual" are remnants of the development process, which used early binding.
  • With late binding, you don't have access to the constants.
BulkLoad Notes:
  • Instead of 5461, I used 5000 as the maximum transfer size.
  • Data transfer (chunk size) is calculated based on the number of columns in the array.
  • Since the customer might add some columns, I wrote the export routine with a loop instead of a hard-coded number of columns like I did with the import routine.
  • The client can easily pass a different array to the BulkLoad routine.
 

Code Refinements

I had trimmed about a half second off the run time by eliminating the Transpose function.  Then it hit me...with the worksheet function eliminated, I was no longer bound by its 5461 cell limit.  I no longer needed to transfer the data in chunks.  The BulkLoadFast version of the routine performs considerably faster than both the original and the tuned versions of BulkLoad.

Private Sub BulkLoadFast(parmRange As Object, parmData() As Single)
                        Dim DataForRange() As Single
                        Dim lngLoop As Long
                        Dim lngChunk As Long
                        Dim rngChunk As Object
                        Dim lngCol As Long
                        Dim lngChunkSize As Long
                        
                        lngChunkSize = UBound(parmData, 2)
                        ReDim DataForRange(1 To UBound(parmData, 2), LBound(parmData, 1) To UBound(parmData, 1))
                        For lngLoop = 1 To lngChunkSize
                          For lngCol = LBound(parmData, 1) To UBound(parmData, 1)
                            DataForRange(lngLoop, lngCol) = sngData(lngCol, lngLoop)
                          Next
                        Next
                        Set rngChunk = parmRange.Range(parmRange.Offset(0, 0), parmRange.Offset(lngChunkSize - 1, 1))
                        rngChunk.Value = DataForRange
                      End Sub

Open in new window

Note: The cmdExport_Click code is now changed to run this new version.

  sngStart = Timer
                        BulkLoadFast rngOut, sngData
                        
                        lblStatus.Caption = "Finished Excel Data Export. (" & Timer - sngStart & " seconds)"

Open in new window

 

Conclusion

Sometimes such 'throw-away' projects can prove to be good learning experiences.  Don't shy away from them.  Although the Transpose worksheet function may be necessary in some cases, it would probably be worth your time to code around it.
Running several tests for each export method, the average performance times to push 9002 rows into an Excel workbook from the two 4501 record sample files are:
 
Cell-at-a-time	       118.72 secs -- OMG! this is terribly sloooow
                      Transpose()   		4.52 secs -- 5000 cell chunks
                      BulkLoad  		4.07 secs -- direct array transfer chunks
                      BulkLoadFast		2.81 secs -- a single bulk data transfer

Open in new window

Given these figures, the performance times relative BulkLoadFast are:
30.96% elapsed time reduction from direct array chunks
37.83% elapsed time reduction from Transpose function chunks
97.63% elapsed time reduction from cell-at-a-time transfer  (almost 60 times faster)


Even if you have a one-dimensional array, it would be worth your time to copy the data into a two-dimension array and transfer the data directly instead of using the Transpose function.  By default, a one dimension array is treated like a single (Excel) row of data.  You need to add a dimension to align with the rows and columns in the target worksheet.

When doing office automation, your application takes a performance 'hit' every time it does something with one of the automation object variables.  It would behoove you to minimize their use when performance is a concern.

If your data source is accessed through a recordset object (database table or ISAM data driver) you should consider using the CopyFromRecordset range method to transfer the data to Excel.
Reference: http://support.microsoft.com/kb/246335
______________
If you found this article helpful, please click the Yes link to the right.
17
16,206 Views
aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT

Comments (3)

Thank you!
CERTIFIED EXPERT

Commented:
hi Aikimark,

I like the article :-)
It's well written and reminded me of a blog on another site which is where I initially learned some of the techniques for writing from memory to a range and also got some clarification relating to potential issues when using Transpose. I've decided to post the links here as it may be useful for someone when they are troubleshooting...


John Walkenbach's initial blog is http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/ and the comments following the blog include:
- Erik Eckhardt provides a link to the same KB article #177991 as you.
- Patrick O'Beirne goes into a little more detail about a limitation of the array method.
- and Fiaz Idris states that the limitation and excel's resulting behaviour have changed in excel 2010.

Thanks
Rob
aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Top Expert 2014

Author

Commented:
Thanks for the links and the kudos, Rob.  It's good to hear that MS has removed some of the worksheet function limitations in 2010

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.