Link to home
Create AccountLog in
Avatar of patrick_exe
patrick_exeFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Reading in from Excel via VSTO is about 10-100 times slower than from VBA

I have a .net 3.5 application with an Excel VBA front end.  Some of the data (e.g. 10000 rows with 150 columns, spread across different worksheets) is stored in Excel, and whilst reading this in from Excel VBA takes about 2-3 seconds, it takes minutes to read in the same data from C# using VSTO (Visual Studio Tools for Office).  
Is this an inevitable consequence of the fact that VSTO has to treat each Excel Cell as of type Object, and has an overhead of talking to and from Excel via the Primary Interop Assemblies, whereas VBA is already within Excel?  
I am not reading in each individual Excel cell (which would slow things down even more), but am reading the whole of each Range containing a rectangular block of data into an array (see code snippet below).  
However, because the array has to be a two dimensional array of type Object, I then have to convert each cell into the appropriate type (DateTime, int, double, string etc.) which probably adds to the time.

Am I missing something?  If not, then Microsoft should be aware that the slowness of reading from Excel (as compared to Access which is very fast in .net) is going to be a significant barrier to converting VBA projects to VSTO/.net projects.

// c# code
int iNumberOfColumns = 251, iNumberOfRecords = 2000;
// should be more efficient to read in as much of the Excel data as possible from a range into a 2D Array of objects:
Object[,] array = new Object[iNumberOfRecords, iNumberOfColumns];
Excel.Range thisRange = XLWorksheet.get_Range(objRange.Cells.get_Item(2, 1), objRange.Cells.get_Item(2 + iNumberOfRecords - 1, iNumberOfColumns));
array = (Object[,])thisRange.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);  // previously .Value2 which leads to dates and currency excel values being returned as doubles!
 
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
// IMPORTANT: for some reason, the array when copied from Excel is 1 based in both dimensions!
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
for (int iRow = 1; iRow <= iNumberOfRecords; iRow++)
{
    int iThisMemberID = pConvert.ToInt32(array[iRow, 1]); // pConvert is an object of a class that I created to handle the conversion of objects to ints, DateTime, etc
    // more code to read in each field (column) value and do something with it
} // for iRow

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of zorvek (Kevin Jones)
zorvek (Kevin Jones)
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of patrick_exe

ASKER

Kevin

Thanks for the comment.  I'm not sure how best to proceed in that I want to regard this as a partial solution (if there is no known resolution at present) and award some of the points, but I don't want the question to be closed because I want to keep it open in order to find out about any resolution if it happens in future.  What I should I do?

Patrick
I was at the Summit and I heard the complaints. There was no good answer.

Here are some discussions. It looks like formatting and how granular your range references are seems to be the main theme.

http://www.eblanco.com/Default.aspx?tabid=57&EntryID=100

http://www.aspose.com/documentation/file-format-components/aspose.cells-for-.net-and-java/comparing-vsto-with-aspose-cells-for-net.html

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=645249&SiteID=1

Leave the question open and see if any of the guys in the VSTO zone have any ideas. I'm mostly Excel/VBA.

Kevin
Kevin - thanks, I'll wait to see if anyone from the VSTO zone posts anything.  
Patrick
I was discussing VSTO and .net with a workmate who is using it to automate Excel. He confirmed it is dog slow and there is no workaround. Just showing a user form takes about 10 to 100 times longer then when doing the same thing with VBA.

Kevin
The answer is that "it is slower and there is no workaround" as stated above.

Kevin