Automation of Excel by VB.NET –Too Slow


Hello Everybody,

I’m trying to automatically, using VB.NET, place data which exist in individual columns in two separate files into a third file and then multiply the row elements of each one of these two columns and place the result in a third column. So far I’m doing it by using a loop:

              Dim I As Integer
        For I = 1 To 20            
            ObjectWorkSheet.Cells(I, 1) = ObjectCell1(I, 2).Value              
            ObjectWorkSheet.Cells(I, 2) = ObjectCell2(I, 2).Value  
           
            ObjectWorkSheet.Cells(I, 3) = ObjectCell(I, 2).Value * ObjectCell(I, 3).Value
           
            objectWorkBook.Save()
        Next I

This works fine when you have a few data points. However, it’s excruciatingly slow for a large array, say 10,000 points. I was wondering if there’s a way to make it faster by mimicking what one would actually do manually in Excel to accomplish this task. Can’t there be a way by using VB.NET to copy each one of the columns from the two initial files and then paste it into the third file? Then, once the data in the two new columns are in the newly formed third file, say in columns A and B, can’t VB.NET instruct Excel to have in column C the formula =(A1*B1), then highlight column C and copy the result into the rest of the rows? Thanks in advance for your help.
judicoAsked:
Who is Participating?
 
Rory ArchibaldConnect With a Mentor Commented:
Assuming your sheets have data from column A onwards and that it's column B in each you are interested in, then something like:


Dim ObjectExcel As New Excel.Application
   
        Dim objectWorkBook As Excel.Workbook
        Dim objectWorkBook1 As Excel.Workbook
        Dim objectWorkBook2 As Excel.Workbook
        ObjectExcel.Visible = True


        objectWorkbook = ObjectExcel.Workbooks.Open("D:\Jim\test.xls")
        Dim ObjectWorkSheet As Excel.Worksheet = objectWorkbook.Worksheets("Sheet1")
        Dim iNextRow As Integer = ObjectWorkSheet.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell = ObjectWorkSheet.UsedRange

        objectWorkbook1 = ObjectExcel.Workbooks.Open("D:\Jim\1.xls")
        Dim ObjectWorkSheet1 As Excel.Worksheet = objectWorkbook1.Worksheets("Sheet1")
        Dim iNextRow1 As Integer = ObjectWorkSheet1.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell1 = ObjectWorkSheet1.UsedRange

        objectWorkBook2 = ObjectExcel.Workbooks.Open("D:\Jim\2.xls")
        Dim ObjectWorkSheet2 As Excel.Worksheet = objectWorkBook2.Worksheets("Sheet1")
        Dim iNextRow2 As Integer = ObjectWorkSheet2.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell2 = ObjectWorkSheet2.UsedRange

        objectCell1.columns(2).copy objectworksheet.cells(iNextRow, 1)
        objectCell2.columns(2).copy objectworksheet.cells(iNextRow, 2)
        objectworksheet.cells(inextRow, 3).resize(objectcell1.rows.count).FormulaR1C1 = "=RC[-2]*RC[-1]"

Open in new window

0
 
NitharsanCommented:
I also had similar kind of problem. I had store more than 10,000 data in excel. first i tried with excel automation. it was very slow. later i used  carlos ag xml writer . It worked fine.It will write in .xls format.

But there is another tools which can write and read the data in .xlsx format. That is epplus .
That works fine too.
you can download that tool http://epplus.codeplex.com/
if you want to download carlos ag xml writer - http://www.carlosag.net/Tools/ExcelXmlWriter/

I hope this information will helpful for you.
0
 
NitharsanCommented:
Further more , Epplus tools has many features.
You can create chart, add shape and formula...... etc.

Just visit http://epplus.codeplex.com/ ...
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
judicoAuthor Commented:
Thanks for the reply. Which one would you recommend? It appears CarlosAg is more comprehensible. So, what do you do with it? You place CarlosAg.ExcelXmlWriter.dll file into the bin directory of the application and then what? I placed -- using CarlosAg.ExcelXmlWriter -- in my Form1 but it doesn't accept it.
0
 
judicoAuthor Commented:
Our posts crossed obviously, so I now understand you recommend EPPlus. What do you actually do to install it? That's the first thing I'd like to understand.
0
 
NitharsanCommented:
Sorry .. I forgot tell u one thing. now i am using epplus for my application.  i feel that it is easier than carlos ag and has many features.

Just add epplus.dll file as reference to ur project.

Goto Project Menu > Add Reference then
browse> select the epplus.dll file to add to ur project.
It will be automatically added to ur bin dirctory.

then add following code in each class where u would use epplus

Imports OfficeOpenXml
Imports OfficeOpenXml.Drawing

there are some examples in their website . follow that.
and also  you can download sample file from their website.

Nitha


0
 
judicoAuthor Commented:
Also, I'm using Office2003 while EPPlus is for 2007 only, as far as I can see. I'm reading both links you cited but it's hard to understand how they should be implemented to accomplish the stated task.
0
 
judicoAuthor Commented:
Sorry, the posts crossed again. First, I'd like to know if APPlus would work with Office2003.
0
 
Rory ArchibaldCommented:
If you are going to automate Excel, I suggest you set the application's screenupdating property to False while you are manipulating the workbook. It is also faster to create the array with the values you want and then write that to the worksheet in one go, not cell by cell, which is slow.
0
 
judicoAuthor Commented:
Probably APPlus isn't working with Office2003 because it cannot be referenced. I could reference CarlosAg, however.
0
 
judicoAuthor Commented:
@rorya, how do you do that, however? The data from the cells have to be placed in arrays or something. I was thinking about that but how do you accomplish it with that Excel automation procedure provided by VB.NET?
0
 
NitharsanCommented:
Actually   you don't need office 2003 or 2007 installed in your pc, if u are going to use EPPlus .net library or carlos ag to write/read excel file. These .net library will write /read excel file without office.

My advise is , use epplus to write excel file. and still u can open xlsx file in office 2003.
(if you install office compatiblity pack  if not, install iit.)
http://office.microsoft.com/en-us/excel/HA100141071033.aspx


Nitha
0
 
Rory ArchibaldCommented:
You would need to create a third array and write that to the worksheet. To write it to the worksheet (let's assume an array of 20 x 3 elements as in your example):

ObjectWorkSheet.Cells(1, 1).Resize(20, 3).Value2 = yourNewArray

0
 
judicoAuthor Commented:
@Nitharsan, as I said, I'm having trouble referencing APPlus from my application. CarlosAg could be referenced easily but when trying to reference APPlus.dll an error message pops up reaing that that file isn't a dll file.
0
 
NitharsanCommented:
I also had same problem. Download EPPlus 2.6.0.1 Binary from recommended download .
anyway i have attached that dll file with this.

and for your start...

this code will create xlsx file in a min.(10,000 cells)



Imports OfficeOpenXml
Imports OfficeOpenXml.Drawing
Imports System.IO
Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
       

    End Sub

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim xlfile As FileInfo = New FileInfo("D:\xlsepp.xlsx")
        Dim xlrange As ExcelRange
        Dim package As ExcelPackage = New ExcelPackage(xlfile)
        Dim xlworksheet = package.Workbook.Worksheets.Add("Sheet 1")
        xlworksheet.View.ShowGridLines = False

        ' xlworksheet.Column(4).OutlineLevel = 1
        ' xlworksheet.Column(4).Collapsed = True
        ' xlworksheet.Column(5).OutlineLevel = 1
        'xlworksheet.Column(5).Collapsed = True
        ' xlworksheet.OutLineSummaryRight = True

        xlworksheet.Column(1).Width = 12
        xlworksheet.Row(1).Height = 15
        xlworksheet.Column(2).Width = 12
        xlworksheet.Row(2).Height = 20


        XLSheetProperties(xlworksheet)



        For i = 3 To 10
            xlworksheet.Column(i).Width = 12
            xlworksheet.Row(i).Height = 15
        Next


        xlrange = xlworksheet.Cells(1, 1)
        xlrange.Value = "X"
        XLCellsMTitleStyle(xlrange)

        xlrange = xlworksheet.Cells(1, 2)
        xlrange.Value = "Y"
        XLCellsMTitleStyle(xlrange)

        xlrange = xlworksheet.Cells(1, 3)
        xlrange.Value = "Z"
        XLCellsMTitleStyle(xlrange)


        For i = 2 To 10000
            xlrange = xlworksheet.Cells(i, 1)
            xlrange.Value = i * 2 + 1
            XLCellsNTextStyle(xlrange)

            xlrange = xlworksheet.Cells(i, 2)
            xlrange.Value = i + 1
            XLCellsNTextStyle(xlrange)

            xlrange = xlworksheet.Cells(i, 3)
            xlrange.Value = 10 * i * i
            XLCellsNTextStyle(xlrange)
        Next


        XLCellBorderThickOutline(xlworksheet, 1, 1, 10000, 3)

        package.Save()
        MsgBox("saved")
    End Sub


    Sub XLSheetProperties(ByVal worksheet As ExcelWorksheet)
        ' lets set the header text
        worksheet.HeaderFooter.oddHeader.CenteredText = "Results"
        ' add the page number to the footer plus the total number of pages
        worksheet.HeaderFooter.oddFooter.RightAlignedText = String.Format("Page {0} of {1}", ExcelHeaderFooter.PageNumber, ExcelHeaderFooter.NumberOfPages)
        ' add the sheet name to the footer
        worksheet.HeaderFooter.oddFooter.CenteredText = ExcelHeaderFooter.SheetName
        ' add the file path to the footer
        ' worksheet.HeaderFooter.oddFooter.LeftAlignedText = ExcelHeaderFooter.FilePath + ExcelHeaderFooter.FileName

        ' Change the sheet view to show it in page layout mode
        worksheet.View.PageLayoutView = True




    End Sub

   

    Sub XLCellBorderThickOutline(ByVal xlworksheet As ExcelWorksheet, ByVal isrow As Integer, ByVal iscol As Integer, ByVal ierow As Integer, ByVal iecol As Integer)
        Dim xl_range As ExcelRange

        '//Top Border _Thick
        xl_range = xlworksheet.Cells(isrow, iscol, isrow, iecol)
        xl_range.Style.Border.Top.Style = Style.ExcelBorderStyle.Thick

        '//Bottom Border _Thick
        xl_range = xlworksheet.Cells(ierow, iscol, ierow, iecol)
        xl_range.Style.Border.Bottom.Style = Style.ExcelBorderStyle.Thick

        '//Left Border _Thick
        xl_range = xlworksheet.Cells(isrow, iscol, ierow, iscol)
        xl_range.Style.Border.Left.Style = Style.ExcelBorderStyle.Thick

        '//Right Border _Thick
        xl_range = xlworksheet.Cells(isrow, iecol, ierow, iecol)
        xl_range.Style.Border.Right.Style = Style.ExcelBorderStyle.Thick

        xl_range = Nothing
    End Sub


    Sub XLCellsMTitleStyle(ByVal xlrange As ExcelRange)
        xlrange.Style.Font.SetFromFont(New Font("Calibri", 18, FontStyle.Underline, GraphicsUnit.Pixel, Nothing))
        xlrange.Style.Font.Bold = True
        xlrange.Style.Font.Color.SetColor(Color.Blue)
        xlrange.Style.HorizontalAlignment = OfficeOpenXml.Style.ExcelHorizontalAlignment.Left

        xlrange.Style.Border.Bottom.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Top.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Left.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Right.Style = Style.ExcelBorderStyle.Thin
        ' xlrange.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid
        '  xlrange.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(23, 55, 93))
    End Sub


    Sub XLCellsNTextStyle(ByVal xlrange As ExcelRange)
        xlrange.Style.Font.SetFromFont(New Font("Calibri", 11, FontStyle.Regular, GraphicsUnit.Pixel, Nothing))
        xlrange.Style.Font.Bold = False
        xlrange.Style.Font.Color.SetColor(Color.Black)
        xlrange.Style.HorizontalAlignment = OfficeOpenXml.Style.ExcelHorizontalAlignment.Left

        xlrange.Style.Border.Bottom.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Top.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Left.Style = Style.ExcelBorderStyle.Thin
        xlrange.Style.Border.Right.Style = Style.ExcelBorderStyle.Thin
        ' xlrange.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid
        '  xlrange.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(23, 55, 93))
    End Sub


End Class

EPPlus.dll
0
 
judicoAuthor Commented:
@rorya, I don't understand the need for the creation of a third array. Once I've removed objectWorkBook.Save() which allows to populate the cells at a higher rate because now you don't need to save after each iteration why does one need a third array?
0
 
judicoAuthor Commented:
@Nitharsan, I tried the binary EPPlus 2.6.0.1 and it won't allow referencing either.
0
 
Rory ArchibaldCommented:
Because it's faster to populate a whole range of cells all at once from an array than it is to write them out one-by-one. Even within Excel using VBA, I would never write to one cell at a time if I could possibly avoid it. Certainly not if I had 10,000 cells to do!
0
 
judicoAuthor Commented:
@rorya, OK, I understand that but how do you practically do that. How do you practically create a new array with the data you need. Can you show me how this can be done in my example?
0
 
Rory ArchibaldCommented:
What are ObjectCell1 and ObjectCell2 in your code and how are they initialised?
0
 
judicoAuthor Commented:
@rorya:        

        Dim ObjectExcel As New Excel.Application
        Dim ObjectExcel1 As New Excel.Application
        Dim ObjectExcel2 As New Excel.Application

        ObjectExcel.Visible = True
        ObjectExcel1.Visible = False
        ObjectExcel2.Visible = False

        ObjectExcel.Workbooks.Open("D:\Jim\test.xls")
        Dim ObjectWorkSheet As Excel.Worksheet = ObjectExcel.Worksheets("Sheet1")
        Dim iNextRow As Integer = ObjectWorkSheet.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim objectWorkBook As Excel.Workbook = ObjectExcel.ActiveWorkbook
        Dim ObjectCell = ObjectWorkSheet.UsedRange

        ObjectExcel1.Workbooks.Open("D:\Jim\1.xls")
        Dim ObjectWorkSheet1 As Excel.Worksheet = ObjectExcel1.Worksheets("Sheet1")
        Dim iNextRow1 As Integer = ObjectWorkSheet1.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim objectWorkBook1 As Excel.Workbook = ObjectExcel1.ActiveWorkbook
        Dim ObjectCell1 = ObjectWorkSheet1.UsedRange

        ObjectExcel2.Workbooks.Open("D:\Jim\2.xls")
        Dim ObjectWorkSheet2 As Excel.Worksheet = ObjectExcel2.Worksheets("Sheet1")
        Dim iNextRow2 As Integer = ObjectWorkSheet2.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim objectWorkBook2 As Excel.Workbook = ObjectExcel2.ActiveWorkbook
        Dim ObjectCell2 = ObjectWorkSheet2.UsedRange
0
 
judicoAuthor Commented:
@rorya, never mind that ObjectExcel.Visible = True. I'll turn it off later.
0
 
Rory ArchibaldCommented:
Is there a reason for using 3 separate instances of Excel? Seems like a waste of resource to me.
0
 
judicoAuthor Commented:
@rorya, the only reason is to have the three separate files opened.
0
 
Rory ArchibaldCommented:
But you can open them all in the same instance rather than using three separate instances. I should point out that VB.Net is not my forte, but something like:

Dim ObjectExcel As New Excel.Application
   
        Dim objectWorkBook As Excel.Workbook
        Dim objectWorkBook1 As Excel.Workbook
        Dim objectWorkBook2 As Excel.Workbook
        ObjectExcel.Visible = True


        objectWorkbook = ObjectExcel.Workbooks.Open("D:\Jim\test.xls")
        Dim ObjectWorkSheet As Excel.Worksheet = objectWorkbook.Worksheets("Sheet1")
        Dim iNextRow As Integer = ObjectWorkSheet.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell = ObjectWorkSheet.UsedRange

        objectWorkbook1 = ObjectExcel.Workbooks.Open("D:\Jim\1.xls")
        Dim ObjectWorkSheet1 As Excel.Worksheet = objectWorkbook1.Worksheets("Sheet1")
        Dim iNextRow1 As Integer = ObjectWorkSheet1.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell1 = ObjectWorkSheet1.UsedRange

        objectWorkBook2 = ObjectExcel.Workbooks.Open("D:\Jim\2.xls")
        Dim ObjectWorkSheet2 As Excel.Worksheet = objectWorkBook2.Worksheets("Sheet1")
        Dim iNextRow2 As Integer = ObjectWorkSheet2.UsedRange.End(Excel.XlDirection.xlDown).Row + 1
        Dim ObjectCell2 = ObjectWorkSheet2.UsedRange

Open in new window

0
 
judicoAuthor Commented:
@rorya, something messes it up when you do it that way and crashes. Maybe we should stay with the three instances of Excel and try to figure out the formation of the third array.
0
 
judicoAuthor Commented:
That crashes at ObjectCell1.columns(2).copy(ObjectWorkSheet.Cells(iNextRow, 1)) with the error message:

Exception from HRESULT: 0x800A03EC

I tried it also with the earlier verion -- opening three instances of Excel
0
 
judicoAuthor Commented:
earlier version, that is ...
0
 
Rory ArchibaldCommented:
What is the value of iNextRow at the time?
0
 
judicoAuthor Commented:
Also, the first columns with data in 1.xls and 2.xls are D but I'm interested in columns E. This is just for the record. I think the code above correctly seeks the second instance of a column with data.
0
 
judicoAuthor Commented:
The value of iNextRow before the crash is 65537.

Also, what stays open and is seen on the screen is 2.xls
0
 
Rory ArchibaldCommented:
That would explain it - there are only 65536 rows in the worksheet! :)
0
 
judicoAuthor Commented:
Now, I changed Row + 1 to Row - 1 in:

 Dim iNextRow As Integer = ObjectWorkSheet.UsedRange.End(Excel.XlDirection.xlDown).Row - 1

and the error message now reads:

The information cannot be pasted because the Copy area and the paste area are not the same size and shape. Try one of the following:

- Click a single cell and paste
- Select a rectangle that's the same size and shape and then paste.
0
 
Rory ArchibaldCommented:
That's because you have still run out of rows, assuming you are pasting more than two rows, because you start the paste at row 65535. Is that actually where your data ends on the target sheet?
0
 
Rory ArchibaldCommented:
You might want to use:

Dim iNextRow As Integer = ObjectWorkSheet.cells(objectworksheet.rows.count, 1).End(Excel.XlDirection.xlUp).Row + 1

instead.
0
 
judicoAuthor Commented:
That seems to be working just fine. Except that, because only one instance of Excel is opened, all three spreadsheets remain opened together at the end, 2.xls being the one seen on the screen while the result is in test.xls. If one tries to close 2.xls, of course, the other two close as well. Anyway, I can live with that.

The thing I can't understand is how this FormulaR1C1 works. Suppose that, instead of A1*B1, as it is now, I want A1*B1-(A1^2)*5. I thought it should be "=RC[-2]*RC[-1]-(RC[-1]^2)]*5" but its giving me weird results.

How about if I want, say, C1 + (A2 - A1)*(B2 - B1)/2  placed in C column?
0
 
Rory ArchibaldCommented:
The [-1] bit is relative to the cell containing the formula, so A1 is two to the left of C1, hence:
=RC[-2]*RC[-1]-(RC[-2]^2)]*5

Your last formula would be a circular reference unless you are placing it in C2 or lower?
0
 
judicoAuthor Commented:
Correct, it should be C2 + (A2 - A1)*(B1 + B2)/2. How is this done with FormulaR1C1.

Also, I'd like tho have the absolute value of, say, RC[-1], so I'm writing System.Math.Abs(RC[-1]) but it won't work. Where can one find a reference of all math functions and stuff applicable to a problem such as this one?
0
 
Rory ArchibaldCommented:
You can only use R1C1 in formulas. If you wanted to use Abs, you would have to pass it a value using code like:
System.Math.Abs(objectworksheet.cells(2, 1).Value2)

for example.

When using R1C1 style in formulas, absolute references are specified using just a row or column number - e.g. R1C1 is the equivalent of $A$1. Relative references use an offset number enclosed in square brackets:
R[1]C[-1]
means one row down and one column to the left of the cell containing the formula.

You could also write your formulas in A1 notation if that makes more sense to you! :)

So, to get the formula C2 + (A2 - A1)*(B1 + B2)/2 in C1, you need:
=R[1]C+(R[1]C[-2]-RC[-2])*(RC[-1]+R[1]C[-1])/2

0
 
judicoAuthor Commented:
Couldn't understand this:

If you wanted to use Abs, you would have to pass it a value using code like:
System.Math.Abs(objectworksheet.cells(2, 1).Value2)

How do you  turn all the values in a given column, say C, into absolute values using the above code?
0
 
judicoAuthor Commented:
Turning the values in column, say B, should occur prior to calculating =R[1]C+(R[1]C[-2]-RC[-2])*(RC[-1]+R[1]C[-1])/2 which places data values in column C. I thought I should add that.
0
 
judicoAuthor Commented:
... turning the values into absolute values, I should've written ...
0
 
Rory ArchibaldCommented:
If it's an option, you could just add ABS into the formula:

=ABS(R[1]C)+(ABS(R[1]C[-2])-ABS(RC[-2]))*(ABS(RC[-1])+ABS(R[1]C[-1]))/2
0
 
NitharsanCommented:
I don't understand why u could not reference it.
if you can send me your project as zip file ., i can try........

my email address. nitharsanr@yahoo.com

send ur email address. I can send sample problem.
0
 
judicoAuthor Commented:
Thanks a lot @Nitharsan for your offer to help. For now I think I'm going to stay with @rorya's solution which is just what I was looking for. It seems to use the standard provisions in Excel and VB.NET without the need to add new classes and tools. Wish you all the best.
0
 
judicoAuthor Commented:
Thank you very much, @rorya. You did a great job. I will now give you a 500 points and if something else pops up in connection with thus problem I'll open another question. All the best.
0
 
judicoAuthor Commented:
Trying to find how to change the points to 500? What should I do to change that?
0
 
judicoAuthor Commented:
OK, that did it automatically.
0
 
judicoAuthor Commented:
Great job. Thanks.
0
 
judicoAuthor Commented:
Now, why is the rating 7,7? How is that determined? I think I gave a grade which should read excellent.
0
 
Rory ArchibaldCommented:
Thanks for the grade! :)
Don't worry about the rating - I'm not entirely sure anyone knows how the initial scores are worked out, but I believe they alter as other people search for things and rate the solutions they find. Either way, it doesn't affect me in the least! ;)
Rory
0
 
NitharsanCommented:
Hai judico
There is another tool " NPOI" which can generate xls file.

just try this.....

http://npoi.codeplex.com/
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.