Solved

Slow process to read Excel

Posted on 2016-07-17
15
160 Views
Last Modified: 2016-07-26
Hi,

It does take nearly 30 minutes to finish scanning these two sheets
https://app.box.com/s/zkm75d29hef82zct7bkw784iro8kmn64

using such codes
                    foreach (Excel.Worksheet sheet in Book0.Worksheets)
                    {
                        Excel.Range usedRange = sheet.UsedRange;

                        RowID++; Worksheet0.Cells[RowID, 1] = "^^^^" + sheet.Name + "^^^^"; RowID++;

                        foreach (Excel.Range row in usedRange.Rows)
                        {
                            RowID++;
                            for (int i = 0; i < row.Columns.Count; i++)
                                /*if (!string.IsNullOrEmpty(rowData[i]))
                                    j = 0;
                                else
                                    j++;*/

                                if (!string.IsNullOrEmpty(Convert.ToString(row.Cells[1, i + 1].Value2)))
                                {
                                    Worksheet0.Cells[RowID, i + 1] = row.Cells[1, i + 1].Value2;
                                    Worksheet0.Cells[RowID, i + 1].Interior.Pattern = row.Cells[1, i + 1].Interior.Pattern;
                                    ...

Open in new window

Why? Any other way to efficiently/quickly read the attached two Excel Worksheets?
0
Comment
Question by:HuaMinChen
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
  • 2
15 Comments
 
LVL 70

Expert Comment

by:Qlemo
ID: 41716340
Using the Cell collation and indexing is slow via Automation. You should use a Range var for the current row of Worksheet0 to set properties, and best to just .Offset(1.0) it in each go. You could do the same down to a cell, but that is getting tedious.
0
 
LVL 10

Author Comment

by:HuaMinChen
ID: 41716353
Can I have more details to your advice? Thanks a lot
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 41717565
I'm not a csharpa, but I guess you get the idea:
                    Excel.Range ws0cell0 = Worksheet0.Cells(1,1);
                    Excel.Range ws0cell;

                    foreach (Excel.Worksheet sheet in Book0.Worksheets)
                    {
                        Excel.Range usedRange = sheet.UsedRange;

                        RowID++;
                        ws0cell0 = "^^^^" + sheet.Name + "^^^^";
                        RowID++;

                        foreach (Excel.Range row in usedRange.Rows)
                        {
                            RowID++;
                            ws0cell = ws0Cell0 = ws0Cell0.Offset(1,0);    // one row down, first cell
                            for (int i = 0; i < row.Columns.Count; i++)
                                /*if (!string.IsNullOrEmpty(rowData[i]))
                                    j = 0;
                                else
                                    j++;*/

                                if (!string.IsNullOrEmpty(Convert.ToString(row.Cells[1, i + 1].Value2)))
                                {
                                    ws0cell.Value2 = row.Cells[1, i + 1].Value2;
                                    ws0cell.Interior.Pattern = row.Cells[1, i + 1].Interior.Pattern;
                                    ...
                                    ws0cell = ws0cell.Offset(0,1);    // one cell to the rightm sane row.

Open in new window

Doing the same for the source cell should improve performance significantly.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Author Comment

by:HuaMinChen
ID: 41718218
I use these
                    Excel.Worksheet Worksheet0;
                    Worksheet0 = (Excel.Worksheet)Book0.Worksheets.Add();
                    Excel.Range ws0cell0 = Worksheet0.Cells[1, 1];
                    Excel.Range ws0cell;

                    foreach (Excel.Worksheet sheet in Book0.Worksheets)
                    {
                        Excel.Range usedRange = sheet.UsedRange;

                        RowID++; ws0cell0.Value2 = "^^^^" + sheet.Name + "^^^^"; RowID++;

                        foreach (Excel.Range row in usedRange.Rows)
                        {
                            RowID++;
                            ws0cell = ws0cell0 = ws0cell0.Offset[1, 0];    // one row down, first cell

                            for (int i = 0; i < row.Columns.Count; i++)
                                /*if (!string.IsNullOrEmpty(rowData[i]))
                                    j = 0;
                                else
                                    j++;*/

                                if (!string.IsNullOrEmpty(Convert.ToString(row.Cells[1, i + 1].Value2)))
                                {
                                    ws0cell.Value2 = row.Cells[1, i + 1].Value2;
                                    ws0cell.Interior.Pattern = row.Cells[1, i + 1].Interior.Pattern;
                                    ws0cell.ColumnWidth = row.Cells[1, i + 1].ColumnWidth;
                                    ws0cell.Font.Color = row.Cells[1, i + 1].Font.Color;
                                    ws0cell = ws0cell.Offset[0, 1]; 
                                    ...

Open in new window

but it does still take 20 minutes to process the same Excel file. It is still too long time to process only 2 worksheets, right?
0
 
LVL 70

Accepted Solution

by:
Qlemo earned 250 total points
ID: 41718233
You are using the array notation [] for Cells and Offset - are you certain it doesn't work with function notation instead?
0
 
LVL 10

Author Comment

by:HuaMinChen
ID: 41718262
Yes, I cannot use

( )

and I'm using C#.
0
 
LVL 9

Assisted Solution

by:Valliappan AN
Valliappan AN earned 250 total points
ID: 41719156
Replace row.Cells[1, i + 1] with a cell1 variable, and any other repeated array access with variable assignment, to avoid multiple times parsing, thus optimizing the code.

HTH.
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 41719207
I cannot use C# that easily, so have to rely on what I know from VB Script and PowerShell (which again uses COM). In PowerShell I have to use the function notation, but that is just an implementation detail.

There is a lot of conversion taking place with Automation. E.g. setting Value2 always requires to convert from (display) string to internal format. It is best to keep Automation out of doing much. With PowerShell, the fastest copy of individual values between cells happens at 18 ms per cell (measured over 10 000 iterations).

The usual VBA ways to increase speed are to set Excel.Application.ScreenUpdating and Excel.WorkSheet.EnableCalculation to false while doing bulk operations.

Any reason you are not using the (Special)Paste function of Excel VBA, copying over a range at once? If you need to apply some processing on the copies, it might help to copy over the range, and then iterate thru the resulting cells instead.
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 41719232
Valliappan AN is correct, I didn't spot that. row access should be thru another variable for the current cell, similar to the changes already applied.
0
 
LVL 10

Author Comment

by:HuaMinChen
ID: 41720176
Thanks a lot to all.

Valliappan AN,
Can I have more details to your suggestion?
0
 
LVL 9

Assisted Solution

by:Valliappan AN
Valliappan AN earned 250 total points
ID: 41720310
Hi, when you access a compound object repeatedly, assign it to a variable and use it, so that the repeated parsing does not occur when you call multiple times the same object.

For more information you may look at: http://www.cpearson.com/excel/optimize.htm

and this:

https://blogs.office.com/2009/03/12/excel-vba-performance-coding-best-practices/

Thnks.
0
 
LVL 10

Author Comment

by:HuaMinChen
ID: 41722104
By these
                OpenFileDialog openFileDialog1 = new OpenFileDialog();
                if (openFileDialog1.ShowDialog() == System.Windows.Forms.DialogResult.OK)
                {
                    var ExcelObj = new Excel.Application();
                    Excel.Workbook Book0 = ExcelObj.Workbooks.Open(openFileDialog1.FileName, 0, false, 5, "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false, 0, true);
                    ExcelObj.ScreenUpdating = false;

                    Excel.Worksheet Worksheet0;
                    Worksheet0 = (Excel.Worksheet)Book0.Worksheets.Add();
                    Excel.Range ws0cell0 = Worksheet0.Cells[1, 1];
                    Excel.Range ws0cell;

                    foreach (Excel.Worksheet sheet in Book0.Worksheets)
                    {
                        Excel.Range usedRange = sheet.UsedRange;

                        RowID++; ws0cell0.Value2 = "^^^^" + sheet.Name + "^^^^"; RowID++;

                        foreach (Excel.Range row in usedRange.Rows)
                        {
                            RowID++;
                            row.Copy();
                            Worksheet0.Cells[RowID, 1].PasteSpecial(-4104); 
                            ...

Open in new window

                           
1. It is not copying whatever cell property to each row
2. It takes about 9 minutes to process the file attached

How to improve it further?
F20---Copy.xlsx
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 41723180
Since your code snippet makes not much sense, we cannot help much more. For example you are going thru all worksheets to copy rows, including the newly created one where all copies go to.
If I use the following sequence instead of your innermost ForEach (line 19+):
RowID++;
sheet.UsedRange.Copy();
worksheet0.Cells[RowID, 1].PasteSpecial(-4104);

Open in new window

the copy is instantaneous, and cell formatting taken over.
1
 
LVL 10

Author Comment

by:HuaMinChen
ID: 41726914
Using PasteSpecial, it does take 9 minutes to copy the sheets within this
https://app.box.com/s/yrwohfmp713si364romjc29pb501y559
to one overall sheet, and such speed is still slow. Why?
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 41727387
Is this XLSM supposed to do anything? Anyway, a complete, complex Excel sheet is hard to check. You really should start with something more simple, and then add more as soon as speed is satisfying with test cases.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Freeze panes is an option within all variants of Excel to enable parts of a sheet to remain stationary when the cursor is in another part of the sheet. This is a very useful feature which is overlooked or under used.
Excel can be a tricky bit of software to get your head around. Whilst you’ll be able to eventually get to grips with the basic understanding of how to get by, there are a few Excel tips that not everybody will even know about let alone know how to d…
This Micro Tutorial will demonstrate in Microsoft Excel how to add style and sexy appeal to horizontal bar charts.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question