Solved

why I got null returned in c# code when I am trying to get values from excel sheet using linq and OpenXml?

Posted on 2011-03-15
11
2,979 Views
Last Modified: 2013-12-17
hi,

I have an excel sheet, I want to get all cells and data from that excel sheet using linq and openxml library,

but I got problem that null records returned from excel sheet, here is my code and the excel sheet:

 
private void btnAddProductType_Click(object sender, EventArgs e)
        {
            string path = @"E:\Leamra Projects\ProductManagment\code\UsingLINQToQueryTablesInExcel2007_CODE\LinqSample.xlsx";
            DataTable dtProductType = ReadDataFromExcel(path);
        }

        private DataTable ReadDataFromExcel(string path)
        {
            //Declare variables to hold refernces to Excel objects
            Workbook workBook;
            SharedStringTable sharedStrings;
            IEnumerable<Sheet> workSheets;
            WorksheetPart sheet;

            //Declare helper variables
            string sheetId;

            DataTable dtProductType = CreateProductTypeTableHeaders();
            using (SpreadsheetDocument document = SpreadsheetDocument.Open(path, true))
            {
                //References to the workbook and Shared String Table.
                workBook = document.WorkbookPart.Workbook;
                workSheets = workBook.Descendants<Sheet>();
                sharedStrings = document.WorkbookPart.SharedStringTablePart.SharedStringTable;

                //Reference to Excel Worksheet with Customer data.
                sheetId = workSheets.First(s => s.Name == @"Customer").Id;
                sheet = (WorksheetPart)document.WorkbookPart.GetPartById(sheetId);

                IEnumerable<Row> dataRows = from row in sheet.Worksheet.Descendants<Row>()
                                            where row.RowIndex > 1
                                            select row;
                foreach (Row row in dataRows)
                {
                    IEnumerable<String> textValues = from cell in row.Descendants<Cell>()
                                                     where cell.CellValue != null
                                                     select
                                                       (cell.DataType != null
                                                         && cell.DataType.HasValue
                                                         && cell.DataType == CellValues.SharedString ? sharedStrings.ChildElements[int.Parse(cell.CellValue.InnerText)].InnerText : cell.CellValue.InnerText);
                    //Check to verify the row contained data.
//here I got textValues is null !!!!
                    if (textValues.Count() > 0)
                    {
                        //Create a Customer and add it to the list.
                        var textArray = textValues.ToArray();
                        if (!AddNewRowTodtProductType(textArray[0], textArray[1], textArray[2], textArray[3], dtProductType))
                        {
                            return null;
                        }
                    }
                    else
                    {
                        //If no Cells assume the end of the table has been reached.
                        break;
                    }
                }


            }
            return dtProductType;
        }

        private bool AddNewRowTodtProductType(string EnProductType, string ArProductType, string EnDescription, string ArDescription, DataTable dtProductType)
        {
            try
            {

                DataRow row = null;
                row = dtProductType.NewRow();
                row["EnProductType"] = EnProductType;
                row["ArProductType"] = ArProductType;
                row["EnDescription"] = EnDescription;
                row["ArDescription"] = ArDescription;
                dtProductType.Rows.Add(row);
                return true;
            }
            catch
            {
                return false;
            }
        }

        private DataTable CreateProductTypeTableHeaders()
        {
            DataTable dtProductType = new DataTable("ProductTypes");
            DataColumn clProduct;
            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "EnProductType";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "ArProductType";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "EnDescription";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "ArDescription";
            clProduct.DefaultValue = "0";
            dtProductType.Columns.Add(clProduct);
            return dtProductType ;
        }

Open in new window


 ProductTypes.xlsx


the strange thing the code works with original excel sample of msdn (where I got openxml and linq code), I thought the problem might be because I am using office 2007 (.xlsx) and I discovered later that the sample excel was using also office 2007 and the file extension is .xlsx

then I thought it might be because its contain unicode characters, so I have deleted the column which has unicode characters, but that was same problem


then I thought the problem because there are some cells they are having null value, so I fill all cells with dumy data, but that also does not solve the problem

any help please?
 
here is the sample url: http://msdn.microsoft.com/en-us/library/dd920313.aspx
0
Comment
Question by:njgroup
  • 6
  • 4
11 Comments
 

Author Comment

by:njgroup
ID: 35144740
in function "ReadDataFromExcel", I got "textValues" variable is null and when I have breakpoint on it I can see the result is empty and it says: "Enumeration yielded no results"

please see the snapshot below

 Enumeration yielded no results
0
 

Author Comment

by:njgroup
ID: 35145072
I found the problems, I got two problems,

#1 there are some merged cells (that was in table headers)
#2 the linq select can not set empty string of null values (from the excel sheet there are two column do not have values, they are "EnProductTypeDescription", "ArProductTypeDescription")

so how to get value from merged cells (instead of returning null),
and how to force getting empty string from empty cells (because the array of cells returned contains only data about the cells that they are not empty, and I dont get the empty cell at all,
for example if I have 4 cells have values like this: "A", "B", "", "C", I got array of strings containing "A", "B", "C",
please see the following attachment)

 cells not comming in select statement
0
 
LVL 11

Expert Comment

by:Sudhakar Pulivarthi
ID: 35147780
Hi,
Remove this where condition and check.. This filtering might be the reason for reduced array values.
where cell.CellValue != null
0
 

Author Comment

by:njgroup
ID: 35154279
if I remove the where statement, I got error textValues variable is null
0
 
LVL 62

Expert Comment

by:Fernando Soto
ID: 35158933
Hi njgroup;

Can you post the LinqSample.xlsx The excel you posted was ProductTypes.xlsx

Fernando
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:njgroup
ID: 35170289
they are same, I just copy data from ProductTypes.xlsx to LinqSample.xlsx to keep the path just for quick test
0
 
LVL 62

Expert Comment

by:Fernando Soto
ID: 35171700
Hi njgroup;

You say that you just copied the ProductTypes.xlsx to LinqSample.xlsx for a quick test, well that will not work unless ProductTypes.xlsx has the same names for sheets and column headers names as the original LinqSample.xlsx excel document has. For example the line of code from your post has this:

//Reference to Excel Worksheet with Customer data.
sheetId = workSheets.First( s => s.Name == @"Customer" ).Id;

Open in new window


but the excel workbook from ProductTypes.xlsx has the following three sheets, "CCTV", "Sheet2" and
"Sheet3" but the line of code is looking for a Sheet called Customer so you would get an exception. And if you look at the construction of your DataTable non of the column names match that of the Excel document. You need to re-write the program to match the Excel document that you are using.

Fernando
0
 

Author Comment

by:njgroup
ID: 35172067
hi Fernando,

sorry I have send you an old code, I have just changed sheet name and I did in my code, but this is not my problem,

the program pass from this line of code, but stuck in my previous posts.

the sheet name is not my problem but the previous posts,

can u please just change the sheet name (its only a string)
0
 
LVL 62

Accepted Solution

by:
Fernando Soto earned 500 total points
ID: 35173445
Hi njgroup;

I have found two issues with the code. The first is that this queries function is to get all none header rows from the Excel work sheet. But your Excel work sheet has defined two rows as the header and NOT 1. Therefore you need to change the "where" clause from this, where row.RowIndex > 1, to what is shown in the code snippet below.

IEnumerable<Row> dataRows = from row in sheet.Worksheet.Descendants<Row>( )
                            where row.RowIndex > 2
                            select row;

Open in new window


The second issue is with this query:

IEnumerable<String> textValues = from cell in row.Descendants<Cell>( )
                                 where cell.CellValue != null
                                 select
                                   ( cell.DataType != null && cell.DataType.HasValue && cell.DataType == CellValues.SharedString) 
                                        ? sharedStrings.ChildElements[ int.Parse( cell.CellValue.InnerText ) ].InnerText 
                                        : cell.CellValue.InnerText ;

Open in new window


The where clause in it filters out all cells that do not have a value only returning the cells with values in them. In your Excel document columns 3 and 4 are all empty therefore it only returns columns 1 and 2 and when your code executes this statement, var textArray = textValues.ToArray( );, the textArray only contains two elements and NOT four. This causes an issue with your function AddNewRowTodtProductType which is always expecting four elements. At which point your program will throw an exception of index out of range. I re-wrote the query to return an empty strings for columns that do not have a value in them, see code snippet below.

IEnumerable<String> textValues = from cell in row.Descendants<Cell>( )
                                 select
                                   ( cell.CellValue != null ) ?
                                     ((cell.DataType != null && cell.DataType.HasValue && cell.DataType == CellValues.SharedString )
                                        ? sharedStrings.ChildElements[ int.Parse( cell.CellValue.InnerText ) ].InnerText
                                        : cell.CellValue.InnerText) 
                                     : String.Empty;

Open in new window


Fernando
0
 

Author Closing Comment

by:njgroup
ID: 35178511
thats perfect :)

thanks very much
0
 
LVL 62

Expert Comment

by:Fernando Soto
ID: 35179722
Not a problem, glad I was able to help.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Parsing a CSV file is a task that we are confronted with regularly, and although there are a vast number of means to do this, as a newbie, the field can be confusing and the tools can seem complex. A simple solution to parsing a customized CSV fi…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now