Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

why I got null returned in c# code when I am trying to get values from excel sheet using linq and OpenXml?

Posted on 2011-03-15
11
Medium Priority
?
3,616 Views
Last Modified: 2013-12-17
hi,

I have an excel sheet, I want to get all cells and data from that excel sheet using linq and openxml library,

but I got problem that null records returned from excel sheet, here is my code and the excel sheet:

 
private void btnAddProductType_Click(object sender, EventArgs e)
        {
            string path = @"E:\Leamra Projects\ProductManagment\code\UsingLINQToQueryTablesInExcel2007_CODE\LinqSample.xlsx";
            DataTable dtProductType = ReadDataFromExcel(path);
        }

        private DataTable ReadDataFromExcel(string path)
        {
            //Declare variables to hold refernces to Excel objects
            Workbook workBook;
            SharedStringTable sharedStrings;
            IEnumerable<Sheet> workSheets;
            WorksheetPart sheet;

            //Declare helper variables
            string sheetId;

            DataTable dtProductType = CreateProductTypeTableHeaders();
            using (SpreadsheetDocument document = SpreadsheetDocument.Open(path, true))
            {
                //References to the workbook and Shared String Table.
                workBook = document.WorkbookPart.Workbook;
                workSheets = workBook.Descendants<Sheet>();
                sharedStrings = document.WorkbookPart.SharedStringTablePart.SharedStringTable;

                //Reference to Excel Worksheet with Customer data.
                sheetId = workSheets.First(s => s.Name == @"Customer").Id;
                sheet = (WorksheetPart)document.WorkbookPart.GetPartById(sheetId);

                IEnumerable<Row> dataRows = from row in sheet.Worksheet.Descendants<Row>()
                                            where row.RowIndex > 1
                                            select row;
                foreach (Row row in dataRows)
                {
                    IEnumerable<String> textValues = from cell in row.Descendants<Cell>()
                                                     where cell.CellValue != null
                                                     select
                                                       (cell.DataType != null
                                                         && cell.DataType.HasValue
                                                         && cell.DataType == CellValues.SharedString ? sharedStrings.ChildElements[int.Parse(cell.CellValue.InnerText)].InnerText : cell.CellValue.InnerText);
                    //Check to verify the row contained data.
//here I got textValues is null !!!!
                    if (textValues.Count() > 0)
                    {
                        //Create a Customer and add it to the list.
                        var textArray = textValues.ToArray();
                        if (!AddNewRowTodtProductType(textArray[0], textArray[1], textArray[2], textArray[3], dtProductType))
                        {
                            return null;
                        }
                    }
                    else
                    {
                        //If no Cells assume the end of the table has been reached.
                        break;
                    }
                }


            }
            return dtProductType;
        }

        private bool AddNewRowTodtProductType(string EnProductType, string ArProductType, string EnDescription, string ArDescription, DataTable dtProductType)
        {
            try
            {

                DataRow row = null;
                row = dtProductType.NewRow();
                row["EnProductType"] = EnProductType;
                row["ArProductType"] = ArProductType;
                row["EnDescription"] = EnDescription;
                row["ArDescription"] = ArDescription;
                dtProductType.Rows.Add(row);
                return true;
            }
            catch
            {
                return false;
            }
        }

        private DataTable CreateProductTypeTableHeaders()
        {
            DataTable dtProductType = new DataTable("ProductTypes");
            DataColumn clProduct;
            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "EnProductType";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "ArProductType";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "EnDescription";
            dtProductType.Columns.Add(clProduct);

            clProduct = new DataColumn();
            clProduct.DataType = Type.GetType("System.String");
            clProduct.ColumnName = "ArDescription";
            clProduct.DefaultValue = "0";
            dtProductType.Columns.Add(clProduct);
            return dtProductType ;
        }

Open in new window


 ProductTypes.xlsx


the strange thing the code works with original excel sample of msdn (where I got openxml and linq code), I thought the problem might be because I am using office 2007 (.xlsx) and I discovered later that the sample excel was using also office 2007 and the file extension is .xlsx

then I thought it might be because its contain unicode characters, so I have deleted the column which has unicode characters, but that was same problem


then I thought the problem because there are some cells they are having null value, so I fill all cells with dumy data, but that also does not solve the problem

any help please?
 
here is the sample url: http://msdn.microsoft.com/en-us/library/dd920313.aspx
0
Comment
Question by:njgroup
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
11 Comments
 

Author Comment

by:njgroup
ID: 35144740
in function "ReadDataFromExcel", I got "textValues" variable is null and when I have breakpoint on it I can see the result is empty and it says: "Enumeration yielded no results"

please see the snapshot below

 Enumeration yielded no results
0
 

Author Comment

by:njgroup
ID: 35145072
I found the problems, I got two problems,

#1 there are some merged cells (that was in table headers)
#2 the linq select can not set empty string of null values (from the excel sheet there are two column do not have values, they are "EnProductTypeDescription", "ArProductTypeDescription")

so how to get value from merged cells (instead of returning null),
and how to force getting empty string from empty cells (because the array of cells returned contains only data about the cells that they are not empty, and I dont get the empty cell at all,
for example if I have 4 cells have values like this: "A", "B", "", "C", I got array of strings containing "A", "B", "C",
please see the following attachment)

 cells not comming in select statement
0
 
LVL 11

Expert Comment

by:Sudhakar Pulivarthi
ID: 35147780
Hi,
Remove this where condition and check.. This filtering might be the reason for reduced array values.
where cell.CellValue != null
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 

Author Comment

by:njgroup
ID: 35154279
if I remove the where statement, I got error textValues variable is null
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 35158933
Hi njgroup;

Can you post the LinqSample.xlsx The excel you posted was ProductTypes.xlsx

Fernando
0
 

Author Comment

by:njgroup
ID: 35170289
they are same, I just copy data from ProductTypes.xlsx to LinqSample.xlsx to keep the path just for quick test
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 35171700
Hi njgroup;

You say that you just copied the ProductTypes.xlsx to LinqSample.xlsx for a quick test, well that will not work unless ProductTypes.xlsx has the same names for sheets and column headers names as the original LinqSample.xlsx excel document has. For example the line of code from your post has this:

//Reference to Excel Worksheet with Customer data.
sheetId = workSheets.First( s => s.Name == @"Customer" ).Id;

Open in new window


but the excel workbook from ProductTypes.xlsx has the following three sheets, "CCTV", "Sheet2" and
"Sheet3" but the line of code is looking for a Sheet called Customer so you would get an exception. And if you look at the construction of your DataTable non of the column names match that of the Excel document. You need to re-write the program to match the Excel document that you are using.

Fernando
0
 

Author Comment

by:njgroup
ID: 35172067
hi Fernando,

sorry I have send you an old code, I have just changed sheet name and I did in my code, but this is not my problem,

the program pass from this line of code, but stuck in my previous posts.

the sheet name is not my problem but the previous posts,

can u please just change the sheet name (its only a string)
0
 
LVL 64

Accepted Solution

by:
Fernando Soto earned 2000 total points
ID: 35173445
Hi njgroup;

I have found two issues with the code. The first is that this queries function is to get all none header rows from the Excel work sheet. But your Excel work sheet has defined two rows as the header and NOT 1. Therefore you need to change the "where" clause from this, where row.RowIndex > 1, to what is shown in the code snippet below.

IEnumerable<Row> dataRows = from row in sheet.Worksheet.Descendants<Row>( )
                            where row.RowIndex > 2
                            select row;

Open in new window


The second issue is with this query:

IEnumerable<String> textValues = from cell in row.Descendants<Cell>( )
                                 where cell.CellValue != null
                                 select
                                   ( cell.DataType != null && cell.DataType.HasValue && cell.DataType == CellValues.SharedString) 
                                        ? sharedStrings.ChildElements[ int.Parse( cell.CellValue.InnerText ) ].InnerText 
                                        : cell.CellValue.InnerText ;

Open in new window


The where clause in it filters out all cells that do not have a value only returning the cells with values in them. In your Excel document columns 3 and 4 are all empty therefore it only returns columns 1 and 2 and when your code executes this statement, var textArray = textValues.ToArray( );, the textArray only contains two elements and NOT four. This causes an issue with your function AddNewRowTodtProductType which is always expecting four elements. At which point your program will throw an exception of index out of range. I re-wrote the query to return an empty strings for columns that do not have a value in them, see code snippet below.

IEnumerable<String> textValues = from cell in row.Descendants<Cell>( )
                                 select
                                   ( cell.CellValue != null ) ?
                                     ((cell.DataType != null && cell.DataType.HasValue && cell.DataType == CellValues.SharedString )
                                        ? sharedStrings.ChildElements[ int.Parse( cell.CellValue.InnerText ) ].InnerText
                                        : cell.CellValue.InnerText) 
                                     : String.Empty;

Open in new window


Fernando
0
 

Author Closing Comment

by:njgroup
ID: 35178511
thats perfect :)

thanks very much
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 35179722
Not a problem, glad I was able to help.
0

Featured Post

Quiz: What Do These Organizations Have In Common?

Hint: Their teams ended up taking quizzes, too.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The ECB site provides FX rates for major currencies since its inception in 1999 in the form of an XML feed. The files have the following format (reducted for brevity) (CODE) There are three files available HERE (http://www.ecb.europa.eu/stats/exch…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Want to learn how to record your desktop screen without having to use an outside camera. Click on this video and learn how to use the cool google extension called "Screencastify"! Step 1: Open a new google tab Step 2: Go to the left hand upper corn…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question