?
Solved

How To: LINQ to Excel

Posted on 2011-05-11
6
Medium Priority
?
1,929 Views
Last Modified: 2013-11-11
I have 2 excel files with File1.Col2 = File2.Col1.  I want to use LINQ to give me a result of all data from File2 based on File1.  I found this code online and it's using IEnumerable lists, but I know that it isn't efficient.  However, I don't know how to make it better.
 
class ExcelRow
    {
        List<object> columns;

        public ExcelRow()
        {
            columns = new List<object>();
        }

        internal void AddColumn(object value)
        {
            columns.Add(value);
        }

        public object this[int index]
        {
            get { return columns[index]; }
        }

        public string GetString(int index)
        {
            if (columns[index] is DBNull)
            {
                return null;
            }
            return columns[index].ToString();
        }

        public int Count
        {
            get { return this.columns.Count; }
        }
    }

class ExcelProvider : IEnumerable<ExcelRow>
    {
        private string sheet;
        private string filePath;
        private List<ExcelRow> rows;


        public ExcelProvider()
        {
            rows = new List<ExcelRow>();
        }

        public static ExcelProvider Create(string filePath, string sheet)
        {
            ExcelProvider provider = new ExcelProvider();
            provider.sheet = sheet;
            provider.filePath = filePath;
            return provider;
        }

        private void Load()
        {
            string connectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=    ""Excel 8.0;HDR=YES;""";
            connectionString = string.Format(connectionString, filePath);
            rows.Clear();
            using (OleDbConnection conn = new OleDbConnection(connectionString))
            {
                conn.Open();
                using (OleDbCommand cmd = conn.CreateCommand())
                {
                    cmd.CommandText = "SELECT * FROM [" + sheet + "$] WHERE Name IS NOT NULL ORDER BY Name";
                    using (OleDbDataReader reader = cmd.ExecuteReader())
                    {
                        while (reader.Read())
                        {
                            ExcelRow newRow = new ExcelRow();
                            for (int count = 0; count < reader.FieldCount; count++)
                            {
                                newRow.AddColumn(reader[count]);
                            }
                            rows.Add(newRow);
                        }
                    }
                }
            }
        }

        public IEnumerator<ExcelRow> GetEnumerator()
        {
            Load();
            return rows.GetEnumerator();
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            Load();
            return rows.GetEnumerator();
        }

    }

class Program {
  static void Main(string[] args) {
    ExcelProvider datapullList =
            ExcelProvider.Create(@"c:\temp\data.xls", "Sheet1");
    ExcelProvider rbintlList =
            ExcelProvider.Create(@"c:\temp\data2.xls", "Sheet1");

            var myList = from d in datapullList
                         from r in rbintlList
                        where d.GetString(0) == r.GetString(1)
                        select new
                        {
                            Name = d.GetString(0),           
                            Alias = r.GetString(0)
                        };
  }
}

Open in new window


I've considered using DataTables but I'm not sure if it's worth all the effort to set it up.  Besides, I'm not convinced that DataTables are better than IEnumerable lists.  I'd like to hear your feedback on DataTables vs IEnumerable Lists.  Also please help me with my code.

Thank you in advance.
0
Comment
Question by:gcastong
  • 3
  • 2
5 Comments
 
LVL 36

Expert Comment

by:Miguel Oz
ID: 35743755
I do not think DataTables are better than IEnumerable lists, you still need to fill the datatable and datatable is a more general class(DataTables have more overhead than the code you show here). The reader in your code is better than datatable.
For an example on the extra overhead, check:
http://aspdotnetcode.source-of-humor.com/CodeSnippets/ImportExport/ImportMsExcelWorkSheetsDataIntoDataSetDataTables.aspx
0
 

Author Comment

by:gcastong
ID: 35746487
Thanks for the response @mas_oz2003.

So, do you believe that my code is efficient?  I feel that it isn't, but I'm just a novice programmer.  If it isn't, how can I make it more efficient, without going through the DataTables route?
0
 

Accepted Solution

by:
gcastong earned 0 total points
ID: 35748418
I decided to stick with my reader and see where that takes me.  Unfortunately, I'm stumped at passing that "variable list" to another method.  I'm having trouble with the type casting.  Here's my code:
IEnumerable<ExcelProvider> tempList = setupList(myList, initials);

private static IEnumerable<ExcelProvider> setupList(ExcelProvider myList, string[] Inits)
        {
            var myList1 = from x in myList
                          where myList1.Alias == Inits[0]
                          select x;

            return myList1;
        }

Open in new window


I've been trying to figure this out for a long time now.  Please help!!!
0
 

Author Comment

by:gcastong
ID: 35905795
I've requested that this question be closed as follows:

Accepted answer: 0 points for gcastong's comment http:/Q_27033299.html#35748418

for the following reason:

The solution that I came up was accurate and pretty efficient.
0
 
LVL 36

Expert Comment

by:Miguel Oz
ID: 35905796
Can you post your final solution?
Apologies for delay, being a bit busy lately.
Glad to see that you follow my recommendation to stick to the reader. Regarding efficiency is hard to tell without knowing Excel file contents and size, and keep in mind you can always use Excel interop if you are using this code in non-multithreaded app
0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes how you can use Custom Document Properties to store settings and other information in your workbook so that they will be available the next time you open the workbook.
Windows Explorer let you handle zip folders nearly as any other folder: Copy, move, change, and delete, etc. In VBA you can also handle normal files and folders, but zip folders takes a little more - and that you'll find here.
This Micro Tutorial will demonstrate the scrolling table in Microsoft Excel using the INDEX function.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question