Link to home
Start Free TrialLog in
Avatar of Crystal Rouse
Crystal RouseFlag for United States of America

asked on

Import Excel File using C#

I need to write something allowing a user to upload an Excel File and then import that into a
table in our database using a Stored Procedure to insert the data.

I need help on how to get the ColumnNames and
verify all Required Fields are there, validate that there are not any NULLS in the Required Fields and then loop thru
the rows passing the data in parameters to the Stored Procedure.
Avatar of hilltop
hilltop
Flag of United States of America image

I would convert your excel to csv/comma delimited and use bulk insert and format files which will allow you to import just about any document format. Any prerequisite checks would be done before calling the bulk insert procedure.

Bulk insert stored procedure.
USE [yourdb]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[csv_import]
(
@FilePath nvarchar(4000)
)
AS
declare @SQL varchar(500)
SET @SQL='BULK INSERT Yourtable FROM ''' + @FilePath + ''' WITH (FIRSTROW = 1, FIELDTERMINATOR ='','',ROWTERMINATOR=''\n'')'
EXEC(@sql)

Open in new window


Use a Format File to Bulk Import Data (SQL Server)
https://docs.microsoft.com/en-us/sql/relational-databases/import-export/use-a-format-file-to-bulk-import-data-sql-server?view=sql-server-2017
Not really sure how to say this in a friendly way, but my expert recommendation here is to RUN LIKE HELL!

The big problem you're going to have is that Excel is great for users being able to edit the spreadsheet any way they want.  On the opposite side of the spectrum ETL jobs such as the proposed import from Excel to SQL Server will require a well-defined 'contract' between the schemas of source and target data, which means there can be zero deviation from that such as missing columns/fields.  This especially means that it will not accept a data set with an unknown number of columns/fields and be able to process that into a pre-defined, normalized table.

Which means you'll have to write custom code to handle every little deviation in the Excel spreadsheet.   It can be done, but it will take a ton of hours that is normally not easily justified.

The flip side that would make this a much quicker process is to enforce that the Excel file has a column format that never changes.

Moving from schema to data, validating data in the columns would not be a problem, as SQL Server can easily handle such things as NULL handing, default values, a range of values, etc.  Taking more work would be things like how to handle a date column where the data has a non-date value.

Also, Stored Procedures return data, and typically not consume it, so explain in detail how the Stored procedures and parameters come into play.

Good luck.

btw for an illustration of 'well-defined 'contract' between the schemas of source and target data' check out my article Requirements Document Template for an ETL Project.
If your file format never changes then a bulk insert is a fast easy option. Jim makes a pretty good point, one little mistake in the file and you will have users yelling for support. Mistakes in the file can be fixed "Well not fixed, but ignored." server side via format files. That said it would be just as easy to fix the file itself. You may well be busy, when end users break something they don't generally fix it..

You could Serialize the , delimited file and check each element. You could then insert the data one line at a time, skipping/noting those lines that failed.

CSV Serializer for .NET
https://www.codeproject.com/articles/566656/csv-serializer-for-net

Better yet if you converted your excel document to XML you could use XmlDictionaryReader.
https://docs.microsoft.com/en-us/dotnet/framework/wcf/feature-details/serialization-and-deserialization
https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmldictionaryreader?view=netframework-4.8

Even minor formating issues with the file could wreck the whole thing.
My 2 cents on this.  First, I DO agree with the comments above.  Bulk insert is going to be faster, and Excel data is going to contain errors.
But if this is coming from users as you said was the case, you can't expect users to convert from Excel to CSV, or even know what CSV is.  A good chunk of them will, but a large percentage have no idea, in which case you'll be fielding support calls of "why my excel didn't work".  And if you already have a stored proc (presumably with a lot more logic than a simple insert), then go with that.

So assuming this is coming from users, and assuming it is Excel:
Have a look at EPPlus
https://www.nuget.org/packages/EPPlus

Fantastic wrapper around the openxml parsing of Excel files.  Very quick and easy to use and gives you access to all the data in the Excel sheet.

I have many different systems allowing upload of excel files, this is what I'm doing:
1) Open the excel file
2) Are we safe in your case to assume the top line is headers? IE are you providing them a template / guideline for how to parse?
3) Make sure the columns are there, and save a reference to the position (IE "StartDate" is in column 4)
4) Loop through each row in the excel, verify the data.  Try to convert to the proper data types, save into an object and save that object into a list
5) Once ALL rows have been parsed - if there are any errors, display the errors back to the user for them to fix the sheet and re-upload.
6) If there were not any errors, then loop through the list and call the proc
You can programmatically convert your file. In Excel or on the server,

Excel to XML Conversion
https://www.c-sharpcorner.com/UploadFile/097c29/excel-to-xml-conversation-in-visual-studio/

C# (CSharp) Microsoft.Office.Interop.Excel Workbook.SaveAs Examples
https://csharp.hotexamples.com/examples/Microsoft.Office.Interop.Excel/Workbook/SaveAs/php-workbook-saveas-method-examples.html


 .SaveAs() method in Excel object.
https://stackoverflow.com/questions/2536181/is-there-any-simple-way-to-convert-xls-file-to-csv-file-excel
wbWorkbook.SaveAs("c:\yourdesiredFilename.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSV)

Open in new window

Or
public static void SaveAs()
{
    Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.ApplicationClass();
    Microsoft.Office.Interop.Excel.Workbook wbWorkbook = app.Workbooks.Add(Type.Missing);
    Microsoft.Office.Interop.Excel.Sheets wsSheet = wbWorkbook.Worksheets;
    Microsoft.Office.Interop.Excel.Worksheet CurSheet = (Microsoft.Office.Interop.Excel.Worksheet)wsSheet[1];

    Microsoft.Office.Interop.Excel.Range thisCell = (Microsoft.Office.Interop.Excel.Range)CurSheet.Cells[1, 1];

    thisCell.Value2 = "This is a test.";

    wbWorkbook.SaveAs(@"c:\one.xls", Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookNormal, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
    wbWorkbook.SaveAs(@"c:\two.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSVWindows, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

    wbWorkbook.Close(false, "", true);
}

Open in new window

Just to note, I'm making the assumption here that this is on a web server where the processing is happening.
The saveAs and methods above are in the interop assemblies.  As a general rule you should avoid those at ALL costs on a web server.
You'll need an office license / applications installed on the server, and will inevitably get massive threading locks and problems and com disposal headaches.
Avatar of Crystal Rouse

ASKER

Thanks for all the help and suggestions!  I am going to spend some time on this and update with sample code.  My users are "elevated users" that have used our system for awhile.  They currently import an Excel file and know all about having the correct column names.  The current File Import is wrote using REACT/Redux and I"m converting to an MVC C# project using RAZOR.  I don't want to use LINQ and would rather use a Stored Procedure for the insert or maybe BULK INSERT.
BULK INSERT was made to perform this task.

Imports a data file into a database table or view in a user-specified format in SQL Server. I have used BULK INSERT extensively, and have a ton of examples using format files etc..
https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-2017

Personally if this excel file was passed around enough to warrant concern. I would build validation into the file itself, and maybe save as csv programmatically at each save or close. Just throw an error if the user try's to upload the .xlsx and remind them that the server expects the auto created .csv. If the users share the excel file, save the .csv locally etc..

I might consider skipping the automatic save as csv, and just educate the users using Excels Save As dialog, but it can be done automatically.

 Your users may have no issue with validation, saving as csv and uploading. If no validation is needed then its a 30 minute task to start importing your data. That said, cell validation for your excel file could be put into place pretty quickly as well. I would say put off today what you can do tomorrow. If you end up needing cell validation and auto saves, well you grew into it.

Apply data validation to cells
https://support.office.com/en-us/article/Apply-data-validation-to-cells-29FECBCC-D1B9-42C1-9D76-EFF3CE5F7249

Workbook.BeforeClose event (Excel)
https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel-mso_winother/run-macro-on-close-and-on-save-excel/d4c6b6e1-b16e-4b8f-8ec4-3a38d8e685fc
Thanks for the info!  The data is actually exported from another program so I can't add any validation to the actual Excel File.
That said, the users understand that it has to be in the correct format with the correct column names.  I'll read up on Bulk Insert with SQL.
Exported from another program? Is it an excel workbook? I would open the file in Wordpad and make sure you don't already have a delimited flat file. Excel reads delimited files like a champ, so I would check.
ASKER CERTIFIED SOLUTION
Avatar of Crystal Rouse
Crystal Rouse
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial