<

Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x

Retrieving Data From Excel Using OPENROWSET()

Published on
80,164 Points
72,164 Views
10 Endorsements
Last Modified:
ValentinoV
1998: C++ - SQL Server 6.5
2000-2007: C++, VB6, C#, java - SQL Server 7.0-2005
2008-...: SQL Server 2005-2016
2014-2016: MVP Data Platform
The purpose of this article is to demonstrate how to retrieve data from an Excel sheet and put it in a table in a SQL Server database.

Introduction

Anyone who's ever used a computer for a significant amount of time has probably come into contact with Excel, the spreadsheet application part of the Microsoft Office suite. Its main purposes are to perform calculations and create charts and pivot tables for analysis.

But people have great imagination and invent new uses for it every day.  I've even seen it used as a picture album.  (Sorry dad, but I know you won't be reading this anyway. :-) )  Ever since he had this specific YACI, or "Yet Another Computer Issue", because his PC wasn't powerful enough to open his 45 MB Excel file, uh, "picture collection", he took some evening classes.  He's now putting his Photoshopped pictures in PowerPoint...  Anyway, let's get back on track now.

Another use, and the one that's the subject of this article, is when Excel has been used as a database.  Come on, you know what I'm talking about, with the first row containing the column headers followed by possibly thousands of data rows.  The following screenshot contains an example, and is also the file that I will be using in this article.  I took all records from the Production.Product table in the AdventureWorks 2008R2 database and dumped them in Excel.

 An Excel sheet used as a data store
At some point people will realize, either because someone told them or because they lost some data due to inattentiveness, that it wasn't a really good idea to keep all that data in an Excel sheet.  And they'll ask you to put it in a real database such as SQL Server.

That's what I'm going to show you in the next paragraphs: how to import data from Excel into SQL Server.

Using OPENROWSET() To Query Excel Files

There are actually several different ways to achieve this.  In this article I will use the OPENROWSET() function.  This is a T-SQL function that can be used to access any OLE DB data source.  All you need is the right OLE DB driver.  The oldest version which I could confirm that contains this function is SQL Server 7.0, good enough to say that any version supports it.

My sample Excel files are located in C:\temp\.  This folder contains two files: Products.xls and Products.xlsx.  The first file is saved in the old format, Excel 97-2003, while the second file was saved from Excel 2010.  Both files contain the same data.  The sheet containing the list of products is called ProductList.

And here are the queries:

--Excel 2007-2010
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

--Excel 97-2003
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
    'Excel 8.0;HDR=YES;Database=C:\temp\Products.xls',
    'select * from [ProductList$]');

Open in new window


These queries are just returning the data from the Excel file into the Results window, when executed using the Management Studio.  To insert the data into a table, uncomment the INTO clause.  When uncommented, the statement retrieves the data from the Excel sheet and puts it into a newly-created local temporary table called #productlist.

Furthermore, the query assumes that the first row contains the header.  If that's not the case, replace HDR=YES with HDR=NO.

Note: if you get an error message when running the query, look further down in this article.  I've covered a couple of them.

With the INTO clause uncommented and the query executed, the temporary table can now be queried just like any other table:

SELECT * FROM #productlist

Open in new window


What Type Is Your Data?

Let's have a look if this method of using a SELECT INTO in combination with OPENROWSET and a temporary table is smart enough to interpret the correct data types of the data coming in.  Use the following command to describe the metadata of the temporary table:

USE tempdb;
GO
sp_help '#productlist';

Open in new window


Because a temporary table is stored in the tempdb, the sp_help command should be issued against that database.

Here's the part of the output in which we're interested:

 The data types used when combining OPENROWSET with SELECT INTO
As you can see, anything that looks like text will be put in a field of type nvarchar(510) and anything that looks like a number (integers, floating-point numbers, datetime values, ...) is put into a float(53).  Not a lot of intelligence there.  This is the result when no formatting was put on the cells in Excel.

As an experiment I've changed the format of some fields in the Excel file and then retried the SELECT INTO statement.  What did I change?  I identified ProductID as being a number without any decimals, changed StandardCost and ListPrice to a currency with four decimal digits and I changed SellStartDate and SellEndDate to a custom date/time format showing both date and time.

The effect on the table creation was not completely as I would have expected:

 SELECT INTO with some field types changed
ProductID is still being stored into a float field, even though in Excel it's defined as having no decimals.  And the datetime values are not recognized either.  Okay, I used a custom format there, so maybe it's due to that.

It's up to you of course how you use this method of importing the data.  You can put your records into a temporary table to process further, or you can create a table with the expected data types upfront and import the data directly into that one.

Some Possible Issues

Let's cover some issues related to this method.

Enable 'AD Hoc Distributed Queries'

The OPENROWSET() function expects that the 'Ad Hoc Distributed Queries' option is enabled on the server.  When that's not the case you'll see the following message:


Msg 15281, Level 16, State 1, Line 1

SQL Server blocked access to STATEMENT 'OpenRowset/OpenDatasource' of component 'Ad Hoc Distributed Queries' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ad Hoc Distributed Queries' by using sp_configure. For more information about enabling 'Ad Hoc Distributed Queries', see "Surface Area Configuration" in SQL Server Books Online.

This is one of the advanced options.  To enable it you can use the following command:

sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO

sp_configure 'Ad Hoc Distributed Queries', 1;
GO
RECONFIGURE;
GO

Open in new window


To get a good look at all the different settings, just run the sp_configure procedure without any parameters.

Note: if you're not the administrator of the server, you should talk to the DBA who's responsible before attempting this.

The File Needs To Be Closed

When the Excel file is not closed, you'll end up with the following error:


Msg 7399, Level 16, State 1, Line 1

The OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "(null)" reported an error. The provider did not give any information about the error.

Msg 7303, Level 16, State 1, Line 1

Cannot initialize the data source object of OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "(null)".

So close the file and try the query again.

OLE DB Driver Not Installed

The OPENROWSET() function uses OLE DB, so it needs a driver for your data source, in this case for Excel.  If the right driver is not installed, you'll see the following error (or similar, depends on the version used).


Msg 7302, Level 16, State 1, Line 1

Cannot create an instance of OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)".

To solve the issue, install the right driver and try again.

How can you tell what drivers are installed?  Open up the ODBC Data Source Administrator window (Start > Run > type ODBCAD32.EXE and enter) and have a look in the Drivers tab.  The following screenshot (taken on a Dutch Windows XP) shows both the JET 4.0 driver for Excel 97-2003 and the fairly-new ACE driver for Excel 2007.

 odbcad32.exe - ODBC Data Source Administrator
The drivers can be downloaded from the following pages on the Microsoft site:

Excel 97-2003 Jet 4.0 driver
Excel 2007 ACE driver – 12.00.6423.1000
Excel 2010 ACE driver (beta) – 14.00.4732.1000

Sidenote: the Excel 2010 driver is not supported on Windows XP, but I was able to query the 2010 Excel sheet using the 2007 driver.  I guess that this is the result of the Office Open XML standard which was introduced in Office 2007.

Driver backward-compatibility

The ACE drivers are backwards-compatible.  So the following queries are working perfectly:

--old Excel with new ACE driver - working query 1
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 8.0;HDR=YES;Database=C:\temp\Products.xls',
    'SELECT * FROM [ProductList$]');

--old Excel with new ACE driver - working query 2
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;HDR=YES;Database=C:\temp\Products.xls',
    'SELECT * FROM [ProductList$]');

Open in new window


In other words, you won't be needing that first link for the Jet driver.  For the full story have a look at this blog post by Adam Saxton of the CSS SQL Server Escalation Services team.

The 64-bit Story

So, what if you're running a 64-bit OS?  I'll start by saying that I had quite some issues getting OPENROWSET to work, but finally I managed it.  Following is a list of my attempts, each time with the resulting message.  And finally I'll show you how I got it to work.  The problem was something really unexpected...

ACE 14 64-bit through SSMS

My main laptop is running Windows 7 64-bit, Office 2010 64-bit and SQL Server 2008 R2 64-bit.  So I installed the 64-bit version of the ACE 14 driver, which happens to be the first OLE DB driver for Excel that ships in 64-bit.  But when I execute my query I'm getting the following message:


Msg 7403, Level 16, State 1, Line 1

The OLE DB provider "Microsoft.ACE.OLEDB.14.0" has not been registered.

Is this because SSMS ships only in 32-bit?  Maybe, but I'm not able to install the 32-bit driver.  It doesn't allow me to because I've got Office in 64-bit installed.  The installer throws me the following error:

 Microsoft Access database engine 2010 (beta) - You cannot install the 32-bit version of Access Database engine for Microsoft Office 2010 because you currently have 64-bit Office products installed...
ACE 12 32-bit on a 64-bit machine

When I check the installed drivers using the 32-bit version of the ODBC Data Source Administrator (located in C:\Windows\SysWOW64), I notice that the ACE 12 driver is installed.  However, trying to use that one from the Management Studio gives me this:


Msg 7399, Level 16, State 1, Line 1

The OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)" reported an error. The provider did not give any information about the error.

Msg 7330, Level 16, State 2, Line 1

Cannot fetch a row from OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)".

The Results pane shows all the columns with the right column names, retrieved from Excel.  But the driver seems to have a problem retrieving the actual data.

This issue with error 7330 is mentioned in the following thread on the SQL Server MSDN forum, but unfortunately the proposed solution didn't solve the problem in my case.

64-bit SQLCMD using ACE 14 driver

I also tried using the 64-bit version of sqlcmd.exe, but strangely enough that throws the same error.

 Using sqlcmd 64-bit to query Excel
I actually expected this last method to work, after all, everything is now running in 64-bit.  But alas, it didn't...

One more go...

After some more trial and error, I have actually found a way to get the query to work.  I don't have a logical explanation on why it's behaving the way it is, but, well, it is working...

This query is running fine:

SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

Open in new window


But this one isn't:

--Excel 2007-2010
SELECT * --INTO #productlist
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0 Xml;HDR=YES;Database=C:\temp\Products.xlsx',
    'SELECT * FROM [ProductList$]');

Open in new window


It's exactly the same query, only difference is the comment line at the start.  And even weirder, if I add a space after the double-dash, the query works fine as well!

Then I decided to remove the commented INTO clause.  This made the weird behavior disappear.  So for some reason SQL Server doesn't like the OPENROWSET function combined with comments inside the query.  The strange behavior also disappears when a space is added between the double-dash and the INTO keyword.

Uh, computers can be so much fun, right? :-)

If anyone has got an explanation on this strange behavior: please do post a comment!  For now my conclusion is: don't use comments when creating an OPENROWSET query.

 

IMPORTANT UPDATE (April 11, 2010): it seems that the current installer for the ACE 14 driver contains a bug and registers it as being "Microsoft.ACE.OLEDB.12.0" instead of "Microsoft.ACE.OLEDB.14.0" .  This explains some of the issues shown above.  Some evidence on the issue:

Microsoft Connect: Access Database Engine 2010 installation issue to use with ADO access technology to access data from Jet database (.mdb files)

The 'Microsoft.ACE.OLEDB.14.0' provider is not registered ..... (see last comment)

Excel Services, ODC and Microsoft.ACE.OLEDB.14.0

 

Conclusion

The above has shown that OPENROWSET() can be a useful function, given the right circumstances.  But in the wrong setting it can be quite cumbersome to get to work.

I would recommend this method only for one-off quick imports, such as when you as a developer are given a bunch of data in a spreadsheet and need to get it into the database, one way or another.  I would not use it for an automated import process.  For that we've got a more interesting alternative which I'll cover in an upcoming article.

Have fun!

PS: is it a Yes or is it a No??  Please click one before you go! :-)

Valentino.

Originally appeared at my own website: http://blog.hoegaerden.be/2010/03/29/retrieving-data-from-excel 

References
BOL 2008: Special Table Types (incl. temporary tables)
BOL 2008: OPENROWSET() function
BOL 2008: the INTO clause
CSS SQL Server Engineers: How to get a x64 version of Jet?
10
Comment
Author:ValentinoV
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 66

Expert Comment

by:Jim Horn
Excellent article.  Voted Yes.
0
 
LVL 48

Expert Comment

by:Dale Fye
Valentino,

Great article, helped me with a project I've been working on which was loading data into Access, then writing to SQL Server and taking forever.  I'm loading 600-750K rows from several tables every month into a much larger

One question, can I put the process in a stored procedure and pass the file name and worksheet name into the stored procedure as variables?  

I'm an Access guy, did some work in SQL Server 2000 but have not been in a position to use it since.  I just really getting started using SQL Server Express 2008.  So could use some advice here.  I'll post a question if you would like.
0
 
LVL 37

Author Comment

by:ValentinoV
Hi fyed,

Regarding your Q: that would be possible but it also means you'd have to use dynamic SQL.  But yes, technically this is an option.

However, if you're looking at a process which will need to happen on a scheduled basis, I would look into using SSIS.  And I happen to have an article on that as well, see that link in the Conclusion above :)

SSIS does have a fairly steep learning curve at the start but it will be worth the effort!  Search for a couple of getting started articles that focus on using the Data Flow transform, play around with it yourself, and you'll see how powerful that can be...

Ow, you mentioned Express...  If that's the only version you've got available then unfortunately SSIS is not an option.  I'll post the above info anyway, just in case you do have access to a broader version of SQL Server...

Best regards,
Valentino.
0
Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

 
LVL 48

Expert Comment

by:Dale Fye
Thanks.  I'll take a look.

Yes, this process is repeated once every month, but I think I can handle the "dynamic" nature of the file and sheet names by building the SQL in Access and passing it to SQL Server in a pass-through query.
0
 
LVL 37

Author Comment

by:ValentinoV
I'm not that familiar with Access so I'm not sure what you mean with "pass-through" query.  But dynamic SQL in the stored proc wouldn't really have to be complicated.  Basically you just build a long string which contains the SELECT statement.  Once the variable contains a valid SELECT statement, you execute it through sp_executesql, a system SP made for that purpose.

If you're up to it, this is a really interesting (though long) article: The Curse and Blessings of Dynamic SQL

For just the basics:
sp_executesql (Transact-SQL)
Execute Dynamic SQL commands in SQL Server
0
 

Expert Comment

by:Jones Bravo
Most complete and integrated solutions that enable you to leverage a wide range of information to operate more efficiently and make better decisions. Gain insight into every aspect of the business and discover new ways to strategize, plan, and optimize your business Mark Hurd Oracle …..
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Join & Write a Comment

Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month