Solved

Importing Excel file problem

Posted on 2010-08-12
12
365 Views
Last Modified: 2012-05-10
I am importing Excel 2003 file using SSIS into SQLServer 2005.
Some of the cells in Excel file have little green triangles in the lefthand corners - with a warning that it's a number formatted as text or preceded by an apostrophy. Even though a value is a normal number and when I save it as csv I don't see any apostrophies.
My problem is that when I import this file using SSIS this cells are lost and become NULLs.
Is there any way to avoid it? Excel files are being picked up from ftp site and imported automatically, so fixing it in the file itself is not an option. Unless I create a script that does it.
In SQL Destination is a varchar(10)

Thank you!
0
Comment
Question by:larisa1970
  • 4
  • 4
  • 2
  • +2
12 Comments
 
LVL 12

Expert Comment

by:pastorchris
ID: 33428013
To avoid Excel from formatting your data, try working with CSVs as source files from the remote site.
0
 

Author Comment

by:larisa1970
ID: 33431944
Unfortunately I don't have an option of modifying the source file before importing.
Is it possible to change xls file (one of the sheets in it) to CSV file with a script?
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 33432831
I would try importing the data into a staging table that only has VarChar() columns.  You should be able to set up the import to use a query that tests the numeric values for being numeric and, either accepts them as they are or , if they are numeric, converts them to a varchar.
Once you have your data in the staging table, you can manipulate it much more easily.  You can set up an INSERT query to pull the data from the staging table and convert any numeric column's varchar value to a number as needed.  This will also allow you to handle issues with dates that may be saved as strings and even those "string dates" that are invalid (e.g. '2010-02-29').
0
 
LVL 30

Expert Comment

by:nmcdermaid
ID: 33450519
Its probably this issue
http://www.sqldts.com/254.aspx
This page is for DTS but it's really about the Excel driver. Basically you need to stick the
IMEX=1
part into the Excel connection string that's defined in the Excel connection manager
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33459945
Yep, normally add in the IMEX=1

But there is a small "gotcha" in so much as excel might ignore the first 8 rows.

It does try to "guess" the content of a column based on the first "n" rows. That is controlled by a registry setting :

 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] located registry REG_DWORD "TypeGuessRows".

If you set it to zero then you are say scan all rows. I have mine set to decimal 25 thought the books say between 0 and 16.

Have a look at the example spreadsheet below - the last three columns using the following :

select * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0','Excel 8.0;database=c:\order_worksheet.xls;hdr=yes;imex=1',' select * from [order_worksheet$]') as a
select * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0','Excel 8.0;database=c:\order_worksheet.xls;hdr=yes;imex=0',' select * from [order_worksheet$]') as a

order-worksheet.xls
0
 
LVL 12

Expert Comment

by:pastorchris
ID: 33473039
Hi Larisa,
Have you already tried 8080_Diver's solution above?

Let the SSIS package import your excel sheet into a temporaty table which you can drop later.
While the data is in a temporary table, you can analyse it and pick up what exactly you want from an SQL script.

I once had a similar issue but on an SQL 2000 instance so i designed a DTS package which would pick up the source files one by one, into a temporary table (kind of a trans table) and then i would fire a procedure to handle all inconsistencies.

With SSIS, most probably the issue is related to the drivers you have. It's unlikely that your issue is caused by the way the data is stored/appears in the source Excel file.

Regards,
Chris.
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 51

Expert Comment

by:Mark Wills
ID: 33474353
Hey Chris,

Just importing you still leave it up to Excel to "auto" decide what column it is - regardless of destination - so alphebetic characters in a numeric column show as NULL and can sometimes do the same the other way around...

So, letting it simply import really wont achieve the results - you need to tell the driver how to override the ever present "help" from excel.

And, it doesnt necessarily have to be SSIS - it could just be a script similar to those openrowsets and essentially "roll your own" import and then simply schedule those scripts. Or a stored procedure or whatever... There are choices.

But always a good idea to first import into a "staging table" then do any validation / tidy up in there before committing to "live".



0
 

Author Comment

by:larisa1970
ID: 33476339
Regarding 8080_Diver's solution:
How do I "set up the import to use a query that tests the numeric values for being numeric and, either accepts them as they are or , if they are numeric, converts them to a varchar."
Jet SQL doesn't recognize CONVERT(VARCHAR,MyField). When I try to do that it complains that it is not a valid SQL statement.
For another suggestion of adding "IMEX=1" - where exactly do I add it?
P.S. It is a text column in excel and varchar(255) in a destination table.

Thank you
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33476478
Could you please try the openrowset with the imex=1 to see if it actually solves your problem... If it doesnt, then we have to do something different...

Is there any particular reason why it must be SSIS ? Could we entertain the idea of using T-SQL (even if an execute sql task)
0
 

Author Comment

by:larisa1970
ID: 33476658
I don't think I can use openrowset because we have 64 bit version
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 500 total points
ID: 33495213
Sure you can...

Or are you saying JET isnt available ? Then have you loaded the ACE (2007 dirver replaces jet)  ?  Can download from : http://www.microsoft.com/downloads/details.aspx?FamilyID=C06B8369-60DD-4B64-A44B-84B371EDE16D&displaylang=en&displaylang=en note there is an x64 version...

Select * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0', 'Excel 12.0;Database=c:\order_worksheet.xls;HDR=Yes;imex=1', 'SELECT * FROM [order_worksheet$]') as a
0
 

Author Closing Comment

by:larisa1970
ID: 33629494
Thank you, Mark.
That worked.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

If you having speed problem in loading SQL Server Management Studio, try to uncheck these options in your internet browser (IE -> Internet Options / Advanced / Security):    . Check for publisher's certificate revocation    . Check for server ce…
Data architecture is an important aspect in Software as a Service (SaaS) delivery model. This article is a study on the database of a single-tenant application that could be extended to support multiple tenants. The application is web-based develope…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now