SSIS Data Flow - Missing Column

Hello Experts,

I have a basic SSIS job that imports a CSV using a data flow task.  The CSV and columns are defined in a flat file connection, then dumped into a temp table using a flat file source -> ole db destination.  This works fine when the CSVs are clean.  However, depending on the day, the CSV might be missing a column.  An example:

Clean CSV

PayID     Name     Date                Void
1             John       8-30-2013       V
2             Jane       8-30-2013

Bad CSV

PayID     Name     Date                
1             John       8-30-2013      
2             Jane       8-30-2013

So the flat file connection expects the Void column to be present.  It isn't always present.  In this case if there are no voided checks the column isn't in the CSV file causing a job failure during the data flow.  I have a lot of CSVs like this.  If the missing column was at the end of the file each time I could probably find a way around this, but it is often in the middle of the column order.

Is there a different way of importing a CSV into a temp table where I can define the column names myself without a connection manager?  I'm looking for a way of importing these CSVs even if one of the columns is missing.  How do I achieve this?
jay-areAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Anthony PerkinsCommented:
The only way you can do that is if you first fix the file with some code.  Alternatively you could have two transformations one with and one without that you could then set to use one or the other depending on the number of columns.
0
jay-areAuthor Commented:
Is there a way to define the table schema and compare it to the CSV and add the missing column if need be?
0
Jim P.Commented:
How about importing into a table that is just like row number and then a varchar(1000) DataText column.

Then have an SP that evaluates the data and says it has a void column or not. Then that chooses which further processing it needs to the final table.

The real problem is the data provided is not consistent. If there is a way to create a data specification and enforce it would be the best way. But that is sort of a management decision and we don't know your situation.
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

awking00Information Technology SpecialistCommented:
Don't know if this will work and have no means of testing, but you might try creating the table with a default value for that last column.
0
jay-areAuthor Commented:
Thanks for the responses, sorry it took me so long to respond.  

Unfortunately I have little control over the CSVs before they reach my sql server.  I wish there was a way to force a column header in these files even if there is no data for the rows.

I thought the best process would be to import he CSV as is into a temp table of some sort, then check the columns of the temp table and look for any missing columns.  If there is one missing add it.  I know what columns should be there and what order they should be in.  Is there a way to achieve this in SSIS?
0
Jim P.Commented:
It would be more a matter of dumping it to a temp table. Then have SSIS fire off the stored procedure to do the fix, and dump it to the end table.
0
jay-areAuthor Commented:
What is the proper way to import a CSV to a temp table without defining the column names first?  I think the best first option is to import the CSV as it is to a temp table, but I'm not sure how to do that.  Currently I just use a flat file source with the CSVs columns already defined.
0
Jim P.Commented:
Just have a table that is an Identity column and then a column called something like TextData that is varchar(2000) or whatever the expected size. The identity column will give the order of the text file that was imported.

Then the stored proc can say does line 1 contain headers, or how many columns, etc and process that way.
0
jay-areAuthor Commented:
jimpen,

I created a table with 2 columns (ColumnID, TextData).  How do I import the CSV no matter the column definitions?  Should the format be fixed width instead of delimited?  I'm not sure how to import the csv without assigning column mappings.  I assume all columns should be mapped to TextData but I'm not sure how to do that part.  I'm used to mapping a flat file source to an ole db destination.
0
Jim P.Commented:
Should the format be fixed width instead of delimited?

Yes. Just import however many characters that you think the widest line will ever be. If the line is shorter -- the SSIS should be set to just ignore the shortage.
0
jay-areAuthor Commented:
Someone mentioned linked servers to me which got me thinking about this problem.  I need to be able to import the csv as is, and if a column is missing add to it.  Why not do this by using select * into via a linked server?  Here's what I came up with:

drop table journaltemp2;
go

select * into journaltemp2
           from (
select * from openrowset('MSDASQL'
               ,'Driver={Microsoft Access Text Driver (*.txt, *.csv)}; 
                   DefaultDir=\\ftpserver\imports\clientdir\'
               ,'select * from journal.CSV')) as journaltemp2;
               go
               
               alter table journaltemp2 add VOID varchar(50) null;
               go

               select * from journaltemp2

Open in new window


I had to download and install that odbc driver for this to work.  It works perfectly.  Now my csv is in a table and I've added the void column.  I'll look into a way of checking to see if VOID exists first, but this does the trick.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jay-areAuthor Commented:
Solves my issues with missing columns in a CSV file.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.