Solved

Columns in delimited file may vary when input to SSIS

Posted on 2007-11-16
4
934 Views
Last Modified: 2012-06-21
Hi everyone.

A delimited file is being sent to us from another company. The file is supposed to have 10 columns in each row. We are going to process the file using SSIS (2005).

Question - how do we handle the file if some of the rows are bad - are missing one or more columns?

If my package reads the file using a flat file source, when I run the package with a file where some of the rows have fewer than the expected 10 columns, my package abends on the flat file source task.

All we can think of doing is writing .net code to process the file as the first step of the package or even outside the package, to remove bad rows from the file before it hits SSIS.  

Thank you.
0
Comment
Question by:glen_esq
  • 2
4 Comments
 
LVL 12

Accepted Solution

by:
kselvia earned 500 total points
Comment Utility
not a great solution but here goes

Rather than writing your own, download gawk.exe (google for it) to search for rows with 10 commas in it and redirect the output to a good input file, then load that file.

Or bulk insert each row into a single varchar column table.

bulk insert mytable
from 'c:\path\file.csv'

create view v_my_good_table
select * from mytable where len(replace(dta,',')) = len(dta) - 10

exec master..xp_cmdshell "select * from v_my_good_table" queryout "goodfile.csv" -S server -U user -P pwd '

Then load goodfile.csv using your package

Or -- just parse the single column data using charindex and substring and load the columns yourself for rows that are correct



0
 
LVL 3

Expert Comment

by:yellowjetski
Comment Utility
You can try to use the Exception Row feature in SSIS to handle the bad rows.

Overview: SSIS will read each row assuming it qualifies to the pre-defined columns. When there's a bad row of any reason (incorrect number of columns, invalid data type, etc.) You can have SSIS write the bad row only to a exception file and the package continues to process the next line without failing the whole package.

Let me know if you have any questions regarding my solution.
0
 

Author Comment

by:glen_esq
Comment Utility
Thank you both very much for your responses.

yellowjetski - more info please on your solution. I have been testing with a very simple package. A flat file data source which reads my delimited file, and on its failure precedence constraint, I have attached a flat file destination.

If I remove the column delimiter (in our case a tilde '~') and the corresponding data, in other words, my input file has only 9 columns instead of 10, the flat file source bombs, the package proceeds no further - the tlat file source turns red, that's all she wrote.

Can you give me more info on the Exception Row feature? To which type of Control Flow or Data Flow task are you referring?
0
 

Author Comment

by:glen_esq
Comment Utility
Should have added more info - If I set both Error and Truncation error conditions to redirect row, my package will not fail (perhaps this is the Exception Row you are referring to), however, SSIS doesn't handle this as well as I'd like.

I have 8 rows of input, the 6th row I removed one column and its associated delimiter leaving it with 6 columns. I have data viewers on the success and failure constraints from my flat file source. Rows 1 to 5 and 8 are processed successfully, and rows 6 and 7 go to the flat file destination specified on the failure constraint of my flat file source. I'm losing the 6th row (which is missing a column) and the 7th row which contains good data, but SSIS grabs anyway). I don't want to lose the 7th row in these situations.
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Introduced in Microsoft SQL Server 2005, the Copy Database Wizard (http://msdn.microsoft.com/en-us/library/ms188664.aspx) is useful in copying databases and associated objects between SQL instances; therefore, it is a good migration and upgrade tool…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Viewers will learn how the fundamental information of how to create a table.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now