SSIS: loading vertical-bar delimited files

Posted on 2013-10-19
Medium Priority
Last Modified: 2016-02-11

I am loading pipe (vertical bar) delimeted text files into SQL Server tables.
The actual data shouldn't have | character , however, it happens from time to time.  And the file cannot be loaded unless we manually correct the data.
I would appreciate any ideas on how to improve loading process.

Thank you in advance.
Question by:quasar_ee
LVL 38

Accepted Solution

Jim P. earned 1000 total points
ID: 39586010
It will always have to be a manual process to cleanup the data. And it is an RBAR (Row By Agonizing Row) process.

But a possibility would be to load the data into a staging table that is just a an identity column and a large text column. Then have a sproc go through each row and count the number of pipes. If the count is right move it to the final table. Any counts that are off, either hold in the staging table or move to another table for editing.
LVL 52

Assisted Solution

by:Mark Wills
Mark Wills earned 1000 total points
ID: 39586042
Yeah, nasty business being *mostly* correct.

Depends really on just how often it happens. If getting an error in your SSIS process and then having to fix manually is an occasional inconvenience, or (hopefully) a rarity, then manual inspection is the best method.

Programmatically it becomes a challenge to recognise a wayward pipe. It will be able to count, but the "fix" is harder to achieve programmatically.

Would have to start relying on things like the nth instance being a possible included pipe rather than a delimiter. And then the size of where it should be included and if contextually that would still make sense. And the number of possible occurrences.

If I was to tackle the problem, and assuming it was just in the one column, then would load into a staging table as jimpen suggests and load the preceding columns and then from the right hand side working back so that the only column left was the one containing the rogue pipe. E.g. process columns 1,2,3,4 then 9,8,7,6 leaving the last one 5.

That doesn't take care of multiple columns having the problem, but would take care of multiple pipes within the one column so long as that one column was always the only one of concern.

Does that make sense ?

It means custom T-SQL to break apart the pipe delimited rows.

There would be a bit of overhead attached, so time might be a concern as well as log activity (no longer a bulk load process) and possibly tempdb.
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 39587194
just some additional  questions:
- which SQL Server version are you targeting/using?
- Do you have the chance to change the generation of the flat file to implement text qualifiers?

Meaning from this
1|Just a text with an additional | separator inside|2013-10-21
to get that
1|"Just a text with an additional | separator inside"|2013-10-21
(using any other unused character to encapsulate entries which contain the delimiter character)

LVL 66

Expert Comment

by:Jim Horn
ID: 39587602
<Somewhat redundant with the above posts>

I had that happen on a project once where emails were stored in SQL columns, and many email signatures had pipes in them such as My Name | My Title | My Phone number.

We were able to go back to the client and convince them to do a tilde ( ~ ) delimeted file, and problem solved.  Hilarity ensued when they first balked and we asked them which Magic 8-ball we should use to interpret the pipes in signature blocks vs. delimeter pipes.

The additional five grand estimated by me to pull this off was icing on the cake.  Anytime you can pit your own sales executives against clients who treat everything as fixed-bid, it is well worth the effort.
LVL 66

Expert Comment

by:Jim Horn
ID: 39611734
So quasar_ee, how's it going?  Lots of good advice here.

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question