SSIS Converting Flat File with Header and Detail lines


I have a flat file that has two row types (OH = Order header, OL = Order line). One OH row can have multiple OL lines. I would like to combine the OH data rows and the OL data rows into one output row.

So for example the source data may look like this:

OH, Ted, #12345, M, United Kingdom
OL, 1, 2, 3, 4
OL, 2, 2, 4, 5
OH, Bob, #12346, M, United Kingdom
OL, 2, 4, 4, 6
OL, 2, 2, 4, 7

I would like the target file to be:
Ted, #12345, M, United Kingdom, 1, 2, 3, 4
Ted, #12345, M, United Kingdom, 2, 2, 4, 5
Bob, #12346, M, United Kingdom, 2, 4, 4, 6
Bob, #12346, M, United Kingdom, 2, 2, 4, 7

I know I could split the data rows into two outputs but I do not know how I would then link the data back together as there is no link in the source file between the OH and OL rows.

Thanks, Greg
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

zephyr_hex (Megan)DeveloperCommented:
I think you'll need to iterate over the source using a script task so that you're able to detect when you're moving to a new order.  I can't think of another way to ensure you don't lose the relationship between header and lines, unless the source has structural entities like XML.
spen_langAuthor Commented:
This is the route I have gone down. The CSV file is delimited using commas and the Order header and order detail rows have different amount of columns (something I forgot to mention in the post).

So in my script I am splitting the row in the file using the delimiter of a comma. However, my concern is that someone may put a comma in the text of the one of the columns, which would cause problems (e.g. in an address line). Is there any way to overcome this?
spen_langAuthor Commented:
Code being used in the script component:

Public Class ScriptMain
    Inherits UserComponent

    Private sDeliveryName As String = ""
    Private sDeliveryAddress1 As String = ""
    Private sDeliveryAddress2 As String = ""
    Private sDeliveryAddress3 As String = ""
    Private sDeliveryAddress4 As String = ""
    Private sDeliveryAddress5 As String = ""
    Private sCustomerOrderNumber As String = ""
    Private sDespatchDate As String = ""
    Private sDeliveryDate As String = ""
    Private sOrderNotes1 As String = ""
    Private sOrderNotes2 As String = ""
    Private sOrderNotes3 As String = ""
    Private sErrorLog As String = ""

    Private sRow As String()
    Const sDelimiter As String = ","

    Public Overrides Sub OrderRows_ProcessInputRow(ByVal Row As OrderRowsBuffer)
        ' Add your code here

        sRow = Row.OrderRow.Split(sDelimiter.ToCharArray())

        If sRow(0).ToString.ToUpper = "OH" Then
        ElseIf sRow(0).ToString.ToUpper = "OL" Then
            Row.Address1 = sDeliveryName
            Row.Address2 = sDeliveryAddress1
            Row.Address3 = sDeliveryAddress2
            Row.Address4 = sDeliveryAddress3
            Row.Address5 = sDeliveryAddress4
            Row.BarCode = ""
            Row.BatchNo = ""
            Row.BBE = ""
            Row.CustomerCode = "TEST"
            Row.CustomerGoodsRef = ""
            Row.CustOrderNo = sCustomerOrderNumber
            Row.CustRef = ""
            Row.DeliveryTime = "08:00"
            Row.DeliveryDateTime = sDeliveryDate
            Row.DeliveryInstructions = sOrderNotes1
            Row.Depot = ""
            Row.DespatchDate = sDespatchDate
            Row.DestLocation = ""
            Row.Location = ""
            Row.OrderPriority = ""
            Row.OrderDate = Date.Today.ToString("dd/MM/yyyy")
            Row.OrderField1 = ""
            Row.OrderField2 = ""
            Row.OrderField3 = ""
            Row.OrderField4 = ""
            Row.Postcode = sDeliveryAddress5
            Row.ProductCode = sRow(1).ToString
            Row.Quantity = sRow(7).ToString
            Row.TransportOut = ""
            Row.UnitsOfOrder = "2"
            Row.UserField3 = ""
            Row.UserField4 = ""
            Row.Warehouse = ""
            'Row.ProductDesc = sRow(2).ToString
        End If

    End Sub

    Private Sub processHeaderRow()

        sDeliveryName = sRow(13).ToString
        sDeliveryAddress1 = sRow(14).ToString
        sDeliveryAddress2 = sRow(15).ToString
        sDeliveryAddress3 = sRow(16).ToString
        sDeliveryAddress4 = sRow(17).ToString
        sDeliveryAddress5 = sRow(18).ToString
        sCustomerOrderNumber = sRow(23).ToString
        sDespatchDate = sRow(28).ToString
        sDeliveryDate = sRow(29).ToString
        sOrderNotes1 = sRow(25).ToString
        sOrderNotes2 = sRow(26).ToString
        sOrderNotes3 = sRow(27).ToString

    End Sub

Open in new window

Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

zephyr_hex (Megan)DeveloperCommented:
Is it possible to delimit by a less common character, such as pipe ("|") ?  There isn't a simple way to detect the difference between a delimiter and a comma in an address.
spen_langAuthor Commented:
It doesn't look like the provider is able to change the delimiter...
zephyr_hex (Megan)DeveloperCommented:
I can't think of another way.  Best case, you throw an error if the number of elements you expect in a line doesn't match up with the delimited pieces.  You could try to guess at a non-delimiter comma, but in truth, that could just give you bad data when the guess is wrong.

Is the provider able to strip or replace commas that aren't delimiters?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
spen_langAuthor Commented:
I think throwing an error if there is an incorrect amount of elements is the best solution for this as the provider will not make any changes. Thank you for your help.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.