Link to home
Start Free TrialLog in
Avatar of spen_lang
spen_lang

asked on

SSIS Converting Flat File with Header and Detail lines

Hi,

I have a flat file that has two row types (OH = Order header, OL = Order line). One OH row can have multiple OL lines. I would like to combine the OH data rows and the OL data rows into one output row.

So for example the source data may look like this:

OH, Ted, #12345, M, United Kingdom
OL, 1, 2, 3, 4
OL, 2, 2, 4, 5
OH, Bob, #12346, M, United Kingdom
OL, 2, 4, 4, 6
OL, 2, 2, 4, 7

I would like the target file to be:
Ted, #12345, M, United Kingdom, 1, 2, 3, 4
Ted, #12345, M, United Kingdom, 2, 2, 4, 5
Bob, #12346, M, United Kingdom, 2, 4, 4, 6
Bob, #12346, M, United Kingdom, 2, 2, 4, 7

I know I could split the data rows into two outputs but I do not know how I would then link the data back together as there is no link in the source file between the OH and OL rows.

Thanks, Greg
Avatar of zephyr_hex (Megan)
zephyr_hex (Megan)
Flag of United States of America image

I think you'll need to iterate over the source using a script task so that you're able to detect when you're moving to a new order.  I can't think of another way to ensure you don't lose the relationship between header and lines, unless the source has structural entities like XML.
Avatar of spen_lang
spen_lang

ASKER

This is the route I have gone down. The CSV file is delimited using commas and the Order header and order detail rows have different amount of columns (something I forgot to mention in the post).

So in my script I am splitting the row in the file using the delimiter of a comma. However, my concern is that someone may put a comma in the text of the one of the columns, which would cause problems (e.g. in an address line). Is there any way to overcome this?
Code being used in the script component:

Public Class ScriptMain
    Inherits UserComponent

    Private sDeliveryName As String = ""
    Private sDeliveryAddress1 As String = ""
    Private sDeliveryAddress2 As String = ""
    Private sDeliveryAddress3 As String = ""
    Private sDeliveryAddress4 As String = ""
    Private sDeliveryAddress5 As String = ""
    Private sCustomerOrderNumber As String = ""
    Private sDespatchDate As String = ""
    Private sDeliveryDate As String = ""
    Private sOrderNotes1 As String = ""
    Private sOrderNotes2 As String = ""
    Private sOrderNotes3 As String = ""
    Private sErrorLog As String = ""

    Private sRow As String()
    Const sDelimiter As String = ","

    Public Overrides Sub OrderRows_ProcessInputRow(ByVal Row As OrderRowsBuffer)
        '
        ' Add your code here
        ''

        sRow = Row.OrderRow.Split(sDelimiter.ToCharArray())

        If sRow(0).ToString.ToUpper = "OH" Then
            processHeaderRow()
            Row.DirectRowToHeaderRow()
        ElseIf sRow(0).ToString.ToUpper = "OL" Then
            Row.Address1 = sDeliveryName
            Row.Address2 = sDeliveryAddress1
            Row.Address3 = sDeliveryAddress2
            Row.Address4 = sDeliveryAddress3
            Row.Address5 = sDeliveryAddress4
            Row.BarCode = ""
            Row.BatchNo = ""
            Row.BBE = ""
            Row.CustomerCode = "TEST"
            Row.CustomerGoodsRef = ""
            Row.CustOrderNo = sCustomerOrderNumber
            Row.CustRef = ""
            Row.DeliveryTime = "08:00"
            Row.DeliveryDateTime = sDeliveryDate
            Row.DeliveryInstructions = sOrderNotes1
            Row.Depot = ""
            Row.DespatchDate = sDespatchDate
            Row.DestLocation = ""
            Row.Location = ""
            Row.OrderPriority = ""
            Row.OrderDate = Date.Today.ToString("dd/MM/yyyy")
            Row.OrderField1 = ""
            Row.OrderField2 = ""
            Row.OrderField3 = ""
            Row.OrderField4 = ""
            Row.Postcode = sDeliveryAddress5
            Row.ProductCode = sRow(1).ToString
            Row.Quantity = sRow(7).ToString
            Row.TransportOut = ""
            Row.UnitsOfOrder = "2"
            Row.UserField3 = ""
            Row.UserField4 = ""
            Row.Warehouse = ""
            'Row.ProductDesc = sRow(2).ToString
            Row.DirectRowToDetailRows()
        End If


    End Sub

    Private Sub processHeaderRow()

        sDeliveryName = sRow(13).ToString
        sDeliveryAddress1 = sRow(14).ToString
        sDeliveryAddress2 = sRow(15).ToString
        sDeliveryAddress3 = sRow(16).ToString
        sDeliveryAddress4 = sRow(17).ToString
        sDeliveryAddress5 = sRow(18).ToString
        sCustomerOrderNumber = sRow(23).ToString
        sDespatchDate = sRow(28).ToString
        sDeliveryDate = sRow(29).ToString
        sOrderNotes1 = sRow(25).ToString
        sOrderNotes2 = sRow(26).ToString
        sOrderNotes3 = sRow(27).ToString

    End Sub

Open in new window

Is it possible to delimit by a less common character, such as pipe ("|") ?  There isn't a simple way to detect the difference between a delimiter and a comma in an address.
It doesn't look like the provider is able to change the delimiter...
ASKER CERTIFIED SOLUTION
Avatar of zephyr_hex (Megan)
zephyr_hex (Megan)
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I think throwing an error if there is an incorrect amount of elements is the best solution for this as the provider will not make any changes. Thank you for your help.