SSIS 2008 import text and XML

Posted on 2011-04-21
Last Modified: 2013-11-10
Hello Experts,

I have log files in text file format that are formatted such that I don't know how to import them.  It is created this way by an application and I need ideas on how to import them into SQL 2008, and parse out all the data into columns.

The text file does not have any row or column delimiters - it is one long string with no line feeds / carriage returns.   Each record starts with a date time stamp (outside the XML) which I need to import, and then an XML string that I also need to import into the same record.

Attached is a sample of 2 records.

My question is:   what tools in SSIS would I need to import this?  If I have to write a VB program to parse the date/timestamp from the XML string for each record, how would I do this?  Would I need to create a new file that has row and column delimiters?  

Any ideas would be greatly appreciated!
20110131:140630.123:<?xml version="1.0" encoding="UTF-8"?><Message xmlns:xsi="" xsi:noNamespaceSchemaLocation=""><EventIdentification EventActionCode="U" EventDateTime="20110131140521" EventID="100_ChangePatientContext" EventOutcomeIndicator="0"></EventIdentification><UserIdentification RoleID="404" UserID="m1636"></UserIdentification><SourceIdentification SourceID="WP Portal" SourceType="3"></SourceIdentification><NetworkAccessPointIdentification NetworkAccessPointID="" NetworkAccessPointTypeCode="1"></NetworkAccessPointIdentification><NetworkAccessPointIdentification NetworkAccessPointID=" : null" NetworkAccessPointTypeCode="2"></NetworkAccessPointIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="0000827576" ParticipantObjectITypeCode="1" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="1563890" ParticipantObjectITypeCode="2" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="1101600033" ParticipantObjectITypeCode="5" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification></Message>

Open in new window

Question by:richxyz
    LVL 22

    Expert Comment

    For openers, you will probably need to remove the datetime stamp (20110131:140630.123: in your example) because the <?XML version ... information needs to be the first thing in the XML file for it to be valid.  Then, if you don't have access to the xmlns:xsi="" link, you will probably want to remove that, too.

    I was able to get the information to present as an XML file after taking those two corrective actions.

    Now, on to the SSIS process, Are you using SSIS 2005 or SSIS 2008?  (It makes a bit of difference.)

    Author Comment

    I'm using SSIS 2008 and SQL 2008 R2.  

    Unfortunately these files are generated by another application and this is how they will always be created - and I have to deal with it!

    So are you saying I will need to write a VB program to parse it out?    I'm curious if I could do this with a Script Task - and would I parse the date time stamp and the XML directly into SQL or would I have to put it into another temporary file before importing?
    LVL 22

    Expert Comment

    I would first try the Script Task.  I think you can provide an ConnectionManager that will let you access the flat file and then manipulate it using VB.Net or (my personal preference ;-) C#.  

    The other option would be to create a or C# aplication that can be called from the SSIS package and have the application perform the manipulations.  The app could archive the original file and then create the preprocessed file either with a known name or in a known directory.  Once the app completes its processing, control would return to the SSIS psckage where the package would either process the single file or, if several files were processed and put in a specific folder, all of the files in the folder.  (If you go this route, I would suggest the use of a folder and embedding the datetime stamp in the filename so that you can retrieve it if needed.)
    LVL 22

    Accepted Solution

    In fact, if you go the route of the Script Task, I would use a System File Task to archive the original file and another to move it to a separate processing folder.  Then I would use a For Each Loop to process the files.  (This is, of course, assuming that you will or may be processing multiple files.)

    Author Comment

    I just realized that importing it as a text file using the ragged right option will work.  (I didn't see the line feeds when viewing the text files before.)   I can put the date time stamp into one column and the rest into an XML data type.  Thanks for your help, I'll give you the points now :)

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Do You Know the 4 Main Threat Actor Types?

    Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

    Introduction In my previous article ( I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
    Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
    Via a live example, show how to shrink a transaction log file down to a reasonable size.
    Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now