[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

SSIS 2008 import text and XML

Posted on 2011-04-21
5
Medium Priority
?
889 Views
Last Modified: 2013-11-10
Hello Experts,

I have log files in text file format that are formatted such that I don't know how to import them.  It is created this way by an application and I need ideas on how to import them into SQL 2008, and parse out all the data into columns.

The text file does not have any row or column delimiters - it is one long string with no line feeds / carriage returns.   Each record starts with a date time stamp (outside the XML) which I need to import, and then an XML string that I also need to import into the same record.

Attached is a sample of 2 records.

My question is:   what tools in SSIS would I need to import this?  If I have to write a VB program to parse the date/timestamp from the XML string for each record, how would I do this?  Would I need to create a new file that has row and column delimiters?  

Any ideas would be greatly appreciated!
20110131:140630.123:<?xml version="1.0" encoding="UTF-8"?><Message xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://info.lalala.com/messagepayload.xsd"><EventIdentification EventActionCode="U" EventDateTime="20110131140521" EventID="100_ChangePatientContext" EventOutcomeIndicator="0"></EventIdentification><UserIdentification RoleID="404" UserID="m1636"></UserIdentification><SourceIdentification SourceID="WP Portal" SourceType="3"></SourceIdentification><NetworkAccessPointIdentification NetworkAccessPointID="167.1.1.1" NetworkAccessPointTypeCode="1"></NetworkAccessPointIdentification><NetworkAccessPointIdentification NetworkAccessPointID="167.1.1.1 : null" NetworkAccessPointTypeCode="2"></NetworkAccessPointIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="0000827576" ParticipantObjectITypeCode="1" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="1563890" ParticipantObjectITypeCode="2" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification><ParticipantObjectIdentification ParticipantObjectDataLifeCycle="1" ParticipantObjectID="1101600033" ParticipantObjectITypeCode="5" ParticipantObjectName="DOE, JOHN" ParticipantObjectTypeCode="1"></ParticipantObjectIdentification></Message>

Open in new window

0
Comment
Question by:richxyz
  • 3
  • 2
5 Comments
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35441581
For openers, you will probably need to remove the datetime stamp (20110131:140630.123: in your example) because the <?XML version ... information needs to be the first thing in the XML file for it to be valid.  Then, if you don't have access to the xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" link, you will probably want to remove that, too.

I was able to get the information to present as an XML file after taking those two corrective actions.

Now, on to the SSIS process, Are you using SSIS 2005 or SSIS 2008?  (It makes a bit of difference.)
0
 

Author Comment

by:richxyz
ID: 35441619
I'm using SSIS 2008 and SQL 2008 R2.  

Unfortunately these files are generated by another application and this is how they will always be created - and I have to deal with it!

So are you saying I will need to write a VB program to parse it out?    I'm curious if I could do this with a Script Task - and would I parse the date time stamp and the XML directly into SQL or would I have to put it into another temporary file before importing?
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35442067
I would first try the Script Task.  I think you can provide an ConnectionManager that will let you access the flat file and then manipulate it using VB.Net or (my personal preference ;-) C#.  

The other option would be to create a VB.net or C# aplication that can be called from the SSIS package and have the application perform the manipulations.  The app could archive the original file and then create the preprocessed file either with a known name or in a known directory.  Once the app completes its processing, control would return to the SSIS psckage where the package would either process the single file or, if several files were processed and put in a specific folder, all of the files in the folder.  (If you go this route, I would suggest the use of a folder and embedding the datetime stamp in the filename so that you can retrieve it if needed.)
0
 
LVL 22

Accepted Solution

by:
8080_Diver earned 2000 total points
ID: 35442101
In fact, if you go the route of the Script Task, I would use a System File Task to archive the original file and another to move it to a separate processing folder.  Then I would use a For Each Loop to process the files.  (This is, of course, assuming that you will or may be processing multiple files.)
0
 

Author Comment

by:richxyz
ID: 35442123
I just realized that importing it as a text file using the ragged right option will work.  (I didn't see the line feeds when viewing the text files before.)   I can put the date time stamp into one column and the rest into an XML data type.  Thanks for your help, I'll give you the points now :)
0

Featured Post

NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question