SolvedPrivate

Combining XML Data into one record in SSIS

Posted on 2014-02-09
6
35 Views
Last Modified: 2016-02-10
Hi.  I am new to VS and SSIS and I am trying to merge  the  data in the xml file (attached) via XML Data Source through Data Conversions to a single table in sql.  The data appears to be organized via headers in the xsd  (attached) .  The actual desired index filed is DISPATCH_RECORD_ID or CADNUMBER which both contain the same data.  I can set up the xml source with out a problem and I can dataconvert the fields in in each header selected.  Where I am lost is get the data together , since I can't join the data points from the xml through a join after conversion because there is no common index.

I am using VS 2005 .  

 Thanks
ems.xml
ems.xsd
0
Comment
Question by:wkrasner
  • 3
  • 2
6 Comments
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
Here is what I would suggest.
1. Always import data (whatever the format) first into a staging table.  This allows you to validate the data prior to importing into your Production tables.
2. Once your Xml data is imported into your staging table, post the schema for the table, some sample data and the desired output for the target table.
0
 
LVL 5

Author Comment

by:wkrasner
Comment Utility
Thank you Anthony.  Something stil isn't clicking for me.  I have the xml source , I have the schema associated with the source but when I try to move the data along the data flow, the fields are allocated to headers I want to bring the fields in to the recordset.  The following depicts the outcome in excel using the xml file.  I am looking to do the same in a dstx.

Sample of Data imported to excel from xml

So Far I have this.  What do I do next to get a;; of the unions to one sql datable as one record for the above data points?

Current Project state
0
 
LVL 59

Expert Comment

by:Kevin Cross
Comment Utility
I agree with Anthony, but I see the challenge here.  The XML does not have a root node.

In other words, instead of:
<DISPATCH_RECORD>
  <HEADER>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
  </HEADER>
  <!-- ... et cetera ... -->
</DISPATCH_RECORD>

Open in new window

I would expect something like this:
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040271</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


Then your XML source can pull one DISPATCH_RECORD.  However, as I said, I will take a look with your current schema.
0
Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

 
LVL 59

Accepted Solution

by:
Kevin Cross earned 500 total points
Comment Utility
Try using an XML Task to transform the current source XML into one that is more flat.  For example, you can use xsl:for-each on whatever element you want records to repeat for (e.g., AGENCY).  Then you can use XSLT matches to handle each of the nodes, so you end up wtih document similar to below.
<?xml version="1.0" encoding="utf-8"?>
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    <CADNUMBER>10040270</CADNUMBER>
    <INCIDENT_TYPE>STRUCTURE FIRE</INCIDENT_TYPE>
    <PRIORITY>1</PRIORITY>
    <STREETNUM>2565x</STREETNUM>
    <STREETNAME>ELM ST</STREETNAME>
    <APTNUM>1x</APTNUM>
    <CROSSSTREET>OAK ST</CROSSSTREET>
    <BUSINESSNAME>xx</BUSINESSNAME>
    <CITY>ANYTOWN</CITY>
    <NAME>ROBERT SMITH</NAME>
    <ADDRESS>161 FAIRMOUNT AV</ADDRESS>
    <TELEPHONE>8885551212xx</TELEPHONE>
    <DISPATCH_RECEIVED>2010-04-27T08:28:29</DISPATCH_RECEIVED>
    <DISPATCH>2010-04-27T08:29:04</DISPATCH>
    <RESPONDING>2010-04-27T08:29:04</RESPONDING>
    <ONLOCATION>2010-04-27T08:42:55</ONLOCATION>
    <UNDERCONTROL>2010-04-27T08:52:35</UNDERCONTROL>
    <COMPLETED>2010-04-27T09:05:05</COMPLETED>
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


You can save the output of the XSLT to file, or variable.  You can build an XSD for the new XML data.  In your XML Source, you can pull the data from new file, or XML data from variable, with the new XSD.  If it helps, you can store the new XSD inline with the generated XML.  This way, you have to store neither the temporary XML document nor the associated schema.  You can just write/read from SSIS variable.

I hope that helps!
0
 
LVL 5

Author Comment

by:wkrasner
Comment Utility
This sounds like a good approach.  Just a dumb question  (I am new to SSIS).  I have created
the xml data source under data flow and it looks like i tie the xml task under the control flow.
What should I use as the destination in data flow for the xml source?
0
 
LVL 59

Expert Comment

by:Kevin Cross
Comment Utility
You are correct.  In the control flow, you have the XML Task first with the Data Flow Task next.  

Valentino does a nice job of describing the XML Task with XSLT in his article "Loading Complex XML Using SSIS."  His example shows output to a file; however, you can set the output destination type to variable (e.g., transformedEMS).  

In the data flow, you can set the source "XML data from variable" and select "User::transformedEMS" as the variable name.  The destination of the XML source could be your OLE DB destination (staging table).

Does that make sense?

The other approach would be to get some ID into each of the tables for the separate parts of the XML into staging tables then merge the data back together via JOINs, which I read Anthony's suggestion to be.  If you click on Valentino's link to his previous article, he showed an example.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

In couple weeks ago, I encountered an extremely difficult problem while deploying 2008 SSIS packages to a new environment (SQL Server 2014 standard).  My scenario is: We have one C# application that is calling 2008R2 SSIS packages to load text fi…
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now