SolvedPrivate

Combining XML Data into one record in SSIS

Posted on 2014-02-09
6
38 Views
Last Modified: 2016-02-10
Hi.  I am new to VS and SSIS and I am trying to merge  the  data in the xml file (attached) via XML Data Source through Data Conversions to a single table in sql.  The data appears to be organized via headers in the xsd  (attached) .  The actual desired index filed is DISPATCH_RECORD_ID or CADNUMBER which both contain the same data.  I can set up the xml source with out a problem and I can dataconvert the fields in in each header selected.  Where I am lost is get the data together , since I can't join the data points from the xml through a join after conversion because there is no common index.

I am using VS 2005 .  

 Thanks
ems.xml
ems.xsd
0
Comment
Question by:wkrasner
  • 3
  • 2
6 Comments
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39845561
Here is what I would suggest.
1. Always import data (whatever the format) first into a staging table.  This allows you to validate the data prior to importing into your Production tables.
2. Once your Xml data is imported into your staging table, post the schema for the table, some sample data and the desired output for the target table.
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39845721
Thank you Anthony.  Something stil isn't clicking for me.  I have the xml source , I have the schema associated with the source but when I try to move the data along the data flow, the fields are allocated to headers I want to bring the fields in to the recordset.  The following depicts the outcome in excel using the xml file.  I am looking to do the same in a dstx.

Sample of Data imported to excel from xml

So Far I have this.  What do I do next to get a;; of the unions to one sql datable as one record for the above data points?

Current Project state
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39847150
I agree with Anthony, but I see the challenge here.  The XML does not have a root node.

In other words, instead of:
<DISPATCH_RECORD>
  <HEADER>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
  </HEADER>
  <!-- ... et cetera ... -->
</DISPATCH_RECORD>

Open in new window

I would expect something like this:
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040271</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


Then your XML source can pull one DISPATCH_RECORD.  However, as I said, I will take a look with your current schema.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 59

Accepted Solution

by:
Kevin Cross earned 500 total points
ID: 39847265
Try using an XML Task to transform the current source XML into one that is more flat.  For example, you can use xsl:for-each on whatever element you want records to repeat for (e.g., AGENCY).  Then you can use XSLT matches to handle each of the nodes, so you end up wtih document similar to below.
<?xml version="1.0" encoding="utf-8"?>
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    <CADNUMBER>10040270</CADNUMBER>
    <INCIDENT_TYPE>STRUCTURE FIRE</INCIDENT_TYPE>
    <PRIORITY>1</PRIORITY>
    <STREETNUM>2565x</STREETNUM>
    <STREETNAME>ELM ST</STREETNAME>
    <APTNUM>1x</APTNUM>
    <CROSSSTREET>OAK ST</CROSSSTREET>
    <BUSINESSNAME>xx</BUSINESSNAME>
    <CITY>ANYTOWN</CITY>
    <NAME>ROBERT SMITH</NAME>
    <ADDRESS>161 FAIRMOUNT AV</ADDRESS>
    <TELEPHONE>8885551212xx</TELEPHONE>
    <DISPATCH_RECEIVED>2010-04-27T08:28:29</DISPATCH_RECEIVED>
    <DISPATCH>2010-04-27T08:29:04</DISPATCH>
    <RESPONDING>2010-04-27T08:29:04</RESPONDING>
    <ONLOCATION>2010-04-27T08:42:55</ONLOCATION>
    <UNDERCONTROL>2010-04-27T08:52:35</UNDERCONTROL>
    <COMPLETED>2010-04-27T09:05:05</COMPLETED>
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


You can save the output of the XSLT to file, or variable.  You can build an XSD for the new XML data.  In your XML Source, you can pull the data from new file, or XML data from variable, with the new XSD.  If it helps, you can store the new XSD inline with the generated XML.  This way, you have to store neither the temporary XML document nor the associated schema.  You can just write/read from SSIS variable.

I hope that helps!
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39847311
This sounds like a good approach.  Just a dumb question  (I am new to SSIS).  I have created
the xml data source under data flow and it looks like i tie the xml task under the control flow.
What should I use as the destination in data flow for the xml source?
0
 
LVL 59

Expert Comment

by:Kevin Cross
ID: 39847378
You are correct.  In the control flow, you have the XML Task first with the Data Flow Task next.  

Valentino does a nice job of describing the XML Task with XSLT in his article "Loading Complex XML Using SSIS."  His example shows output to a file; however, you can set the output destination type to variable (e.g., transformedEMS).  

In the data flow, you can set the source "XML data from variable" and select "User::transformedEMS" as the variable name.  The destination of the XML source could be your OLE DB destination (staging table).

Does that make sense?

The other approach would be to get some ID into each of the tables for the separate parts of the XML into staging tables then merge the data back together via JOINs, which I read Anthony's suggestion to be.  If you click on Valentino's link to his previous article, he showed an example.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Powershell v3 - SQLCMD 3 27
Problems "swapping" SQL Server DBs 2 22
TSQL XML Namespaces 7 24
how to restore or keep sql2000  backups useful... 2 18
Here's a requirements document template for an integration project (also known as Extract-Transform-Load or ETL) based on my development experience as an SQL Server Information Services (SSIS) developer over the years.
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question