Go Premium for a chance to win a PS4. Enter to Win

x
?
SolvedPrivate

Combining XML Data into one record in SSIS

Posted on 2014-02-09
6
Medium Priority
?
44 Views
Last Modified: 2016-02-10
Hi.  I am new to VS and SSIS and I am trying to merge  the  data in the xml file (attached) via XML Data Source through Data Conversions to a single table in sql.  The data appears to be organized via headers in the xsd  (attached) .  The actual desired index filed is DISPATCH_RECORD_ID or CADNUMBER which both contain the same data.  I can set up the xml source with out a problem and I can dataconvert the fields in in each header selected.  Where I am lost is get the data together , since I can't join the data points from the xml through a join after conversion because there is no common index.

I am using VS 2005 .  

 Thanks
ems.xml
ems.xsd
0
Comment
Question by:wkrasner
  • 3
  • 2
6 Comments
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39845561
Here is what I would suggest.
1. Always import data (whatever the format) first into a staging table.  This allows you to validate the data prior to importing into your Production tables.
2. Once your Xml data is imported into your staging table, post the schema for the table, some sample data and the desired output for the target table.
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39845721
Thank you Anthony.  Something stil isn't clicking for me.  I have the xml source , I have the schema associated with the source but when I try to move the data along the data flow, the fields are allocated to headers I want to bring the fields in to the recordset.  The following depicts the outcome in excel using the xml file.  I am looking to do the same in a dstx.

Sample of Data imported to excel from xml

So Far I have this.  What do I do next to get a;; of the unions to one sql datable as one record for the above data points?

Current Project state
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39847150
I agree with Anthony, but I see the challenge here.  The XML does not have a root node.

In other words, instead of:
<DISPATCH_RECORD>
  <HEADER>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
  </HEADER>
  <!-- ... et cetera ... -->
</DISPATCH_RECORD>

Open in new window

I would expect something like this:
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040271</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


Then your XML source can pull one DISPATCH_RECORD.  However, as I said, I will take a look with your current schema.
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 60

Accepted Solution

by:
Kevin Cross earned 2000 total points
ID: 39847265
Try using an XML Task to transform the current source XML into one that is more flat.  For example, you can use xsl:for-each on whatever element you want records to repeat for (e.g., AGENCY).  Then you can use XSLT matches to handle each of the nodes, so you end up wtih document similar to below.
<?xml version="1.0" encoding="utf-8"?>
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    <CADNUMBER>10040270</CADNUMBER>
    <INCIDENT_TYPE>STRUCTURE FIRE</INCIDENT_TYPE>
    <PRIORITY>1</PRIORITY>
    <STREETNUM>2565x</STREETNUM>
    <STREETNAME>ELM ST</STREETNAME>
    <APTNUM>1x</APTNUM>
    <CROSSSTREET>OAK ST</CROSSSTREET>
    <BUSINESSNAME>xx</BUSINESSNAME>
    <CITY>ANYTOWN</CITY>
    <NAME>ROBERT SMITH</NAME>
    <ADDRESS>161 FAIRMOUNT AV</ADDRESS>
    <TELEPHONE>8885551212xx</TELEPHONE>
    <DISPATCH_RECEIVED>2010-04-27T08:28:29</DISPATCH_RECEIVED>
    <DISPATCH>2010-04-27T08:29:04</DISPATCH>
    <RESPONDING>2010-04-27T08:29:04</RESPONDING>
    <ONLOCATION>2010-04-27T08:42:55</ONLOCATION>
    <UNDERCONTROL>2010-04-27T08:52:35</UNDERCONTROL>
    <COMPLETED>2010-04-27T09:05:05</COMPLETED>
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


You can save the output of the XSLT to file, or variable.  You can build an XSD for the new XML data.  In your XML Source, you can pull the data from new file, or XML data from variable, with the new XSD.  If it helps, you can store the new XSD inline with the generated XML.  This way, you have to store neither the temporary XML document nor the associated schema.  You can just write/read from SSIS variable.

I hope that helps!
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39847311
This sounds like a good approach.  Just a dumb question  (I am new to SSIS).  I have created
the xml data source under data flow and it looks like i tie the xml task under the control flow.
What should I use as the destination in data flow for the xml source?
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39847378
You are correct.  In the control flow, you have the XML Task first with the Data Flow Task next.  

Valentino does a nice job of describing the XML Task with XSLT in his article "Loading Complex XML Using SSIS."  His example shows output to a file; however, you can set the output destination type to variable (e.g., transformedEMS).  

In the data flow, you can set the source "XML data from variable" and select "User::transformedEMS" as the variable name.  The destination of the XML source could be your OLE DB destination (staging table).

Does that make sense?

The other approach would be to get some ID into each of the tables for the separate parts of the XML into staging tables then merge the data back together via JOINs, which I read Anthony's suggestion to be.  If you click on Valentino's link to his previous article, he showed an example.
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
This month, Experts Exchange sat down with resident SQL expert, Jim Horn, for an in-depth look into the makings of a successful career in SQL.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

783 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question