SolvedPrivate

Combining XML Data into one record in SSIS

Posted on 2014-02-09
6
41 Views
Last Modified: 2016-02-10
Hi.  I am new to VS and SSIS and I am trying to merge  the  data in the xml file (attached) via XML Data Source through Data Conversions to a single table in sql.  The data appears to be organized via headers in the xsd  (attached) .  The actual desired index filed is DISPATCH_RECORD_ID or CADNUMBER which both contain the same data.  I can set up the xml source with out a problem and I can dataconvert the fields in in each header selected.  Where I am lost is get the data together , since I can't join the data points from the xml through a join after conversion because there is no common index.

I am using VS 2005 .  

 Thanks
ems.xml
ems.xsd
0
Comment
Question by:wkrasner
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39845561
Here is what I would suggest.
1. Always import data (whatever the format) first into a staging table.  This allows you to validate the data prior to importing into your Production tables.
2. Once your Xml data is imported into your staging table, post the schema for the table, some sample data and the desired output for the target table.
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39845721
Thank you Anthony.  Something stil isn't clicking for me.  I have the xml source , I have the schema associated with the source but when I try to move the data along the data flow, the fields are allocated to headers I want to bring the fields in to the recordset.  The following depicts the outcome in excel using the xml file.  I am looking to do the same in a dstx.

Sample of Data imported to excel from xml

So Far I have this.  What do I do next to get a;; of the unions to one sql datable as one record for the above data points?

Current Project state
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39847150
I agree with Anthony, but I see the challenge here.  The XML does not have a root node.

In other words, instead of:
<DISPATCH_RECORD>
  <HEADER>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
  </HEADER>
  <!-- ... et cetera ... -->
</DISPATCH_RECORD>

Open in new window

I would expect something like this:
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
  <DISPATCH_RECORD>
    <HEADER>
      <API_VERSION>7.00.0001</API_VERSION>
      <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
      <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
      <DISPATCH_RECORD_ID>10040271</DISPATCH_RECORD_ID>
    </HEADER>
    <!-- ... et cetera ... -->
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


Then your XML source can pull one DISPATCH_RECORD.  However, as I said, I will take a look with your current schema.
0
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 60

Accepted Solution

by:
Kevin Cross earned 500 total points
ID: 39847265
Try using an XML Task to transform the current source XML into one that is more flat.  For example, you can use xsl:for-each on whatever element you want records to repeat for (e.g., AGENCY).  Then you can use XSLT matches to handle each of the nodes, so you end up wtih document similar to below.
<?xml version="1.0" encoding="utf-8"?>
<DISPATCH_TABLE>
  <DISPATCH_RECORD>
    <API_VERSION>7.00.0001</API_VERSION>
    <SOFTWARE_VENDOR>QED</SOFTWARE_VENDOR>
    <SOFTWARE_VERSION>7.00.0001</SOFTWARE_VERSION>
    <DISPATCH_RECORD_ID>10040270</DISPATCH_RECORD_ID>
    <CADNUMBER>10040270</CADNUMBER>
    <INCIDENT_TYPE>STRUCTURE FIRE</INCIDENT_TYPE>
    <PRIORITY>1</PRIORITY>
    <STREETNUM>2565x</STREETNUM>
    <STREETNAME>ELM ST</STREETNAME>
    <APTNUM>1x</APTNUM>
    <CROSSSTREET>OAK ST</CROSSSTREET>
    <BUSINESSNAME>xx</BUSINESSNAME>
    <CITY>ANYTOWN</CITY>
    <NAME>ROBERT SMITH</NAME>
    <ADDRESS>161 FAIRMOUNT AV</ADDRESS>
    <TELEPHONE>8885551212xx</TELEPHONE>
    <DISPATCH_RECEIVED>2010-04-27T08:28:29</DISPATCH_RECEIVED>
    <DISPATCH>2010-04-27T08:29:04</DISPATCH>
    <RESPONDING>2010-04-27T08:29:04</RESPONDING>
    <ONLOCATION>2010-04-27T08:42:55</ONLOCATION>
    <UNDERCONTROL>2010-04-27T08:52:35</UNDERCONTROL>
    <COMPLETED>2010-04-27T09:05:05</COMPLETED>
  </DISPATCH_RECORD>
</DISPATCH_TABLE>

Open in new window


You can save the output of the XSLT to file, or variable.  You can build an XSD for the new XML data.  In your XML Source, you can pull the data from new file, or XML data from variable, with the new XSD.  If it helps, you can store the new XSD inline with the generated XML.  This way, you have to store neither the temporary XML document nor the associated schema.  You can just write/read from SSIS variable.

I hope that helps!
0
 
LVL 5

Author Comment

by:wkrasner
ID: 39847311
This sounds like a good approach.  Just a dumb question  (I am new to SSIS).  I have created
the xml data source under data flow and it looks like i tie the xml task under the control flow.
What should I use as the destination in data flow for the xml source?
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39847378
You are correct.  In the control flow, you have the XML Task first with the Data Flow Task next.  

Valentino does a nice job of describing the XML Task with XSLT in his article "Loading Complex XML Using SSIS."  His example shows output to a file; however, you can set the output destination type to variable (e.g., transformedEMS).  

In the data flow, you can set the source "XML data from variable" and select "User::transformedEMS" as the variable name.  The destination of the XML source could be your OLE DB destination (staging table).

Does that make sense?

The other approach would be to get some ID into each of the tables for the separate parts of the XML into staging tables then merge the data back together via JOINs, which I read Anthony's suggestion to be.  If you click on Valentino's link to his previous article, he showed an example.
0

Featured Post

Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

628 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question