Solved

Help with combining data in xml file based on identical country using VB.NET

Posted on 2015-01-24
15
43 Views
Last Modified: 2015-09-12
Hi,

How do combine all records with identical <COUNTRY> in create one record combining data from other data elements, for example, If I have xml file in Part A, how do I create a new xml file as shown in art B? The example is only for one record, the xml file in part A will contain multiple records with similar situation.

Part A.
<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN></SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
</NewDataSet>
  <AOP5>
    <SCOUNTRY></SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
  <AOP5>
    <SCOUNTRY></SCOUNTRY>
    <SN></SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Part B:
<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Thanks,

Victor
0
Comment
Question by:vcharles
  • 9
  • 4
  • 2
15 Comments
 
LVL 62

Expert Comment

by:Fernando Soto
Comment Utility
Hi Victor;

Your original document had a single value over all like records, for example one AGD4 having a value and all other records having an empty values in the node. Is it possible to have multiple values in the same node type across all records like shown below? If yes how do you want to handle these multiple values?

<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN>2012</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
</NewDataSet>
  <AOP5>
    <SCOUNTRY>WA 1020</SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>F999</AGD4>
  </AOP5
  <AOP5>
    <SCOUNTRY>FL 2031</SCOUNTRY>
    <SN>1141</SN>
    <COUNTRY>ABC</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Open in new window

0
 

Author Comment

by:vcharles
Comment Utility
Hi,

If this occurs the first record containing a data element  with data shoud overwrite the other records, for example if I have:\ the following in Part A:

Part A.
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M800</AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10412</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Part B should contain:

Part B:
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10412</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M800</AGD4>
   </AOP5>
 </NewDataSet>
0
 
LVL 62

Assisted Solution

by:Fernando Soto
Fernando Soto earned 250 total points
Comment Utility
Hi Victory;

This should do what you need.

Dim xdoc As XDocument = XDocument.Load("Path to the XML document with input data ")
'' New document being created
Dim newXdoc As New XDocument(New XElement("Root"))
'' Get all the parent nodes
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                   Group By aop5.Name Into Group _
                   Select Group).ToList()

'' Select all the needed nodes and place them in a new document 
For Each p As XElement() In parentNodes
    Dim nodes = (From n As XElement In p.Descendants().ToList() _
                 Where Not String.IsNullOrEmpty(n.Value)
                 Group By n.Name Into Group _
                 Select Group.First()).ToList()

     '' Create the new elements of the document
    Dim addNodes = New XElement("NewDataSet", nodes)
    newXdoc.Root.Add(addNodes)
Next
 '' Save the new document
newXdoc.Save("Path and file name to store on disk")

Open in new window

0
 

Author Comment

by:vcharles
Comment Utility
Thank you, I will get back to you.
0
 

Author Comment

by:vcharles
Comment Utility
Hi again,

I don't see where <Country> is mentioned to create one record from multiple identical values. Can you please explain the code to me.

Thanks,

Victor
0
 

Author Comment

by:vcharles
Comment Utility
Hi,

unfortunately it does not work, only one record was returned. I need to merge data from all data elements where country is the same and eliminate duplicate data in Country.

Victor
0
 
LVL 62

Expert Comment

by:Fernando Soto
Comment Utility
Hi Victor;

Please post you code as you implemented it and the file that you used as the input data.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 11

Expert Comment

by:louisfr
Comment Utility
I don't see where <Country> is mentioned to create one record from multiple identical values.
In the sample, replace
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                    Group By aop5.Name Into Group _
                    Select Group).ToList()

Open in new window

with
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                    Group By aop5.Element("COUNTRY").Value Into Group _
                    Select Group).ToList()

Open in new window

0
 
LVL 62

Expert Comment

by:Fernando Soto
Comment Utility
Lets start from the beginning. In Part A in your question you have only AOP5 node names and no other. Does this mean that each file only handles one of these node type in each document you may have? Or can one document have different node types at this level?

You state that you want, "identical <COUNTRY>", but in your example you only show nodes of this, <COUNTRY>ITA</COUNTRY>, all having the same value. so it is difficult to help with something where all the information is not stated.

When dealing with XML we need accurate information on the structure that is to be operated on and what the output is to be.  

Please post an accurate representation of the input XML with all possible types and what you want the output to look like. Something like a mini specification your boss would give you to work on a project.
0
 

Author Comment

by:vcharles
Comment Utility
Hi,
 I will post a better example later today.
Thanks,
Victor
0
 

Author Comment

by:vcharles
Comment Utility
Hi,

Below is a better example with three different NSC values (Part A) with duplicate values in three records, which need to be combined in three records in Part B.


Part A.
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>100</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>100</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>100</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>101</NSC>
     <AGD4></AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>101</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>101</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>10111</NSC>
     <AGD4></AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>10111</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>10111</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Desired Results:

Part B:
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>100</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
<AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>101</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
<AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>10111</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Thanks,

Victor
0
 
LVL 11

Accepted Solution

by:
louisfr earned 250 total points
Comment Utility
With the correction I made earlier, Fernando's code would have worked as expected. Here is the corrected code, including the change from COUNTRY to NSC:
Dim xdoc As XDocument = XDocument.Load("Path to the XML document with input data ")
'' New document being created
Dim newXdoc As New XDocument(New XElement("Root"))
'' Get all the parent nodes
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                   Group By aop5.Element("NSC").Value Into Group _
                   Select Group).ToList()

'' Select all the needed nodes and place them in a new document 
For Each p As XElement() In parentNodes
    Dim nodes = (From n As XElement In p.Descendants().ToList() _
                 Where Not String.IsNullOrEmpty(n.Value)
                 Group By n.Name Into Group _
                 Select Group.First()).ToList()

     '' Create the new elements of the document
    Dim addNodes = New XElement("NewDataSet", nodes)
    newXdoc.Root.Add(addNodes)
Next
 '' Save the new document
newXdoc.Save("Path and file name to store on disk")

Open in new window

0
 

Author Comment

by:vcharles
Comment Utility
Thank you, will try it and get back to you.
Victor
0
 

Author Comment

by:vcharles
Comment Utility
Hi,

I'm get only one record for each NSC but they do not include values from the other data elements of the other rows which contained the same NSC.

Victor
0
 

Author Comment

by:vcharles
Comment Utility
Hi,

The code works but I see the problem. The data elements with no data include "N/A", therefore  N/A is included in the results eventhough in another record with same NSC there is an existing value in that data element.  When I remove N/A in all data elements the results only returns data elements with data.
Is there a way to include the empty data elements with my results for all the records?

Thanks,

Victor
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Suggested Solutions

For those of you who don't follow the news, or just happen to live under rocks, Microsoft Research released a beta SDK (http://www.microsoft.com/en-us/download/details.aspx?id=27876) for the Xbox 360 Kinect. If you don't know what a Kinect is (http:…
The ECB site provides FX rates for major currencies since its inception in 1999 in the form of an XML feed. The files have the following format (reducted for brevity) (CODE) There are three files available HERE (http://www.ecb.europa.eu/stats/exch…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now