Solved

Help with combining data in xml file based on identical country using VB.NET

Posted on 2015-01-24
15
54 Views
Last Modified: 2015-09-12
Hi,

How do combine all records with identical <COUNTRY> in create one record combining data from other data elements, for example, If I have xml file in Part A, how do I create a new xml file as shown in art B? The example is only for one record, the xml file in part A will contain multiple records with similar situation.

Part A.
<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN></SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
</NewDataSet>
  <AOP5>
    <SCOUNTRY></SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
  <AOP5>
    <SCOUNTRY></SCOUNTRY>
    <SN></SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Part B:
<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Thanks,

Victor
0
Comment
Question by:vcharles
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 4
  • 2
15 Comments
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40568204
Hi Victor;

Your original document had a single value over all like records, for example one AGD4 having a value and all other records having an empty values in the node. Is it possible to have multiple values in the same node type across all records like shown below? If yes how do you want to handle these multiple values?

<NewDataSet>
  <AOP5>
    <SCOUNTRY>ITA 10411</SCOUNTRY>
    <SN>2012</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4></AGD4>
  </AOP5
</NewDataSet>
  <AOP5>
    <SCOUNTRY>WA 1020</SCOUNTRY>
    <SN>10411</SN>
    <COUNTRY>ITA</COUNTRY>
    <AGD4>F999</AGD4>
  </AOP5
  <AOP5>
    <SCOUNTRY>FL 2031</SCOUNTRY>
    <SN>1141</SN>
    <COUNTRY>ABC</COUNTRY>
    <AGD4>M777</AGD4>
  </AOP5>
</NewDataSet>

Open in new window

0
 

Author Comment

by:vcharles
ID: 40568223
Hi,

If this occurs the first record containing a data element  with data shoud overwrite the other records, for example if I have:\ the following in Part A:

Part A.
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M800</AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10412</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Part B should contain:

Part B:
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10412</SN>
     <COUNTRY>ITA</COUNTRY>
     <AGD4>M800</AGD4>
   </AOP5>
 </NewDataSet>
0
 
LVL 63

Assisted Solution

by:Fernando Soto
Fernando Soto earned 250 total points
ID: 40568503
Hi Victory;

This should do what you need.

Dim xdoc As XDocument = XDocument.Load("Path to the XML document with input data ")
'' New document being created
Dim newXdoc As New XDocument(New XElement("Root"))
'' Get all the parent nodes
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                   Group By aop5.Name Into Group _
                   Select Group).ToList()

'' Select all the needed nodes and place them in a new document 
For Each p As XElement() In parentNodes
    Dim nodes = (From n As XElement In p.Descendants().ToList() _
                 Where Not String.IsNullOrEmpty(n.Value)
                 Group By n.Name Into Group _
                 Select Group.First()).ToList()

     '' Create the new elements of the document
    Dim addNodes = New XElement("NewDataSet", nodes)
    newXdoc.Root.Add(addNodes)
Next
 '' Save the new document
newXdoc.Save("Path and file name to store on disk")

Open in new window

0
Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

 

Author Comment

by:vcharles
ID: 40568561
Thank you, I will get back to you.
0
 

Author Comment

by:vcharles
ID: 40568568
Hi again,

I don't see where <Country> is mentioned to create one record from multiple identical values. Can you please explain the code to me.

Thanks,

Victor
0
 

Author Comment

by:vcharles
ID: 40568821
Hi,

unfortunately it does not work, only one record was returned. I need to merge data from all data elements where country is the same and eliminate duplicate data in Country.

Victor
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40568831
Hi Victor;

Please post you code as you implemented it and the file that you used as the input data.
0
 
LVL 11

Expert Comment

by:louisfr
ID: 40570079
I don't see where <Country> is mentioned to create one record from multiple identical values.
In the sample, replace
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                    Group By aop5.Name Into Group _
                    Select Group).ToList()

Open in new window

with
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                    Group By aop5.Element("COUNTRY").Value Into Group _
                    Select Group).ToList()

Open in new window

0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40570541
Lets start from the beginning. In Part A in your question you have only AOP5 node names and no other. Does this mean that each file only handles one of these node type in each document you may have? Or can one document have different node types at this level?

You state that you want, "identical <COUNTRY>", but in your example you only show nodes of this, <COUNTRY>ITA</COUNTRY>, all having the same value. so it is difficult to help with something where all the information is not stated.

When dealing with XML we need accurate information on the structure that is to be operated on and what the output is to be.  

Please post an accurate representation of the input XML with all possible types and what you want the output to look like. Something like a mini specification your boss would give you to work on a project.
0
 

Author Comment

by:vcharles
ID: 40570610
Hi,
 I will post a better example later today.
Thanks,
Victor
0
 

Author Comment

by:vcharles
ID: 40571874
Hi,

Below is a better example with three different NSC values (Part A) with duplicate values in three records, which need to be combined in three records in Part B.


Part A.
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>100</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>100</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>100</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>101</NSC>
     <AGD4></AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>101</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>101</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN></SN>
     <NSC>10111</NSC>
     <AGD4></AGD4>
   </AOP5
 </NewDataSet>
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN>10411</SN>
     <NSC>10111</NSC>
     <AGD4></AGD4>
   </AOP5
   <AOP5>
     <SCOUNTRY></SCOUNTRY>
     <SN></SN>
     <NSC>10111</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Desired Results:

Part B:
 <NewDataSet>
   <AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>100</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
<AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>101</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
<AOP5>
     <SCOUNTRY>ITA 10411</SCOUNTRY>
     <SN>10411</SN>
     <NSC>10111</NSC>
     <AGD4>M777</AGD4>
   </AOP5>
 </NewDataSet>

Thanks,

Victor
0
 
LVL 11

Accepted Solution

by:
louisfr earned 250 total points
ID: 40572168
With the correction I made earlier, Fernando's code would have worked as expected. Here is the corrected code, including the change from COUNTRY to NSC:
Dim xdoc As XDocument = XDocument.Load("Path to the XML document with input data ")
'' New document being created
Dim newXdoc As New XDocument(New XElement("Root"))
'' Get all the parent nodes
Dim parentNodes = (From aop5 As XElement In xdoc.Descendants("AOP5") _
                   Group By aop5.Element("NSC").Value Into Group _
                   Select Group).ToList()

'' Select all the needed nodes and place them in a new document 
For Each p As XElement() In parentNodes
    Dim nodes = (From n As XElement In p.Descendants().ToList() _
                 Where Not String.IsNullOrEmpty(n.Value)
                 Group By n.Name Into Group _
                 Select Group.First()).ToList()

     '' Create the new elements of the document
    Dim addNodes = New XElement("NewDataSet", nodes)
    newXdoc.Root.Add(addNodes)
Next
 '' Save the new document
newXdoc.Save("Path and file name to store on disk")

Open in new window

0
 

Author Comment

by:vcharles
ID: 40572820
Thank you, will try it and get back to you.
Victor
0
 

Author Comment

by:vcharles
ID: 40572980
Hi,

I'm get only one record for each NSC but they do not include values from the other data elements of the other rows which contained the same NSC.

Victor
0
 

Author Comment

by:vcharles
ID: 40573265
Hi,

The code works but I see the problem. The data elements with no data include "N/A", therefore  N/A is included in the results eventhough in another record with same NSC there is an existing value in that data element.  When I remove N/A in all data elements the results only returns data elements with data.
Is there a way to include the empty data elements with my results for all the records?

Thanks,

Victor
0

Featured Post

Enroll in June's Course of the Month

June's Course of the Month is now available! Every 10 seconds, a consumer gets hit with ransomware. Refresh your knowledge of ransomware best practices by enrolling in this month's complimentary course for Premium Members, Team Accounts, and Qualified Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question