Solved

Merge XML Files

Posted on 2012-03-09
22
1,816 Views
Last Modified: 2012-03-12
Looking for a way to merge a folder with about 50 different xml files into one xml files. I will schedule it for a daily task to run and merge all xml files.. Anyone know if this can be done with Powershell, or any other type of script? I found this link that talks about merging Excel files, is this close if we changed it to xml?? Please help!  

http://www.youdidwhatwithtsql.com/merging-csv-files-with-powershell/330
0
Comment
Question by:LeviDaily
  • 8
  • 6
  • 4
  • +1
22 Comments
 
LVL 53

Expert Comment

by:Bill Prew
ID: 37705072
Do all the XML files have a common schema?

I'd want to see some samples of the files to be merged.

Will there be any "duplicate" root nodes among the files, or is this more of an "append each file to the rest" type deal?

~bp
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37705076
can i email you the files? I would rather not post them. there would not
be duplicatec
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 37705094
It's actually against EE policy for us to exchange offline info while working a question (not my rule, but I get it).

If they contain sensitive data then typically posters will edit the file before posting to remove the sensitive data and replace it with meaningless info.

If it's a size issue then strip it down to just a representative sample of the file(s).

Understand, I'm not trying to be difficult, but I do want to abide by the EE rules on this so that all experts get to see the same info and messages.

~bp
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 
LVL 51

Accepted Solution

by:
ahoffmann earned 200 total points
ID: 37705103
do you mean to just concatenate the files? in DOS or similar that's:
type file1 file2 file3 > new-file

or do you mean to merge the content of the files according their xml structure?
then you need a proper xml parser and then walk through the xml tree programatically
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37705120
i totally understand.. i wikk remove the sensitive data shortly and post.not sure if i know the differencr between concatenate or merge. say i have two xml files. i would go into file 1 and select all and copy, then open file 2 and paste the data.
0
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 150 total points
ID: 37705126
You could certainly do this with perl.  The script would basically look like this (you'll likely need to tweak the XMLin and XMLout and merge calls):

#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
use Data::Merger;
my $dir = shift or die "Usage: $0 <dir to merge XML files in>\n";
# get the list of files
opendir DIR, $dir or die "could not open dir $dir: $!";
my @files = grep /\.xml$/, readdir DIR;
closedir DIR;
# get the first XML
# see perldoc XML::Simple for options
my %in_opts = ( ForceArray => 1 );
my $xml = XMLin(shift(@files), %in_opts);
# loop and merge the others
foreach my $fil (@files) {
    my $tmp = XMLin($fil, %in_opts);
    $xml = merger($xml, $tmp);
}
# output the XML
open OUT, '>', 'output.xml' or die "could not write output.xml: $!";
# see perldoc XML::Simple for options
print OUT XMLout($xml, ( ForceArray => 1 ));
close OUT;

Open in new window


If Data::Merger doesn't give you enough control, you can use Data::Nested instead:

#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
use Data::Nested;
my $dir = shift or die "Usage: $0 <dir to merge XML files in>\n";
# get the list of files
opendir DIR, $dir or die "could not open dir $dir: $!";
my @files = grep /\.xml$/, readdir DIR;
closedir DIR;
# get the first XML
# see perldoc XML::Simple for options
my %in_opts = ( ForceArray => 1 );
my $xml = XMLin(shift(@files), %in_opts);
# created nested data object - see perldoc Data::Nested
my $nds = new Data::Nested;
$nds->set_merge('merge_ul', 'merge');
# you could validate structure by calls to $nds->structure(...)
# loop and merge the others
foreach my $fil (@files) {
    my $tmp = XMLin($fil, %in_opts);
    $nds->merge($xml, $tmp, undef, 1);
}
# output the XML
open OUT, '>', 'output.xml' or die "could not write output.xml: $!";
# see perldoc XML::Simple for options
print OUT XMLout($xml, ( ForceArray => 1 ));
close OUT;

Open in new window

0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37705170
> i would go into file 1 and select all and copy, then open file 2 and paste the data.
I assume that's concatenation, hence see my simple suggestion :)
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 37705234
I wouldn't expect just concatenating all the single files together to yield a new single schema XML data file, but maybe I don't understand how you use the new merged file?

~bp
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 37705255
Concatenating will definitely not produce a valid XML file.  You need to merge them.  If there are no duplicates between files and only a single structure under the root element then this script will work (modified from earlier to get rid of heavy-duty merge modules):

#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
my $dir = shift or die "Usage: $0 <dir to merge XML files in>\n";
# get the list of files
opendir DIR, $dir or die "could not open dir $dir: $!";
my @files = grep /\.xml$/, readdir DIR;
closedir DIR;
# get the first XML
# see perldoc XML::Simple for options
my %in_opts = ( ForceArray => 1 );
my $xml = XMLin(shift(@files), %in_opts);
# loop and merge the others
foreach my $fil (@files) {
    my $tmp = XMLin($fil, %in_opts);
    # this line is probably wrong but I don't have a handy XML file to test on
    # and it should be close - the goal is to append the elements just below the
    # root element to produce a merged and valid output XML
    push @{$xml->[0]}, @{$tmp->[0]};
}
# output the XML
open OUT, '>', 'output.xml' or die "could not write output.xml: $!";
# see perldoc XML::Simple for options
print OUT XMLout($xml, ( ForceArray => 1 ));
close OUT;

Open in new window

0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37706348
do you stand on a powershell solution?
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37706356
no i am up fror anything.. using windows xp, dont care what solution i use as long as it works
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 37706987
@LeviDaily

I was still waiting for example files before I posted a solution...

~bp
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37708077
Ok.. Sorry to take a while.. HEre is an update. This is  for a custom POS system. So for every transaction, there are 3 xml files created. Looks like they have 3 different schemas. I have attached 1, 2, & 3.
1.xml
2.xml
3.xml
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37711658
Would it be easier if it converted all .xml's to a .csv file? I guess it doesnt have to be csv..
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 37711736
If the XML files have difference schemas, why are you trying to merge them into a single XML file?  It will never be valid (at least according to any DTD or XSD).

To go back to the very basics, what is it you are trying to do with the data?  You now mention csv when the original question was asking for merged XML (two very different things).
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37711753
Sorry ... You are right.. I just need all the data in one readable file.. Since this is a Point of Sale machine, I want to be able to "merge" all those xml's into one file that will then get transferred over ftp to a server.. The accounting team will have access to it..

If there was a "transaction" in question, the accounting team would look at that file (.xml, or .csv...??) to verify the transaction took place.
0
 
LVL 53

Expert Comment

by:Bill Prew
ID: 37711763
Why not just ZIP the files together first, and then transfer that over for possible future reference?

~bp
0
 
LVL 2

Author Comment

by:LeviDaily
ID: 37711780
BP.. you are right, except the Accounting Team wants it all in one file so the data is searchable... For instance, if we merge all the files to one every night, then the accounting team can go to the day in question, and open all the one file and search for data..

There could be hundreds of xml files in one day, and right now they have to open each file individually to find what they are looking for.. the titles of the xml files are the timestamp, no data referencing the customer..
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 37711796
Can you teach the Accounting Team about grep (or whatever the Windows equivalent is (findstr?))?  There's no reason to open all of the files to search for something.

If they really want everything in one file and it does not have to be a valid XML file, then I would just go with the simple "type *.xml > combined.xml" approach.  It's simple, can be done in a bat file, and will produce something as good as any other "random" merge of the files.

If it does have to be valid XML, then I'd still do something in perl.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37712298
> .. the Accounting Team wants it all in one file ..
what's the problem with my very first solution ID: 37705103
0
 
LVL 53

Assisted Solution

by:Bill Prew
Bill Prew earned 150 total points
ID: 37712395
Simplest way is to use the COPY command to do the concatenation then.  You can either do:

copy file1.xml+file2.xml+file3.xml all.xml

or, if you want to merge all the files in the current folder then you can do:

copy *.xml all.txt

and then rename the all.txt to all.xml

~bp
0
 
LVL 2

Author Closing Comment

by:LeviDaily
ID: 37713007
Thanks for all the help guys!! I am fine with the solution of type file>output since it doesnt require any other installations and is native to windows.. Sorry for being all over the place on this, but looks like they can search a text file.. Appreciate it much!!
0

Featured Post

Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Utilizing an array to gracefully append to a list of EmailAddresses
Synchronize a new Active Directory domain with an existing Office 365 tenant
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question