Solved

merging XML in bash

Posted on 2013-05-14
7
334 Views
Last Modified: 2013-05-17
Greetings,
I have two xml documents:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
</document>

Open in new window


<images>
     <image>
          <name></name>
          <size></size>
     </image>
     ....(more images)
</images>

Open in new window


I need to get <images> into <document> like this:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
     <images>
          <image>
               <name></name>
               <size></size>
          </image>
          ....(more images)
     </images>
</document>

Open in new window


Is there a way to do it in a bash script? or something like that?  xmllint?

Thanks
0
Comment
Question by:Evan Cutler
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 23

Expert Comment

by:nemws1
ID: 39165392
Not that I know of that is XML aware.

I would just:

cat document.xml images.xml > newdocument.xml

And then edit 'newdocument.xml' and move the "</document>" line.

Do you have like 1000 (or more) files that you need to do this with?  Is there other stuff *after* the "</document>" line?
0
 
LVL 9

Author Comment

by:Evan Cutler
ID: 39165420
yeah there is.  unfortunately the document.xml is a HUGE XML document...and the only thing I have in my arsonal is my XPATH.
0
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 150 total points
ID: 39165447
The next thing that comes to mind is using Perl and one of the several XML modules (but that's pretty much just xpath again).

Have you tried xmlstarlet?

http://xmlstar.sourceforge.net/overview.php

I'm thinking the '--xinclude' argument can do what you want.  Check out the examples:

http://xmlstar.sourceforge.net/doc/xmlstarlet.txt
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 
LVL 62

Expert Comment

by:gheist
ID: 39167252
You can try programming xmllint, namely xmllint --shell which can traverse xml tree and emit converted structure(s) and validate against DTD after if needed.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 39174904
If the layout is as you described, and the </tag1> tag only occurs once in the file, a simple awk would do it:
awk '/<\/tag1>/{print;system("cat image.xml");next}{print}' doc.xml > output.xml

Open in new window

It wouldn't be indented in the way you show, but that shouldn't affect the XML itself.  If you really wanted it indented, that would be just a bit more complex and messier.
0
 
LVL 9

Author Comment

by:Evan Cutler
ID: 39174919
that's pretty genius simon,
instead of tag then print, can you do print (cat...) before </document>
to guarantee placement?
0
 
LVL 19

Accepted Solution

by:
simon3270 earned 350 total points
ID: 39175388
Yes, even easier in fact!
awk '/<\/document>/{system("cat image.xml")}{print}' doc.xml > output.xml

Open in new window

0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Attention: This article will no longer be maintained. If you have any questions, please feel free to mail me. jgh@FreeBSD.org Please see http://www.freebsd.org/doc/en_US.ISO8859-1/articles/freebsd-update-server/ for the updated article. It is avail…
Utilizing an array to gracefully append to a list of EmailAddresses
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now