• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 348
  • Last Modified:

merging XML in bash

Greetings,
I have two xml documents:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
</document>

Open in new window


<images>
     <image>
          <name></name>
          <size></size>
     </image>
     ....(more images)
</images>

Open in new window


I need to get <images> into <document> like this:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
     <images>
          <image>
               <name></name>
               <size></size>
          </image>
          ....(more images)
     </images>
</document>

Open in new window


Is there a way to do it in a bash script? or something like that?  xmllint?

Thanks
0
Evan Cutler
Asked:
Evan Cutler
  • 2
  • 2
  • 2
  • +1
2 Solutions
 
nemws1Commented:
Not that I know of that is XML aware.

I would just:

cat document.xml images.xml > newdocument.xml

And then edit 'newdocument.xml' and move the "</document>" line.

Do you have like 1000 (or more) files that you need to do this with?  Is there other stuff *after* the "</document>" line?
0
 
Evan CutlerAuthor Commented:
yeah there is.  unfortunately the document.xml is a HUGE XML document...and the only thing I have in my arsonal is my XPATH.
0
 
nemws1Commented:
The next thing that comes to mind is using Perl and one of the several XML modules (but that's pretty much just xpath again).

Have you tried xmlstarlet?

http://xmlstar.sourceforge.net/overview.php

I'm thinking the '--xinclude' argument can do what you want.  Check out the examples:

http://xmlstar.sourceforge.net/doc/xmlstarlet.txt
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
gheistCommented:
You can try programming xmllint, namely xmllint --shell which can traverse xml tree and emit converted structure(s) and validate against DTD after if needed.
0
 
simon3270Commented:
If the layout is as you described, and the </tag1> tag only occurs once in the file, a simple awk would do it:
awk '/<\/tag1>/{print;system("cat image.xml");next}{print}' doc.xml > output.xml

Open in new window

It wouldn't be indented in the way you show, but that shouldn't affect the XML itself.  If you really wanted it indented, that would be just a bit more complex and messier.
0
 
Evan CutlerAuthor Commented:
that's pretty genius simon,
instead of tag then print, can you do print (cat...) before </document>
to guarantee placement?
0
 
simon3270Commented:
Yes, even easier in fact!
awk '/<\/document>/{system("cat image.xml")}{print}' doc.xml > output.xml

Open in new window

0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

  • 2
  • 2
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now