Solved

merging XML in bash

Posted on 2013-05-14
7
335 Views
Last Modified: 2013-05-17
Greetings,
I have two xml documents:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
</document>

Open in new window


<images>
     <image>
          <name></name>
          <size></size>
     </image>
     ....(more images)
</images>

Open in new window


I need to get <images> into <document> like this:
<document>
     <header></header>
     <tag1>
          <tag1a></tag1a>
    </tag1>
     <images>
          <image>
               <name></name>
               <size></size>
          </image>
          ....(more images)
     </images>
</document>

Open in new window


Is there a way to do it in a bash script? or something like that?  xmllint?

Thanks
0
Comment
Question by:Evan Cutler
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 23

Expert Comment

by:nemws1
ID: 39165392
Not that I know of that is XML aware.

I would just:

cat document.xml images.xml > newdocument.xml

And then edit 'newdocument.xml' and move the "</document>" line.

Do you have like 1000 (or more) files that you need to do this with?  Is there other stuff *after* the "</document>" line?
0
 
LVL 9

Author Comment

by:Evan Cutler
ID: 39165420
yeah there is.  unfortunately the document.xml is a HUGE XML document...and the only thing I have in my arsonal is my XPATH.
0
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 150 total points
ID: 39165447
The next thing that comes to mind is using Perl and one of the several XML modules (but that's pretty much just xpath again).

Have you tried xmlstarlet?

http://xmlstar.sourceforge.net/overview.php

I'm thinking the '--xinclude' argument can do what you want.  Check out the examples:

http://xmlstar.sourceforge.net/doc/xmlstarlet.txt
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 62

Expert Comment

by:gheist
ID: 39167252
You can try programming xmllint, namely xmllint --shell which can traverse xml tree and emit converted structure(s) and validate against DTD after if needed.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 39174904
If the layout is as you described, and the </tag1> tag only occurs once in the file, a simple awk would do it:
awk '/<\/tag1>/{print;system("cat image.xml");next}{print}' doc.xml > output.xml

Open in new window

It wouldn't be indented in the way you show, but that shouldn't affect the XML itself.  If you really wanted it indented, that would be just a bit more complex and messier.
0
 
LVL 9

Author Comment

by:Evan Cutler
ID: 39174919
that's pretty genius simon,
instead of tag then print, can you do print (cat...) before </document>
to guarantee placement?
0
 
LVL 19

Accepted Solution

by:
simon3270 earned 350 total points
ID: 39175388
Yes, even easier in fact!
awk '/<\/document>/{system("cat image.xml")}{print}' doc.xml > output.xml

Open in new window

0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
AD Cleanup by EmployeeID 11 62
centos commands 6 70
AWS EC2 HTTP & HTTPS 2 46
how to configure linux OS using Ubuntu 7 59
Every server (virtual or physical) needs a console: and the console can be provided through hardware directly connected, software for remote connections, local connections, through a KVM, etc. This document explains the different types of consol…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question