?
Solved

How do I insert an XML fragment into another XML tree using Python and lxml?

Posted on 2014-07-29
4
Medium Priority
?
1,491 Views
Last Modified: 2014-07-29
I have a fragment of XML (the output from an XSL transform using lxml.etree), which I want to add to another XML tree.  I have
  <newsection>
    <newitem name="fred" attr1="7"/> 
    <newitem name="george" attr1="6"/>
  </newsection>

Open in new window

and
<oldparent>
  <oldsection>
    <oldchild>
      <oldgc name="sally"/>
    </oldchild>
  </oldsection>
  <othersect>
    <otherch name="alice"/>
  </othersect>
</oldparent>

Open in new window

Essentially, I need to insert the first fragment within the second so that "<newsection" is at the same level as "<oldsection>" and "<othersect>".  the location doesn't matter.

So far, I have tried inserting the first fragment entry by entry, but it ends up in the wrong order.  For example, using
#!/usr/bin/env python

import lxml.etree as ET

sd = ET.parse('file_containing_first_fragment.xml')
ft = ET.parse('extract_first_fragment.xsl')
transform = ET.XSLT(ft)
rx = transform(sd)
# "rx" now contains just the first fragment

st = ET.parse('second_fragment.xml')
root = st.getroot()
for e in rx.getiterator():
	root.append(e)
print(ET.tostring(st))

Open in new window

This does insert the first fragment, but the ordering is wrong - I get the "<newsection>" open and close before any of the "<newitem>" entries:
<oldparent>
  <oldsection>
    <oldchild>
      <oldgc name="sally"/>
    </oldchild>
  </oldsection>
  <othersect>
    <otherch name="alice"/>
  </othersect>
<newsection>
       </newsection>
<newitem name="fred" attr1="7"/> 
    <newitem name="george" attr1="6"/>
</oldparent>

Open in new window

How can I do this and have that "<newsection>" fragment in the right order?
0
Comment
Question by:simon3270
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 29

Accepted Solution

by:
pepr earned 2000 total points
ID: 40226402
The following code is with the standard xml.etree.ElementTree, but it should be the same with the lxml:
#!python3

import xml.etree.ElementTree as ET

rx = ET.fromstring('''
<newsection>
  <newitem name="fred" attr1="7"/> 
  <newitem name="george" attr1="6"/>
</newsection>''')
# "rx" now contains just the first fragment

root = ET.fromstring('''
<oldparent>
  <oldsection>
    <oldchild>
      <oldgc name="sally"/>
    </oldchild>
  </oldsection>
  <othersect>
    <otherch name="alice"/>
  </othersect>
</oldparent>''')

root.append(rx)
ET.dump(root)

Open in new window

It prints
<oldparent>
  <oldsection>
    <oldchild>
      <oldgc name="sally" />
    </oldchild>
  </oldsection>
  <othersect>
    <otherch name="alice" />
  </othersect>
<newsection>
  <newitem attr1="7" name="fred" />
  <newitem attr1="6" name="george" />
</newsection></oldparent>

Open in new window

The root stores the element 'oldparent' as the list of its children. You want to append the rx element as another child (as a whole).
0
 
LVL 19

Author Comment

by:simon3270
ID: 40226537
Thanks, @pepr, you pointed me in the right direction.

Your example works fine with lxml.etree, but my code didn't - I got:
  File "./tstsd.py", line 13, in <module>
    root.append(rx)
AttributeError: 'lxml.etree._ElementTree' object has no attribute 'append'

My problem turned out to be that ET.parse (which I was using to read the XML file) returns an ElementTree, which doesn't have an "append" attribute. ET.fromstring (which you used) returns an Element, which does have one.

The fix was to make use the string attributes to make ElementTrees into Elements, so I did
    root = ET.parse('second_fragment.xml')
    root = ET.tostring(ET.fromstring(root))
Not pretty, but it worked!
0
 
LVL 29

Expert Comment

by:pepr
ID: 40226626
No, no! It is known that ET.parse() returns a tree object. The root element object is obtained from the tree object via calling its .getroot() method -- as you did in your code for example here:
...
st = ET.parse('second_fragment.xml')
root = st.getroot()

Open in new window

0
 
LVL 19

Author Comment

by:simon3270
ID: 40226960
Aha, even easier!  And certainly prettier. Many thanks.
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
Suggested Courses

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question