Solved

[Python] Extracing elements with ElemenTree

Posted on 2013-11-10
3
278 Views
Last Modified: 2013-11-23
How do I extract the element horizontal and vertical from the XML. The code:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child tag or None if the tag isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    QualifiedElement.__init__(self, el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    #self.frameSize.horizontal = self.get_child_text('zip')
       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
       # print source.frameSize.horizontal

if __name__=='__main__':
    main() 

Open in new window

0
Comment
Question by:forums_mp
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 80 total points
ID: 39640536
Here you go:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_element(self, tag, ns=None):
    '''returns the child element or None if the element isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child element or None'''      
    child = self.get_child_element(tag, ns)
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    super(Source, self).__init__(el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    self.frameSize = FrameSize(self.get_child_element('frameSize'), ns)

class FrameSize( QualifiedElement ):
    def __init__(self, el, ns=None):
        super(FrameSize, self).__init__(el, ns) 
        if el.tag != self.fqn('frameSize'):
            raise InvalidTagError(el, self.fqn('frameSize'))
        self.horizontal = self.get_child_text('horizontal')
        self.vertical = self.get_child_text('vertical')

       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
      print source.frameSize.horizontal

if __name__=='__main__':
    main()

Open in new window

0
 

Author Comment

by:forums_mp
ID: 39640636
clockwatcher .. thanks as always.   What IDE - if any do you use for python?   I want something with intellisense given I'm learning and it would be nice to have insight into the API.
0
 
LVL 25

Expert Comment

by:clockwatcher
ID: 39640786
I'm a vim fan-- so usually that's what I'm coding with.   But I've used Komodo, Eclipse with the pydev plugin and Visual Studio with python tools (PTVS).  And, actually, any of the three are pretty good.  If you're on windows and are used to Visual Studio, give Python Tools for Visual Studio a try.   If you're on Linux (or if it's a quick and dirty throwaway script), try Komodo.  For larger projects (or if you're familiar with Eclipse), the pydev plugin is pretty good.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Variable is a place holder or reserved memory locations to store any value. Which means whenever we create a variable, indirectly we are reserving some space in the memory. The interpreter assigns or allocates some space in the memory based on the d…
When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question