Solved

[Python] Extracing elements with ElemenTree

Posted on 2013-11-10
3
275 Views
Last Modified: 2013-11-23
How do I extract the element horizontal and vertical from the XML. The code:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child tag or None if the tag isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    QualifiedElement.__init__(self, el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    #self.frameSize.horizontal = self.get_child_text('zip')
       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
       # print source.frameSize.horizontal

if __name__=='__main__':
    main() 

Open in new window

0
Comment
Question by:forums_mp
  • 2
3 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 80 total points
ID: 39640536
Here you go:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_element(self, tag, ns=None):
    '''returns the child element or None if the element isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child element or None'''      
    child = self.get_child_element(tag, ns)
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    super(Source, self).__init__(el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    self.frameSize = FrameSize(self.get_child_element('frameSize'), ns)

class FrameSize( QualifiedElement ):
    def __init__(self, el, ns=None):
        super(FrameSize, self).__init__(el, ns) 
        if el.tag != self.fqn('frameSize'):
            raise InvalidTagError(el, self.fqn('frameSize'))
        self.horizontal = self.get_child_text('horizontal')
        self.vertical = self.get_child_text('vertical')

       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
      print source.frameSize.horizontal

if __name__=='__main__':
    main()

Open in new window

0
 

Author Comment

by:forums_mp
ID: 39640636
clockwatcher .. thanks as always.   What IDE - if any do you use for python?   I want something with intellisense given I'm learning and it would be nice to have insight into the API.
0
 
LVL 25

Expert Comment

by:clockwatcher
ID: 39640786
I'm a vim fan-- so usually that's what I'm coding with.   But I've used Komodo, Eclipse with the pydev plugin and Visual Studio with python tools (PTVS).  And, actually, any of the three are pretty good.  If you're on windows and are used to Visual Studio, give Python Tools for Visual Studio a try.   If you're on Linux (or if it's a quick and dirty throwaway script), try Komodo.  For larger projects (or if you're familiar with Eclipse), the pydev plugin is pretty good.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A set of related code is known to be a Module, it helps us to organize our code logically which is much easier for us to understand and use it. Module is an object with arbitrarily named attributes which can be used in binding and referencing. …
Article by: Swadhin
Introduction of Lists in Python: There are six built-in types of sequences. Lists and tuples are the most common one. In this article we will see how to use Lists in python and how we can utilize it while doing our own program. In general we can al…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question