Solved

[Python] Extracing elements with ElemenTree

Posted on 2013-11-10
3
271 Views
Last Modified: 2013-11-23
How do I extract the element horizontal and vertical from the XML. The code:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child tag or None if the tag isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    QualifiedElement.__init__(self, el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    #self.frameSize.horizontal = self.get_child_text('zip')
       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
       # print source.frameSize.horizontal

if __name__=='__main__':
    main() 

Open in new window

0
Comment
Question by:forums_mp
  • 2
3 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 80 total points
Comment Utility
Here you go:

import xml.etree.ElementTree as ET

class InvalidTagError(Exception):
  def __init__(self, element, expected):
    self.element = element
    self.expected = expected
  def __str__(self):
    return "'{0}' is not an instance of '{1}'".format(self.element, self.expected)

class QualifiedElement(object):

  @classmethod
  def qualify(classname, tag, ns):
    if ns != '':
      tag =  "{{{ns}}}{tag}".format(ns=ns, tag=tag)
    return tag

  def fqn(self, tag, ns=None):
    if ns == None:
      ns = self.ns
    return QualifiedElement.qualify(tag, ns)

  def get_child_element(self, tag, ns=None):
    '''returns the child element or None if the element isn't found'''      
    child = self.element.find(self.fqn(tag, ns))
    if child == None:
      return None
    else:
      return child

  def get_child_text(self, tag, ns=None):
    '''returns the text for the child element or None'''      
    child = self.get_child_element(tag, ns)
    if child == None:
      return None
    else:
      return child.text

  def __init__(self, el, ns=None):        
    if not isinstance(el, ET.Element):
      raise TypeError()
    self.ns = ns
    self.element = el


class Source( QualifiedElement ):

  def __init__(self, el, ns=None):
    super(Source, self).__init__(el, ns)  
    if el.tag != self.fqn('source'):
      raise InvalidTagError(el, self.fqn('source'))

    self.num = el.get('num')  # example of attribute access
    self.name = self.get_child_text('name')
    self.street = self.get_child_text('street')
    self.city = self.get_child_text('city')
    self.zip = self.get_child_text('zip')
    self.frameSize = FrameSize(self.get_child_element('frameSize'), ns)

class FrameSize( QualifiedElement ):
    def __init__(self, el, ns=None):
        super(FrameSize, self).__init__(el, ns) 
        if el.tag != self.fqn('frameSize'):
            raise InvalidTagError(el, self.fqn('frameSize'))
        self.horizontal = self.get_child_text('horizontal')
        self.vertical = self.get_child_text('vertical')

       
def main():

    xml='''<?xml version="1.0"?>
<!--
    Document   test.xml
    Created on :
    Author     : Jane Doe
    Description: XML Definition for address 
-->
<st:address xmlns:st="http://this/prefix/needs/to/be/defined" >
  <st:source num="1">
    <st:name>Bubba McBubba</st:name>
    <st:street>123 Happy Go Lucky Ln.</st:street>
    <st:city>Seattle</st:city>
    <st:state>WA</st:state>
    <st:zip>98056</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
  <st:source num="2">
    <st:name>McBubba</st:name>
    <st:street>456 Happy Go Lucky Ln.</st:street>
    <st:city>Orlando</st:city>
    <st:state>FL</st:state>
    <st:zip>43336</st:zip>
    <st:frameSize>
      <st:horizontal>640</st:horizontal>
      <st:vertical>480</st:vertical>
    </st:frameSize>
  </st:source>
</st:address>'''

    ns = 'http://this/prefix/needs/to/be/defined'
    root = ET.fromstring(xml)
    for child_of_root in root :
      print child_of_root.tag, child_of_root.attrib
      print child_of_root [ 0 ].tag, child_of_root [ 0 ].text

    sources = []
    for el in root.findall(QualifiedElement.qualify('source', ns)):
      source = Source(el, ns)
      sources.append(source)

    for source in sources:
      print source.name
      print source.street
      print source.frameSize.horizontal

if __name__=='__main__':
    main()

Open in new window

0
 

Author Comment

by:forums_mp
Comment Utility
clockwatcher .. thanks as always.   What IDE - if any do you use for python?   I want something with intellisense given I'm learning and it would be nice to have insight into the API.
0
 
LVL 25

Expert Comment

by:clockwatcher
Comment Utility
I'm a vim fan-- so usually that's what I'm coding with.   But I've used Komodo, Eclipse with the pydev plugin and Visual Studio with python tools (PTVS).  And, actually, any of the three are pretty good.  If you're on windows and are used to Visual Studio, give Python Tools for Visual Studio a try.   If you're on Linux (or if it's a quick and dirty throwaway script), try Komodo.  For larger projects (or if you're familiar with Eclipse), the pydev plugin is pretty good.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Python tuples 2 109
Python output problem 10 49
Python - how to convert UTC time string to local time 4 71
data scientists and AI 17 68
Plenty of writing has gone on the web trying to compare Python with other competitive programming languages and vice versa. However, not much has been put into a wholistic perspective. This article should help you decide whether to adopt Python as a…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now