Python ElementTree xml output to csv

I have the following XML file ('registerreads_EE.xml'):

<?xml version="1.0" encoding="us-ascii" standalone="yes"?>
<ReadingDocument xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ReadingStatusRefTable>
    <ReadingStatusRef Ref="1">
      <UnencodedStatus SourceValidation="SIMPLE">
        <StatusCodes>
          <Signal>XX</Signal>
        </StatusCodes>
      </UnencodedStatus>
    </ReadingStatusRef>
  </ReadingStatusRefTable>
  <Header>
    <IEE_System Id="XXXXXXXXXXXXXXX" />
    <Creation_Datetime Datetime="2015-10-22T09:05:32Z" />
    <Timezone Id="UTC" />
    <Path FilePath="X:\XXXXXXXXXXXX.xml" />
    <Export_Template Id="XXXXX" />
    <CorrelationID Id="" />
  </Header>
  <ImportExportParameters ResubmitFile="false" CreateGroup="true">
    <DataFormat TimestampType="XXXXXX" Type="XXXX" />
  </ImportExportParameters>
  <Channels>
    <Channel StartDate="2015-10-21T00:00:00-05:00" EndDate="2015-10-22T00:00:00-05:00">
      <ChannelID ServicePointChannelID="73825603:301" />
      <Readings>
        <Reading Value="3577.0" ReadingTime="2015-10-21T00:00:00-05:00" StatusRef="1" />
        <Reading Value="3601.3" ReadingTime="2015-10-22T00:00:00-05:00" StatusRef="1" />
      </Readings>
      <ExportRequest RequestID="152" EntityType="ServicePoint" EntityID="73825603" RequestSource="Scheduled" />
    </Channel>
    <Channel StartDate="2015-10-21T00:00:00-05:00" EndDate="2015-10-22T00:00:00-05:00">
      <ChannelID ServicePointChannelID="73825604:301" />
      <Readings>
        <Reading Value="3462.5" ReadingTime="2015-10-21T00:00:00-05:00" StatusRef="1" />
        <Reading Value="3501.5" ReadingTime="2015-10-22T00:00:00-05:00" StatusRef="1" />
      </Readings>
      <ExportRequest RequestID="152" EntityType="ServicePoint" EntityID="73825604" RequestSource="Scheduled" />
    </Channel>
  </Channels>
</ReadingDocument>

I want to parse the XML of the channel data into a csv file.

He is what I have written in Python 2.7.10:

import xml.etree.ElementTree as ET

tree = ET.parse('registerreads_EE.xml')

root = tree.getroot()[3]

for channel in tree.iter('Channel'):
    for exportrequest in channel.iter('ExportRequest'):
        entityid = exportrequest.attrib.get('EntityID')
        for meterread in channel.iter('Reading'):
            read = meterread.attrib.get('Value')
            date = meterread.attrib.get('ReadingTime')
            print read[:-2],",",date[:10],",",entityid
           
tree.write(open('registerreads_EE.csv','w'))

Here is the screen output when the above is run:

3577 , 2015-10-21 , 73825603
3601 , 2015-10-22 , 73825603
3462 , 2015-10-21 , 73825604
3501 , 2015-10-22 , 73825604

The 'registerreads.csv' output file is like the original XML file, minus the first line.

I would like the printed output above outputted to a csv file with headers of read, date, entityid.

I am having difficulty with this.  This is my first python program.  Any help is appreciated.
Scott AbrahamTechnical Systems ManagerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

peprCommented:
Here is your modified code:
#!python2

import xml.etree.ElementTree as ET
import csv

tree = ET.parse('registerreads_EE.xml')
root = tree.getroot()[3]

with open('registerreads_EE.csv', 'wb') as fout:
    writer = csv.writer(fout)
    row = ['read', 'date', 'entityid']
    writer.writerow(row)
    for channel in tree.iter('Channel'):
        for exportrequest in channel.iter('ExportRequest'):
            entityid = exportrequest.attrib.get('EntityID')
            for meterread in channel.iter('Reading'):
                read = meterread.attrib.get('Value')
                date = meterread.attrib.get('ReadingTime')
                row = [read[:-2], date[:10], entityid]
                print ', '.join(row)
                writer.writerow(row)

Open in new window

Notice that the file must be opened in binary mode (that holds for Python 2 but not for Python 3). The csv.writer() wraps the writing process. The row is a list of items that will be written as a row of the csv file. Notice the row filled in the inner loop. Notice how the values can be joined for printing. The same data is then output to the csv file. The with construct ensures the automatic closing of the file. The very first row with the header texts is written the same way before the first loop.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Scott AbrahamTechnical Systems ManagerAuthor Commented:
Thank you, worked awesome!  This will come in handy for other items as well.
peprCommented:
I suggest to modify further the extraction of the data using the simple XPath expressions:
#!python2

import xml.etree.ElementTree as ET
import csv

tree = ET.parse('registerreads_EE.xml')
root = tree.getroot()

with open('registerreads_EE.csv', 'wb') as fout:
    writer = csv.writer(fout)
    row = ['read', 'date', 'entityid']
    writer.writerow(row)
    for channel in root.findall('.//Channel'):
        exportrequest = channel.find('ExportRequest')
        entityid = exportrequest.attrib.get('EntityID')
        for reading in channel.findall('Readings/Reading'):
            value = reading.attrib.get('Value')
            reading_time = reading.attrib.get('ReadingTime')
            row = [value[:-2], reading_time[:10], entityid]
            print ', '.join(row)
            writer.writerow(row)

Open in new window

If you name it root it should not be the fourth element from the XML structure. The expression './/Channel' means: from this element ('.'), wherever in the structure ('//'), the Channel element. You can also be more explicit and write 'Channels/Channel' instead. You can also write './Channels/Channel' (with a single slash after the dot). It means that the Channel elements are searched only as direct children of the Channels element, and that one has to be the direct descendant of the root element.

So, find all Channel elements. Inside of each, extract the info from the ExportRequest element (no loop needed here). And inside the Channel, find all 'Readings/Reading' elements (explicit XPath again), get their attributes, form the row, and write it to the screen and to the csv file.

reading.attrib['Value'] can also be used instead of reading.attrib.get('Value'), but the later is better if the attribute may be missing (None is returned by .get() in the case).
Scott AbrahamTechnical Systems ManagerAuthor Commented:
Out of curiosity, if I wanted to open the xml file from c:\export and save it to i:\import, how would I do that?  Right now I just have the script and the files residing under the scripts directory c:\python27\scripts
peprCommented:
You just pass the file name with the path. The bare file name is a relative path to the file at the current working directory. (That is, the 'registerreads_EE.xml' is equal to './registerreads_EE.xml'.) The recommended approach is to store the file name to the variable, and use the variable when opening the file or for reading or for writing.

Just be careful with backslashes. Or they have to be doubled (error prone), or you can use the normal slash (as in Unix -- recommended), or use a raw string literal with backslashes (error prone), or use the os.path.join() function to put the parts of the path together (also recommended). However, you have to use at least one slash after the disk name (if I recall correctly):
    ...
    fname_in = 'i:/import/registerreads_EE.xml'
    fname_out = os.path.join('i:/', 'export', 'anothersubdir', 'registerreads_EE.csv')
    fname = r'x:\path\as\raw\string\literal.txt'

Open in new window

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.