Solved

Parse data from webservice data feed to csv

Posted on 2014-12-23
4
189 Views
Last Modified: 2014-12-29
Hi,

Below is an example to parse the hourly observation data from the Met Office DataPoint API and turn it into a CSV file. When I run the python module it gives the following error:

Traceback (most recent call last):
  File "C:\Temp\MetOfficehourlyjson.py", line 19, in <module>
    import requests
ImportError: No module named 'requests'

Can I please get some help with the error and syntax?

Thank you

#!/usr/bin/python
 
#-------------------------------------------------------------------------------------------
#Assemble a CSV file from the Hourly Observation Service from the Met Office 
#DataPoint API.
 
#This is a bit of a mess – it was hacked together quickly to give an example of how
#you could go about having a CSV file of observation data. It does an API request,
#parses the data as JSON, then turns that into a set of data model objects, those
#are them flattened and written down as a CSV file.

#Known Issues:
 
#* Some of the data isn't filled in (data date, data type, etc).
#* Error handling just sets the field to blank.
#* There's only real error handling on the observation data.
#-------------------------------------------------------------------------------------------
 
import requests
import json
import csv
 
API_KEY = "???????????????????"
URL = "http://datapoint.metoffice.gov.uk/public/data/val/wxobs/all/json/all?res=hourly&key=%s" % (API_KEY)
OUTPUT_FILE = "C:\Temp\example_data.csv"
 
def remove_key(d, key):
#-------------------------------------------------------------------------------------------
    #Remove a key from a dictionary.
    #Return a copy of the dict.
#-------------------------------------------------------------------------------------------
    r = dict(d)
    del r[key]
    return r
 
def flatten_location(loc):
#-------------------------------------------------------------------------------------------
    #Flatten a Location instance in preparation for writing to a CSV.
#-------------------------------------------------------------------------------------------
    location = loc.__dict__
    location_info = remove_key(location, 'data')
    location_data = []
 
    for data in location['data']:
        flat_data = data.__dict__
        flat = dict(location_info.items() + flat_data.items())
        location_data.append(flat)
 
    return location_data
 
class Location():
#-------------------------------------------------------------------------------------------
    #Data Model for locations returned from DataPoint.
#-------------------------------------------------------------------------------------------
    def __init__(self):
        self.data_date = ""
        self.data_type = ""
        self.location_id = ""
        self.lat = ""
        self.lon = ""
        self.name = ""
        self.country = ""
        self.continent = ""
        self.data = []
 
    def __repr__(self):
        return "<Location: %s>" % (self.name)
 
class Data():
#-------------------------------------------------------------------------------------------
    #Data Model for collected data returned from DataPoint
#-------------------------------------------------------------------------------------------
    def __init__(self):
        self.collection_time = ""
        self.wind_direction = ""
        self.pressure = ""
        self.wind_speed = ""
        self.temperature = ""
        self.visibility = ""
        self.weather_type = ""
        
    def __repr__(self):
        return "<Data: %s>" % (self.collection_time)
 
# fetch data, parse json
r = requests.get(URL)
j = r.json()
 
# get a collection of locations
locations = j['SiteRep']['DV']['Location']
parsed_locations = []
 
for location in locations:
    # turn the locations into Location objects
    loc = Location()
    loc.location_id = location['i']
    loc.lat = location['lat']
    loc.lon = location['lon']
    loc.name = location['name']
    loc.country = location['country']
    loc.continent = location['continent']
 
    # turn the location's data into Data objects
    for period in location['Period']:
        for rep in period['Rep']:
            data = Data()
            data.collection_time = rep['$']
            try:
                data.wind_direction = rep['D']
            except KeyError:
                data.wind_direction = ""
            try:
                data.pressure = rep['P']
            except KeyError:
                data.pressure = ""
            try:
                data.wind_speed = rep['S']
            except KeyError:
                data.wind_speed = ""
            try:
                data.temperature = rep['T']
            except KeyError:
                data.temperature = ""
            try:
                data.visibility = rep['V']
            except KeyError:
                data.visibility = ""
            try:
                data.weather_type = rep['W']
            except KeyError:
                data.weather_type = ""
 
            loc.data.append(data)
 
    # hold onto a flattened version of the location
    parsed_locations.extend(flatten_location(loc))
 
# write the csv
with open(OUTPUT_FILE, "wb") as f:
    writer = csv.DictWriter(f, parsed_locations[0].keys())
    writer.writeheader()
    for d in parsed_locations:
        writer.writerow(d)

Open in new window

0
Comment
Question by:crompnk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 500 total points
ID: 40514908
You don't appear to have requests installed.  You'll need to install it: http://docs.python-requests.org/en/latest/user/install/#install
0
 

Author Comment

by:crompnk
ID: 40516393
Hi,

Thanks, I installed requests and the script seemed to run ok but I now have a new error:

Traceback (most recent call last):
  File "C:\Temp\MetOfficehourlyjson.py", line 136, in <module>
    parsed_locations.extend(flatten_location(loc))
  File "C:\Temp\MetOfficehourlyjson.py", line 46, in flatten_location
    flat = dict(location_info.items() + flat_data.items())
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'

I am using Python34.
0
 
LVL 25

Assisted Solution

by:clockwatcher
clockwatcher earned 500 total points
ID: 40516595
Your code is written for python 2.  Under python 3, dict_items() doesn't return a list but instead returns a view and you can't add two views together.  

If you want list functionality, you need to create a list.  In other words, change
flat = dict(location_info.items() + flat_data.items())

Open in new window

To:
flat = dict(list(location_info.items()) + list(flat_data.items()))

Open in new window

0
 

Author Closing Comment

by:crompnk
ID: 40521470
I also converted the python script from 2 to 3 using 2to3.py
I then got an error 'TypeError: 'str' does not support the buffer interface', so I just needed to add a 't' to the mode so it becomes "wt". This causes Python to open the file as a text file and not binary.
The script then worked fine.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A set of related code is known to be a Module, it helps us to organize our code logically which is much easier for us to understand and use it. Module is an object with arbitrarily named attributes which can be used in binding and referencing. …
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question