Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Parse data from webservice data feed to csv

Posted on 2014-12-23
4
Medium Priority
?
216 Views
Last Modified: 2014-12-29
Hi,

Below is an example to parse the hourly observation data from the Met Office DataPoint API and turn it into a CSV file. When I run the python module it gives the following error:

Traceback (most recent call last):
  File "C:\Temp\MetOfficehourlyjson.py", line 19, in <module>
    import requests
ImportError: No module named 'requests'

Can I please get some help with the error and syntax?

Thank you

#!/usr/bin/python
 
#-------------------------------------------------------------------------------------------
#Assemble a CSV file from the Hourly Observation Service from the Met Office 
#DataPoint API.
 
#This is a bit of a mess – it was hacked together quickly to give an example of how
#you could go about having a CSV file of observation data. It does an API request,
#parses the data as JSON, then turns that into a set of data model objects, those
#are them flattened and written down as a CSV file.

#Known Issues:
 
#* Some of the data isn't filled in (data date, data type, etc).
#* Error handling just sets the field to blank.
#* There's only real error handling on the observation data.
#-------------------------------------------------------------------------------------------
 
import requests
import json
import csv
 
API_KEY = "???????????????????"
URL = "http://datapoint.metoffice.gov.uk/public/data/val/wxobs/all/json/all?res=hourly&key=%s" % (API_KEY)
OUTPUT_FILE = "C:\Temp\example_data.csv"
 
def remove_key(d, key):
#-------------------------------------------------------------------------------------------
    #Remove a key from a dictionary.
    #Return a copy of the dict.
#-------------------------------------------------------------------------------------------
    r = dict(d)
    del r[key]
    return r
 
def flatten_location(loc):
#-------------------------------------------------------------------------------------------
    #Flatten a Location instance in preparation for writing to a CSV.
#-------------------------------------------------------------------------------------------
    location = loc.__dict__
    location_info = remove_key(location, 'data')
    location_data = []
 
    for data in location['data']:
        flat_data = data.__dict__
        flat = dict(location_info.items() + flat_data.items())
        location_data.append(flat)
 
    return location_data
 
class Location():
#-------------------------------------------------------------------------------------------
    #Data Model for locations returned from DataPoint.
#-------------------------------------------------------------------------------------------
    def __init__(self):
        self.data_date = ""
        self.data_type = ""
        self.location_id = ""
        self.lat = ""
        self.lon = ""
        self.name = ""
        self.country = ""
        self.continent = ""
        self.data = []
 
    def __repr__(self):
        return "<Location: %s>" % (self.name)
 
class Data():
#-------------------------------------------------------------------------------------------
    #Data Model for collected data returned from DataPoint
#-------------------------------------------------------------------------------------------
    def __init__(self):
        self.collection_time = ""
        self.wind_direction = ""
        self.pressure = ""
        self.wind_speed = ""
        self.temperature = ""
        self.visibility = ""
        self.weather_type = ""
        
    def __repr__(self):
        return "<Data: %s>" % (self.collection_time)
 
# fetch data, parse json
r = requests.get(URL)
j = r.json()
 
# get a collection of locations
locations = j['SiteRep']['DV']['Location']
parsed_locations = []
 
for location in locations:
    # turn the locations into Location objects
    loc = Location()
    loc.location_id = location['i']
    loc.lat = location['lat']
    loc.lon = location['lon']
    loc.name = location['name']
    loc.country = location['country']
    loc.continent = location['continent']
 
    # turn the location's data into Data objects
    for period in location['Period']:
        for rep in period['Rep']:
            data = Data()
            data.collection_time = rep['$']
            try:
                data.wind_direction = rep['D']
            except KeyError:
                data.wind_direction = ""
            try:
                data.pressure = rep['P']
            except KeyError:
                data.pressure = ""
            try:
                data.wind_speed = rep['S']
            except KeyError:
                data.wind_speed = ""
            try:
                data.temperature = rep['T']
            except KeyError:
                data.temperature = ""
            try:
                data.visibility = rep['V']
            except KeyError:
                data.visibility = ""
            try:
                data.weather_type = rep['W']
            except KeyError:
                data.weather_type = ""
 
            loc.data.append(data)
 
    # hold onto a flattened version of the location
    parsed_locations.extend(flatten_location(loc))
 
# write the csv
with open(OUTPUT_FILE, "wb") as f:
    writer = csv.DictWriter(f, parsed_locations[0].keys())
    writer.writeheader()
    for d in parsed_locations:
        writer.writerow(d)

Open in new window

0
Comment
Question by:crompnk
  • 2
  • 2
4 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 2000 total points
ID: 40514908
You don't appear to have requests installed.  You'll need to install it: http://docs.python-requests.org/en/latest/user/install/#install
0
 

Author Comment

by:crompnk
ID: 40516393
Hi,

Thanks, I installed requests and the script seemed to run ok but I now have a new error:

Traceback (most recent call last):
  File "C:\Temp\MetOfficehourlyjson.py", line 136, in <module>
    parsed_locations.extend(flatten_location(loc))
  File "C:\Temp\MetOfficehourlyjson.py", line 46, in flatten_location
    flat = dict(location_info.items() + flat_data.items())
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'

I am using Python34.
0
 
LVL 25

Assisted Solution

by:clockwatcher
clockwatcher earned 2000 total points
ID: 40516595
Your code is written for python 2.  Under python 3, dict_items() doesn't return a list but instead returns a view and you can't add two views together.  

If you want list functionality, you need to create a list.  In other words, change
flat = dict(location_info.items() + flat_data.items())

Open in new window

To:
flat = dict(list(location_info.items()) + list(flat_data.items()))

Open in new window

0
 

Author Closing Comment

by:crompnk
ID: 40521470
I also converted the python script from 2 to 3 using 2to3.py
I then got an error 'TypeError: 'str' does not support the buffer interface', so I just needed to add a 't' to the mode so it becomes "wt". This causes Python to open the file as a text file and not binary.
The script then worked fine.
0

Featured Post

Ask an Anonymous Question!

Don't feel intimidated by what you don't know. Ask your question anonymously. It's easy! Learn more and upgrade.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Sequence is something that used to store data in it in very simple words. Let us just create a list first. To create a list first of all we need to give a name to our list which I have taken as “COURSE” followed by equals sign and finally enclosed …
Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
Suggested Courses

824 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question