Solved

How to I scrape data from exchange rate sites and store in DB

Posted on 2010-11-08
2
1,164 Views
Last Modified: 2012-05-10
Hi,
I currently have some broken bones and am struggling to type so sorry if my spelling and grammar makes no sense.

I have a time frame of about a week or so for this as well as some other work which won't be too easy given my current difficulties typing, I am hoping someone can help guide me in the right direction so this is as painless as possible and I can learn as I go.

Basically I need help with creating OO PHP script(s) which can be use for retrieving current exchange rates, for at least 160 different currencies from different sources as required. I need to scrape these sources for rates and store in a mysql db, or using a php script call a convert cgi (ie http://www.xe.com/ucc/convert.cgi?) and scrape the data returned.

I also need to store related ISO 4217 data for each currency (http://www.iso.org/iso/support/currency_codes_list-1.htm)

Using the stored data I can implement my own conversion tool for exchange rates.


The URL on the conversion site I am building specified as: http://thisite.com/conv?amnt=17.65&from=GBP&to=JPY which will then return xml structured like so
            <conv>
                  <at>8 November 2010 13:44</at>
                  <rate>180.64215</rate>
                  <from>
                        <code>GBP</code>
                        <curr>Pound Sterling</curr>
                        <loc>      United Kingdom, Crown Dependencies (the Isle of Man and                                           the Channel Islands), certain British Overseas Territories                                           (South Georgia and the South Sandwich Islands,British                                                       Antarctic Territory and British Indian Ocean Territory)
                        </loc>
                        <amnt>17.65</amnt>
                  </from>
                  <to>
                        <code>JPY</code>
                        <curr>Japanese Yen</curr>
                        <loc>Japan</loc>
                        <amnt>1599.47</amnt>
                  </to>
            </conv>

or error handling like so
<?xml version="1.0" encoding="UTF-8"?>
            <conv>
                  <error code=’nnnn’>error message</error>
            </conv>
Some error codes would be:

1000      URL not recognized
1100      Required parameter is missing
1200      Parameter not recognized
2000      Currency type not recognized

I also need to be able to re-scrape data every 24hours to update exchange rates, I am not sure if there is some sort of caching method to do this.

Base Paths, db information and error message data needs to be placed into a config file.

Some possible resources I may use if I can get some idea of how to do this:
http://www.bloomberg.com/js/calculators/currdata.js
http://www.xe.com/ucc/convert.cgi?Amount=100&From=USD&To=ALL
themoneyconverter.com/GBP/rss.xml

I think that is it. So to summarize if someone is kind enough to guide me in the right direction to understanding this it would be brilliant.

I am unsure how to retrieve the data to scrape and store what I need in db

The URL: http://thisite.com/conv? the lack of file extension after conv has confused me, would this be done with base paths in the config file or another way?

For errors I am unsure how to handle them, would a mistaken query return an error code, which in turn I would use the codes and messages stored in the config file to output the error and
explanation on screen?

How would I automate a caching method to check/update DB with current rates every 24hrs?

Hopefully someone can help as I am clueless how to begin or what to look into to learn what to do?

Thanks :)
0
Comment
Question by:dchid
2 Comments
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 34088877
This is not a question -- it's a requirement for application design and build, and for that you need to hire a professional programmer.  If you want to learn PHP, this might not be the best task to start with -- too many moving parts and unknowns.  But this is a good book, with great examples and a downloadable code library you can copy and modify to suit your needs.
http://www.sitepoint.com/books/phpmysql4/

If you can isolate the questions enough that we can answer them, I'll be glad to try to help.  For example, if you wanted to know how to use CURL to read from a foreign web site, you might be able to adapt this teaching examples script.

Good luck with the project, and as you come to specific questions, please post them here at EE. ~Ray
<?php // RAY_temp_curl_example.php

error_reporting(E_ALL);



function my_curl($url, $timeout=2, $error_report=FALSE)

{

    $curl = curl_init();



    // HEADERS FROM FIREFOX - APPEARS TO BE A BROWSER REFERRED BY GOOGLE

    $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";

    $header[] = "Cache-Control: max-age=0";

    $header[] = "Connection: keep-alive";

    $header[] = "Keep-Alive: 300";

    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";

    $header[] = "Accept-Language: en-us,en;q=0.5";

    $header[] = "Pragma: "; // BROWSERS USUALLY LEAVE BLANK



    // SET THE CURL OPTIONS - SEE http://php.net/manual/en/function.curl-setopt.php

    curl_setopt($curl, CURLOPT_URL,            $url);

    curl_setopt($curl, CURLOPT_USERAGENT,      'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6');

    curl_setopt($curl, CURLOPT_HTTPHEADER,     $header);

    curl_setopt($curl, CURLOPT_REFERER,        'http://www.google.com');

    curl_setopt($curl, CURLOPT_ENCODING,       'gzip,deflate');

    curl_setopt($curl, CURLOPT_AUTOREFERER,    TRUE);

    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);

    curl_setopt($curl, CURLOPT_TIMEOUT,        $timeout);



    // RUN THE CURL REQUEST AND GET THE RESULTS

    $htm = curl_exec($curl);



    // ON FAILURE HANDLE ERROR MESSAGE

    if ($htm === FALSE)

    {

        if ($error_report)

        {

            $err = curl_errno($curl);

            $inf = curl_getinfo($curl);

            echo "CURL FAIL: $url TIMEOUT=$timeout, CURL_ERRNO=$err";

            var_dump($inf);

        }

        curl_close($curl);

        return FALSE;

    }



    // ON SUCCESS RETURN XML / HTML STRING

    curl_close($curl);

    return $htm;

}









// USAGE EXAMPLE - PUT YOUR FAVORITE URL HERE

$url = "http://finance.yahoo.com/d/quotes.csv?s=lulu&f=snl1c1ohgvt1";

$htm = my_curl($url);

if (!$htm) die("NO $url");





// SHOW WHAT WE GOT

echo "<pre>";

echo htmlentities($htm);

Open in new window

0
 

Author Comment

by:dchid
ID: 34092192
Ray, thank you for the advice, I will look into this more and post back any areas which confuse me and I made need help with.

Thanks again
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
get radio button vale in array 7 36
Modify PHP Code on the Fly? 8 39
Phone Dialer 5 36
Why does my array not "dump?" 5 16
Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this.Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it is …
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now