Solved

How to I scrape data from exchange rate sites and store in DB

Posted on 2010-11-08
2
1,237 Views
Last Modified: 2012-05-10
Hi,
I currently have some broken bones and am struggling to type so sorry if my spelling and grammar makes no sense.

I have a time frame of about a week or so for this as well as some other work which won't be too easy given my current difficulties typing, I am hoping someone can help guide me in the right direction so this is as painless as possible and I can learn as I go.

Basically I need help with creating OO PHP script(s) which can be use for retrieving current exchange rates, for at least 160 different currencies from different sources as required. I need to scrape these sources for rates and store in a mysql db, or using a php script call a convert cgi (ie http://www.xe.com/ucc/convert.cgi?) and scrape the data returned.

I also need to store related ISO 4217 data for each currency (http://www.iso.org/iso/support/currency_codes_list-1.htm)

Using the stored data I can implement my own conversion tool for exchange rates.


The URL on the conversion site I am building specified as: http://thisite.com/conv?amnt=17.65&from=GBP&to=JPY which will then return xml structured like so
            <conv>
                  <at>8 November 2010 13:44</at>
                  <rate>180.64215</rate>
                  <from>
                        <code>GBP</code>
                        <curr>Pound Sterling</curr>
                        <loc>      United Kingdom, Crown Dependencies (the Isle of Man and                                           the Channel Islands), certain British Overseas Territories                                           (South Georgia and the South Sandwich Islands,British                                                       Antarctic Territory and British Indian Ocean Territory)
                        </loc>
                        <amnt>17.65</amnt>
                  </from>
                  <to>
                        <code>JPY</code>
                        <curr>Japanese Yen</curr>
                        <loc>Japan</loc>
                        <amnt>1599.47</amnt>
                  </to>
            </conv>

or error handling like so
<?xml version="1.0" encoding="UTF-8"?>
            <conv>
                  <error code=’nnnn’>error message</error>
            </conv>
Some error codes would be:

1000      URL not recognized
1100      Required parameter is missing
1200      Parameter not recognized
2000      Currency type not recognized

I also need to be able to re-scrape data every 24hours to update exchange rates, I am not sure if there is some sort of caching method to do this.

Base Paths, db information and error message data needs to be placed into a config file.

Some possible resources I may use if I can get some idea of how to do this:
http://www.bloomberg.com/js/calculators/currdata.js
http://www.xe.com/ucc/convert.cgi?Amount=100&From=USD&To=ALL
themoneyconverter.com/GBP/rss.xml

I think that is it. So to summarize if someone is kind enough to guide me in the right direction to understanding this it would be brilliant.

I am unsure how to retrieve the data to scrape and store what I need in db

The URL: http://thisite.com/conv? the lack of file extension after conv has confused me, would this be done with base paths in the config file or another way?

For errors I am unsure how to handle them, would a mistaken query return an error code, which in turn I would use the codes and messages stored in the config file to output the error and
explanation on screen?

How would I automate a caching method to check/update DB with current rates every 24hrs?

Hopefully someone can help as I am clueless how to begin or what to look into to learn what to do?

Thanks :)
0
Comment
Question by:dchid
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 34088877
This is not a question -- it's a requirement for application design and build, and for that you need to hire a professional programmer.  If you want to learn PHP, this might not be the best task to start with -- too many moving parts and unknowns.  But this is a good book, with great examples and a downloadable code library you can copy and modify to suit your needs.
http://www.sitepoint.com/books/phpmysql4/

If you can isolate the questions enough that we can answer them, I'll be glad to try to help.  For example, if you wanted to know how to use CURL to read from a foreign web site, you might be able to adapt this teaching examples script.

Good luck with the project, and as you come to specific questions, please post them here at EE. ~Ray
<?php // RAY_temp_curl_example.php
error_reporting(E_ALL);

function my_curl($url, $timeout=2, $error_report=FALSE)
{
    $curl = curl_init();

    // HEADERS FROM FIREFOX - APPEARS TO BE A BROWSER REFERRED BY GOOGLE
    $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: "; // BROWSERS USUALLY LEAVE BLANK

    // SET THE CURL OPTIONS - SEE http://php.net/manual/en/function.curl-setopt.php
    curl_setopt($curl, CURLOPT_URL,            $url);
    curl_setopt($curl, CURLOPT_USERAGENT,      'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6');
    curl_setopt($curl, CURLOPT_HTTPHEADER,     $header);
    curl_setopt($curl, CURLOPT_REFERER,        'http://www.google.com');
    curl_setopt($curl, CURLOPT_ENCODING,       'gzip,deflate');
    curl_setopt($curl, CURLOPT_AUTOREFERER,    TRUE);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($curl, CURLOPT_TIMEOUT,        $timeout);

    // RUN THE CURL REQUEST AND GET THE RESULTS
    $htm = curl_exec($curl);

    // ON FAILURE HANDLE ERROR MESSAGE
    if ($htm === FALSE)
    {
        if ($error_report)
        {
            $err = curl_errno($curl);
            $inf = curl_getinfo($curl);
            echo "CURL FAIL: $url TIMEOUT=$timeout, CURL_ERRNO=$err";
            var_dump($inf);
        }
        curl_close($curl);
        return FALSE;
    }

    // ON SUCCESS RETURN XML / HTML STRING
    curl_close($curl);
    return $htm;
}




// USAGE EXAMPLE - PUT YOUR FAVORITE URL HERE
$url = "http://finance.yahoo.com/d/quotes.csv?s=lulu&f=snl1c1ohgvt1";
$htm = my_curl($url);
if (!$htm) die("NO $url");


// SHOW WHAT WE GOT
echo "<pre>";
echo htmlentities($htm);

Open in new window

0
 

Author Comment

by:dchid
ID: 34092192
Ray, thank you for the advice, I will look into this more and post back any areas which confuse me and I made need help with.

Thanks again
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
This article discusses how to implement server side field validation and display customized error messages to the client.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question