Link to home
Start Free TrialLog in
Avatar of 1ns4nity
1ns4nity

asked on

PHP file_get_contents() not reading entire page


Hi. Im trying to write a web page parser which will extract weather data from a web site. The URL is : "http://www.accuweather.com/adcbin/public/inthbh_local.asp? partner=accuweather&metric=1&whend=1&whent=8"

I have encountered a problem where the function file_get_contents(URL) does not read the entire web page. I have even tried using fopen() and it still always gets only half of the page. Does anyone know why this is happening?
Avatar of jkna_gunn
jkna_gunn

take from php manual - so try this way instead

<?php
// Get a file into an array.  In this example we'll go through HTTP to get
// the HTML source of a URL.
$lines = file('http://www.example.com/');

// Loop through our array, show HTML source as HTML source; and line numbers too.
foreach ($lines as $line_num => $line) {
   echo "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br />\n";
}

// Another example, let's get a web page into a string.  See also file_get_contents().
$html = implode('', file('http://www.example.com/'));
?>
Hi

Are you sure you are not reading the whole page, because the url you posted gives me a permission denied error when trying to load the forecast data, I can load the page, but the forecast does not display...

If you have to fetch weather that way, why not use MSN, this way you only fetch the raw data feed and build a template to display the results....


Download this example.....

The download contains a simple example script that displays a weather forecast + you get the latest complete database of all weather location from around the world....

http://zip.ya-right.net/wdb1.rar

Example of what the example looks like....

http://zip.ya-right.net/weather.php


example of how I use it in my mail system, so users can look up the weather while reading their mail...

http://mail.ya-right.net/example.html


Fataqui


Avatar of 1ns4nity

ASKER

How did you get a permission denied error? What that when PHP tried to load the page or when you viewed it in your browser. When viewed in a browser, the weather data comes out fine, however, php somehow gets cut off. I already set the user-agent string so it acts like IE but that didnt work.

The msn weather which you use seems to be a good source but i've alerady successfully parsed accuweather's 5-day forcast. Its just that I want to be able to get the hourly forecast too and am wondering why php cant retrieve the data.
hi all,

$result = file_get_contents("http://www.accuweather.com/adcbin/public/inthbh_local.asp?%20partner=accuweather&metric=1&whend=1&whent=8");
echo $result;

works perfectly for me.

What is your PHP version ?

Valoodev
Im using the latest version of PHP. Are you able to retreive even the hourly weather data?
Avatar of skullnobrains
the only ways you could read only a part of the file are either there is a \0 character present somewhere in the file itself (maybe on purpoise : i'm not sure the destination site wants you to 'pump' them out), or there is a maximum size buffer for the function you are using.

remember that extracting the whole page is not the way a parser would work :
you should use fopen and fgets to read the lines 1 by 1.

another issue (dummy one) is maybe the page displays the forecasts in a frame, iframe, div, object.... well basically any tag with a src or location property. of course this is where you need to pump them out. not the original url.

btw, the site will not display properly in ie6; nn7 or firefox
this may be because i block ads at firewall level, and some javascripts on the page require them which is nonsense.

the url you gave does not specify a city so the site will not display any forecast anyway.
might work using the location information from your browser when you try but definitely not using php.

the city, even when you browse and specify one is not stored in the url
there is very high chances that they store it on purpoise either directly on the server, or using cookies, or in a session.

COOKIES
---loading---
adc1="|||||"
adc2="||||"
adc3="|||"
partner="accuweather"
adc6="4|"
sesstime="1082627973437"
adc8="4|300730"
adc9="34|1|300730"
---choose city : show form---
ASPSESSIONIDASABSQCQ="PANBFMKDLOKLBLBMKAGCOLFB"
adc9="34|1|300730|38|1|300730"
---validation---
adc5="LFPB|EU|FR|PARIS|48.97|2.45| 1"
adc9="34|1|300730|38|1|300730|0"

look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.

note on cookies. OPERA is the browser that gives the best real-time information on cookies, and the cookie manager contains VERY cool features such as "accept but delete when closing opera.", "accept but discard changes", "accept for this server only"...
skullnobrains: I think you found the problem. I never noticed that the city data was not stored in the URL.

I guess how the site works is that a cookie is stored on your browser when you reach the page which gives you the weather forecast for the next 7 days. On that page, the city/country is present in the URL. When you then click to view the hourly forecast, it remembers what city/country you are viewing using a cookie. I guess there is no way I can get PHP to overcome this right?
you can if you set the cookie yourself, setting the server to their server.

as said before
<< look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.
>>
Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience.
Set-Cookie: <name>=<value>[; <name>=<value>]...
[; expires=<date>][; domain=<domain_name>]
[; path=<some_path>][; secure][; httponly]

rfc syntax to be pasted in header, the server can be set to any server including a different one.
you don't spoof or steal, you just set the cookie in a regular way. (i'm no black-hat ;)
for personnal use, the simplest is to set the cookie manually in your browser. (ie go to their site once)
for clients, you must set the cookie before you call file_get_contents or an equiv syntax.

the source of the page where you choose the cities on their site probably contains the exhaustive list of the supported cities.

again, you MUST
- have their agreement as hotlinking is costly in bandwidth (and may lead to prosecutions in some countries)
- let banners and names of their site visible on yours.

btw, i'd be eager to see the working result if you can afford to paste a link sometime.

ps if you have a hard time, use opera for debugging (real-time information on cookies, rejection and limitations possible...)
Actually it doesnt really make sense. If the cookie is still only stored by the browser, how is the weather server supposed to retreive it when it is PHP which is requesting the page and not the browser?
either you don't request the page using php but simply include it in any html element
in this case you can workout some javascript to remove the unnecessary code.

or you may try a few options using php to retrieve the page
while feeding either the name of the variable itself or $_COOKIES[adc5]

<< Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience. >>
if they agree, they'll let you have a look at their code and it will be much easier to work it out.
i really believe that you are doing it for personnal convenience, actually.
i'm beginning to be ashamed to explain such things.
this is my last post on the thread, unless i know more of the whereabouts, and you provide a link.
ASKER CERTIFIED SOLUTION
Avatar of skullnobrains
skullnobrains

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for the help skullnobrains. I'll try it out when I have time. NIce to learn something new about PHP.