We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

PHP file_get_contents() not reading entire page

1ns4nity
1ns4nity asked
on
Medium Priority
2,617 Views
Last Modified: 2013-12-13

Hi. Im trying to write a web page parser which will extract weather data from a web site. The URL is : "http://www.accuweather.com/adcbin/public/inthbh_local.asp? partner=accuweather&metric=1&whend=1&whent=8"

I have encountered a problem where the function file_get_contents(URL) does not read the entire web page. I have even tried using fopen() and it still always gets only half of the page. Does anyone know why this is happening?
Comment
Watch Question

take from php manual - so try this way instead

<?php
// Get a file into an array.  In this example we'll go through HTTP to get
// the HTML source of a URL.
$lines = file('http://www.example.com/');

// Loop through our array, show HTML source as HTML source; and line numbers too.
foreach ($lines as $line_num => $line) {
   echo "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br />\n";
}

// Another example, let's get a web page into a string.  See also file_get_contents().
$html = implode('', file('http://www.example.com/'));
?>

Commented:
Hi

Are you sure you are not reading the whole page, because the url you posted gives me a permission denied error when trying to load the forecast data, I can load the page, but the forecast does not display...

If you have to fetch weather that way, why not use MSN, this way you only fetch the raw data feed and build a template to display the results....


Download this example.....

The download contains a simple example script that displays a weather forecast + you get the latest complete database of all weather location from around the world....

http://zip.ya-right.net/wdb1.rar

Example of what the example looks like....

http://zip.ya-right.net/weather.php


example of how I use it in my mail system, so users can look up the weather while reading their mail...

http://mail.ya-right.net/example.html


Fataqui


Author

Commented:
How did you get a permission denied error? What that when PHP tried to load the page or when you viewed it in your browser. When viewed in a browser, the weather data comes out fine, however, php somehow gets cut off. I already set the user-agent string so it acts like IE but that didnt work.

The msn weather which you use seems to be a good source but i've alerady successfully parsed accuweather's 5-day forcast. Its just that I want to be able to get the hourly forecast too and am wondering why php cant retrieve the data.

Commented:
hi all,

$result = file_get_contents("http://www.accuweather.com/adcbin/public/inthbh_local.asp?%20partner=accuweather&metric=1&whend=1&whent=8");
echo $result;

works perfectly for me.

What is your PHP version ?

Valoodev

Author

Commented:
Im using the latest version of PHP. Are you able to retreive even the hourly weather data?
CERTIFIED EXPERT

Commented:
the only ways you could read only a part of the file are either there is a \0 character present somewhere in the file itself (maybe on purpoise : i'm not sure the destination site wants you to 'pump' them out), or there is a maximum size buffer for the function you are using.

remember that extracting the whole page is not the way a parser would work :
you should use fopen and fgets to read the lines 1 by 1.

another issue (dummy one) is maybe the page displays the forecasts in a frame, iframe, div, object.... well basically any tag with a src or location property. of course this is where you need to pump them out. not the original url.

btw, the site will not display properly in ie6; nn7 or firefox
this may be because i block ads at firewall level, and some javascripts on the page require them which is nonsense.

the url you gave does not specify a city so the site will not display any forecast anyway.
might work using the location information from your browser when you try but definitely not using php.

the city, even when you browse and specify one is not stored in the url
there is very high chances that they store it on purpoise either directly on the server, or using cookies, or in a session.

COOKIES
---loading---
adc1="|||||"
adc2="||||"
adc3="|||"
partner="accuweather"
adc6="4|"
sesstime="1082627973437"
adc8="4|300730"
adc9="34|1|300730"
---choose city : show form---
ASPSESSIONIDASABSQCQ="PANBFMKDLOKLBLBMKAGCOLFB"
adc9="34|1|300730|38|1|300730"
---validation---
adc5="LFPB|EU|FR|PARIS|48.97|2.45| 1"
adc9="34|1|300730|38|1|300730|0"

look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.

note on cookies. OPERA is the browser that gives the best real-time information on cookies, and the cookie manager contains VERY cool features such as "accept but delete when closing opera.", "accept but discard changes", "accept for this server only"...

Author

Commented:
skullnobrains: I think you found the problem. I never noticed that the city data was not stored in the URL.

I guess how the site works is that a cookie is stored on your browser when you reach the page which gives you the weather forecast for the next 7 days. On that page, the city/country is present in the URL. When you then click to view the hourly forecast, it remembers what city/country you are viewing using a cookie. I guess there is no way I can get PHP to overcome this right?
CERTIFIED EXPERT

Commented:
you can if you set the cookie yourself, setting the server to their server.

as said before
<< look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.
>>

Author

Commented:
Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience.
CERTIFIED EXPERT

Commented:
Set-Cookie: <name>=<value>[; <name>=<value>]...
[; expires=<date>][; domain=<domain_name>]
[; path=<some_path>][; secure][; httponly]

rfc syntax to be pasted in header, the server can be set to any server including a different one.
you don't spoof or steal, you just set the cookie in a regular way. (i'm no black-hat ;)
for personnal use, the simplest is to set the cookie manually in your browser. (ie go to their site once)
for clients, you must set the cookie before you call file_get_contents or an equiv syntax.

the source of the page where you choose the cities on their site probably contains the exhaustive list of the supported cities.

again, you MUST
- have their agreement as hotlinking is costly in bandwidth (and may lead to prosecutions in some countries)
- let banners and names of their site visible on yours.

btw, i'd be eager to see the working result if you can afford to paste a link sometime.

ps if you have a hard time, use opera for debugging (real-time information on cookies, rejection and limitations possible...)

Author

Commented:
Actually it doesnt really make sense. If the cookie is still only stored by the browser, how is the weather server supposed to retreive it when it is PHP which is requesting the page and not the browser?
CERTIFIED EXPERT

Commented:
either you don't request the page using php but simply include it in any html element
in this case you can workout some javascript to remove the unnecessary code.

or you may try a few options using php to retrieve the page
while feeding either the name of the variable itself or $_COOKIES[adc5]

<< Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience. >>
if they agree, they'll let you have a look at their code and it will be much easier to work it out.
i really believe that you are doing it for personnal convenience, actually.
i'm beginning to be ashamed to explain such things.
this is my last post on the thread, unless i know more of the whereabouts, and you provide a link.
CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
Thanks for the help skullnobrains. I'll try it out when I have time. NIce to learn something new about PHP.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.