Solved

PHP file_get_contents() not reading entire page

Posted on 2004-04-12
14
1,052 Views
Last Modified: 2013-12-13

Hi. Im trying to write a web page parser which will extract weather data from a web site. The URL is : "http://www.accuweather.com/adcbin/public/inthbh_local.asp? partner=accuweather&metric=1&whend=1&whent=8"

I have encountered a problem where the function file_get_contents(URL) does not read the entire web page. I have even tried using fopen() and it still always gets only half of the page. Does anyone know why this is happening?
0
Comment
Question by:1ns4nity
14 Comments
 
LVL 6

Expert Comment

by:jkna_gunn
ID: 10811421
take from php manual - so try this way instead

<?php
// Get a file into an array.  In this example we'll go through HTTP to get
// the HTML source of a URL.
$lines = file('http://www.example.com/');

// Loop through our array, show HTML source as HTML source; and line numbers too.
foreach ($lines as $line_num => $line) {
   echo "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br />\n";
}

// Another example, let's get a web page into a string.  See also file_get_contents().
$html = implode('', file('http://www.example.com/'));
?>
0
 
LVL 2

Expert Comment

by:Fataqui
ID: 10818034
Hi

Are you sure you are not reading the whole page, because the url you posted gives me a permission denied error when trying to load the forecast data, I can load the page, but the forecast does not display...

If you have to fetch weather that way, why not use MSN, this way you only fetch the raw data feed and build a template to display the results....


Download this example.....

The download contains a simple example script that displays a weather forecast + you get the latest complete database of all weather location from around the world....

http://zip.ya-right.net/wdb1.rar

Example of what the example looks like....

http://zip.ya-right.net/weather.php


example of how I use it in my mail system, so users can look up the weather while reading their mail...

http://mail.ya-right.net/example.html


Fataqui


0
 

Author Comment

by:1ns4nity
ID: 10818831
How did you get a permission denied error? What that when PHP tried to load the page or when you viewed it in your browser. When viewed in a browser, the weather data comes out fine, however, php somehow gets cut off. I already set the user-agent string so it acts like IE but that didnt work.

The msn weather which you use seems to be a good source but i've alerady successfully parsed accuweather's 5-day forcast. Its just that I want to be able to get the hourly forecast too and am wondering why php cant retrieve the data.
0
 

Expert Comment

by:valoodev
ID: 10881442
hi all,

$result = file_get_contents("http://www.accuweather.com/adcbin/public/inthbh_local.asp?%20partner=accuweather&metric=1&whend=1&whent=8");
echo $result;

works perfectly for me.

What is your PHP version ?

Valoodev
0
 

Author Comment

by:1ns4nity
ID: 10886735
Im using the latest version of PHP. Are you able to retreive even the hourly weather data?
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 10887217
the only ways you could read only a part of the file are either there is a \0 character present somewhere in the file itself (maybe on purpoise : i'm not sure the destination site wants you to 'pump' them out), or there is a maximum size buffer for the function you are using.

remember that extracting the whole page is not the way a parser would work :
you should use fopen and fgets to read the lines 1 by 1.

another issue (dummy one) is maybe the page displays the forecasts in a frame, iframe, div, object.... well basically any tag with a src or location property. of course this is where you need to pump them out. not the original url.

btw, the site will not display properly in ie6; nn7 or firefox
this may be because i block ads at firewall level, and some javascripts on the page require them which is nonsense.

the url you gave does not specify a city so the site will not display any forecast anyway.
might work using the location information from your browser when you try but definitely not using php.

the city, even when you browse and specify one is not stored in the url
there is very high chances that they store it on purpoise either directly on the server, or using cookies, or in a session.

COOKIES
---loading---
adc1="|||||"
adc2="||||"
adc3="|||"
partner="accuweather"
adc6="4|"
sesstime="1082627973437"
adc8="4|300730"
adc9="34|1|300730"
---choose city : show form---
ASPSESSIONIDASABSQCQ="PANBFMKDLOKLBLBMKAGCOLFB"
adc9="34|1|300730|38|1|300730"
---validation---
adc5="LFPB|EU|FR|PARIS|48.97|2.45| 1"
adc9="34|1|300730|38|1|300730|0"

look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.

note on cookies. OPERA is the browser that gives the best real-time information on cookies, and the cookie manager contains VERY cool features such as "accept but delete when closing opera.", "accept but discard changes", "accept for this server only"...
0
 

Author Comment

by:1ns4nity
ID: 10910002
skullnobrains: I think you found the problem. I never noticed that the city data was not stored in the URL.

I guess how the site works is that a cookie is stored on your browser when you reach the page which gives you the weather forecast for the next 7 days. On that page, the city/country is present in the URL. When you then click to view the hourly forecast, it remembers what city/country you are viewing using a cookie. I guess there is no way I can get PHP to overcome this right?
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 26

Expert Comment

by:skullnobrains
ID: 10911943
you can if you set the cookie yourself, setting the server to their server.

as said before
<< look at the adc5 cookie...
if u need more help, i need evidence that they agree.
in that case, they probably will let you either hotlink their site, or use their db.
>>
0
 

Author Comment

by:1ns4nity
ID: 10912610
Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience.
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 10916506
Set-Cookie: <name>=<value>[; <name>=<value>]...
[; expires=<date>][; domain=<domain_name>]
[; path=<some_path>][; secure][; httponly]

rfc syntax to be pasted in header, the server can be set to any server including a different one.
you don't spoof or steal, you just set the cookie in a regular way. (i'm no black-hat ;)
for personnal use, the simplest is to set the cookie manually in your browser. (ie go to their site once)
for clients, you must set the cookie before you call file_get_contents or an equiv syntax.

the source of the page where you choose the cities on their site probably contains the exhaustive list of the supported cities.

again, you MUST
- have their agreement as hotlinking is costly in bandwidth (and may lead to prosecutions in some countries)
- let banners and names of their site visible on yours.

btw, i'd be eager to see the working result if you can afford to paste a link sometime.

ps if you have a hard time, use opera for debugging (real-time information on cookies, rejection and limitations possible...)
0
 

Author Comment

by:1ns4nity
ID: 10923058
Actually it doesnt really make sense. If the cookie is still only stored by the browser, how is the weather server supposed to retreive it when it is PHP which is requesting the page and not the browser?
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 10923285
either you don't request the page using php but simply include it in any html element
in this case you can workout some javascript to remove the unnecessary code.

or you may try a few options using php to retrieve the page
while feeding either the name of the variable itself or $_COOKIES[adc5]

<< Oh so I can spoof a cookie? Interesting...I doubt they would even reply to such a request. Im just doing this for personal convenience. >>
if they agree, they'll let you have a look at their code and it will be much easier to work it out.
i really believe that you are doing it for personnal convenience, actually.
i'm beginning to be ashamed to explain such things.
this is my last post on the thread, unless i know more of the whereabouts, and you provide a link.
0
 
LVL 26

Accepted Solution

by:
skullnobrains earned 50 total points
ID: 10923299
this is your answer as well, but i'm not helping for such forgeries.
http://wp.netscape.com/newsref/std/cookie_spec.html
just find a way to send the proper header.
0
 

Author Comment

by:1ns4nity
ID: 10948317
Thanks for the help skullnobrains. I'll try it out when I have time. NIce to learn something new about PHP.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Popularity Can Be Measured Sometimes we deal with questions of popularity, and we need a way to collect opinions from our clients.  This article shows a simple teaching example of how we might elect a favorite color by letting our clients vote for …
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now