Link to home
Start Free TrialLog in
Avatar of Schuyler Kuhl
Schuyler Kuhl

asked on

simplexml_load_file returns 404 page not found error

Greetings,

I have a page that uses simplexml_load_file to get a list of rental properties and then enter them into a mysql database.

Today the page started returning a 404 page not found error.

If I remove this line then the page loads without that error.  But of course it doesn't have the data it needs:

$xml = simplexml_load_file("http://service_domain.com/service.asmx/getproperty?");  (Just a sample url)

However if I enter that url in the address bar of a browser I get all the data I am looking for.

how do I start troubleshooting this and figuring out what happened between yesterday and today?

I'm having a problem figuring out how to start figuring this out.

Thanks very much for any help in advance.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Schuyler Kuhl
Schuyler Kuhl

ASKER

Thank you.  No there isn't a question mark at the end of the url. I was just putting in a sample one.  It actually looks more like this:

http://exampledomain.com/service/servicepage.asmx/GetProperty?username=1234&password=12345&Account=123456


And it works in a browser to show the xml data.  I agree that the service api might have changed or something could be going on with it.  I have requested help from them but I am thinking it might not be that also.

I think you are saying to use a different method to handle the data.  This has been working well for a while so I think I should try to figure out what is going on with it first.

Thank you.
SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thank you. I will check these things.
I'm sorry to be an idiot but where is the php.ini file?

Also Ray, thank you. I am trying the script you posted now.  Thank you.
php.ini is usually in the WWW root directory.  YMMV, you may be able to find it if you run phpinfo() and look at the output.
Ray,

When I put the script you posted into a page on the server and browse to it in IE it initially shows a bunch of the data. but then ie stops responding and the page turns white and the ie window crashes.

Does that tell me that there is a problem with the data?

Thanks very much.
Actually, I take that back.  I stepped away for a few minutes and when I returned the page had fully loaded.  I tried it in Chrome. and it loaded right away.

So is this telling me that I need to modify the method I use to get the data?  

I'm not sure what this test tells me.  But I guess one thing it tells me is that simple xml is enabled.  Is this true?
initially shows a bunch of the data. but then ie stops responding
The script I posted should not do that unless your internet pipe is very, very busy.  It will read the XML document on line 10 or fail within 3 seconds, and it sends the data in a burst on line 12.  So the hesitation may be a server-traffic related issue.

You're probably on firm ground with SimpleXML if the script gives you the var_dump() output on line 12.

There could be a lot of reasons why simplexml_load_xxx() has a hard time using the remote resource.  Perhaps they decided that they want to limit automated access to their data.   Or even if they didn't decide that, they made a change that checks for a browser.  PHP's remote file access does not provide a browser signature, but cURL can do that.  So whenever we run into a problem with, for example, file_get_contents(), I just recommend switching over to cURL.  It usually provides a quick and enduring fix.
Ray and Guy,

Thank you both very much for your help.  I haven't totally resolved it yet but at least I have the proper data in my database that is live and people have stopped freaking out.

Ray I used your script.  Thank you very much.  I don't really understand it but that is ok for now.  Guy, I believe that what you wrote is also correct.  I believe that the problem is the second or third possibility you mentioned.

What I learned is this.  My original page worked in this way. It would check for the new data on a regular basis.  Then it would truncate the existing table and add the new data to the table.  What I realized after a while was that everytime that page ran there would end up being 55 rows in the table out of 423 rows that were received from the source.  

I realized that after each row was inserted I had this:

if (!$result) {
    $message  = 'Invalid query: ' . mysql_error() . "\n";
    $message .= 'Whole query: ' . $sql;
    die($message);
                              }
So what was happening was that on the 56th row something was causing the insert to fail and the page was showing the page not found error.

So anyway, I guess there are 19 rows with a problem because I can see that 423 rows were received and only 404 ended up in the database on the website.

Very stressful.  Again thank you very much for your help.

Sky
Thank you very much for your invaluable help!
Glad to help.  Thanks for the points and thanks for using EE, ~Ray