Solved

simplexml_load_file returns 404 page not found error

Posted on 2014-01-15
12
748 Views
Last Modified: 2014-01-15
Greetings,

I have a page that uses simplexml_load_file to get a list of rental properties and then enter them into a mysql database.

Today the page started returning a 404 page not found error.

If I remove this line then the page loads without that error.  But of course it doesn't have the data it needs:

$xml = simplexml_load_file("http://service_domain.com/service.asmx/getproperty?");  (Just a sample url)

However if I enter that url in the address bar of a browser I get all the data I am looking for.

how do I start troubleshooting this and figuring out what happened between yesterday and today?

I'm having a problem figuring out how to start figuring this out.

Thanks very much for any help in advance.
0
Comment
Question by:Schuyler Kuhl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 4
12 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 300 total points
ID: 39783858
If the question mark really belongs at the end of the URL, maybe the service has changed its API.  Consider reading the XML document with cURL, and then using SimpleXML_Load_String() to create the object.

<?php // RAY_temp_skykuhl.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28339708.html


// USAGE EXAMPLE (USE THE CORRECT URL)
$url = 'http://service_domain.com/service.asmx/getproperty';
$xml = my_curl($url);
$obj = SimpleXML_Load_String($xml);
var_dump($obj);


// A FUNCTION TO RUN A CURL-GET CLIENT CALL TO A FOREIGN SERVER
function my_curl
( $url
, $get_array=array()
, $timeout=3
, $error_report=TRUE
)
{
    // PREPARE THE ARGUMENT STRING IF NEEDED
    $get_string = NULL;
    foreach ($get_array as $key => $val)
    {
        $get_string
        = $get_string
        . $key
        . '='
        . urlencode($val)
        . '&';
    }
    $get_string = rtrim($get_string, '&');
    if (!empty($get_string)) $url .= '?' . $get_string;

    // START CURL
    $curl = curl_init();

    // HEADERS AND OPTIONS APPEAR TO BE A FIREFOX BROWSER REFERRED BY GOOGLE
    $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: "; // BROWSERS USUALLY LEAVE THIS BLANK

    // SET THE CURL OPTIONS - SEE http://php.net/manual/en/function.curl-setopt.php
    curl_setopt( $curl, CURLOPT_URL,            $url  );
    curl_setopt( $curl, CURLOPT_USERAGENT,      'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'  ); // HISTORY
    curl_setopt( $curl, CURLOPT_USERAGENT,      'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0'  );
    curl_setopt( $curl, CURLOPT_HTTPHEADER,     $header  );
    curl_setopt( $curl, CURLOPT_REFERER,        'http://www.google.com'  );
    curl_setopt( $curl, CURLOPT_ENCODING,       'gzip,deflate'  );
    curl_setopt( $curl, CURLOPT_AUTOREFERER,    TRUE  );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE  );
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE  );
    curl_setopt( $curl, CURLOPT_TIMEOUT,        $timeout  );

    // THIS SEEMS TO LET IT WORK WITH HTTPS SITES
    curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, FALSE );

    // RUN THE CURL REQUEST AND GET THE RESULTS
    $htm = curl_exec($curl);

    // ON FAILURE HANDLE ERROR MESSAGE
    if ($htm === FALSE)
    {
        if ($error_report)
        {
            $err = curl_errno($curl);
            $inf = curl_getinfo($curl);
            echo "CURL FAIL: $url TIMEOUT=$timeout, CURL_ERRNO=$err";
            var_dump($inf);
        }
        curl_close($curl);
        return FALSE;
    }

    // ON SUCCESS RETURN XML / HTML STRING
    curl_close($curl);
    return $htm;
}

Open in new window

0
 

Author Comment

by:Schuyler Kuhl
ID: 39783875
Thank you.  No there isn't a question mark at the end of the url. I was just putting in a sample one.  It actually looks more like this:

http://exampledomain.com/service/servicepage.asmx/GetProperty?username=1234&password=12345&Account=123456


And it works in a browser to show the xml data.  I agree that the service api might have changed or something could be going on with it.  I have requested help from them but I am thinking it might not be that also.

I think you are saying to use a different method to handle the data.  This has been working well for a while so I think I should try to figure out what is going on with it first.

Thank you.
0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 200 total points
ID: 39783885
the only 3 reasons I had so far giving me this error where:
* the simple xml was no longer enabled in the php.ini
* the xml returned some special/accented characters, and the data is not coming in a certains character set :
http://www.w3schools.com/xml/xml_encoding.asp
* the xml daata contains some special "xml" characters in the data, which need either to be encoded, or to be put into cdata tag:
http://www.w3schools.com/xml/xml_cdata.asp

hope this helps
0
Cloud Training Guides

FREE GUIDES: In-depth and hand-crafted Linux, AWS, OpenStack, DevOps, Azure, and Cloud training guides created by Linux Academy instructors and the community.

 

Author Comment

by:Schuyler Kuhl
ID: 39783892
thank you. I will check these things.
0
 

Author Comment

by:Schuyler Kuhl
ID: 39783894
I'm sorry to be an idiot but where is the php.ini file?

Also Ray, thank you. I am trying the script you posted now.  Thank you.
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39783906
php.ini is usually in the WWW root directory.  YMMV, you may be able to find it if you run phpinfo() and look at the output.
0
 

Author Comment

by:Schuyler Kuhl
ID: 39783915
Ray,

When I put the script you posted into a page on the server and browse to it in IE it initially shows a bunch of the data. but then ie stops responding and the page turns white and the ie window crashes.

Does that tell me that there is a problem with the data?

Thanks very much.
0
 

Author Comment

by:Schuyler Kuhl
ID: 39784006
Actually, I take that back.  I stepped away for a few minutes and when I returned the page had fully loaded.  I tried it in Chrome. and it loaded right away.

So is this telling me that I need to modify the method I use to get the data?  

I'm not sure what this test tells me.  But I guess one thing it tells me is that simple xml is enabled.  Is this true?
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39784112
initially shows a bunch of the data. but then ie stops responding
The script I posted should not do that unless your internet pipe is very, very busy.  It will read the XML document on line 10 or fail within 3 seconds, and it sends the data in a burst on line 12.  So the hesitation may be a server-traffic related issue.

You're probably on firm ground with SimpleXML if the script gives you the var_dump() output on line 12.

There could be a lot of reasons why simplexml_load_xxx() has a hard time using the remote resource.  Perhaps they decided that they want to limit automated access to their data.   Or even if they didn't decide that, they made a change that checks for a browser.  PHP's remote file access does not provide a browser signature, but cURL can do that.  So whenever we run into a problem with, for example, file_get_contents(), I just recommend switching over to cURL.  It usually provides a quick and enduring fix.
0
 

Author Comment

by:Schuyler Kuhl
ID: 39784366
Ray and Guy,

Thank you both very much for your help.  I haven't totally resolved it yet but at least I have the proper data in my database that is live and people have stopped freaking out.

Ray I used your script.  Thank you very much.  I don't really understand it but that is ok for now.  Guy, I believe that what you wrote is also correct.  I believe that the problem is the second or third possibility you mentioned.

What I learned is this.  My original page worked in this way. It would check for the new data on a regular basis.  Then it would truncate the existing table and add the new data to the table.  What I realized after a while was that everytime that page ran there would end up being 55 rows in the table out of 423 rows that were received from the source.  

I realized that after each row was inserted I had this:

if (!$result) {
    $message  = 'Invalid query: ' . mysql_error() . "\n";
    $message .= 'Whole query: ' . $sql;
    die($message);
                              }
So what was happening was that on the 56th row something was causing the insert to fail and the page was showing the page not found error.

So anyway, I guess there are 19 rows with a problem because I can see that 423 rows were received and only 404 ended up in the database on the website.

Very stressful.  Again thank you very much for your help.

Sky
0
 

Author Closing Comment

by:Schuyler Kuhl
ID: 39784369
Thank you very much for your invaluable help!
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39784409
Glad to help.  Thanks for the points and thanks for using EE, ~Ray
0

Featured Post

Stressed Out?

Watch some penguins on the livecam!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses four methods for overlaying images in a container on a web page
3 proven steps to speed up Magento powered sites. The article focus is on optimizing time to first byte (TTFB), full page caching and configuring server for optimal performance.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question