Link to home
Start Free TrialLog in
Avatar of J N
J N

asked on

detect if image exists

Hi,

i am working with some external data from twitter and occasionally it displays a link to an image in one of its array values.

If i try and use the img tag and display the image i get an empty field.

i am curious if it is possible to test if the link is an image file. i have tried using file_get_contents however the page loads very slowly but it does work.

i am wondering if i could use is_file or file_exists with a url such as

https://pbs.twimg.com/profile_banners/334036628/1395969403

to determine if the file exists as it is not always the case

thanks in advance
Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

There are some comments in the PHP man page for file_exists that might be of use

http://www.php.net/manual/en/function.file-exists.php#75064
$file = 'http://www.domain.com/somefile.jpg';
$file_headers = @get_headers($file);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
    $exists = false;
}
else {
    $exists = true;
}

Open in new window

And http://www.php.net/manual/en/function.file-exists.php#74469
<?php
function url_exists($url) {
    if (!$fp = curl_init($url)) return false;
    return true;
}
?>

Open in new window

Avatar of J N
J N

ASKER

Thanks

Which one would be faster?
PHP is_file() and file_exists() are designed to work on your own server, and even though the man pages say that "some protocols are supported" you're dependent on a foreign server for the support.  PHP file_get_contents() may be unreliable because (1) some servers will block it and (2) if it does not get a response, it hangs your script until timeout, then you get a fatal error.

Your best bet is probably cURL.

I'll see if I can give you a script that will try to detect a file on a foreign server and will time the attempt.  Then you can experiment with it to choose a way that is (hopefully) both fast and reliable.
SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Key to the question is probably "discernible difference."  The time required to check the URLs is something that is interesting, but it is not under the author's control.  It is accomplished via the response from a remote server.  For all we know that server could respond quickly and consistently, or be erratic and sometimes very slow, or it could cache the responses, making all 2+ iterations of the test dilutive of the expected results that would arise in a real-world scenario.  This appears to be the case when I tested your script.  The first access with get_headers() took almost five times longer than subsequent accesses.  The actual elapsed times are measured in milliseconds, not fractions of milliseconds.

The Stopwatch class is just my easy way of getting a measurement -- it's not really an integral part of the solution - just a sidebar note to answer the question of "which is faster."  Given the test case, cURL seems faster.  But it's not really under our Author's control.
Avatar of J N

ASKER

WOW!! thanks for the input!

Im not sure where i should take this. I am using it to access photos from twitter using the abraham library. Unfortunately, sometimes the information obtained from the library classes do not contain the full data and i need to attach info. the reason why i want to test to see if there is something there. I believe twitter has pretty stable servers.

Knowing that information which method should i select.

thanks
guys
Avatar of J N

ASKER

Additionally, i am using the function to test about 200 photos
I believe twitter has pretty stable servers.
Ha!  Twitter crashes and hangs all the time.  Let's do the arithmetic.

200 photos * 95 milliseconds (for not-found, less for found) = 19 seconds.  If you use cURL to read the image files, you can cache them on your own server and be done with it in less than a minute.
Avatar of J N

ASKER

Hi,

I already did the math to get 19 seconds which is FAR better than what i have right now. However, im curious how that compared to the other methods compared above.
Try them and see!
Avatar of J N

ASKER

THANKS GUYS BIG HELP!!!
In my opinion it makes no difference - I would go with the get_headers code because it is less lines of code - I prefer less code - but there is nothing wrong with the cUrl solution.

If you can't decide - toss a coin.