[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 407
  • Last Modified:

php validate URL

Hi E´s,
I need to validate a URL in PHP.
I search in Internet for try get solution for the problem and I don't see any good script that work perfect.
The most perfect that I found was this code:
<?php
$url = "http://www.example.com/index2.php";
        if (!preg_match("#^http://www\.[a-z0-9-_.]+\.[a-z]{2,4}$#i",$url)) {
        echo "wrong url";
        } else {
        echo "ok";
        }
?> 

Open in new window

The code above work partial fine for the domain names, like http://www.example.com validate, but http://example.com (without www) not validate!
Also not validate for this kind of URL's:
http://www.example.com/index.php
http://www.example.com/friendlyurl/

Any idea to improve the regular expression or other way to validate the URL?

The best regards, JC
0
Pedro Chagas
Asked:
Pedro Chagas
  • 9
  • 4
  • 2
  • +1
5 Solutions
 
Ray PaseurCommented:
Why not just try to read from the URL?  If you don't get a 200 OK response, it's not a valid URL.

If you want to see how professionals would approach this problem, please read this article.  It shows how to deconstruct the problem, think about the solutions and create a test plan that will allow the greatest chance of rapid and dependable success.
http://www.experts-exchange.com/Programming/Languages/Scripting/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html
0
 
GaryCommented:
This seems to work didn't do much testing;

<?php

function checkurl($url){

return(preg_match("#^(https?://)?[^/.]+(\.[^/.]+)+/?$#i",$url));
}

echo "http://www.example.com: " . checkurl("http://www.example.com")."<br>";
echo "http://www.example.com/index.php: " . checkurl("http://www.example.com/index.php")."<br>";
echo "http://www.example.com/friendlyurl/: " . checkurl("http://www.example.com/friendlyurl/")."<br>";
echo "http://example.com: " . checkurl("http://example.com")."<br>";

Open in new window

0
 
Pedro ChagasWebmasterAuthor Commented:
Hi @Gary,
I increase this line in your code:
echo "http://example: " . checkurl("http://example.com")."<br>";
and the output is:
http://www.example.com: 1
http://www.example.com/index.php: 0
http://www.example.com/friendlyurl/: 0
http://example.com: 1
http://example: 1
The number 5 should be "0", and 2 and 3 "1".
Can you improve the RE?

Hi @Ray: I will read the article!

~JC
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
GaryCommented:
Yes but the url you are passing has .com - if you remove that it doesn't pass.
0
 
GaryCommented:
Maybe I misunderstood I thought you only wanted the domain and nothing else.
0
 
GaryCommented:
Changed the pattern, allowed bad characters in the first attempt, who'd thought a url would so hard to validate.

return(preg_match("#^(https?://)?([\da-zA-Z\.-]+)\.([a-z\.]{2,6})([\da-zA-Z/\.-]*)*/?$#i",$url));
0
 
Pedro ChagasWebmasterAuthor Commented:
Hi @Garry,
Is possible you improve your solution for check also GET variables in the URL, like this one:
http://example.com/some.php?hhh=10&dddd=20: 0
In line above the return is "0", not validate.

Thanks.

~JC
0
 
GaryCommented:
return(preg_match("#^(https?://)?([\da-zA-Z_\.-]+)\.([a-z\.])([\da-zA-Z/\.-\\?]*)*/?$#i",$url));


There's a couple of proviso's after double checking what is and isn't allowed
Underscores are allowed in the host name - this isn't accounted for and isn't likely a problem anyway - I've yet to see someone use one in an hostname.
The domain extension is a b*tch to validate as there is so many variations that it would make the regex pretty complex to make sure it is correct (if even possible)

Also where I have {2,6} - it is wrong, I completely forgot about all the new extensions like .photography til just now - I think it's probably better to remove this
0
 
GaryCommented:
Check that

return(preg_match("#^(https?://)?([\da-zA-Z_\.-]+)\.([a-z\.])([\d\w/\.=\\?]*)*/?$#i",$url));
0
 
Terry WoodsIT GuruCommented:
Some minor changes to Gary's latest pattern. There's no need for the /? at the end, and the . characters between the [] brackets don't need escaping. The * after the last group is also redundant.
 return(preg_match("#^(https?://)?([\da-zA-Z_.-]+)\.([a-z.])([\d\w/.=\\?]*)$#i",$url)); 

Open in new window

No points thanks...
0
 
Pedro ChagasWebmasterAuthor Commented:
I forget that kind of domains exist. now domains can be lot's of things.
For example I test this URL based in new or future domains:
http://example.games/jjj.php?uu=kjkjk
and validated and well.
But for example:
http://example.c/jjj.php?uu=kjkjk

Open in new window

, the domain is ".c", and the script validate, do not know if it good or bad. It is possible there are domains with only one character?
If not, can you please improve the RE, for not accept domains with one character?

~JC
0
 
GaryCommented:
Add it back in but use {2}
There is no extensions less than 2 characters
0
 
GaryCommented:
So using Terry's corrections

 return(preg_match("#^(https?://)?([\da-zA-Z_.-]+)\.([a-z.]{2})([\d\w/.=\\?]*)$#i",$url));
0
 
GaryCommented:
I think anything more complicated than this then you are just blowing in the wind - there are too many variables in play to be certain of 100% validation plus I'm heading to the bar!
0
 
Ray PaseurCommented:
... plus I'm heading to the bar!
Yes.  Whenever the problem can only be solved with REGEX the truth is to be found here. http://xkcd.com/1171/
0
 
Pedro ChagasWebmasterAuthor Commented:
Based on the idea of @Ray, other way to check URL:
<?
$file = 'http://stackoverflow.com/questions/2280394/how-can-i-check-if-a-url-exists-via-php';
$file_headers = @get_headers($file);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
    echo "não encontrado";
}
else {  
    echo "encontrado";
}
?>

Open in new window

0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 9
  • 4
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now