PHP Validate URL Field. Make sure i contains a certain website URL

I have this validation setup:

if (!preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i",$website))
  {
  $websiteErr = "Invalid URL";
  }

Open in new window


How do i make sure that url contains lets say "http://facebook.com/" in the url inputed?
jporter80Asked:
Who is Participating?
 
Ray PaseurCommented:
Regular Expressions always throw everybody for a loop!  I find them a little easier to read and understand if I deconstruct them into the components.  See lines 47-85

But even with that: http://xkcd.com/1171/

<?php // RAY_temp_jporter80.php
error_reporting(E_ALL);
echo "<pre>";


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28348079.html
// REF http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html


/* DESCRIPTION OF THE PROBLEM
Okay Inputs:
http://exactdomain.com/someother stuff
https://exactdomain.com/somother stuff
https://anysubdomain.exactdomain.com/somother stuff
http://anysubdomain.exactdomain.com/somother stuff

Not okay
exactdomain.com/someother stuff
anysubdomain.exactdomain.com/someotherstuff
*/


// TEST DATA IS AN ARRAY OF INDIVIDUAL TEST ARRAYS
// KEY = EXPECTED, VALUE = TEST-DATA
$targets
= array

// KNOWN BAD SHOULD NOT MATCH
(  array( "" => "exactdomain.com/someother stuff"
), array( "" => "anysubdomain.exactdomain.com/someotherstuff"

// KNOWN GOOD SHOULD MATCH
), array( "http://exactdomain.com/someother"              => "http://exactdomain.com/someother stuff"
), array( "https://exactdomain.com/somother"              => "https://exactdomain.com/somother stuff"
), array( "https://anysubdomain.exactdomain.com/somother" => "https://anysubdomain.exactdomain.com/somother stuff"
), array( "http://anysubdomain.exactdomain.com/somother"  => "http://anysubdomain.exactdomain.com/somother stuff"
)
)
;


// A SIGNAL STRING PREPARED FOR USE IN A REGULAR EXPRESSION
$signal = preg_quote('exactdomain.com');


// A REGEX THAT FINDS URLS AND DOMAIN SUBSTRINGS
$regex
= '#'         // REGEX DELIMITER

. '\b'        // ON WORD BOUNDARY

. '('         // START GROUP
. 'https?'    // HTTP OR HTTPS
. ')'         // END GROUP
. '{1}'       // EXACTLY ONE OF THIS GROUP

. '('         // START GROUP
. '://'       // COLON, SLASH, SLASH
. ')'         // END GROUP
. '{1}'       // EXACTLY ONE OF THIS GROUP

. '('         // START GROUP
. '[A-Z0-9]'  // A SUBDOMAIN
. '+?'        // INDETERMINATE LENGTH
. '\.'        // A DOT (ESCAPED)
. ')'         // END GROUP
. '??'        // ZERO OR ONE OF THIS GROUP, UNGREEDY

. '('         // START GROUP
. $signal     // THE DOMAIN WE WANT TO FIND
. ')'         // END GROUP

. '('         // GROUP
. '/{1}'      // ONE SLASH
. '.*?'       // URL PATHS (NOT WHITE SPACE)
. ')'         // END GROUP

. '('         // GROUP
. '[\S\B]'    // NOT WHITESPACE, NOT WORD BOUNDARY
. '{0,}'      // URL PATH OR NOTHING
. ')'         // END GROUP

. '#'         // REGEX DELIMITER
. 'i'         // CASE-INSENSITIVE
;

// TEST THE DATA STRINGS IN THE SUB-ARRAYS
foreach ($targets as $arr)
{
    foreach ($arr as $expected => $target)
    {
        preg_match_all($regex, $target, $match);

        // SHOW WHAT HAPPENED
        foreach ($match[0] as $matched)
        {
            // NO OUTPUT IF THE TEST WORKED AS EXPECTED
            if ($matched == $expected) continue;

            // EXPOSITION IF THE TEST DID NOT WORK AS EXPECTED
            echo PHP_EOL;
            echo "<b>EXPECT:</b> $expected";
            echo PHP_EOL;
            echo "<b>INPUTS:</b> $target";
            echo PHP_EOL;
            echo "<b>REGEXP:</b> $regex";
            echo PHP_EOL;
            echo "<b>OUTPUT:</b> ";
            print_r($match[0]);
            echo PHP_EOL;
        }
    }
}

Open in new window

0
 
gr8gonzoConsultantCommented:
Use strips after you validate the URL format:

if (strpos (strtolower ($url), "facebook.com") !== false)
{
// URL contains facebook.com
}
if you want to make sure its in the domain part of the url, then use parse_url first.
0
 
gr8gonzoConsultantCommented:
Sorry, auto correct on my phone changed strpos to strips.
0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

 
Ray PaseurCommented:
http://www.damnyouautocorrect.com/

You can do several things depending on what you want to accomplish.  The meaning of "validate" could be pretty broad.  Do you care if there is a valid URL that does not have a corresponding resource on HTTP or FTP?  Do you care if the URL has the wrong protocol? For example, if you have http://facebook.com, it does not point to https://facebook.com and will get rewritten to https://www.facebook.com

You might find this article helpful:
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html
0
 
jporter80Author Commented:
It should require http:// or https:// only.  I dont care if has a subdomain or not.. but must require a certain TLD is entered.

Help?
0
 
Ray PaseurCommented:
Do you understand the regular expression you posted?  If not, just tell us in plain language exactly what you want to find in the acceptable URLs and we can probably help from that.
0
 
jporter80Author Commented:
no not completely.. regular expression always throws me for a loop.  Basically:

Okay Inputs:
http://exactdomain.com/someother stuff
https://exactdomain.com/somother stuff
https://anysubdomain.exactdomain.com/somother stuff
http://anysubdomain.exactdomain.com/somother stuff

Not okay
exactdomain.com/someother stuff
anysubdomain.exactdomain.com/someotherstuff
0
 
jporter80Author Commented:
thanks for the excellent education on regex.. definitely need to practice that.. it can be so helpful.  This worked like a charm
0
 
Ray PaseurCommented:
Thanks for the points and thanks for using EE, ~Ray
0
 
gr8gonzoConsultantCommented:
While I understand that regex might handle validation for you, you should really consider using parse_url if you want to examine different parts of a URL:

http://us3.php.net/parse_url

It -is- meant for that kind of thing.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.