We help IT Professionals succeed at work.

PHP Regular Expression

christamcc
christamcc asked
on
Medium Priority
482 Views
Last Modified: 2012-08-15
Hello,
Can someone code the regular expression for these field restrictions:

Must be between 4 and 20 characters long and should contain only letters, numbers, and the underscore symbol.

Thanks!
Comment
Watch Question

preg_match('/^[\w\d_]{4,20}$/',$field);

Open in new window

Most Valuable Expert 2011
Author of the Year 2014

Commented:
Case-insensitive match.  Remove the "i" if you want only uppercase.
preg_match('/^[A-Z0-9_]{4,20}$/i',$field);

Open in new window

HTH, ~Ray
Ray's is the one you'll want to use.  Used \w in my haste which will match spaces, tabs, etc.
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2015

Commented:
Used \w in my haste which will match spaces, tabs, etc.
No, "\w" matches [a-zA-Z0-9_]. "\s" matches spaces, tabs, and other whitespace  = )
This is what I get for answering a regex question when I haven't used them in awhile :(

preg_match('/^[\w]{4,20}$/',$field);

Open in new window


Does appear to work as expected when testing.
Most Valuable Expert 2011
Author of the Year 2014
Commented:
According to my notes, \w matches any "word" which would be an unbroken string of characters that are not punctuation or whitespace or other word boundaries matched by \b.  As a practical matter I like to write regular expressions (and test them) using the techniques described in this article.  Writing them one-line-at-a-time with comments helps me keep my thinking clear.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_7830-A-Quick-Tour-of-Test-Driven-Development.html

Following my own advice: http://www.laprbass.com/RAY_temp_christamcc.php
<?php // RAY_temp_christamcc.php
error_reporting(E_ALL);
echo '<pre>';


// THE REGULAR EXPRESSION
$rgx
= '/'           // REGEX DELIMITER
. '^'           // AT START OF STRING
. '['           // DEFINE A CHARACTER CLASS
. 'A-Z0-9_'     // CLASS INCLUDES LETTERS, NUMBERS, UNDERSCORE
. ']'           // END CHARACTER CLASS
. '{4,20}'      // MUST MATCH LENGTH FROM FOUR TO TWENTY
. '$'           // AT END OF STRING
. '/'           // REGEX DELIMITER
. 'i'           // CASE-INSENSITIVE
;

// THE TEST CASES (ADD YOUR OWN HERE)
$dats
= array
( 'HelloWorld'
, 'Four.Twenty'
, '1___5___10___15___20'
, 'Too short'
, 'a'
, 'Very_very_much_too_long'
, ' NeedsATrim'
)
;

// MAKE THE AUTOMATED TESTS AND REPORTS
foreach ($dats as $dat)
{
    echo "TESTING: $dat";

    if (preg_match($rgx, $dat))
    {
        echo ' MATCHED';
    }
    else
    {
        echo ' DID NOT MATCH';
    }
    echo PHP_EOL;
}

Open in new window

HTH, ~Ray
Terry WoodsWeb Developer, specialising in WordPress
CERTIFIED EXPERT
Most Valuable Expert 2011

Commented:
My understanding is that in most regex flavours, \w is identical to [a-zA-Z0-9_]

My understanding of \b is that it essentially matches the (zero-length) gap between \w characters and non-\w characters (\W if you prefer), as well as the start and end of the string, which is exactly equivalent to this:
(?:(?<!\w)(?=\w)|(?<=\w)(?!\w))

Open in new window

It's unfortunate in a lot of situations that the ' character isn't included, because \b\w+\b fails to treat cases like "don't" and "O'Connor" as  single words.

Anyway, the most concise way to meet the author's requirements is:
preg_match('/^\w{4,20}$/',$field);

Open in new window

Author

Commented:
Thanks for you due diligence!
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2015

Commented:
According to my notes, \w matches any "word"
Ray, I agree with Terry: "\w" matches any word character, not word. You could get a "word" by using "\w+". See "Shorthand Character Classes" for more information.

The reality is that all three approaches (DustinKikuchi, Ray's, and Terry's) are exactly equivalent. While I acknowledge that Ray took the extra time to document the regex, I argue that if one takes the time to understand exactly what "\w" means, then that extra documentation is unnecessary. (Admittedly it is helpful for persons unfamiliar with regex coming behind the original author.)
Most Valuable Expert 2011
Author of the Year 2014

Commented:
@Kaufmed: That is a good link for the Shorthand Character Classes.  I forget about this site sometimes.  It deserves to be remembered!
http://www.regular-expressions.info/charclass.html

And you're right about \w versus \w+ with the latter matching more than one alpha-numeric.  But in a character class expression that also includes explicit lengths I did not see any difference in my tests.

Explore More ContentExplore courses, solutions, and other research materials related to this topic.