Solved

how do I write this regular expression?

Posted on 2008-10-10
16
277 Views
Last Modified: 2010-04-21
I want to write a regular expression that is satisfied by zero or more instances of:

a string beginning with one alphanumeric character, and followed between 2 and 29 additional characters that can be alphanumeric, underscores, periods, or dashes, followed by one or more white spaces

so far I have

'^(^[[:alnum:]][a-z0-9_\.\-]{2,30}[[:space:]^ ]{0,}){0,255}$'

but even submitting

user2
user3

does not satisfy this. also, I cobbled this together from another regular expression and I'm not sure what the "^ " after [:space:] does. I assume this is not relevant to my current application?

0
Comment
Question by:bitt3n
  • 11
  • 5
16 Comments
 
LVL 13

Expert Comment

by:Xyptilon2
ID: 22693006
Try

^[A-Za-z0-9][a-zA-Z0-9_\.\-]{2,29} *$
0
 

Author Comment

by:bitt3n
ID: 22694033
hm, I am still having a few problems.. the string

user

doesn't parse with that expression, although it seems like it should. the string

user1

parses. also

user1
user2

doesn't parse. I tried changing that expression to

^[A-Za-z0-9][a-zA-Z0-9_\.\-]{2,29}[[:space:]]*$

since the elements are separated by new lines and not by spaces, but that didn't fix it.

the above strings all parse with expression

^([[:alnum:][:punct:]\.\'\-]{3,30}[[:space:]]{0,}){0,255}$

although that does not satisfy the requirement for the expression that it begin with an alphanumeric character and not punctuation.
0
 

Author Comment

by:bitt3n
ID: 22694057
actually fiddling around with it more it appears that

user

does parse when I use

eregi ('^[A-Za-z0-9][a-zA-Z0-9_\.\-]{2,29} *$', 'user')

and this also parses

if (eregi ('^[A-Za-z0-9][a-zA-Z0-9_\.\-]{2,29} *$', 'user1\r\nuser2')

but if I submit

user

or

user1
user2

through a form, the form input does not parse for some reason. so sending it through the form is altering the string in some way that is preventing the parse. how do I change the expression to account for this?
0
 
LVL 13

Expert Comment

by:Xyptilon2
ID: 22694069
perhaps quotes are added, look at the magic_quotes settings in your php.ini file and compensate for it :)


0
 

Author Comment

by:bitt3n
ID: 22694087
magic quotes is active, however, I am not using any quotes. just to experiment I tried stripping slashes, and it still did not parse. also it seems like magic quotes would not be responsible for

user

parsing but not

user1

and not

user1
user2

unless I am mistaken
0
 

Author Comment

by:bitt3n
ID: 22694115
since

'^([[:alnum:][:punct:]\.\-]{3,30}[[:space:]]{0,}){0,255}$'

seems to almost give me what I want, I tried

'^(^[[:alnum:]][[:alnum:][:punct:]\.\-]{3,30}[[:space:]]{0,}){0,255}$'

which does seem to work for one entry, because it allows

user1

to parse but not

.user1

but for some reason it also prevents

user1
user2

from parsing and I have no idea why
0
 

Author Comment

by:bitt3n
ID: 22694130
ok I added parentheses around everything like this

'^(([[:alnum:]])([[:alnum:][:punct:]\.\-]){2,29}([[:space:]]){0,}){0,255}$'

and it seems to work now. why would this work and not

'^([[:alnum:]][[:alnum:][:punct:]\.\-]{3,29}[[:space:]]{0,}){0,255}$'
0
 

Author Comment

by:bitt3n
ID: 22694156
well

user1 user2

doesn't parse using my last attempt, so it still doesn't seem to work exactly as I would expect. seems like it should, since a space is one of the members of the [[:space:]] group, just like a line break...
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:bitt3n
ID: 22758440
any additional advice on why the regular expression listed in 22694130 isn't working as per 22694156 would be greatly appreciated, I have not been able to figure this out.
0
 
LVL 13

Expert Comment

by:Xyptilon2
ID: 22758695
This works for me.. i just tested it:

^[:alnum:][a-zA-Z0-9\.\-_]{2,29}\s*$

Use RegexBuddy to help you out, useful program:
http://www.regexbuddy.com/
0
 
LVL 13

Accepted Solution

by:
Xyptilon2 earned 500 total points
ID: 22758709
For 0 or more spaces in the end:
^[:alnum:][a-zA-Z0-9\.\-_]{2,29}\s*$

For 1 or more spaces in the end:
^[:alnum:][a-zA-Z0-9\.\-_]{2,29}\s+$
0
 

Author Comment

by:bitt3n
ID: 22759236
ok cool.. I had to put an extra set of brackets around [:alnum:] to get it to validate

^[[:alnum:]][a-zA-Z0-9\.\-_]{2,29}\s*$

and now it works for each username. so now all I have to do is make this validate for multiple usernames, ie not just for

username

but for

username
username

or

username username

I imagine for between 0 and 255 usernames, it's something like

[^[[:alnum:]][a-zA-Z0-9\.\-_]{2,29}\s*$]{0,255}

but that's not the right syntax. how can I make it work for multiple usernames?

thanks for the suggestion about regexbuddy -- I'll check it out
0
 

Author Comment

by:bitt3n
ID: 22759340
hm I think I figured it out, if I use round brackets

(^[[:alnum:]][a-zA-Z0-9\.\-_]{2,29}[[:space:]]*$){0,255}

it appears to work..
0
 

Author Closing Comment

by:bitt3n
ID: 31505237
thanks!!
0
 
LVL 13

Expert Comment

by:Xyptilon2
ID: 22759501
Perhaps that may work, but it's not really the way to do it.

The carat sign ^ means beginning of the line and the $ in the end means end of the line. Though you can use round parenthesis to group parts of an expression. Why don't you use the preg_match function from PHP and all usernames will be stored in an array.

Or, you can explode a string by space like

$aArray = explode(" ", $sUsernameString);

echo "number of usernames: " . count($aArray);

and then loop over the array and perform a regex comparison
0
 

Author Comment

by:bitt3n
ID: 22759774
I can't explode that way because I don't know how many or what kind of whitespaces will separate the usernames. that is, some may be separate by spaces, others by line breaks, others by spaces and line breaks.

I could use split() instead of explode() but it seems like trying to parse a list of usernames should be well within the capabilities of the normal regex syntax without resorting to an array. I'm kind of surprised that using parentheses like that isn't kosher. It is necessary to specify the beginning and the end of each part of the expression (ie, each username must begin with an alphanumeric character, and end with zero or more whitespaces), so I am confused how else it is possible to do this without using ^ and $.

Presumably it would be possible to use ^ and $ for other parts of the expression (if I wanted to include another part that had beginning and ending requirements), and also for the expression as a whole?
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Generating table dynamically is the most common issue faced by php developers.... So it seems there is a need of an article that explains the basic concept of generating tables dynamically. It just requires a basic knowledge of html and little maths…
Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now