Link to home
Start Free TrialLog in
Avatar of webressurs
webressursFlag for Norway

asked on

Regular expression: Find long words

I have searched for a regex that check for long words in a text string. The text sting can be long, but any words in this string should not be longer than 20 characters. I have tried a lot of  different expressions, but all gives me the exeption even if no word is longer thean 20 chars (the total text is longer than 20 chars). I am using asp.net c#. Please see attached code.
I Have tried these expressions without luck:
 
ValidationExpression="^\S{20}$"
ValidationExpression="^\w{20}$"
ValidationExpression="^\w{20,}$"
ValidationExpression="^\b\w{20,}$"
ValidationExpression="^\b\p{L}{20,}$"
ValidationExpression="^\w{0,20}$"

Open in new window

Avatar of evilrix
evilrix
Flag of United Kingdom of Great Britain and Northern Ireland image

What about this?

\b\w{0,20}\b
Avatar of webressurs

ASKER

Nope, this also return expeption even if the text is "hello world hello world hello world hello world". It takes the entire string, not each word in the sting. :)
>> It takes the entire string, not each word in the sting
It should stop matching at the space because \w matches words (A-Z, a-z and _) and doesn't match space. This being the case it should not be matching the whole string only the first word of that string. I tested it with RegexBuddy and what I expected to happen is what happens.

\b      Matches at the position between a word character (anything matched by \w) and a non-word character (anything matched by [^\w] or \W) as well as at the start and/or end of the string if the first and/or last characters in the string are word characters.

http://www.regular-expressions.info/reference.html

Given this string...

"hello world hello world hello world hello world"

RegexBuddy finds all these as separate matches for that regex

hello
world
hello
world
hello
world
hello
world

Maybe you can show how you're doing this?
This is how I do it:

<asp:RegularExpressionValidator runat="server" ID="regDescription" ControlToValidate="txtDescription" Display="Dynamic" Text="Too long word" ValidationExpression="^\b\w{0,20}\b$" />

If txtDescription = "Hello" it dont throw any exeption.
If txtDescription = "Hello world" I get the exeption: "Too long word".

So, here it seems to be something with the spaces. Is there another way to write this when using regex in asp.net like above?
You've modified the regex I gave you by adding a start and end anchor.

You have this...

^\b\w{0,20}\b$

the regex I gave you was this...

\b\w{0,20}\b

It fails because it is expecting the string to end after it matches the word boundry
Yes, I modified it but have tried both. This also gives the same result (dont work):

<asp:RegularExpressionValidator runat="server" ID="regDescription" ControlToValidate="txtDescription" Display="Dynamic" Text="Too long word" ValidationExpression="\b\w{0,20}\b" />

Strange...?
>> Strange...?
A little, yes :)

I'll take another look when I get home from work.
ASKER CERTIFIED SOLUTION
Avatar of webressurs
webressurs
Flag of Norway image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial