Solved

RegEx - How to match two words in a string?

Posted on 2011-09-11
10
313 Views
Last Modified: 2012-05-12
Hi All,

I need to write a regex that matches two words, separated by an arbitrary number of other characters, but I can't seem to get it to work.

Example:

BigString = "uhfkjbvfkj+ant+ohgrkjhgr+bird+riouhrh"

Search terms = "ant" "bird"

I thought the following regex would work:

.*ant.*bird.*

but according to this site:

http://sqa.fyicenter.com/Online_Test_Tools/Test_Regular_Expression_Match_Pattern.php

it doesn't match.

My interpretation of the regex I used is:

{AnyString} followed by "ant" followed by {AnyString} followed by "bird" followed by {AnyString}

which should work, but obviously I am not getting it!

Please can you advise how to write the regex, and also explain where I am misunderstanding?

Thanks,

Alan.
0
Comment
Question by:Alan3285
  • 2
  • 2
  • 2
  • +4
10 Comments
 
LVL 9

Expert Comment

by:sshah254
ID: 36520915
Try (.*)ant(.*)bird

SS
0
 
LVL 39

Assisted Solution

by:Pratima Pharande
Pratima Pharande earned 200 total points
ID: 36520941
try this

/.*ant.*bird.*/
0
 
LVL 84

Expert Comment

by:ozo
ID: 36521116
0
 
LVL 9

Expert Comment

by:user_n
ID: 36521174
Try the same but with /.*ant.*bird.*/
0
 
LVL 12

Author Comment

by:Alan3285
ID: 36521997
Hi All,

SS:  That doesn't seem to work?  Without the trailing wildcard, won't it require that the string ends in "bird"?  I tested it on the site I linked to, and it seems to fail the example I gave.

Pratima / UserN:

Those work exactly as required.  Why do they work, and what is wrong with mine?

Ozo:  I used that site again, and it doesn't work for me - it says, "No match found".  Are you getting a different result there?


Thanks,

Alan.

0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 74

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 100 total points
ID: 36522046
You shouldn't really need the starting and ending .*  ...  "ant.*bird" should suffice.

Those work exactly as required.  Why do they work, and what is wrong with mine?
The site requires pattern delimiters. These are the two slashes that you have seen other mention. Some languages require the use of pattern delimiters: Perl and PHP, for example. If you were entering simply:

    .*ant.*bird.*

Then you wouldn't match. But if you entered:

    /.*ant.*bird.*/

then you should have matched, depending on whether or not the case of both your pattern and target string matched. To turn on case-insensitivity, you could have done:

    /.*ant.*bird.*/i

The trailing "i" enables case insensitivity.

Incorporating this information with my earlier suggestion, you can simply use:

    /ant.*bird/i
0
 
LVL 9

Assisted Solution

by:user_n
user_n earned 50 total points
ID: 36522426
http://www.regular-expressions.info/php.html

All of the preg functions require you to specify the regular expression as a string using Perl syntax. In Perl, /regex/ defines a regular expression. In PHP, this becomes preg_match('/regex/', $subject). When forward slashes are used as the regex delimiter, any forward slashes in the regular expression have to be escaped with a backslash. So http://www\.jgsoft\.com/ becomes '/http:\/\/www\.jgsoft\.com\//'. Just like Perl, the preg functions allow any non-alphanumeric character as regex delimiters. The URL regex would be more readable as '%http://www\.jgsoft\.com/%' using percentage signs as the regex delimiters, since then you don't need to escape the forward slashes. You would have to escape percentage sings if the regex contained any.
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 36522491
One thing to keep in mind when using any of the above suggestions is that you won't be limited to just capturing "words". You will actually find any string that contains the string "ant" followed at some point by the string "bird". Any of the above will do this. If you had the following:

    Put the canteen on top of the birdhouse.

You would still match:

    Put the canteen on top of the birdhouse.

If you want to match actual words, then I would suggest using some bounding conditions, such as "word boundary": \b

/\bant\b.*\bbird\b/i

Open in new window


Now the example sentence above would not match, because "ant" and "bird" do not occur on word boundaries. However, if you had the sentence:

    The ant was eaten by the bird.

You would find a match.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 150 total points
ID: 36526080
If you don't want the order of the two words to matter, you can use this to detect a match, provided your regex tool of choice supports lookahead:

/(?=.*\bant\b)(?=.*\bbird\b)/i
0
 
LVL 12

Author Closing Comment

by:Alan3285
ID: 36714119
Thanks All - Very interesting stuff!

Alan.
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now