Regular expression to locate words ending in a particular format.

Hello Experts,
I need to locate all the words ending with the letters "an" in a text file using notepad++. Can you please suggest some regular expressions for the same?

Thanks.
sukhoi35Asked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Rgonzo1971Connect With a Mentor Commented:
Hi,

pls try

\w*an\b

Open in new window


regards
0
 
sukhoi35Author Commented:
Hey, thanks for the response.
But this didn't work.

Screenshot
0
 
Marco GasiFreelancerCommented:
I used

(an)$
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
sukhoi35Author Commented:
Hi marqusG,
Your expression is only locating the last words in every line ending with the letter "an"

For example:

Swan Duck Right Man
Ocean Sea Fish Fan

If I use the expression (an)$ on the above lines, only the words "man" and "fan" are located. Observe that even Swan and Ocean end with "an', but they are not located.
0
 
Marco GasiConnect With a Mentor FreelancerCommented:
You're right:

(an)\b
0
 
sukhoi35Author Commented:
Even (an)\b isn't working.

Screen capture
0
 
Marco GasiFreelancerCommented:
Don't knowHere it works
0
 
mark_harris231Commented:
Try .*?an[\s|\r|\n]
0
 
Terry WoodsConnect With a Mentor IT GuruCommented:
@mark_harris231, your pattern will fail to match words ending in an which don't have a space character after them. A full stop after the word would cause it to fail.

This worked for me, with the Match case option off:
[a-z]*an(?![a-z])

The (?!...) part of the pattern is called a negative lookahead, which in this case means don't match if a letter is found immediately after "an"

Change the [a-z] to \w if you want to treat the underscore character as a word character. Not that the word "can't" will be found; if that's a problem, I'll adjust the pattern for you.
0
 
Derek JensenCommented:
Looking for words ending in 'an':

(\w+?an)[^a-z]

Open in new window

0
 
Terry WoodsIT GuruCommented:
@bigdogdman, that pattern requires a character after the word. For real data, it would probably be fine as a I think the only case it would miss would be a word at the very end of the file, however the negative lookahead works fine in notepad++ and covers all cases.
0
 
Derek JensenCommented:
@Terry,

Actually, it doesn't. Granted, I haven't tried it in NotePad++, but unless it has a really sucky Regex engine, [^a-z] should match $ or \n or \r...

Again, untested, so...could still be wrong. :-)
0
 
Marco GasiFreelancerCommented:
Sorry, I'm anovice of regex: can you explain why in my pc works what I shown in post ID 39637188

I don't want points but only understand why (an)\b works in my Notepad++ on Win7 but not for sukhoi35: I really don't understand this...
0
 
Terry WoodsIT GuruCommented:
I tested the negative lookahead in notepad++ and it was fine...

I've never encountered any regex engine that matches an end-of-line or end-of-file (which aren't a character of any sort) with a character set such as [^a-z], but you're correct it will match a \n or \r character, or a literal dollar sign character. Notepad++ works as expected. I'll keep an open mind for other languages though, as some regex engines have some pretty unexpected behaviour!

Note that Notepad++ is a free download, for anyone who hasn't used it before.
0
 
Terry WoodsConnect With a Mentor IT GuruCommented:
@marqusG, doing some more testing in Notepad++ (I have version 6.4.2), your original pattern worked fine for me:
\w*an\b

Open in new window

as did the pattern:
(an)\b

Open in new window

though the round brackets are unnecessary.

The only possible faults with it are unlikely to be a problem in real data:
\w will also match an underscore character or number
\b matches a boundary between a word character (\w) and a non-word character. Again, because \w matches an underscore or a number, there may be odd cases missed, although it may not be a problem
eg
ocean1 would not be picked up
ocean_ would not be picked up
With pattern \w*an\b a value of 2345_2345an can be matched.

@sukhoi35, perhaps you'd like to retry using pattern \w*an\b ? It's simpler than the negative lookahead that I suggested.
0
 
Marco GasiFreelancerCommented:
Thanks for your answer, Terry: unfortunately, I can't give you points ;)
0
 
sukhoi35Author Commented:
Thanks!
0
 
sukhoi35Author Commented:
Hi Guys,
By the way, upgrading to notepad++ v6.5.1 from the existing v5.9.8 helped me use some of the solutions posted earlier. Looks like there were problems with the earlier versions which got fixed recently. Thanks all for your time!
0
All Courses

From novice to tech pro — start learning today.