Link to home
Start Free TrialLog in
Avatar of jay_waugh
jay_waughFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Regex Last variable in string (Identify all characters following the final space in a string - unless last characters is single)

I am trying to gather all of the characters following the final space in a string but am struggling with the syntax. Can anyone help please? The last set of characters will always be alphanumeric but will be of differing characters and amount of characters.

 In the below example I would only want to highlight cat.  

 dog xxx-klsfkd-sdf-sdf cat


 And in the next one aaaaaaaaaaaaaddddddddddddddd

 heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd


however if the final character in the string is single I wish to ignore it e.g.

 heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd 0

would still highlight aaaaaaaaaaaaaddddddddddddddd


 Many Thanks
Avatar of Bill Prew
Bill Prew

I think one approach would be the following, and then get the first submatch.  This should discard any cases of a trailing space followed by a single character.

([a-zA-Z0-9]{2,})( [a-zA-Z0-9]){0,1}$

Open in new window


»bp
What about this case, what would you want here?

heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd 0 1 2 3


»bp
Avatar of jay_waugh

ASKER

Hi Bill,

What you have suggested isn't currently in play, but if it was it would still be the

aaaaaaaaaaaaaddddddddddddddd

Thanks
Okay, if it were I think this would handle that as well.

([a-zA-Z0-9]{2,})( [a-zA-Z0-9]){0,}$

Open in new window


»bp
Thanks Bill but sadly that doesn't seem to work as I'd hoped.

I also failed to mentioned that some of the strings that I wish to highlight can contain underscores if that makes any difference.

for example when I use that code on the below FRANK 0 is highlighted

NewYork abcdefg fghkmj-asdasd-efeefe-efeffe FRANK 0

equally for

NewYork abcdefg fghkmj-asdasd-efeefe-efeffe FRANK_Dave 0

Dave 0 is highlighted
What language or environment are you working in, REGEX varies a bit by context.

And are you taking the first submatch from the results, not the full results?


»bp
Take a look at the test cases here, it seems to work if you take just subgroup 1 of the match results...

https://regex101.com/r/qlAarF/6

(\S{2,})( \S){0,}$

Open in new window


»bp
OK thanks, I have been using http://pythex.org/ to check the regex and it is giving different results.

I'll try in the application instead and get back to you.
Seems to yield the same results, just remember you don't want the whole match, just the first group.

User generated image

»bp
ok thanks, but how do I only return match group 1 and ignore group 2. Again just using the single line of code?
What language are you working in, and what would the single line of code look like?


»bp
If python, then this is the idea.

import re
m = re.search('(\S{2,})( \S){0,}$', 'NewYork abcdefg fghkmj-asdasd-efeefe-efeffe FRANK_Dave 0')
print m.group(1)

Open in new window


»bp
Sadly I have to use the Regular expression in an application which doesn't give me the option to use the Python code that you have suggested. I can only type "(\S{2,})( \S){0,}$"
Okay, as long as the application supports lookaround in the regex patterns then I think this should work.

\S{2,}(?=( \S{1})*$)

Open in new window


https://regex101.com/r/qlAarF/7


»bp
:( curses it doesn't seem to as expected. In this instance I get two values returned one blank and one with the correct string.


Many thanks for your continued engagement, do you have any more ideas?
Not unless the application calls out the "flavor" of regex engine it uses.  Lookaround is pretty common these days but not ubiquitous.

In the application, you use the regex for extracting data from a field, not just validating, right?

Regex can be a pretty deep dive with additional complex features.  I'm not aware of another way to skin this cat, but that doesn't guarantee that there isn't one.  Doesn't feel probable, but there is a chance I suppose.

There is a solid "tutorial" on some of the more advanced regex features at the link below, if you are feel courageous...

http://www.regular-expressions.info/tutorial.html


»bp
:) I've bought a book too!!!

So that I can finish what I'm doing though can you please just confirm what I'd need to do the selection of the string but ignoring the single character at the end?

e.g.
NewYork abcdefg fghkmj-asdasd-efeefe-efeffe FRANK_Dave 0

highlights

FRANK_Dave

The same code would also need to work for

heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd 0

And just highlight

aaaaaaaaaaaaaddddddddddddddd

etc etc

Can I do that without using multiple subgroups?

Thanks again
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial