Link to home
Start Free TrialLog in
Avatar of snix123
snix123

asked on

Strip leading or following numbers from string

Hi Experts:

I have a string which may be composed of alphanumeric characters.  If the first one OR two characters are number, they must be stripped off.  If the last OR next to last characters are numbers,  they must be stripped off too.
So if the input string is: 23abcde56, then I want to extract abcde.
If the input string is 3abcde56, then I want to extract abcde.
If the input string is 45abcde6, then I want to extract abcde.
If the input string is 45abcde7756, then I want to extract abcde77.
If the input string is 2367abcde3, then I want to extract 67abcde.
I have the following code, which seems to work, I'm still testing, but I was wondering if there was a better way to do this?  I think there must be.  I am developing in ColdFusion and here's what I have so far:



Any ideas how to improve this?
Thanks in advance!
<cfset mystring="56testing89">  <!--- test input string here --->
<cfset WordToTest = "">  <!--- result is here --->
<!---- see if there are digits on left --->
<cfset leftresult= #SpanIncluding(mystring, "123456789")#> 

<!---- see if there are digits on right --->
 <cfset rightresult=#Reverse(SpanIncluding(Reverse(mystring),"123456789"))#>

<cfif leftresult is not "" and rightresult is not "">
      <cfset WordToTest = RemoveChars(mystring,1,min(len(leftresult),2))>
      <cfset newword = reverse(WordToTest)>
      <cfset WordToTest = RemoveChars(newword,1,min(len(rightresult),2))>
      <cfset WordToTest = reverse(WordToTest)>
<cfelseif leftresult is not "" and rightresult is "">
       <cfset WordToTest = RemoveChars(mystring,1,min(len(leftresult),2))>
<cfelseif leftresult is "" and rightresult is not "">
       <cfset newword = reverse(mystring)>
       <cfset WordToTest = RemoveChars(newword,1,min(len(rightresult),2))>
       <cfset WordToTest = reverse(WordToTest)>
</cfif>	
<cfoutput>#WordToTest#</cfoutput>  <!--- This is the extracted part I want --->

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of gdemaria
gdemaria
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
i am no regex guru, so could only come up with a bit clumsy one-liner:

rereplace(reverse(rereplace(reverse(myString), "^\d{1,2}(.+)$", "\1")), "^\d{1,2}(.+)$", "\1")

here's some code for you to test it:

<cfset list = "23abcde56,3abcde56,45abcde6,45abcde7756,2367abcde3">
<cfoutput>
<cfloop list="#list#" index="myString">
<p>#myString# : #rereplace(reverse(rereplace(reverse(myString), "^\d{1,2}(.+)$", "\1")), "^\d{1,2}(.+)$", "\1")#</p>
</cfloop>
</cfoutput>

Azadi
@Azadi - I'm not a guru either, but I think you could simplify it a bit more to this:

#reReplaceNoCase(myString, "^\d{1,2}(.+)\d{1,2}$", "\1")#
tried that - it only removes one digit from the end of the string, no matter how many you have there...

cf's implementation of regex, unfortunately, does not seem to support possessive qualifiers/atomic groups/(?ifthen|else) construct, which are necessary to create a regex to match "two digits at the end of a string if there are 2 or more digits there, otherwise 1 digit).

thus reversing the string and then reversing it back seems to be the only way to makle it a one-liner in cf...

Azadi
Ugh.. you're right :/  Hate to say it but I'd almost take string functions here, because damn that's a lot of hoops...  Hopefully they'll improve is someday. Maybe CF 10? ;-)
i *think* cf uses java regex engine - so unless that engine starts implementing all this advanced regex functionality, even cf10 may not have it...

however, with a few built-in cf functions - like reverse() - you can emulate pretty much anything, and still have a one-line solution, which is nice :) you probably do lose a couple milliseconds this way compared to using native regex engine options, but who really cares?!

Azadi
I'm not sure what it uses underneath.  There's probably a library out there that does it.  But it's a shame it's not built in. Really I don't really care about slight time differentials .. just readability and being able to figure out what the code's doing 6 mo's from now.  That's is hard enough with simple regex's ;-)  {shrug} But I guess it is what it is ...

Avatar of snix123
snix123

ASKER

I agree with all.  I've used the regex's in the past myself and usually have to rethink them again if there's an issue with it.  It always seems hard for me to explain it to someone.  I did experiment with regex's before deciding to use the string functions only.  Seems easier to understand.  

Thanks for all the replies.  It is very much appreciated.  Everyone have a great weekend!