dsimcox
asked on
Regular Expression to extract decimal number from text string
I'm having trouble getting the right regular expression to use in an .Execute or .Replace RegExp Method to extract a decimal number.
Here's what I've come up with that does not work:
I want to extract the decimal number from a string like this: "$13.4M" or "74.5 FTE"
My pattern is: "[^\d]+[^\.]?[^\d]+"
[^\d]+ - Not a digit (multiple instances)
[^\.]? - Not a decimal (only one instance)
[^\d]+ - Not a digit (multiple instances)
This function is returning the entire string when I test it. What am I doing wrong?
Here's what I've come up with that does not work:
Public Function MyNumber(txt)
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.Pattern = "[^\d]+[^\.]?[^\d]+"
End With
MyNumber = RegEx.Replace(txt, "")
Set RegEx = Nothing
End Function
I want to extract the decimal number from a string like this: "$13.4M" or "74.5 FTE"
My pattern is: "[^\d]+[^\.]?[^\d]+"
[^\d]+ - Not a digit (multiple instances)
[^\.]? - Not a decimal (only one instance)
[^\d]+ - Not a digit (multiple instances)
This function is returning the entire string when I test it. What am I doing wrong?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Or a little differently..."\d(\.\d+)?"
Forgot to mention, the patterns above are meant to use with RegExp.Execute. If you simply want to extract a match, Execute is usually better than Replace.
For your pattern to match, you are currently requiring:
[^\d]+ One or more non-digit characters
[^\.]? (an optional non-decimal-point character)
[^\d]+ And another one (or more) non-digit characters
These must be all next to each other for the pattern to match, so it would only work where you have at least 2 non-digit characters in a row. To keep using RegEx.Replace you'll need to use something like:
.Pattern = "^\D*(\d+(?:\.\d+)?)\D*$"
MyNumber = RegEx.Replace(txt, "$1")
[^\d]+ One or more non-digit characters
[^\.]? (an optional non-decimal-point character)
[^\d]+ And another one (or more) non-digit characters
These must be all next to each other for the pattern to match, so it would only work where you have at least 2 non-digit characters in a row. To keep using RegEx.Replace you'll need to use something like:
.Pattern = "^\D*(\d+(?:\.\d+)?)\D*$"
MyNumber = RegEx.Replace(txt, "$1")
ASKER
matthewspatric, thanks for your suggestions.
I used this code successfully:
========================== =======
Public Function MyNumber(txt)
Dim RegEx As Object, myMatch As Variant
Set RegEx = CreateObject("vbscript.reg exp")
If RegEx.Test(txt) = True Then
With RegEx
.IgnoreCase = True
.MultiLine = False
.Global = True
.Pattern = "\d+\.?\d*"
End With
Set myMatch = RegEx.Execute(txt)
MyNumber = myMatch(0)
End If
========================== ========
terryatopus, I was unable to get your suggestion to work.
.Pattern = "^\D*(\d+(?:\.\d+)?)\D*$"
MyNumber = RegEx.Replace(txt, "$1")
Using the Replace method, it seems like this pattern would exclude the numeric characters and decimal: (\d+(?:\.\d+). I must be missing something here, because I don't doubt your logic is right-on!
I used this code successfully:
==========================
Public Function MyNumber(txt)
Dim RegEx As Object, myMatch As Variant
Set RegEx = CreateObject("vbscript.reg
If RegEx.Test(txt) = True Then
With RegEx
.IgnoreCase = True
.MultiLine = False
.Global = True
.Pattern = "\d+\.?\d*"
End With
Set myMatch = RegEx.Execute(txt)
MyNumber = myMatch(0)
End If
==========================
terryatopus, I was unable to get your suggestion to work.
.Pattern = "^\D*(\d+(?:\.\d+)?)\D*$"
MyNumber = RegEx.Replace(txt, "$1")
Using the Replace method, it seems like this pattern would exclude the numeric characters and decimal: (\d+(?:\.\d+). I must be missing something here, because I don't doubt your logic is right-on!