Link to home
Start Free TrialLog in
Avatar of boise2004
boise2004

asked on

regex for float/double?

Need a regex for a float.

must test that a number is a float in 7.2 format
(7 digits point 2 digits, for example 9000000.75)

must account for scientific notation

must be a Perl Compatible Regular Expression (PCRE)

thanks very much.
Avatar of ysre
ysre
Flag of Afghanistan image

if ($testnumber =~ /^\s*\d{1,7}\.\d{2}$/) { print "got 7.2f number!\n"; }

Note: We are looking for \s* (looking for whitespace, eg. space, tab, etc.) and then a following number consisting of 1 to 7 digits before the dot.
If you had a definition like %07.2f (thus we had zeroes instead of spaces for the unused parts of the number)  you'd have to write /^0*\d{1,7}\.\d{2}$/ :)

Hope this helps,
Ys
On a sidenote:
Your problem implies a non scientific notation output  which is why I don't check for this :)

Ys
Avatar of ozo
perldoc -q float
Avatar of boise2004
boise2004

ASKER

oh I see I worded that poorly, sorry.
*bump 50 points*

I have a mysql field that holds a 7.2 float.

I want a regex that will check if an incoming # is acceptable to be stored in there.

so the incoming # could be

1
.5
.03
9000000.75
.2e2

etc

note I am not using perl, just need a perl-compliant regex

thank you.
Just to make ozo's suggestion even more obvious

$ perldoc -q float

Found in /usr/lib/perl5/5.8.0/pod/perlfaq4.pod
       How do I determine whether a scalar is a number/whole/integer/float?

               Assuming that you don't care about IEEE notations like "NaN" or
               "Infinity", you probably just want to use a regular expression.

                  if (/\D/)            { print "has nondigits\n" }
                  if (/^\d+$/)         { print "is a whole number\n" }
                  if (/^-?\d+$/)       { print "is an integer\n" }
                  if (/^[+-]?\d+$/)    { print "is a +/- integer\n" }
                  if (/^-?\d+\.?\d*$/) { print "is a real number\n" }
                  if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number\n" }
                  if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/)
                                       { print "a C float\n" }

               You can also use the Data::Types module on the CPAN, which
               exports functions that validate data types using these and
               other regular expressions, or you can use the "Regexp::Common"
               module from CPAN which has regular expressions to match various
               types of numbers.

               If you're on a POSIX system, Perl's supports the "POSIX::str-
               tod" function.  Its semantics are somewhat cumbersome, so
               here's a "getnum" wrapper function for more convenient access.
               This function takes a string and returns the number it found,
               or "undef" for input that isn't a C float.  The "is_numeric"
               function is a front end to "getnum" if you just want to say,
               ``Is this a float?''

                   sub getnum {
                       use POSIX qw(strtod);
                       my $str = shift;
                       $str =~ s/^\s+//;
                       $str =~ s/\s+$//;
                       $! = 0;
                       my($num, $unparsed) = strtod($str);
                       if (($str eq '') || ($unparsed != 0) || $!) {
                           return undef;
                       } else {
                           return $num;
                       }
                   }

                   sub is_numeric { defined getnum($_[0]) }

               Or you could check out the String::Scanf module on the CPAN
               instead. The POSIX module (part of the standard Perl distribu-
               tion) provides the "strtod" and "strtol" for converting strings
               to double and longs, respectively.
I think the problem is more challenging than the answers suggest.
The task is to take an arbitrary float in arbitrary notation and check whether this number fits into a 7.2 float db-field.

As far as I see it, other experts please correct me, this is *not* achievable by a regular expression.
  100e4
  1000e-4
  1001e-2
are valid, while
  1000e4
  1001e-4
  1000.1e-2
are not.

The number of leading non-zero digits, the number of digits before and after the decimal point, and the exponent - they all have to be related to each other by non-trivial rules. There is no regex functionality (I know of) which is capable of calculating the actual length of a captured pattern and comparing it to a captured number.
I'm curious whether the other experts know of a solution to this.

I suggest to first convert the given number to a string in non-scientific float format. This is easily achieved in perl or C by sprintf. If the programming language you are actually using does not provide an equivalent function you should consider changing the language anyhow :)
Second, match that string against a very straight forward regex like /^\d{0,7}(?:\.\d{0,2})?$/.

regards,
woolf
Thinking a little bit more about it :) , there is a solution, *if* you can rely on the scientific notation to be canonical. Then
  1000e4
would not be allowed but must be
  .1e8
instead.

Even then, you would have to cover all possible values of the exponent by lots of OR-ed regular expressions like:
  /^(?:\d{0,7}(?:\.\d{0,2})?|\.\d{1,9}e7|\.\d{1,8}e6|\.\d{1,7}e5| ... |\.\d{1,2}e0|\.\de-1)$/
This is regexp that catches your cases for normal numbers and all cases for exponent.

/^(\d{1,7}|(\d{1,7})?\.\d{0,2}|(\d{0,7})?(\.\d{0,7})?[Ee]((\+0*[1-7])|-0*\d{1,2}))$/

I think Mr."woolf" is right. You cannot skip the check for exponent whether it exceeds your range. This cannot be done with regexp.
Or this:

/^(\d{1,7}|(\d{1,7})?\.\d{0,2}|(\d{0,7})?(\.\d{0,7})?[Ee]((\+?0*[1-7])|-0*\d{1,2}))$/

if you want the xponent sign to be optional
ASKER CERTIFIED SOLUTION
Avatar of CetusMOD
CetusMOD
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial