• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 204
  • Last Modified:

help needed with regular expression

Hi

I have written a regular expression for a complication input string and I would like help to optimise or improve it please.

Here are some examples of the input string

556:$ID,354879020144852,IOP,I,0x10,GPSEX,A,D,060509,T,140521,La,52.97084,N,Lo,2.18178,W,V,0.3,*
556:$ID,354879020144852,ALARM,0x00000010,IOP,I,0x00,GPSEX,A,D,060509,T,140626,La,52.97082,N,Lo,2.18182,W,V,0.8

The format is explained below. I am assuming you can see the parts of teh string are delineated by commas
a) 3 or 4 digits followed by $ID:
b) variable 15 digits
c - not always present)  ALARM,0x00000010
c) IOP followed by an I or an O
d)0x00 or 0x00
e) GPSEX
f) an A or a V
g) D
h) 6 digits
i) T
j) 6 digits
k)La
l) 0.00000 or 2 digits, decimal point, 5 digits e.g. 52.12345
m) N, lo
o) 0.00000 or 1 digit, decimal point, 5 digits e.g. 2.12345
p) W or E
q) V
r) 0.0 or 1 to 3 digits, decimal point, 2 digits e.g  0.3, 45.3, 101.3
s - not always present) ,*

Here is what i have so far

\d{3,4}$ID,(\d{15}),IOP,[I|O],([0x00|0x10]),GPSEX,([A|V]),D,(\d{6}),T,(\d{6}),La,(.{7,8}),N,Lo,.{7},([W|E]),V,

I am particularly stuck on the last bit. The number that comes after the V is sometimes followed by a comma and a *

I need to save the number that follows the V but i don't know how to. How do i write a character set that says "character can be number from 0 to 9 or a decimal point only"

Many thanks
andrea
0
andieje
Asked:
andieje
  • 7
  • 4
1 Solution
 
ToddBeaulieuCommented:
[0-9]*.[0-9]*

will match a numeric sequence, a decimal, and another numeric sequence.

Do you need to specify the length maximums?
0
 
Todd MummertCommented:

for most of your decimals you may want to use:

(\d+(?:\.\d+)?)       # this will match simple number (1, 12, etc), as well as ones that include a decimal followed by more digits

You can add the additional specifications \d{min,max}   rather than using \d if you want

for the not always preset ,*, you can use

(,\*)?

the same for the not always present ALARM

(,ALARM,0x00000010)?


0
 
andiejeAuthor Commented:
climbgunks, please could you explain how this matches a decimal?

(\d+(?:\.\d+)?)  

Why are there 2 lots of brackets? I also don't know what the ?:\ bit means

However that is more detailed than I need because the numbers are always expressed as a decimal (i think the above accounts for optional digits after an optional decimal place)
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
andiejeAuthor Commented:
regarding this:

(\d+(?:\.\d+)?)

I see this means

\d+  one of more digits

followed by optional decimal place \. and one or more digits again \d+

This would give (\d+(\.\d+)?) to me. I don;t know what the ?: is for
0
 
Todd MummertCommented:
the ?: after the opening (  tells perl not to add it to the fields... just makes things neater

usually every pair of ()'s would count as a field   $1, $2, etc

so we don't want a field with just the  stuff after the decimal point (and the decimal point)... the entire number is already included in the previous cluster

sorry for the confusion
0
 
andiejeAuthor Commented:
Im trying to do it bit by bit and I can't even get this first bit to work

"$ID,(\d{15}).*"

can you see anything wrong with that?

thanks
0
 
andiejeAuthor Commented:
i think $ might mean end of line?
0
 
Todd MummertCommented:
escape the $
\$ID,(\d{15})
0
 
andiejeAuthor Commented:
More problems

This works

           Dim gpsRegExString As String = "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),.*"

but when i add GPSEX to the end it stops matching

  Dim gpsRegExString As String = "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),GPSEX.*"

ANY IDEAS?

thanks
0
 
andiejeAuthor Commented:
My mistake, it stops working when i add the comma

so this works
"\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]).*"
but this does not  "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),.*"
0
 
Todd MummertCommented:

your use of [] is incorrect...

it's not used to delimit choices, but to create matching character sets...

so you want

(I|O)   and (0x00|0x10)    

this [0x00|0x10]   matches a single character from  [0x|1]     including the character '|'  which isn't what you want at all.

0
 
andiejeAuthor Commented:
thanks for your help, all working
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 7
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now