Solved

help needed with regular expression

Posted on 2009-05-18
12
194 Views
Last Modified: 2012-05-07
Hi

I have written a regular expression for a complication input string and I would like help to optimise or improve it please.

Here are some examples of the input string

556:$ID,354879020144852,IOP,I,0x10,GPSEX,A,D,060509,T,140521,La,52.97084,N,Lo,2.18178,W,V,0.3,*
556:$ID,354879020144852,ALARM,0x00000010,IOP,I,0x00,GPSEX,A,D,060509,T,140626,La,52.97082,N,Lo,2.18182,W,V,0.8

The format is explained below. I am assuming you can see the parts of teh string are delineated by commas
a) 3 or 4 digits followed by $ID:
b) variable 15 digits
c - not always present)  ALARM,0x00000010
c) IOP followed by an I or an O
d)0x00 or 0x00
e) GPSEX
f) an A or a V
g) D
h) 6 digits
i) T
j) 6 digits
k)La
l) 0.00000 or 2 digits, decimal point, 5 digits e.g. 52.12345
m) N, lo
o) 0.00000 or 1 digit, decimal point, 5 digits e.g. 2.12345
p) W or E
q) V
r) 0.0 or 1 to 3 digits, decimal point, 2 digits e.g  0.3, 45.3, 101.3
s - not always present) ,*

Here is what i have so far

\d{3,4}$ID,(\d{15}),IOP,[I|O],([0x00|0x10]),GPSEX,([A|V]),D,(\d{6}),T,(\d{6}),La,(.{7,8}),N,Lo,.{7},([W|E]),V,

I am particularly stuck on the last bit. The number that comes after the V is sometimes followed by a comma and a *

I need to save the number that follows the V but i don't know how to. How do i write a character set that says "character can be number from 0 to 9 or a decimal point only"

Many thanks
andrea
0
Comment
Question by:andieje
  • 7
  • 4
12 Comments
 
LVL 16

Expert Comment

by:ToddBeaulieu
ID: 24413711
[0-9]*.[0-9]*

will match a numeric sequence, a decimal, and another numeric sequence.

Do you need to specify the length maximums?
0
 
LVL 11

Accepted Solution

by:
climbgunks earned 500 total points
ID: 24413956

for most of your decimals you may want to use:

(\d+(?:\.\d+)?)       # this will match simple number (1, 12, etc), as well as ones that include a decimal followed by more digits

You can add the additional specifications \d{min,max}   rather than using \d if you want

for the not always preset ,*, you can use

(,\*)?

the same for the not always present ALARM

(,ALARM,0x00000010)?


0
 

Author Comment

by:andieje
ID: 24415493
climbgunks, please could you explain how this matches a decimal?

(\d+(?:\.\d+)?)  

Why are there 2 lots of brackets? I also don't know what the ?:\ bit means

However that is more detailed than I need because the numbers are always expressed as a decimal (i think the above accounts for optional digits after an optional decimal place)
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:andieje
ID: 24415544
regarding this:

(\d+(?:\.\d+)?)

I see this means

\d+  one of more digits

followed by optional decimal place \. and one or more digits again \d+

This would give (\d+(\.\d+)?) to me. I don;t know what the ?: is for
0
 
LVL 11

Expert Comment

by:climbgunks
ID: 24415614
the ?: after the opening (  tells perl not to add it to the fields... just makes things neater

usually every pair of ()'s would count as a field   $1, $2, etc

so we don't want a field with just the  stuff after the decimal point (and the decimal point)... the entire number is already included in the previous cluster

sorry for the confusion
0
 

Author Comment

by:andieje
ID: 24415896
Im trying to do it bit by bit and I can't even get this first bit to work

"$ID,(\d{15}).*"

can you see anything wrong with that?

thanks
0
 

Author Comment

by:andieje
ID: 24415912
i think $ might mean end of line?
0
 
LVL 11

Expert Comment

by:climbgunks
ID: 24415913
escape the $
\$ID,(\d{15})
0
 

Author Comment

by:andieje
ID: 24416027
More problems

This works

           Dim gpsRegExString As String = "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),.*"

but when i add GPSEX to the end it stops matching

  Dim gpsRegExString As String = "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),GPSEX.*"

ANY IDEAS?

thanks
0
 

Author Comment

by:andieje
ID: 24416047
My mistake, it stops working when i add the comma

so this works
"\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]).*"
but this does not  "\$ID,(\d{15}),(ALARM,0x00000010,)?IOP,[I|O],([0x00|0x10]),.*"
0
 
LVL 11

Expert Comment

by:climbgunks
ID: 24416060

your use of [] is incorrect...

it's not used to delimit choices, but to create matching character sets...

so you want

(I|O)   and (0x00|0x10)    

this [0x00|0x10]   matches a single character from  [0x|1]     including the character '|'  which isn't what you want at all.

0
 

Author Closing Comment

by:andieje
ID: 31582653
thanks for your help, all working
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Since .Net 2.0, Visual Basic has made it easy to create a splash screen and set it via the "Splash Screen" drop down in the Project Properties.  A splash screen set in this manner is automatically created, displayed and closed by the framework itsel…
The ECB site provides FX rates for major currencies since its inception in 1999 in the form of an XML feed. The files have the following format (reducted for brevity) (CODE) There are three files available HERE (http://www.ecb.europa.eu/stats/exch…
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question