Link to home
Start Free TrialLog in
Avatar of HLRosenberger
HLRosenbergerFlag for United States of America

asked on

Parsing question

How can I best search for this data in a larger string, where the number of spaces in between each word can vary?   regex?  


MOTHER             RACE       GENDER    AGE (DOB)      SSN      PHONE
Avatar of hielo
hielo
Flag of Wallis and Futuna image

That's what I would use.  I would also use word boundaries to make sure it doesn't match in the middle of a word - ex: TRACE
myRegExp.Pattern = "\s*\b(MOTHER|RACE|GENDER|AGE|\(DOB\)|SSN|PHONE)\b\s*"

Open in new window

Avatar of HLRosenberger

ASKER

OK,  thanks.  I know regular expressions stuff is powerful, but I have not used them much.   In your example of TRACE, I would not want that to match because, even thought it contains RACE.   There would only ever be spaces between MOTHER and RACE, so I would want "MOTHER     TRACE" to not match.
>>  In your example of TRACE, I would not want that to match because, even thought it contains RACE
That's exactly why I included the "\b" (word boundary) delimiters.  The expression above would match RACE, but not if it is part of a larger word.

Are you looking for the entire substring "MOTHER             RACE       GENDER    AGE (DOB)      SSN      PHONE" in that specific order in a larger string (with varying spaces between the words of course), OR are you looking for any of those words?  The regex I posted looks for each of those words -- the "|" means "OR".  In other words it states, match "MOTHER OR RACE OR ..."
Also, I need the index/offset of that substring within a larger string
Ah, sorry.  I'm looking for a string with ALL those words, with varying spaces between the words.
OK, then change the "|" to "\s+"
myRegExp.Pattern = "\s*\b(MOTHER\s+RACE\s+GENDER\s+AGE\s+\(DOB\)\s+SSN\s+PHONE)\b\s*"

Open in new window

ah, great!  that works.  So in english, what does this search string do?  

search for a string that begins and ends with spaces, and

Inside the parens is the string to search for, and

\s+ means there can be any number of spaces .

What does the \b mean?
ASKER CERTIFIED SOLUTION
Avatar of hielo
hielo
Flag of Wallis and Futuna image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks so much!