Solved

Vexing regex in VB, Pt. 2

Posted on 2014-02-27
13
306 Views
Last Modified: 2014-02-27
I need to find table names that are located between two markers, "Begin InputTables" and "Begin OutputColumns". The table names are in turn located between double quotes. In the following example, I am looking for "AGCOUNTY", "ELIGIBILITY" and "X":

Begin InputTables
    Name ="AGCOUNTY"
    Name ="ELIGIBILITY"
    Name ="X"
End
Begin OutputColumns
Since I'm looping through the text, I don't need or want to find these strings use a regex MatchCollection.

My question: why doesn't the following regex...

(?<=^Begin InputTables\r*.*)X
... find the X in singleline mode?

This regex...

(?<=^Begin InputTables\r\n.*)AGCOUNTY
... will find AGCOUNTY in multiline mode, and this regex...

(?<=^Begin InputTables\r\n.*\r\n.*)ELIGIBILITY
... will find ELIGIBILITY in multiline mode.

Put another way, what regex can I use to place an unknown number of carriage returns in a lookahead in .Net?
0
Comment
Question by:aanuncio
  • 7
  • 5
13 Comments
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893736
It worked for me on myregextester.com with the .NET mode turned on. The wildcard "." should match \r so the pattern can be simplified to:
(?<=^Begin InputTables.*)X

Open in new window

but it won't work if the word Begin is not at the very start of the input text, which is the only thing that ^ matches.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893740
Are you wanting to find those 3 tables by their exact names, or would you prefer to grab the values matching something like:
Name ="[^"]*"

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893742
Multiline mode will make ^ match the beginning of any line, rather than just the beginning of the input string.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893745
\s matches a space character, which can be a space, tab, newline or line feed. That might be what you're looking for?
0
 
LVL 84

Expert Comment

by:ozo
ID: 39893747
.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893754
This pattern might do the trick? It will capture the table names; you just need to extract them from the resulting array
(?s)(?<=^Begin InputTables(?:(?!Begin OutputColumns).)*)Name ="([^"]*)"

Open in new window

The (?s) at the start indicates singleline mode (it can be activated that way).
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 35

Expert Comment

by:Terry Woods
ID: 39893756
The (?:(?!Begin OutputColumns).)* part of the pattern says don't go past "Begin OutputColumns" trying to find a match.
0
 

Author Comment

by:aanuncio
ID: 39893767
TerryAtOpus: eventually I'll need to match anything between double quotes that are in turn between "Begin InputTables" and "Begin OutputColumns", but for now I want to search for the tables by their exact names.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 500 total points
ID: 39893773
Then this should do the trick:
(?ms)(?<=^Begin InputTables(?:(?!Begin OutputColumns).)*)Name ="AGCOUNTY"

Open in new window

0
 

Author Comment

by:aanuncio
ID: 39893801
Should, but doesn't. It only finds the first match. On subsequent iterations, it misses "ELIGIBILITY" and "X".

Here's the code:

                Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.Multiline
                Dim pattern As String = "(?ms)(?<=^Begin InputTables(?:(?!Begin OutputColumns).)*)Name =""" & findText & """"
                Dim m As Match = Regex.Match(inputLines, pattern, options)

Open in new window

I can make it match the first, second or third lines between "Begin InputTables" and "Begin OutputColumns", but I can't make it find all of them using a single pattern. That wouldn't be a problem if I had a finite number of lines, but I don't.
0
 

Author Comment

by:aanuncio
ID: 39893803
By the way, what's the "(?ms)"?
0
 

Author Comment

by:aanuncio
ID: 39893806
TerryAtOpus: I got it. You were right. I am unworthy.
0
 

Author Closing Comment

by:aanuncio
ID: 39893808
How do people get so smart?
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
regex code how to filter this sql email combo 3 20
Limiting string to two deciamls 18 34
Get String split 5 33
Close word object 13 21
In my previous two articles we discussed Binary Serialization (http://www.experts-exchange.com/A_4362.html) and XML Serialization (http://www.experts-exchange.com/A_4425.html). In this article we will try to know more about SOAP (Simple Object Acces…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now