Link to home
Start Free TrialLog in
Avatar of Tom Knowlton
Tom KnowltonFlag for United States of America

asked on

RegEx how to

Can someone provide me with RegEx code that will turn

THIS

Dim LenderPhone_Before As Variant


INTO THIS:

Write #1, "LenderPhone, " & LenderPhone_Before


public string ConvertVarIntoPhrase(string oldString)
{

}


==============

For example:

string oldString = "Dim LenderPhone_Before As Variant";
string newString="";

newString = ConvertVarIntoPhrase(oldString)



newString now contains:

Write #1, "LenderPhone, " & LenderPhone_Before


========================================


I will be working on this as well......as I am trying to learn RegEx.....
Avatar of amit_g
amit_g
Flag of United States of America image

Try this ...

    Dim reg_exp As New RegExp
    reg_exp.Pattern = "Dim (.*?)_(.*?) As Variant"
    reg_exp.IgnoreCase = True
    reg_exp.Global = True
    reg_exp.Replace(InString, "Write #1, ""$1,"" & $1_$2")
Oh that is VB.Net. Convert it to C#. The pattern should be

Dim (.*?)_(.*?) As Variant

and the replace string should be

Write #1, \"$1,\" & $1_$2

with " escaped with \
Avatar of Tom Knowlton

ASKER

What would it be in C#?

Did I post in the wrong TA....or did you assume I wanted VB?   :)

Anyways....thanks!!!!   man you guys are fast!!!!  wow!
Nevermind....you got it!
So the C# method internal code would look like......????

public string ConvertVarIntoPhrase(string oldString)
{
???????
????????
???????
}
So it is

using System.Text.RegularExpressions

that gives me access to the RegEx class library?
using System.Text.RegularExpressions

public string ConvertVarIntoPhrase(string oldString)
{
    string newString;
    RegEx reg_exp = new RegEx;

    reg_exp.Pattern = "Dim (.*?)_(.*?) As Variant";
    reg_exp.IgnoreCase = true;
    reg_exp.Global = true;

    newString = reg_exp.Replace(oldString, "Write #1, ""$1,"" & $1_$2");

    return newString;
}
Super!

Let me try it out......


Tom
using System;
using System.Text.RegularExpressions;


namespace ConsoleApplicationTestOOPConcepts
{
      /// <summary>
      /// Summary description for Class1.
      /// </summary>
      class ClassTestOOOP
      {
            /// <summary>
            /// The main entry point for the application.
            /// </summary>
            [STAThread]
            static void Main(string[] args)
            {
                  string exp = "Dim LenderPhone_Before As Variant";
                  string newStr;
                  newStr = ConvertVarIntoPhrase(exp);
                  Console.WriteLine(newStr);
                  Console.ReadLine();
                  
                  //
                  // TODO: Add code to start application here
                  //

//                  Branch br = new Branch();
//                  //Branch br2 = new Branch();
//
//                  br.GetBranch();
//                  
//                  //br2 = (Branch)br.GetCopy();
//
//                  br.BranchName = "Tom";
//
//                  //br.CheckForChanges(br2);
//
//                  br.CheckForChanges();


//
//                  ClosingAgency ca = new ClosingAgency();
//
//                  ca.GetBranch();
//
//                  Console.WriteLine(ca.ClosingAgencyName);
//                  Console.WriteLine(ca.BranchName);            
            }



            static public string ConvertVarIntoPhrase(string oldString)
            {
                  string newString;
                  Regex reg_exp = new Regex("abc");

                  //reg_exp.Pattern = "Dim (.*?)_(.*?) As Variant";
                  //reg_exp.IgnoreCase = true;
                  //reg_exp.Global = true;

                  newString = reg_exp.Replace(oldString, "Write #1, \"$1,\" & $1_$2");

                  return newString;
            }
      }
}



When I return newString...........   newString equals oldString........   hmmmm........


The replace does not seem to be working.
A working example ...

        private void TestRegEx()
        {
            string oldString, newString;
            Regex reg_exp = new Regex("Dim (.*?)_(.*?) As Variant");

            oldString = "Dim LenderPhone_Before As Variant";

            newString = reg_exp.Replace(oldString, "Write #1, \"$1,\" & $1_$2");

            Trace.Warn(oldString);
            Trace.Warn(newString);
        }
ASKER CERTIFIED SOLUTION
Avatar of amit_g
amit_g
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
FOR MY REFERENCE:

static public string ConvertVarIntoPhrase(string oldString)
{
      string newString;
      Regex reg_exp = new Regex("Dim (.*?)_(.*?) As Variant");
      newString = reg_exp.Replace(oldString, "Write #1, \"$1,\" & $1_$2");
      return newString;
}
I am trying to dissect the following.

I have a book from O'Reilly on "Mastering Regular Expressions"

(.*?)


(    )   is an alternation sequence    ????


.      means  "any one character"

*     means   "any number, including none, of the item"      item being   "any one character"

?      means  preceding character is allowed to appear at this point in the expression, but whose existence is not required  to still be considered a successful match.


================


So taken as a whole.......this

Regex("Dim (.*?)_(.*?) As Variant");

seems to say

"The string we are creating variables for MUST start with the following four characters:  Dim<space>, then there can exist ANY number of characters or no characters, then a   "_"  ,  then there can exist ANY number of characters or no characters, then the following sequence of characters:  <space>As<space>Variant

Each  (.*?)      means    one    "variable"     ?????

the first   (.*?)  =  $1


the second   (.*?)  =  $2


?????

You got most of it. Anything inside () can be refernced later in the replace using $1, $2 and so on.

.* is any character, any number of times.
.*? is any character, any number of times but consume as less characters as possible. So this stops at the next pattern item.

Without a ? .* will match all the way to the end. With ? .*? matches upto next item (_).

>>>>Without a ? .* will match all the way to the end. With ? .*? matches upto next item (_).


can you give me a few more examples?
.* matches any character any number of times. So there is no stopping. So in

Regex("Dim (.*)_(.*) As Variant");

the first .* would have matched

LenderPhone_Before As Variant

but we wanted it to stop at _. That is acomplished using ?.
In effect:

?    means   "keep going any number of times until you reach the first character outside the (...)  alternation sequence"

 ???????
( ) has nothing to do with it. () is used to capture a matched patter in positional variables ($1, $2...). If we had this

Regex("Dim .*_.* As Variant");

the stopping happens at _. Although in this case it doesn't really matter as we are not capturing the matches.

.* is hungry. It will eat up all available characters and stop only at the end. ? stops that behavior and let it consume as little as possible so that the remaining pattern ("_.* As Variant" in this case) can be applied.
?      tells     .*     "eat as little as you can until you reach the _ "

??
So    the    (    )            is there simply to say    "setup a new variable for each sequence of letters that mach   .*?"


In our case.....two variables got created.
>> ?      tells     .*     "eat as little as you can until you reach the _ "

Yes.

>> So    the    (    )            is there simply to say    "setup a new variable for each sequence of letters that mach   .*?"

Yes.
LOL......Tom finally got it.


RegEx is hard........but seems worth the effort.......
Yes it is hard but it can do miracle in string manipulations.
Yes....the O'Reilly book I bought promises me that it is very powerful as far as text manipulation.