Regular Expression Delegate

Praveen VenuTechnical Project Manager
Finally, a really useful addition to the .NET Framework is that the Regex.Replace() method allows the use of a delegate as the "replacement" argument. To understand what I'm talking about, consider the following snippet:

Dim myString As String = RegEx.Replace( "a true taste of the temperature", "t.*?e\b", "a" )

Open in new window

After the replace operation has occurred, the value of myString will be "a a a of a a" and it's fairly obvious what happened. Every time the regular expression parser found a match within the string it replaced it with the letter "a". That's all nice and easy if all you need to do is a straight replace, but what about if you need to implement some sort of business logic into the check or you need to "touch" the sub-matches in some way and re-build the replaced string.

A good enough example is converting all words within a body of text to proper case (i.e. first letter capitalized). To do this your first instincts might be to create a pattern like so: \b(\w)(\w+)?\b. You could then enumerate the matches, convert the first sub-match to its uppercase version, join the sub-matches and re-append them to a StringBuilder instance, like so:

mc = re.Matches( bodyOfText )
                      Dim m As Match
                      For Each m In mc
                         sb.AppendFormat("{0}{1}", m.Groups(1).Value.ToUpper(), m.Groups(2).Value)

Open in new window

That would work fine if your string contained only word characters, but, what if it looked like this: ~~~ This %%% is ### a chunk of text. After the replacement operation you would end up with the following string meaning that all non-word characters that didn't participate in the matches were dropped: ThisIsAChunkOfText. There are ways around it, mostly by building bigger, more complex patterns and doing more string building inside the match collection iteration.

A more elegant solution is to wire-up a MatchEvaluator delegate. You can think of a MatchEvaluator as an event handler that fires when an "OnMatch event" occurs. You provide the MatchEvaluator with a pointer (reference) to handler function and that function will be called each time a match is encountered. The function must take a Match parameter as its single argument and must return a String back to the regular expression Replace method that invoked it. This method of replacement allows you the flexibility to do all sorts of operations transparently to the Replace method itself, and because it is all handled within the Replace method call, you are not left with having to re-build a string as in the previous example.

A demonstration is in order - let's re-write our previous failed attempt at converting a string to proper case using delegates:
Sub Page_Load(sender as Object, e as EventArgs)
                          Dim myDelegate As New MatchEvaluator( AddressOf MatchHandler )
                          Dim sb As New System.Text.Stringbuilder()
                          Dim bodyOfText As String = _
                              "~~~ This %%% is ### a chunk of text."
                          Dim pattern As String = "\b(\w)(\w+)?\b"
                          Dim re As New Regex( _
                              pattern, RegexOptions.Multiline Or _
                              RegexOptions.IgnoreCase _
                          Dim newString As String = re.Replace(bodyOfText, myDelegate)
                          Response.Write( bodyOfText & "<hr>" & newString )
                      End Sub
                      Private Function MatchHandler( ByVal m As Match ) As String
                          Return m.Groups(1).Value.ToUpper() & m.Groups(2).Value
                      End Function

Open in new window

Praveen VenuTechnical Project Manager

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.