[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 719
  • Last Modified:

vb.net regular expression with $ at beginning of string sought

I'm using escaped characters to allow for special characters.

For example this works

Regex.IsMatch("d+ollar", "^\bd\+ollar\b") = true
Regex.IsMatch("d$ollar", "^\bd\$ollar\b") = true
Regex.IsMatch("d{ollar", "^\bd\{ollar\b") = true

But when the special character is at the start or end of the string it does not work. For example:

Regex.IsMatch("+dollar", "^\b\+dollar\b") = false
Regex.IsMatch("$dollar", "^\b\$dollar\b") = false
Regex.IsMatch("{dollar", "^\b\{dollar\b") = false

Regex.IsMatch("dollar+", "^\bdollar\+\b") = false
Regex.IsMatch("dollar$", "^\bdollar\$\b") = false
Regex.IsMatch("dollar{", "^\bdollar\{\b") = false

How can I use special charaters at the at the start and end of the string?

Thanks,
Glenn
0
glenn_r
Asked:
glenn_r
  • 4
  • 4
1 Solution
 
Fernando SotoCommented:
Hi glenn_r;

Regex.IsMatch("+dollar", "^\b\+dollar\b") = false

The \b defines a word boundary for example, [a-zA-Z_0-9][^a-zA-Z_0-9] would be a word boundary or [^a-zA-Z_0-9][a-zA-Z_0-9] would be another word boundary. The \b before the + in the input string fails the test, in other words \b+ fails the test because the start of the string and the + is not a word boundary, removing the \b at the begining of the regex pattern will correct the issue.

The same thing holds true for the other two items.

The last three issues is the same as the first three but this time removing the last \b will correct the issue for the same reason.

Fernando
0
 
glenn_rAuthor Commented:
Fernando

I tried your solution and you are correct. Before I award and accept the solition I need you to clarify the reasoning behind this logic.

From my understanding wrapping the search string "dollar" in \b means - The "\b" is a special code that means, "match the position at the beginning or end of any word". This expression will only match complete words spelled "dollar" with any combination of lower case or capital letters. Example "\bdollar\b"

I have some instances where I need to find the word dollar with +, $, [, (, (, etc. AKA special characters. From what I read to use the literal value prefix the special character with a backslash. So if I wanted to find the word "+dollar" the regex should be "\b\+dollar\b". Why is this so? Please explain.

Thanks
Glenn


0
 
glenn_rAuthor Commented:
Fernando,

I did more testing. The following does not work

Match the word "dollar+"
?Regex.IsMatch("dollar+", "^\bdollar\+") = TRUE
The issue is that it will also return true for
dollar++
dollar+anycharatersaftertheplus

i want to find the exact word literal values +dollar, dollar+, $dollar, dollar$. I want to remove the special meaning of the characters because the user might type in a word with a special character so I prefix the character with a backslash escape character to use its literal value.

Thanks
Glenn
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
Fernando SotoCommented:
Hi Glenn;

In a regular expression \b denotes that the match must occur on a word boundary, between \w and \W character.

Where \w is any alphanumeric character and the underscore character, [a-zA-Z_0-9]
Where \W is all other non- alphanumeric characters, [^a-zA-Z_0-9].

In this case that you give in your post ID: 24025370 you state the following, "So if I wanted to find the word "+dollar" the regex should be "\b\+dollar\b". Why is this so? Please explain.", Well in fact that is not the case for this reason, a \b to be matched in a pattern it must have a alphanumeric to the left or right of it and a non-alphanumeric on the other side of it otherwise it fails. for example if this is the input string, " +dollar" and you have the regex pattern of "\b\+dollar\b" the \b is between the space character and the + sign, a whitespace character is not an alphanumeric and neither is the + sign and fails the definition. Now if the input string were this, "X+dollar" where X is any alphanumeric or underscore character then the input string would match and pass.

In your post ID: 24025404 you state this, "Regex.IsMatch("dollar+", "^\bdollar\+") = TRUE The issue is that it will also return true for dollar++", well that is correct the pattern is that you start matching at the beginning of the string and the first pattern character is the \b, the character to the left of \b is a non-character and for the \b to match in that position it has to find a alphanumeric character next which in this case is the letter d so at this point we have a match. Then the next 6 characters it must be ollar+ as per the pattern which it is. At this point you have run out of pattern characters and so regex last known state was that it found one complete match and returns true as it should.

Two things 1- will there be other words around the "+dollar, dollar+, $dollar, dollar$. " or will it be the only word in the string? and 2- what version of Visual Studio are you using? Because there may an easy way to have the regex engine escape the user input string.

Fernando
0
 
glenn_rAuthor Commented:
using vs vb.net 2005

My application:

I have a list of file names.

list of files
------------------------
myfile.doc
xfile.doc
test1.xls
dollar$.txt
+dollar.xxx
dollar
dollar2

I have a textbox where the user can specify text used to filter the list. Example, user wants to see all the files that (start) with the string dollar. They'd type in dollar*. The * wildcard meas anything after the * would return

dollar$.txt
dollar
dollar2

Note that the user does not type in regex strings. I convert the search string to a regex for matching

Note that all my logic works without regexe special characters. as some of the special characters can be use in file names I have to allow for them.
0
 
Fernando SotoCommented:
Hi Glenn;

The function Regex.Escape will properly escape the user input for any characters that the Regex engine uses. The following statement will automatically escape the Regex meta-characters and place the result in the variable pattern.

Dim pattern As String = Regex.Escape(TextBox1.Text)

In setting up the Regex pattern which you will add to the variable pattern use the ^ for the start of the string and the $ for the end of the string. Also I would not use the \b meta-character in this scenario.

To your statement, "Note that all my logic works without regexe special characters. as some of the special characters can be use in file names I have to allow for them.", The Regex.Escape will take care of this.

Fernando
0
 
glenn_rAuthor Commented:
thanks for the help
0
 
Fernando SotoCommented:
Not a problem, glad I was able to help.  ;=)
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 4
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now