• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2228
  • Last Modified:

Regex to Match and Replace smart quotes

Hi folks, I need to replace smart quotes with the html equiv, but have not been able to even match them in a regex, let alone replace them.  So far, and with help from FernandoSoto (excellent help!) I was able to put together this bit of code which deals with standard quotes and other characters. I tried to include smart quotes like this (really is hex for "angle", can't find a hex specific to "smart" quotes)

 'Get input string from textbox.
        Dim input As String = TextBox1.Text
        'Declare new regex
        Dim re As New Regex("(\x22|'|&|<|>|\x93|\x94)")
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)

        'MessageBox.Show(output)
        TextBox2.Text = output

    End Sub

    Private Function ReplaceChars(ByVal m As Match) As String

        Dim retEntity As String = String.Empty

        Select Case (m.Groups(1).Value)
            Case """"
                retEntity = "&quot;"
            Case "'"
                retEntity = "&apos;"
            Case "&"
                retEntity = "&amp;"
            Case "<"
                retEntity = "&lt;"
            Case ">"
                retEntity = "&gt;"
            Case "" <-------------------------" I couldn't get my designer (VS 2005) to accept the right-hand angle quote - I'm not sure why.
                retEntity = "&#8221;"
            Case ""
                retEntity = "$#8220;"

        End Select

        Return retEntity

    End Function

Interesting, this won't match angle or smart quotes.  I tried several ways, but can't seem to get a match for them.  Even so, VS 2005 seems to only recognize one side - left angle -- and not the right side, but won't match in regex anyway.

Any ideas?

Thanks!
0
garyLynn7
Asked:
garyLynn7
  • 18
  • 16
1 Solution
 
ozoCommented:
can you use \x3e ?
0
 
Fernando SotoCommented:
Hi Gary;

I think the problem you are having with the , $#8220;, character and the , $#8221;, character is that they are not on the same character code page. the following is what I need to do to make them work

    Private Sub Button1_Click(ByVal sender As System.Object, _
        ByVal e As System.EventArgs) Handles Button1.Click

        ' Chr(147) =  and Chr(148) = 
        Dim input As String = "I'm ""tring to configure"" <a regex> that will " & Chr(147) & "match several different items" & Chr(148) & " in a string, & then replace the matches with corrosponding strings."

        ' The unicode value \u201C is the &ldquo; and the \u201D is &rdquo;
        Dim re As New Regex("(\x22|'|&|<|>|\u201C|\u201D)")
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)
        MessageBox.Show(output)

    End Sub

    Private Function ReplaceChars(ByVal m As Match) As String

        Dim retEntity As String = String.Empty

        ' Needed to change this to get the character numeric value
        Select Case Asc((m.Groups(1).Value))
            Case 34
                retEntity = "&quot;"
            Case 39
                retEntity = "&apos;"
            Case 38
                retEntity = "&amp;"
            Case 60
                retEntity = "&lt;"
            Case 62
                retEntity = "&gt;"
            Case 147
                retEntity = "&#8220;"   '&ldquo;
            Case 148
                retEntity = "&#8221;"   '&rdquo;
        End Select

        Return retEntity

    End Function

Fernando
0
 
garyLynn7Author Commented:
Hi Fernando,  I made these changes and ran a block of text which I pasted over froma a word doc (2003).

This text:

 See that? Nothin to it, she said. But stuff a whole mess of these in a sack and whatcha got? A pillow fit for angels! Just like yall-- 

Returns this conversion:

 See that? Nothin to it,&#8221; she said. But stuff a whole mess of these in a sack and whatcha got? A pillow fit for angels! Just like yall--&#8221;

So it is missing the apostrophe and the left-hand quote.

I see you included the numeric value in the "input string", which is unlikely in the environment I'm working in. It will likely be a copy and paste, or file import to a custom editor for conversion, so I expect to see the actuall characters in the input string.

And the case selector using the numeric value is truly a cool idea!  I ws messing around with using the hex value - but not much success!

Thanks for all your effort!


0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
garyLynn7Author Commented:
My apologies,  somehow the text in the previous post included a "|" where an apostrophe should have been.
Please substitute with the following:

See that? Nothin to it, she said. But stuff a whole mess of these in a sack and whatcha got? A pillow fit for angels! Just like yall-- 

 See that? Nothin to it,&#8221; she said. But stuff a whole mess of these in a sack and whatcha got? A pillow fit for angels! Just like yall--&#8221;

Thanks!
0
 
garyLynn7Author Commented:
Hi ozo,

 &#x3e;  appears to represent ">" in unicode.

So far, I've been working with these variations:

         º   &#x2ba;
         î   &#x2ee;
        Ý    &#x2dd;
         õ    &#x2f5;
         ö    &#x2f6;

The problem is in matching the left and right versions, or sides; and replacing them as indicated in the first post.

Thanks for having a look!
0
 
garyLynn7Author Commented:
Apparently I entered that last set incorrectly!! :>)

Allow me to correct this mistake: the character is how this particular viewer render's them. I.E: I cut and pasted the examples from a web page into this site editor, and they are being rendered as you may see;
however, the actuall hex above is correct for these symbols:

straight quote: #x22;
left curly(smart) quote: #x93;
right curly(smart) quote : #x94;

These are the basic quotation codes I'm trying to use in my app: (from: http://www.robinlionheart.com/stds/html4/entities.xhtml)

left single quotation mark
 #8216
 #x2018
 lsquo
 
right single quotation mark
 #8217
 #x2019
 rsquo
 
single low-9 quotation mark
 #8218
 #x201a
 sbquo
 
left double quotation mark
 #8220
 #x201c
 ldquo
 
right double quotation mark
 #8221
 #x201d
 rdquo
 
double low-9 quotation mark
 #8222
 #x201e
 bdquo

and will use these also:

prime
 #8242
 #x2032
 prime
 
double prime
 #8243
 #x2033
 Prime
 
single left-pointing angle quotation mark
 #8249
 #x2039
 lsaquo
 
single right-pointing angle quotation mark
 #8250
 #x203a
 rsaquo
 
So there are a bunch of possibilities- but it's getting the regex correct that is the challenge!

Thanks!



0
 
Fernando SotoCommented:
Hi Gary;

I think something is happening when you copy and past the info to this site. Can you create a text file with the information and uploaded it the EE - Stuff web site. The address is http://www.ee-stuff.com/login.php The user name and password is the same user name and password you use on this EE web site. Once logged into the system click on the "Expert Area", then click on "Upload a new file" and follow the instructions for uploading a file. I think this will be better.

Fernando
0
 
garyLynn7Author Commented:
Thanks Fernando, I will!  I noticed the issue after it was too late - it appears that the actual symbols are being displayed as something else!

At any rate, the left hand dblquote isn't being picked up in the last expression we were working on.

I'll post the text file as you indicated!

Thanks!

Gary


0
 
Fernando SotoCommented:
Hi Gary;

The last thing we were talking about was

  - This is the character for left double quotation mark      -  ldquo
  - This is the character for right double quotation mark    -  rdquo

In the file that you will be posting please place the characters and there descriptions so I know which character they are.

Fernando

0
 
Fernando SotoCommented:
As you can see my character were not translated correctly on the last post. :=(
0
 
garyLynn7Author Commented:
Hi Fernando, for what ever reason, I cannot login to the file upload section with my current info.  I even requested a new password, but that didn't work either.  Not sure what to do at this point.

I'll try again....
0
 
garyLynn7Author Commented:
OK, got the file uploaded to the question ID for this thread.  Hope it's helpfull.

Please let me know if you need further info, but the submitted file is for the most part what I am trying to convert.

Thanks!
0
 
Fernando SotoCommented:
That is strange, I just tested it and it is working. do you have a web site that you can place the file on so that any one of the experts here can download it? If not let me know.
0
 
Fernando SotoCommented:
I got it. ;=)
0
 
garyLynn7Author Commented:
YAY!
0
 
Fernando SotoCommented:
Hi Hi Gary;

Here is the modified code which converts all the characters in the file you posted. I will post the results of the code up to EE- Stuff so you can download and compare.

    Private Sub Button1_Click(ByVal sender As System.Object, _
        ByVal e As System.EventArgs) Handles Button1.Click

        ' Stream reader is reading the file you posted, file is in same directory as exe
        Dim sr As New StreamReader("Entitylist-UTF-8.txt")
        ' Read all info from file into variable
        Dim input As String = sr.ReadToEnd()
        sr.Close()
        ' New Regex pattern for all characters
        Dim re As New Regex("(\x22|'|&|<|>|\u2013|\u2014|\u2018|\u2019|\u201A|\u201C|\u201D|\u201E)")
        Dim sw As New StreamWriter("Entitylist-UTF-8-Converted.txt")
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)
        sw.Write(output)
        sw.Flush()
        sw.Close()

    End Sub

    Private Function ReplaceChars(ByVal m As Match) As String

        Dim retEntity As String = String.Empty

        ' Convert single character string to char array
        Dim chr() As Char = m.Groups(1).Value.ToCharArray()
        ' Get the integer value of the unicode character
        Dim charValue As Integer = CInt(AscW(chr(0)))

        Select Case charValue
            Case 34
                retEntity = "&quot;"
            Case 39
                retEntity = "&apos;"
            Case 38
                retEntity = "&amp;"
            Case 60
                retEntity = "&lt;"
            Case 62
                retEntity = "&gt;"
            Case 8211
                retEntity = "&#8211;"   'ndash
            Case 8212
                retEntity = "&#8212;"   'mdash
            Case 8216
                retEntity = "&#8216;"   'lsquo
            Case 8217
                retEntity = "&#8217;"   'rsquo
            Case 8218
                retEntity = "&#8218;"   'sbquo
            Case 8220
                retEntity = "&#8220;"   '&ldquo;
            Case 8221
                retEntity = "&#8221;"   '&rdquo;
            Case 8222
                retEntity = "&#8222;"   'bdquo
        End Select

        Return retEntity

    End Function

Fernando
0
 
Fernando SotoCommented:
Here is the direct link to the results of the conversion compare with original to see that all characters were converted.

    https://filedb.experts-exchange.com/incoming/ee-stuff/4306-Entitylist-UTF-8-Converted.txt
0
 
garyLynn7Author Commented:
Thanks Fernando!  I'll do some work and let you know what happens!

Gary
0
 
garyLynn7Author Commented:
Hi Fernando, again many thanks!  This works great!

It really appears the key here is in the replace function and the use of ascii characters with their matches as return entities.

Including the match-group in a dimension for the select was also a nice thing to see, it would have taken me a a good deal of time to come up with the logic as I'm still "green" with a lot of structure and code.

And the corrolation between the regex match entities and the case select entities makes sense, and has an ellegant look  as well; which also explains something about why only one side of a quote was showing up at first.

Great stuff! Thanks!!

Gary

P.S.: I spent at least 30 hours researching this stuff on-line, and I need to say that given the breadth of this question, not much was out there that would lead one to a solution such as you've provided.
It will certainly add to the edification pool!
0
 
Fernando SotoCommented:
Well I am glad that I was able to help and thank you for the kind words. ;=)
0
 
garyLynn7Author Commented:
Hi Fernando, just thought you may be interested in the code for this after refactoring.

Imports System.Text.RegularExpressions

Public Class Form1

#Region " Regex Patterns "
    Private Sub btnConvert_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnConvert.Click
        'Get input string from textbox.
        Dim input As String = tbInput.Text
        'Declare new regex, unicode values for match.
        Dim re As New Regex("(\x22|'|&|<|>|\u2013|\u2014|\u2018|\u2019|\u201A|\u201C|\u201D|\u201E|\x40)")
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)

        tbOutPut.Text = output

    End Sub

    Private Function ReplaceChars(ByVal m As Match) As String
        Dim result As String = String.Empty
        'Convert single character string to char array.
        Dim chr As Char() = m.Groups(1).Value.ToCharArray()
        'Get the integer value of the unicode character.
        Dim charValue As Integer = CInt(AscW(chr(0)))
        'Select from hex values a replacement value.
        Select Case charValue
            Case 34
                result = "&quot;"
            Case 39
                result = "&apos;"
            Case 38
                result = "&amp;"
            Case 60
                result = "&lt;"
            Case 62
                result = "&gt;"
            Case 64
                result = "&#64;" '@
            Case 8211
                result = "&#8211;" 'ndash
            Case 8212
                result = "&#8212;" 'mdash
            Case 8216
                result = "&#8216;" 'lsquo
            Case 8217
                result = "&#8217;" 'rsquo
            Case 8218
                result = "&#8218;" 'sbquo
            Case 8220
                result = "&#8220;" '&ldquo;
            Case 8221
                result = "&#8221;" '&rdquo;
            Case 8222
                result = "&#8222;" 'bdquo
        End Select
        Return result
    End Function
#End Region

End Class
0
 
Fernando SotoCommented:
Excellent looks great. Glad it worked out for you.
0
 
garyLynn7Author Commented:
Hi Fernando, and anyone reading;

now that the select case is working great, how could I incorporate a checkListBox to assign the individual select-case items to the current function?

Any ideas?

Thanks!

Gary
0
 
Fernando SotoCommented:
Hi Gary;

Not sure what you want to assign to the CheckedListBox? Can you explain a little more?

Fernando
0
 
garyLynn7Author Commented:
Sure, sorry for not being clear enough.

We now have a list of case items which tells us to use a "result = xn".
So how could one assign the case to a check box; for example:

(this could be individual check boxes, or an array of check boxes - check list box may not be exactly appropriate)

CheckBox 1 = Case 34
                result = "&quot;"

CheckBox 2 = Case 39
                result = "&apos;"

and so on through the case select list.  The idea being to give the user the oportunity to choose 1 or more of the case select items to use, as opposed to having to use all the select case items in the funtion together.

In other words, one may want to only match dblQuotes, and nothing else, or perhaps any other combination from the list.

Make sense?

Thanks!
Gary
0
 
Fernando SotoCommented:
Hi Gary;

Seeming that the Regex object does all the work here then you will need to build the Regex pattern dynamically so that it only replace for what the user wants to match. Doing this in the ReplaceChars function will not make the process any faster and the code needed to do that will be a little more complex at the expense of more processor time.

This is how I would do it. I created a Panel control and placed all the check box controls in it. I set the Text property to the name of the entity and set its Tag to the value of the entity.

This next code is here to show how I created the panel and check boxes on the form at design time.

    Private Sub InitializeComponent()
        Me.Panel1 = New System.Windows.Forms.Panel
        Me.CheckBox1 = New System.Windows.Forms.CheckBox
        Me.CheckBox2 = New System.Windows.Forms.CheckBox
        Me.CheckBox3 = New System.Windows.Forms.CheckBox
        Me.CheckBox4 = New System.Windows.Forms.CheckBox

        ...

        '
        'Panel1
        '
        Me.Panel1.Controls.Add(Me.CheckBox4)
        Me.Panel1.Controls.Add(Me.CheckBox1)
        Me.Panel1.Controls.Add(Me.CheckBox2)
        Me.Panel1.Controls.Add(Me.CheckBox3)
        Me.Panel1.Location = New System.Drawing.Point(32, 31)
        Me.Panel1.Name = "Panel1"
        Me.Panel1.Size = New System.Drawing.Size(200, 100)
        Me.Panel1.TabIndex = 7
        '
        'CheckBox1
        '
        Me.CheckBox1.AutoSize = True
        Me.CheckBox1.Location = New System.Drawing.Point(15, 19)
        Me.CheckBox1.Name = "CheckBox1"
        Me.CheckBox1.Size = New System.Drawing.Size(47, 17)
        Me.CheckBox1.TabIndex = 8
        Me.CheckBox1.Tag = "\x22|"
        Me.CheckBox1.Text = "quot"
        Me.CheckBox1.UseVisualStyleBackColor = True
        '
        'CheckBox2
        '
        Me.CheckBox2.AutoSize = True
        Me.CheckBox2.Location = New System.Drawing.Point(104, 19)
        Me.CheckBox2.Name = "CheckBox2"
        Me.CheckBox2.Size = New System.Drawing.Size(55, 17)
        Me.CheckBox2.TabIndex = 9
        Me.CheckBox2.Tag = "\u2013|"
        Me.CheckBox2.Text = "ndash"
        Me.CheckBox2.UseVisualStyleBackColor = True
        '
        'CheckBox3
        '
        Me.CheckBox3.AutoSize = True
        Me.CheckBox3.Location = New System.Drawing.Point(15, 70)
        Me.CheckBox3.Name = "CheckBox3"
        Me.CheckBox3.Size = New System.Drawing.Size(49, 17)
        Me.CheckBox3.TabIndex = 10
        Me.CheckBox3.Tag = "'|"
        Me.CheckBox3.Text = "apos"
        Me.CheckBox3.UseVisualStyleBackColor = True
        '
        'CheckBox4
        '
        Me.CheckBox4.AutoSize = True
        Me.CheckBox4.Location = New System.Drawing.Point(104, 70)
        Me.CheckBox4.Name = "CheckBox4"
        Me.CheckBox4.Size = New System.Drawing.Size(57, 17)
        Me.CheckBox4.TabIndex = 11
        Me.CheckBox4.Tag = "\u2014|"
        Me.CheckBox4.Text = "mdash"
        Me.CheckBox4.UseVisualStyleBackColor = True

Note the Tag properties in the above CheckBox's.
--------------------------------------------------------------------

The following code is what needs to be modified. Note no changes to the function ReplaceChars.

    Private Sub Button1_Click(ByVal sender As System.Object, _
        ByVal e As System.EventArgs) Handles Button1.Click

        Dim sr As New StreamReader("Entitylist-UTF-8.txt")
        Dim input As String = sr.ReadToEnd()
        sr.Close()
        ' Build Regex patter depending on what check boxes are checked
        Dim pattern As String = BuildPattern()
        ' Check to see if we have a string pattern if not no check boxes were checked.
        If Not pattern = String.Empty Then
            ' We have a pattern so do the matching
            Dim re As New Regex(pattern)
            Dim sw As New StreamWriter("Entitylist-UTF-8-Converted.txt")
            Dim output As String = re.Replace(input, AddressOf ReplaceChars)
            sw.Write(output)
            sw.Flush()
            sw.Close()
        End If

    End Sub

    ' New function to build a Regex pattern
    Private Function BuildPattern() As String

        Dim retPattern As String = String.Empty
        For Each ctl As Control In Panel1.Controls
            Dim cb As CheckBox
            If ctl.GetType.Name = "CheckBox" Then
                cb = CType(ctl, CheckBox)
                If cb.Checked Then
                    retPattern &= cb.Tag.ToString()
                End If
            End If
        Next
        If retPattern.Length > 0 Then
            retPattern = "(" & retPattern.Substring(0, retPattern.Length - 1) & ")"
        End If

        Return retPattern

    End Function

Fernando
0
 
garyLynn7Author Commented:
Hi Fernando, I'm still working on this - I was delayed by another project issue for one of my clients.  
I'll let you know soon how this went.  My appologies for keeping you waiting!

Gary
0
 
garyLynn7Author Commented:
Hi Fernando,

Well, after some work I've ended up with and error at the ReplaceChars Function: "Index out of range" at this statment: 'Get the integer value of the unicode character.
        Dim charValue As Integer = CInt(AscW(chr(0)))

I don't know what this issue could be, but I see you are also reading a file as the input source, and writing back with the stream reader to file.

I'm using 2 rich textboxes where the user types or pastes text, the converter processes it, and writes the result back to the 2nd RTB.

Here's how the code looks after I've attempted to adopt it:

   Private Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        'Get input string from textbox.
        Dim input As String = rtbInput.Text
        'Declare new regex, unicode values for match.
        Dim pattern As String = BuildPattern()
        Dim re As New Regex(pattern)
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)

        rtbOutPut.Text = output
    End Sub

    Private Function ReplaceChars(ByVal m As Match) As String
        Dim result As String = String.Empty
        'Convert single character string to char array.
        Dim chr As Char() = m.Groups(1).Value.ToCharArray()
        'Get the integer value of the unicode character.
        Dim charValue As Integer = CInt(AscW(chr(0)))
        'Select from hex values a replacement value.
        Select Case charValue
            Case 34
                result = "&quot;"
            Case 39
                result = "&apos;"
            Case 38
                result = "&amp;"
            Case 60
                result = "&lt;"
            Case 62
                result = "&gt;"
            Case 64
                result = "&#64;" '@
            Case 8211
                result = "&#8211;" 'ndash
            Case 8212
                result = "&#8212;" 'mdash
            Case 8216
                result = "&#8216;" 'lsquo
            Case 8217
                result = "&#8217;" 'rsquo
            Case 8218
                result = "&#8218;" 'sbquo
            Case 8220
                result = "&#8220;" '&ldquo;
            Case 8221
                result = "&#8221;" '&rdquo;
            Case 8222
                result = "&#8222;" 'bdquo
        End Select
        Return result
    End Function
#End Region

    ' New function to build a Regex pattern
    Private Function BuildPattern() As String

        Dim retPattern As String = String.Empty
        For Each ctl As Control In SplitContainerControl1.Panel2.Controls
            Dim cb As DevExpress.XtraEditors.CheckEdit
            If ctl.GetType.Name = "CheckBox" Then
                cb = CType(ctl, DevExpress.XtraEditors.CheckEdit)
                If cb.Checked Then
                    retPattern &= cb.Tag.ToString()
                End If
            End If
        Next
        If retPattern.Length > 0 Then
            retPattern = "(" & retPattern.Substring(0, retPattern.Length - 1) & ")"
        End If

        Return retPattern

    End Function

I didn't have any problem setting up the check boxes and tags, and otherwise I think all else is as expected for the most part.

What do you think?

need more info?
0
 
Fernando SotoCommented:
Hi Gary;

Well seeming that the ReplaceChars function was working before the modification I do not think it is there but more likely setting up the regex pattern. Can you copy and past the section of the form designer code that shows all the properties of all the DevExpress.XtraEditors.CheckEdit controls as I did in my most resent code.

Fernando
0
 
garyLynn7Author Commented:
Certainly - you probably know they are extended check box controls -- the tag property is there as well the text property, amd there are extra properties as well available. I've already tried with MS controls with the same result.
Also note that I'm using a control panel, but as a custom split control panel; I'll include it as well.

Since this is taking on some complexity, let me say something about the application so you understand it a little better.

I'm trying to set this up as a user control to display on a main form, and use a left nav bar to link it to the form.
That's going well and I have a couple other user controls doing similar tasks already.
Anyway, it has all been working fine other than the check boxes, and I'll research the error to see if I can understand the problem as well.
Here's the designer code:

        Me.CheckEdit4.Location = New System.Drawing.Point(6, 180)
        Me.CheckEdit4.Name = "CheckEdit4"
        Me.CheckEdit4.Properties.Caption = "mdash"
        Me.CheckEdit4.Size = New System.Drawing.Size(75, 19)
        Me.CheckEdit4.TabIndex = 3
        Me.CheckEdit4.Tag = "\u2014|"
        '
        'CheckEdit3
        '
        Me.CheckEdit3.Location = New System.Drawing.Point(6, 131)
        Me.CheckEdit3.Name = "CheckEdit3"
        Me.CheckEdit3.Properties.Caption = "apos"
        Me.CheckEdit3.Size = New System.Drawing.Size(75, 19)
        Me.CheckEdit3.TabIndex = 2
        Me.CheckEdit3.Tag = "'|"
        '
        'CheckEdit2
        '
        Me.CheckEdit2.Location = New System.Drawing.Point(6, 82)
        Me.CheckEdit2.Name = "CheckEdit2"
        Me.CheckEdit2.Properties.Caption = "ndash"
        Me.CheckEdit2.Size = New System.Drawing.Size(75, 19)
        Me.CheckEdit2.TabIndex = 1
        Me.CheckEdit2.Tag = "\u2013|"
        '
        'CheckEdit1
        '
        Me.CheckEdit1.Location = New System.Drawing.Point(6, 33)
        Me.CheckEdit1.Name = "CheckEdit1"
        Me.CheckEdit1.Properties.Caption = "quot"
        Me.CheckEdit1.Size = New System.Drawing.Size(75, 19)
        Me.CheckEdit1.TabIndex = 0
        Me.CheckEdit1.Tag = "\x22|"

And the panel control designer code:

 Me.SplitContainerControl1.Location = New System.Drawing.Point(0, 0)
        Me.SplitContainerControl1.Name = "SplitContainerControl1"
        Me.SplitContainerControl1.Panel1.Controls.Add(Me.Button1)
        Me.SplitContainerControl1.Panel1.Controls.Add(Me.rtbOutput)
        Me.SplitContainerControl1.Panel1.Controls.Add(Me.rtbInput)
        Me.SplitContainerControl1.Panel1.Text = "Panel1"
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit11)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit10)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit9)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit8)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit7)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit6)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit5)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit4)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit3)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit2)
        Me.SplitContainerControl1.Panel2.Controls.Add(Me.CheckEdit1)
        Me.SplitContainerControl1.Panel2.Text = "Panel2"
        Me.SplitContainerControl1.Size = New System.Drawing.Size(556, 650)
        Me.SplitContainerControl1.SplitterPosition = 278
        Me.SplitContainerControl1.TabIndex = 0
        Me.SplitContainerControl1.Text = "SplitContainerControl1"

Gary
0
 
Fernando SotoCommented:
Hi Gary;

This may have been my fault. In my post I only provided these 4 values to be placed into the Tag property of the check boxes. I did not indicate in my post that I left the others for you to do, sorry was in a rush to leave work and get home.

These four are already in the Tag property of the check box in your project.

Me.CheckEdit4.Tag = "\u2014|"
Me.CheckEdit3.Tag = "'|"
Me.CheckEdit2.Tag = "\u2013|"
Me.CheckEdit1.Tag = "\x22|"

These need to be added. Note that I have a ? in the control name change it to the correct character for the control it goes to.

Me.CheckEdit?.Tag = "&|"              '"&amp;"
Me.CheckEdit?.Tag = "<|"              '"&lt;"
Me.CheckEdit?.Tag = ">|"              '"&gt;"
Me.CheckEdit?.Tag = "\u2018|"     '"'lsquo"
Me.CheckEdit?.Tag = "\u2019|"     '"rsquo"
Me.CheckEdit?.Tag = "\u201A|"     '"sbquo"
Me.CheckEdit?.Tag = "\u201C|"     '"ldquo"
Me.CheckEdit?.Tag = "\u201D|"     '"rdquo"
Me.CheckEdit?.Tag = "\u201E|"     '"bdquo"

Each one of these Tag property is used to build the Regex pattern so they all should have a value.

Fernando
0
 
garyLynn7Author Commented:
Hi Fernando,

So what things look like now is that one can select any number of the check boxes as  needed for matching, and as long as the case select for matching has the correct data, we can include more selections by adding more check box controls, and then setting the tag correctly.

Also, I changed the name of the control from "CheckBox" to "CheckEdit" in the pattern function, which I suspect would make a difference considering the condition statement it's in.

Now what I have found is if all checkboxes are unchecked, then it returns the error: "Index was outside the bounds of the array".

Otherwise it's doing a geat job!

Gary

0
 
Fernando SotoCommented:
Hi Gary;

To your statement, "So what things look like now is that one can select any number of the check boxes as  needed for matching, and as long as the case select for matching has the correct data, we can include more selections by adding more check box controls, and then setting the tag correctly."

Yes, if you add more check boxes to match more entities, you need to set the Tag property of the check box to the correct Unicode character and you must add a new Case statement to catch that new entity in the ReplaceChars function.

For example lets say I was adding the following entity to the application, this entity has already been added to your application this is just for an example,

Name                          Decimal    Hex         Named
right double quotation mark   #8221     #x201d     rdquo 

The Tag property would be equal to the Hex value formatted like, "\u201D|", with out the double quote marks. and the following statements need to be added to the ReplaceChars function.

    Case 8221
        result = "&#8221;" '&rdquo;

Note that the Case has the decimal value of the entity.

To your next statement, "Also, I changed the name of the control from "CheckBox" to "CheckEdit" in the pattern function, which I suspect would make a difference considering the condition statement it's in."

If you are talking about this line of code, If ctl.GetType.Name = "CheckBox" Then, "CheckBox" must be the type of control. In your case it is most likely "CheckEdit" seeming you are using a third party control.

To your next statement, "Now what I have found is if all checkboxes are unchecked, then it returns the error: "Index was outside the bounds of the array"."

To correct that problem change the code here from this:

   Private Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        'Get input string from textbox.
        Dim input As String = rtbInput.Text
        'Declare new regex, unicode values for match.
        Dim pattern As String = BuildPattern()
        Dim re As New Regex(pattern)
        Dim output As String = re.Replace(input, AddressOf ReplaceChars)

        rtbOutPut.Text = output
    End Sub

To this:

   Private Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        'Get input string from textbox.
        Dim input As String = rtbInput.Text
        'Declare new regex, unicode values for match.
        Dim pattern As String = BuildPattern()
        ' If the pattern is empty do not check it at all
        If Not pattern = String.Empty Then
            Dim re As New Regex(pattern)
            Dim output As String = re.Replace(input, AddressOf ReplaceChars)

            rtbOutPut.Text = output
        End If
    End Sub

Fernando
0
 
garyLynn7Author Commented:
Fernando, you are a genius! Thank you so much for your help!  

Interestingly, I was on the way to the same fix you posted, but hadn't quite figured out how to halt the pattern match.
It really is a great bit of work, and as I said a while back, I spent hours researching how to do a match and replace such as this, and realizing there is very little in the way of examples, actually.

My wife thanks you too, since this was her request for such a tool!

Award yourself a thousand bonus points, I say!

Tale care,

Gary
0
 
Fernando SotoCommented:
Well your very welcome and have a great weekend what is left of it. ;=)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 18
  • 16
Tackle projects and never again get stuck behind a technical roadblock.
Join Now