Link to home
Start Free TrialLog in
Avatar of Gary Samuels
Gary SamuelsFlag for United States of America

asked on

Counting Characters less whitespace, with Regular Expression

VB.NET 2003

I'm trying to count the words and characters in a text box. I picked up the CountWords function from a post by TheLearnedOne and it works great. I was trying to build off of it and make a function that would count all characters with the exception of any whiteSpace. The function I wrote, CountCharacters is not working. Any help in counting the characters without counting spaces would really be appreciated.

 Public Shared Function CountWords(ByVal inputText As String) As Integer
        Dim patternWords As String = "[\w]+"
        Dim regCountWords As New System.Text.RegularExpressions.Regex(patternWords)
        Return regCountWords.Matches(inputText).Count()
    End Function 'CountWords'


    Public Shared Function CountCharacters(ByVal inputText As String) As Integer
        Dim patternCharacters As String = "[^:Wh]"
        Dim regCountCharacters As New System.Text.RegularExpressions.Regex(patternCharacters)
        Return regCountCharacters.Matches(inputText).Count()
    End Function 'CountCharacters'
Avatar of neilprice
neilprice

Hi,

You can try just removing the spaces first then counting individual characters like this;

    Public Function CountCharacters(ByVal inputText As String) As Integer
        Dim tempText As String = inputText.Replace(" ", "")
        Dim patternCharacters As String = "[\w]"
        Dim regCountCharacters As New System.Text.RegularExpressions.Regex(patternCharacters)
        Return regCountCharacters.Matches(tempText).Count()
    End Function 'CountCharacters'

Hope this helps,
Neil
Avatar of Gary Samuels

ASKER

That works but I still have a problem. The CountCharacters function is not counting spaces now, but it also is not counting any puncuation.
Looking up Regular Expressions in help I see that:

[^...]   =   Matches any character not in the set of characters following the ^.
:Wh     =   Matches all types of whitespace, including publishing and ideographic spaces.

This is where I though that
Dim patternCharacters As String = "[^:Wh]"
would count any characters except whitespace.

Does anyone know the correct Regular Expressions syntax that will count all characters except whitespace?

ASKER CERTIFIED SOLUTION
Avatar of neilprice
neilprice

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I think we have it. I had to make a few changes. I realized that the "[^:Wh ]" was removing the "h" as well as the "whiteSpace". Then I realized I needed to remove the "newline" and "tab". I know there are other ways to do this but I am using this in real time as the operator types the counts are continously updated. That's why I wanted to go with the Regular Expressions because of the speed. Thanks for the help.


    Public Shared Function CountCharacters(ByVal inputText As String) As Integer
        Dim patternCharacters As String = "[^:W \n \t ]"   'whiteSpace, newline, and tabs removed from count
        Dim regCountCharacters As New System.Text.RegularExpressions.Regex(patternCharacters)
        Return regCountCharacters.Matches(inputText).Count()
    End Function 'CountCharacters'