Solved

How to count the number of words in an Excel 2010 cell or range using VBA?

Posted on 2014-04-01
11
1,361 Views
Last Modified: 2014-04-15
Hi everybody,

As a translator I frequently need to know how many words a cell or a range contains. I was just wondering how to get this info using a VBA macro.

Word delimiters may include the flwg:

blank space after
any punctuation mark
line feed
paragraph mark
end of cell

Any help will be appreciated.

Many thanks in advance.

Cheers.

PG
0
Comment
Question by:pg111
  • 5
  • 4
  • 2
11 Comments
 
LVL 39

Accepted Solution

by:
nutsch earned 500 total points
ID: 39969755
Go to Visual Basic Editor (alt + F11), insert a module and copy the below code, courtesy of Patrick Matthews's excellent Regular Expressions article, then call the function on your worksheet as

=CountWords(A1:A6)

Option Compare Text
 

Function CountWords(rg As Range) As Long
Dim rgLoop As Range, lLoop As Long, lCount As Long

For Each rgLoop In rg.Cells
    lLoop = 1
    Do While RegExpFind(rgLoop.Text, "\b[a-zA-Z]+\b", lLoop) <> ""
        lCount = lCount + 1
        lLoop = lLoop + 1
    Loop
Next

CountWords = lCount

End Function
 
 
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True, Optional ReturnType As Long = 0, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' Pos = 0                   : the last match
    ' Pos = -1                  : the last match
    ' Pos = -2                  : the 2nd to last match
    ' Pos = <negative integer>  : the Nth to last match
    ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
    ' matches, the function returns an empty string.  If no match is found, the function returns
    ' an empty string.  (Earlier versions of this code used zero for the last match; this is
    ' retained for backward compatibility)
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' ReturnType indicates what information you want to return:
    ' ReturnType = 0            : the matched values
    ' ReturnType = 1            : the starting character positions for the matched values
    ' ReturnType = 2            : the lengths of the matched values
    
    ' If MultiLine = False, the ^ and $ match the beginning and end of input, respectively.  If
    ' MultiLine = True, then ^ and $ match the beginning and end of each line (as demarcated by
    ' new line characters) in the input string
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
    ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
    ' the character positions conform to VBA/VB6 expectations
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer()
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Evaluate ReturnType
    
    If ReturnType < 0 Or ReturnType > 2 Then
        RegExpFind = ""
        Exit Function
    End If
    
    ' Create instance of RegExp object if needed, and set properties
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' Test to see if Pos is negative, which indicates the user wants the Nth to last
        ' match.  If it is, then based on the number of matches convert Pos to a positive
        ' number, or zero for the last match
        
        If Not IsMissing(Pos) Then
            If Pos < 0 Then
                If Pos = -1 Then
                    Pos = 0
                Else
                    
                    ' If Abs(Pos) > number of matches, then the Nth to last match does not
                    ' exist.  Return a zero-length string
                    
                    If Abs(Pos) <= TheMatches.Count Then
                        Pos = TheMatches.Count + Pos + 1
                    Else
                        RegExpFind = ""
                        GoTo Cleanup
                    End If
                End If
            End If
        End If
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1)
            For Counter = 0 To UBound(Answer)
                Select Case ReturnType
                    Case 0: Answer(Counter) = TheMatches(Counter)
                    Case 1: Answer(Counter) = TheMatches(Counter).FirstIndex + 1
                    Case 2: Answer(Counter) = TheMatches(Counter).Length
                End Select
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        
        Else
            Select Case Pos
                Case 0                          ' Last match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(TheMatches.Count - 1)
                        Case 1: RegExpFind = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(TheMatches.Count - 1).Length
                    End Select
                Case 1 To TheMatches.Count      ' Nth match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(Pos - 1)
                        Case 1: RegExpFind = TheMatches(Pos - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(Pos - 1).Length
                    End Select
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFind = ""
    End If
    
Cleanup:
    ' Release object variables
    
    Set TheMatches = Nothing
    
End Function

Open in new window

0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39971984
Thinking logically, the number of spaces in a sentence (plus 1) would determine the number of words. Therefore, for a single cell:

=LEN(A1)-LEN(SUBSTITUTE(A1," ",""))+1

Would give the number of words.

For punctuation this assumes that a space follows the punctuation, such as the ", " (comma and space) I have just used in this sentence. This would not allow for line or carriage feed.

Thanks
Rob H
0
 
LVL 39

Expert Comment

by:nutsch
ID: 39972387
Rob, in a translation business, this would not handle some aspects of French punctuation for example, where columns have both a trailing and a leading space. But it makes it darn simpler to get a close approximation.

Thomas
0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39973613
If you include the TRIM function:

=TRIM(A1)

This will remove surplus leading, trailing and double spaces.

Thanks
Rob
0
 

Author Comment

by:pg111
ID: 39980584
Thanks to nutsch. I will try to implement and ajust the suggested solution which looks the best one.

Thanks to Rob and Thomas as well. Thomas, you’re perfectly right with your remarks about French punctuation requirement. However in my case I’m translating from English to French.

See you later as soon as I have the time to implement and try nutsch’s solution.
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 

Author Comment

by:pg111
ID: 39993172
Hi Nutsch,

Actually I intend to include call to function into a macro which I might invoke from a shortcut menu.

My macro would read the addresses of selected cells as a range, read their contents, and perform the calculation.

The final output (the word count) would then be sent to Clipboard for reuse in other applications supporting text paste.
0
 

Author Comment

by:pg111
ID: 39996206
Hi Nutsch,

I tried the flwg code but I get a Required Object 424 error. What’s wrong with it?

Option Compare Text

Sub WordCount()

Dim s As Range

Set s = Range(Selection.Address)

CountWords (s)

End Sub

Function CountWords(rg As Range) As Long
Dim rgLoop As Range, lLoop As Long, lCount As Long

For Each rgLoop In rg.Cells
    lLoop = 1
    Do While RegExpFind(rgLoop.Text, "\b[a-zA-Z]+\b", lLoop) <> ""
        lCount = lCount + 1
        lLoop = lLoop + 1
    Loop
Next

CountWords = lCount

End Function
 
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True, Optional ReturnType As Long = 0, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' Pos = 0                   : the last match
    ' Pos = -1                  : the last match
    ' Pos = -2                  : the 2nd to last match
    ' Pos = <negative integer>  : the Nth to last match
    ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
    ' matches, the function returns an empty string.  If no match is found, the function returns
    ' an empty string.  (Earlier versions of this code used zero for the last match; this is
    ' retained for backward compatibility)
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' ReturnType indicates what information you want to return:
    ' ReturnType = 0            : the matched values
    ' ReturnType = 1            : the starting character positions for the matched values
    ' ReturnType = 2            : the lengths of the matched values
    
    ' If MultiLine = False, the ^ and $ match the beginning and end of input, respectively.  If
    ' MultiLine = True, then ^ and $ match the beginning and end of each line (as demarcated by
    ' new line characters) in the input string
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
    ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
    ' the character positions conform to VBA/VB6 expectations
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer()
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Evaluate ReturnType
    
    If ReturnType < 0 Or ReturnType > 2 Then
        RegExpFind = ""
        Exit Function
    End If
    
    ' Create instance of RegExp object if needed, and set properties
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' Test to see if Pos is negative, which indicates the user wants the Nth to last
        ' match.  If it is, then based on the number of matches convert Pos to a positive
        ' number, or zero for the last match
        
        If Not IsMissing(Pos) Then
            If Pos < 0 Then
                If Pos = -1 Then
                    Pos = 0
                Else
                    
                    ' If Abs(Pos) > number of matches, then the Nth to last match does not
                    ' exist.  Return a zero-length string
                    
                    If Abs(Pos) <= TheMatches.Count Then
                        Pos = TheMatches.Count + Pos + 1
                    Else
                        RegExpFind = ""
                        GoTo Cleanup
                    End If
                End If
            End If
        End If
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1)
            For Counter = 0 To UBound(Answer)
                Select Case ReturnType
                    Case 0: Answer(Counter) = TheMatches(Counter)
                    Case 1: Answer(Counter) = TheMatches(Counter).FirstIndex + 1
                    Case 2: Answer(Counter) = TheMatches(Counter).Length
                End Select
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        
        Else
            Select Case Pos
                Case 0                          ' Last match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(TheMatches.Count - 1)
                        Case 1: RegExpFind = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(TheMatches.Count - 1).Length
                    End Select
                Case 1 To TheMatches.Count      ' Nth match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(Pos - 1)
                        Case 1: RegExpFind = TheMatches(Pos - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(Pos - 1).Length
                    End Select
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFind = ""
    End If
    
Cleanup:
    ' Release object variables
    
    Set TheMatches = Nothing
    
End Function
                

Open in new window


Many thanks in advance for any help.

Cheers.

PG
0
 
LVL 39

Expert Comment

by:nutsch
ID: 39996218
Try replacing
Countword(s)

By

Msgbox countwords(s)
0
 
LVL 39

Expert Comment

by:nutsch
ID: 39996221
The countwords function returns a long result, so you need to do something with the result
0
 

Author Comment

by:pg111
ID: 39996733
Hi Nutsch, hi everybody,

Your solution works after I updated my sub as flws:

Sub WordCount()

Dim s As Range

Set s = Range(Selection.Address)

Dim MesDonnées As New DataObject

n = CountWords(s)

v = CStr(Round(n))

MesDonnées.SetText v
MesDonnées.PutInClipboard

m = MsgBox(v & ", soit le nombre total – indicatif – de mots dans la sélection, a été copié vers le Presse-papiers." & Chr(13) & Chr(13) & "Vous pouvez coller cette valeur à n'importe quel emplacement convenable de votre choix.", vbOKOnly, "Calculer nombre de mots dans sélection")

End Sub

Open in new window


My message basically states how many words are indicatively contained in the selection and says the user can paste this value at any suitable place.


I just have another question to you: Are empty cells ignored by the function or are they taken into account as one word?

Many thanks in advance for your answer.

Cheers.

PG
0
 

Author Closing Comment

by:pg111
ID: 40001500
Thank you so much for your promptness and efficiency.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

In case Office 2010 has not been deployed in your environment, this article may be quite useful. In our office, we wanted a way to deploy Microsoft Office Professional Plus 2010 through an automated batch file via logon script. This article is docum…
Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now