Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1423
  • Last Modified:

VB.NET Word Count algorithm

Can anyone suggest a model algorithm to (a) output the number of words in the text in the code window and (b) a separate algorithm to count the frequency of each word appearing in the text?

pythonV
The vessel, which is operated by an Italian company, carried a crew of 24, from Bulgaria, Ukraine, Russia and the Philippines, Britain's Telegraph newspaper reported.

Open in new window

0
pythonV
Asked:
pythonV
2 Solutions
 
ChloesDadCommented:
Hi, The first one is fairly easy. Thers is a string.split function which returns a string array

dim Words() as string = mytext.split(convert.tochar(" "))
dim NumberofWords as integer = Words.length

You can then use the array to work out the number of times each word appears, although not a problem with the given text, the punctuation would have to be removed from the original string first otherwise "vessel," and "vessel" would be treated as different words.
0
 
Jason EvansSenior Software DeveloperCommented:
Hi there.
Here's a quick and dirty way to get the number of times each word is used in the string. As ChloesDad noted, you may need to change the code to take care of characters such as ',:; etc.
You need to use .NET framework 3.5 for my code example to work.
Hope this helps.
Jas.

Imports System.Text.RegularExpressions
 
Module Module1
 
    Sub Main()
        Dim inputString As String = "The vessel, which is operated by an Italian company, carried a crew of 24, from the country of Bulgaria, Ukraine, Russia and the Philippines, Britain's Telegraph newspaper reported."
 
        Dim wordCounts As Dictionary(Of String, Integer) = GetWordUsageCount(inputString)
 
        For Each kvp As KeyValuePair(Of String, Integer) In wordCounts
            Console.WriteLine("Word = {0}, Count = {1}", kvp.Key, kvp.Value)
        Next
 
    End Sub
 
    Private Function GetWordUsageCount(ByVal input As String) As Dictionary(Of String, Integer)
 
        Dim m As MatchCollection = Regex.Matches(input, "[^\ ^\t^\n^,]+", RegexOptions.Singleline)
 
        Dim words = (From word In m _
                     Select word.value).ToList()
 
        Dim wordGroups = From word In words _
                         Group By word.ToString.ToLower _
                         Into wordCount = Count()
 
        Return wordGroups.ToDictionary(Of String, Integer)(Function(key) key.ToLower, Function(value) value.wordCount)
 
 
    End Function
 
End Module

Open in new window

0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now