Solved

VB.NET Word Count algorithm

Posted on 2009-04-06
2
1,405 Views
Last Modified: 2012-05-06
Can anyone suggest a model algorithm to (a) output the number of words in the text in the code window and (b) a separate algorithm to count the frequency of each word appearing in the text?

pythonV
The vessel, which is operated by an Italian company, carried a crew of 24, from Bulgaria, Ukraine, Russia and the Philippines, Britain's Telegraph newspaper reported.

Open in new window

0
Comment
Question by:pythonV
2 Comments
 
LVL 15

Accepted Solution

by:
ChloesDad earned 250 total points
ID: 24082299
Hi, The first one is fairly easy. Thers is a string.split function which returns a string array

dim Words() as string = mytext.split(convert.tochar(" "))
dim NumberofWords as integer = Words.length

You can then use the array to work out the number of times each word appears, although not a problem with the given text, the punctuation would have to be removed from the original string first otherwise "vessel," and "vessel" would be treated as different words.
0
 
LVL 10

Assisted Solution

by:MrClyfar
MrClyfar earned 250 total points
ID: 24082418
Hi there.
Here's a quick and dirty way to get the number of times each word is used in the string. As ChloesDad noted, you may need to change the code to take care of characters such as ',:; etc.
You need to use .NET framework 3.5 for my code example to work.
Hope this helps.
Jas.

Imports System.Text.RegularExpressions
 
Module Module1
 
    Sub Main()
        Dim inputString As String = "The vessel, which is operated by an Italian company, carried a crew of 24, from the country of Bulgaria, Ukraine, Russia and the Philippines, Britain's Telegraph newspaper reported."
 
        Dim wordCounts As Dictionary(Of String, Integer) = GetWordUsageCount(inputString)
 
        For Each kvp As KeyValuePair(Of String, Integer) In wordCounts
            Console.WriteLine("Word = {0}, Count = {1}", kvp.Key, kvp.Value)
        Next
 
    End Sub
 
    Private Function GetWordUsageCount(ByVal input As String) As Dictionary(Of String, Integer)
 
        Dim m As MatchCollection = Regex.Matches(input, "[^\ ^\t^\n^,]+", RegexOptions.Singleline)
 
        Dim words = (From word In m _
                     Select word.value).ToList()
 
        Dim wordGroups = From word In words _
                         Group By word.ToString.ToLower _
                         Into wordCount = Count()
 
        Return wordGroups.ToDictionary(Of String, Integer)(Function(key) key.ToLower, Function(value) value.wordCount)
 
 
    End Function
 
End Module

Open in new window

0

Featured Post

Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Article by: jpaulino
XML Literals are a great way to handle XML files and the community doesn’t use it as much as it should.  An XML Literal is like a String (http://msdn.microsoft.com/en-us/library/system.string.aspx) Literal, only instead of starting and ending with w…
It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question