Solved

List of words and count of occurances

Posted on 2010-09-23
14
529 Views
Last Modified: 2012-05-10
Not sure what area this would fit in.
I have a word document that has several medical dictations in it.  Would like to find some kind of script or way to make a list of the words in the document and note how many times each occurs.  
Is there anything like that around?
Thanks, Pat
0
Comment
Question by:PatKung
  • 6
  • 5
  • 3
14 Comments
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 33749540
There is no easy way to do this in Word, but this page compares various utilities that can create a concordance for you:http://cybertext.wordpress.com/2010/04/06/word-concordanceword-list-creators/
0
 
LVL 6

Expert Comment

by:CRJ2000
ID: 33749954
The Word object model actually makes this relatively easy. Here is an example of code you could run within Word itself. To run it, do the following:

1. Create a new word document.
2. Create a new VBA module in that document.
3. Paste the following code in the module.
4. Enter some text into the document.
5. Execute the code. It will print a list of words and their counts in the debug window. You could, of course, put that output anywhere (another word doc, Excel, etc).

Sub GetWords()
 
  Dim Doc As Document
  Dim TheWords As Words
  Dim TheWord As Range
  Dim TheWordText As String
  Dim WordCount As Long
  Dim Key As Variant
  Dim TheDictionary As Object
 
  Set Doc = ThisDocument
  Set TheDictionary = CreateObject("Scripting.Dictionary")
 
  Set TheWords = Doc.Words
 
  For Each TheWord In TheWords
   
    TheWordText = TheWord.Text
     
    If TheDictionary.Exists(TheWordText) Then
      TheDictionary.Item(TheWordText) = TheDictionary.Item(TheWordText) + 1
    Else
      TheDictionary.Add TheWordText, 1
    End If
     
  Next
 
  For Each Key In TheDictionary.Keys
    Debug.Print Key, TheDictionary(Key)
  Next
 
End Sub
0
 

Author Comment

by:PatKung
ID: 33750153
OK, I tried this, but maybe I don't know how to run it correctly because I don't seem to get anything.
How do I get the debug window up?
Thanks, Pat
0
 
LVL 6

Expert Comment

by:CRJ2000
ID: 33750177
Hit Control-G to pull up the debug window.
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 33750265
CRJ2000,Very nice :)One small tweak: that code as written will be case sensitive.  Thus, "the" and "The" would have two separate entries.To make it case INsensitive, add this line:TheDictionary.CompareMode = vbTextComparehttp://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/A_3391-Using-the-Dictionary-Class-in-VBA.htmlPatrick
0
 
LVL 6

Expert Comment

by:CRJ2000
ID: 33750309
Good point, Patrick. Thanks for the case-insensitive suggestion.
0
 

Author Comment

by:PatKung
ID: 33750376
Where does the line of code go in the module?  Control+G didn't bring up the debug window, it brought up the goto page window.  

Appreciate your patience, Thanks, Pat
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 33750426
If you're in the VB Editor, Ctrl+G brings up the Immediate window (which is where Debug.Print writes to).The line I am recommending to modify the Dictionary object can go right after you set the Dictionary.
0
 

Author Comment

by:PatKung
ID: 33750515
How do I make the code run?  I think I ran it but it only put a "1" in the immediate window.
Not sure if I am doing it right. I have 17 pages of cardiology reports in the document.  Is it too much?
I made just a two pager and it did the same thing.

Thanks, Pat
0
 
LVL 6

Accepted Solution

by:
CRJ2000 earned 500 total points
ID: 33750973
Here is a new version of the script that includes the case insensitivity, and outputs the word list to a new document. I'm not sure why it's not working on your document. Could you explain exactly what you are doing to put the code in and to execute it?

Sub GetWords()
 
  Dim Doc As Document
  Dim TheWords As Words
  Dim TheWord As Range
  Dim TheWordText As String
  Dim WordCount As Long
  Dim Key As Variant
  Dim TheDictionary As Object
  Dim NewDoc As New Document
 
  Set Doc = ThisDocument
  Set TheDictionary = CreateObject("Scripting.Dictionary")
  TheDictionary.CompareMode = vbTextCompare
 
  Set TheWords = Doc.Words
 
  For Each TheWord In TheWords
   
    TheWordText = Trim(TheWord.Text)
     
    If TheDictionary.Exists(TheWordText) Then
      TheDictionary.Item(TheWordText) = TheDictionary.Item(TheWordText) + 1
    Else
      TheDictionary.Add TheWordText, 1
    End If
     
  Next
 
  For Each Key In TheDictionary.Keys
    NewDoc.Range.InsertAfter Key & vbTab & TheDictionary(Key) & vbNewLine
  Next
 
End Sub
0
 

Author Comment

by:PatKung
ID: 33751035
That is the question.  I am not sure I know how to execute it.  I followed your insturctions to insert code in new module.  Clicked Debug tab, and compiled it. Then clicked on the Run tab, and clicked run macro, it brought up a list and I chose GetWords from the list.  It then put the number 1 in the immediate window.
No list of words or anything else.
Thanks, Pat
0
 
LVL 6

Assisted Solution

by:CRJ2000
CRJ2000 earned 500 total points
ID: 33751092
Okay. Let's make sure that you have the code in the right place....

1. Close Word, if it's already open.
2. Open Word. It should give you a blank document.
3. Hit Alt-F11, or go to Tools | Macro | Visual Basic Editor
4. In the "Project" window (on the left, by default), right-click on "ThisDocument" and choose "Insert | Module"
5. Paste the latest version of the code from this thread.
6. Go back to the Word document and enter whatever text you want to. I would start with a small amount of text, just to see if things are working.
7. You can run the code with one of these two methods: 1) Go to Tools | Macro | Macros, click the name of the Macro, and click Run. 2) Go back to the Visual Basic Editor (step 3, above), Select some text in the GetWords subroutine, and click the "Run" button (or press F5).

The code is setup to return all of the words in the document in which the code resides. You should see all of the words put into a new Word document.

Chris
0
 

Author Closing Comment

by:PatKung
ID: 33753619
Great!!!
It did run, but put it in a different new document, which is fine.
I appreciate all your effort in helping me solve this.  This will be great.
Thanks, Pat
0
 
LVL 6

Expert Comment

by:CRJ2000
ID: 33754345
Glad I could help, Pat.

Chris
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

If you work with Word a lot, you probably use styles. If you use styles a lot, you've probably balled your fist more often than not when working with the ribbon. In Word 2007/2010, one of the things that I find missing when using styles is a quic…
This article is the result of a quest to better understand Task Scheduler 2.0 and all the newer objects available in vbscript in this version over  the limited options we had scripting in Task Scheduler 1.0.  As I started my journey of knowledge I f…
This video walks the viewer through the process of creating an MLA formatted document, as well as a bibliography with citations.
Learn how to create and modify your own paragraph styles in Microsoft Word. This can be helpful when wanting to make consistently referenced styles throughout a document or template.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now