This might be a challenge, so I gave it 500 points. Or it might be easy, I really don't know.
I am trying to do some Text Mining, or Text Analytics on survey comments using MS Excel -- starting with the simplest concepts of separating the text into single words (in sequence) and then looking at two-word and three-word combinations in the text.
The attached Excel file gives some text (feel free to use your own if it might be a better example) with the columns started.
I can somewhat do this procedure in MS Word and then with Concatenate and Pivot Tables back in Excel.
In Word, I dump all the text, then remove all periods and commas. But first, I replace all periods with a paragraph mark (to get the end of a sentence). Then I replace all spaces (" ") with a paragraph mark. This creates a single column of words, in sequence -- with two paragraphs at the end of each cell.
Then with pivot tables I can count the words. With Concatenate, I can add one cell to the following cell and get two-word combinations and do the same for three-word combinations. These can be counted (with Pivot Tables) to find key themes in the text.
Now -- to do all of that in Excel with Code???
I have found the attached code which does a nice job of Counting Instances of Words in the text. No matter what I've tried though, I cannot get this to just display the single words in sequence though ( and I don't understand how the range works).
So, this might be a helpful starting point???
If not, and you're up for the challange -- could you try to write the code that would parse the strings into single words in a column, and then join them in two-word & three-word combinations?
Thanks very much.
Dim rngData As Range
Dim rngCell As Range
Dim colWords As Collection
Dim vntWord As Variant
On Error Resume Next
Set colWords = New Collection
Set rngData = Range("B30:B86")
For Each rngCell In rngData.Cells
For Each vntWord In Split(Replace(Replace(Replace(rngCell.Value, """", ""), "]", ""), "[", ""), " ")
colWords.Add colWords.Count + 1, vntWord
With Cells(29 + colWords(vntWord), 3)
.Value = vntWord
.Offset(0, 1) = .Offset(0, 1) + 1
With Range("C30", Cells(Rows.Count, 3).End(xlUp)).Resize(, 2)
.Sort .Cells(1, 2), xlDescending