Shyamulee Das
asked on
Highlight sentences with similar words!!
Hi,
I want to create a Word macro that will check for similar words in two documents and highlight those sentences with more than 6 similar words in 2nd document.
Example: Doc1 contains some sentences and 2nd document contains some sentences, so highlight sentences which contains more than 6 similar words in the 2nd document.
Thank you.
I want to create a Word macro that will check for similar words in two documents and highlight those sentences with more than 6 similar words in 2nd document.
Example: Doc1 contains some sentences and 2nd document contains some sentences, so highlight sentences which contains more than 6 similar words in the 2nd document.
Thank you.
Please define "similar words" so that we can determine whether this is possible without using external software.
ASKER
For example: (Doc1) " I am going to the market in sometime "
(Doc2) " I will be going to the market in the evening "
So both the sentences have 6 similar words, then highlight the 2nd sentence.
(Doc2) " I will be going to the market in the evening "
So both the sentences have 6 similar words, then highlight the 2nd sentence.
See if this fits your need:
Sub HighLightMatchingWords()
Dim docFirst As Document
Dim docSecond As Document
Dim iTarget As Integer
Dim iCount As Integer
Dim rngSent1 As Range
Dim rngSent2 As Range
Dim rngWord1 As Range
Dim rngWord2 As Range
iTarget = 6
Set docFirst = Documents.Open("C:\MyFolder\" & "First.docx")
Set docSecond = Documents.Open("C:\MyFolder\" & "Second.docx")
For Each rngSent1 In docFirst.Range.Sentences
For Each rngSent2 In docSecond.Range.Sentences
iCount = 0
For Each rngWord1 In rngSent1.Words
For Each rngWord2 In rngSent2.Words
If StrComp(rngWord1.Text, rngWord2.Text, vbTextCompare) = 0 Then
rngWord2.HighlightColorIndex = wdYellow
iCount = iCount + 1
End If
Next rngWord2
If iCount < iTarget Then
rngSent2.HighlightColorIndex = wdNoHighlight
End If
Next rngWord1
Next rngSent2
Next rngSent1
End Sub
Warnings:
1. The macro compares every sentence in the first document with every sentence in the second document, so that will make it slow with large documents.
2. It uses MS Word's VBA definition of a word. This might not be what you need (e.g. commas are treated as words)
1. The macro compares every sentence in the first document with every sentence in the second document, so that will make it slow with large documents.
2. It uses MS Word's VBA definition of a word. This might not be what you need (e.g. commas are treated as words)
ASKER
I want to highlight Sentences in the "DocSecond" Sentences having more than 6 similar words with compare to the sentences of "DocFirst".
Thank you.
Thank you.
ASKER
I have created a macro for this but it is not helping me. It is highlighting everything in the second doc. If you can help me out with the code it will be really helpful.
Sub comparison1()
Dim doc As Document
Dim doc1 As Document
Set doc = ActiveDocument
Dim sen As Range
Dim wrd
Dim wrd1
Dim sen1 As Range
Dim SelectedFileItem As String
Dim afile
Dim item_count As Variant
Dim total_item
Dim FDialog As FileDialog
Set FDialog = Application.FileDialog(msoFileDialogOpen)
Dim c_dict As Scripting.Dictionary
Set c_dict = New Scripting.Dictionary
Dim c_dict1 As Scripting.Dictionary
Set c_dict1 = New Scripting.Dictionary
With FDialog
.Title = "Select a file"
.AllowMultiSelect = False
.InitialFileName = "D:\OneDrive - CACTUS\Documents"
.Filters.Clear
.Filters.Add "Word files", "*.docx"
If .Show = -1 Then
SelectedFileItem = .SelectedItems(1)
afile = SelectedFileItem
Else
MsgBox "You cancelled the operation"
Exit Sub
End If
Set doc1 = Documents.Open(afile)
End With
total_item = 0
For Each sen In doc1.Range.sentences
For Each wrd In sen.words
If c_dict.Exists(LCase(wrd.Text)) = False Then
With sen
If sen.words.count > 6 Then
c_dict.Add LCase(wrd.Text), LCase(wrd.Text)
End If
End With
End If
Next
For Each sen1 In doc1.Range.sentences
For Each wrd1 In sen1.words
If c_dict1.Exists(LCase(wrd1.Text)) = False Then
If sen1.words.count >= sen.words.count Then
c_dict1.Add LCase(wrd1.Text), LCase(wrd1.Text)
End If
End If
Next
For Each item_count In c_dict.Items
If c_dict1.Exists(LCase(item_count)) = True Then
total_item = total_item + 1
If total_item > 6 Then
sen1.HighlightColorIndex = wdGray25
End If
End If
Next
c_dict.RemoveAll
c_dict1.RemoveAll
Next sen1
Next sen
For Each coll In doc1.Range.sentences
If coll.HighlightColorIndex = wdGray25 Then
highlightCount = highlightCount + 1
End If
Next
MsgBox "There are (" & highlightCount & ") similar sentences" & vbCr & "Matched items: " & total_item
End Sub
I hope this is better for you
Sub HighLightMatchingSentences()
Dim docFirst As Document
Dim docSecond As Document
Dim iTarget As Integer
Dim iCount As Integer
Dim rngSent1 As Range
Dim rngSent2 As Range
Dim rngWord1 As Range
Dim rngWord2 As Range
iTarget = 6
Set docFirst = Documents.Open("C:\MyFolder\" & "First.docx")
Set docSecond = Documents.Open("C:\MyFolder\" & "Second.docx")
For Each rngSent1 In docFirst.Range.Sentences
For Each rngSent2 In docSecond.Range.Sentences
iCount = 0
For Each rngWord1 In rngSent1.Words
For Each rngWord2 In rngSent2.Words
If StrComp(rngWord1.Text, rngWord2.Text, vbTextCompare) = 0 Then
iCount = iCount + 1
End If
Next rngWord2
If iCount < iTarget Then
rngSent2.HighlightColorIndex = wdYellow
End If
Next rngWord1
Next rngSent2
Next rngSent1
End Sub
I haven't analysed your macro in detail, but lines 44 and 55 are both stepping through doc1. I guess that you wanted to use doc in line 44
ASKER
rngSent2.HighlightColorInd ex = wdYellow
After updating the recent code it is highlighting everything.
After updating the recent code it is highlighting everything.
My code or yours?
ASKER
Yours.
Mine
ASKER
Yes!!
Sub HighLightMatchingSentences()
Dim docFirst As Document
Dim docSecond As Document
Dim iTarget As Integer
Dim iCount As Integer
Dim rngSent1 As Range
Dim rngSent2 As Range
Dim rngWord1 As Range
Dim rngWord2 As Range
iTarget = 6
Set docFirst = Documents.Open("C:\MyFolder\" & "First.docx")
Set docSecond = Documents.Open("C:\MyFolder\" & "Second.docx")
For Each rngSent1 In docFirst.Range.Sentences
For Each rngSent2 In docSecond.Range.Sentences
iCount = 0
For Each rngWord1 In rngSent1.Words
For Each rngWord2 In rngSent2.Words
If StrComp(rngWord1.Text, rngWord2.Text, vbTextCompare) = 0 Then
iCount = iCount + 1
End If
Next rngWord2
If iCount < iTarget Then
rngSent2.HighlightColorIndex = wdYellow
End If
Next rngWord1
Next rngSent2
Next rngSent1
End Sub
Sorry. Careless mistake. It was line 26
Sub HighLightMatchingSentences()
Dim docFirst As Document
Dim docSecond As Document
Dim iTarget As Integer
Dim iCount As Integer
Dim rngSent1 As Range
Dim rngSent2 As Range
Dim rngWord1 As Range
Dim rngWord2 As Range
iTarget = 6
Set docFirst = Documents.Open("C:\MyFolder\" & "First.docx")
Set docSecond = Documents.Open("C:\MyFolder\" & "Second.docx")
For Each rngSent1 In docFirst.Range.Sentences
For Each rngSent2 In docSecond.Range.Sentences
iCount = 0
For Each rngWord1 In rngSent1.Words
For Each rngWord2 In rngSent2.Words
If StrComp(rngWord1.Text, rngWord2.Text, vbTextCompare) = 0 Then
iCount = iCount + 1
End If
Next rngWord2
If iCount >= iTarget Then
rngSent2.HighlightColorIndex = wdYellow
End If
Next rngWord1
Next rngSent2
Next rngSent1
End Sub
ASKER
Still it is highlighting the whole thing !! :(
Perhaps my choice of documents is wrong. Can you post a couple of sample documents, please?
ASKER
I have uploaded the two demo files which m working on!! In doc2 New stimulation paradigms with time-varying intensity and frequency were developed to suppress the “onset responses”. you will find this sentence same as the one in doc1. so have to highlight this sentence.
Thank you.
Doc1.docx
Doc2.docx
Thank you.
Doc1.docx
Doc2.docx
It didn't highlight every sentence, but more, I suppose, than you were expecting.
That is partly because of punctuation. Also I suspect that you didn't want to include short common words like "to", "the" and "and".
In this version, the word length must be greater than 4 characters (including the trailing blank) to be counted. It only finds one sentence.
That is partly because of punctuation. Also I suspect that you didn't want to include short common words like "to", "the" and "and".
In this version, the word length must be greater than 4 characters (including the trailing blank) to be counted. It only finds one sentence.
Sub HighLightMatchingSentences1()
Dim docFirst As Document
Dim docSecond As Document
Dim iTarget As Integer
Dim iCount As Integer
Dim rngSent1 As Range
Dim rngSent2 As Range
Dim rngWord1 As Range
Dim rngWord2 As Range
iTarget = 6
Set docFirst = Documents.Open("C:\MyFolder\" & "Doc1.docx")
Set docSecond = Documents.Open("C:\MyFolder\" & "Doc2.docx")
For Each rngSent1 In docFirst.Range.Sentences
For Each rngSent2 In docSecond.Range.Sentences
iCount = 0
For Each rngWord1 In rngSent1.words
If Len(rngWord1) > 4 Then
For Each rngWord2 In rngSent2.words
If StrComp(rngWord1.Text, rngWord2.Text, vbTextCompare) = 0 Then
iCount = iCount + 1
End If
Next rngWord2
If iCount >= iTarget Then
rngSent2.HighlightColorIndex = wdYellow
End If
End If
Next rngWord1
Next rngSent2
Next rngSent1
End Sub
ASKER
This is the output which I am getting.
But I don't want this as my output. It should highlight only those sentences which has more than 6 similar words as compared to the sentences of doc1!!
Thank you.
Output.docx
But I don't want this as my output. It should highlight only those sentences which has more than 6 similar words as compared to the sentences of doc1!!
Thank you.
Output.docx
That is strange. This is what I get:
Doc2Out.docx
Doc2Out.docx
I'll add some code to capture some debugging information.
You might be interested in plagiarism detectors such as quetext or Search Engine Reports.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Thank you so much !! :) This has solved my problem..