?
Solved

VB.Net - Search for two strings in the same paragraph MS Word

Posted on 2011-10-19
21
Medium Priority
?
272 Views
Last Modified: 2012-05-12
Hey guys,

I have a list of pairs of strings contained in Excel that I am wanting to use to loop a search through a Word document that will detect if both strings appear before the next vbCr (carriage return). If both words are detected, then copy the paragraph to a seperate word document. I want to encorporate this into a windows app that i have been developing using VB.Net.

The general process is:
 
Open "Reference.xlsx"
Open "Input.docx"
Create "Output.docx"

In Input.docx Find "stringA1"
If "stringA1" found = True Then
	
	Find "stringB1" before next vbCr
		If "stringB1" found = True Then
			Copy paragraph to "Output.docx"
		Else 
			continue search
		End If
Else
	Excel.Range("D1").Value = "Reference Not Found"
End If

Open in new window


If only scripting was that easy :o) I hope atleast that you get the general idea of what I am trying to achieve. I have attached some test data to clarify further.

Hope someone can help.
Cheers
G_M

 references.xlsx
 input.docx
0
Comment
Question by:G_M
  • 11
  • 9
21 Comments
 
LVL 35

Expert Comment

by:Norie
ID: 36996394
Why use Excel and Word?

Aren't the references you want to search for coming from a Word document?

1. Open document with original references.

2. Open document to search.

3. Search for reference(s) in first document, then search for them in the second document.

4. If found copy paragraph to new document, or even Excel.

This can probably be refined/improved a lot but kind of seems like a good idea.

PS You haven't you mentioned Access yet, you could store all this data in a database and then have code to generate the Excel/text files when needed.

PPS Or did you mention Access?
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996404
It isn't clear which bit you are stuck on; working in VB.Net,  opening Word and/or Excel documents, reading from Excel sheets,  searching in Word documents or all of the above. If it is all of them, then it is more like a project than a question.

Meanwhile, I'll try to show how to do the Word searching bit
0
 

Author Comment

by:G_M
ID: 36996425
imnorie, I am already working in too many applications to take Access into account (although i know you have experience there too).

I am just working with what I have at the moment. All my data is in the Excel document so I'll use it if I can I guess.

As for your question Graham, I just need a hand writing up the vb.net script required to achieve said task.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:G_M
ID: 36996441
All the string references are in the excel document... the word document contains the paragraph of text those references are associated with.
0
 

Author Comment

by:G_M
ID: 36996537
Not that this will help, though you may atleast laugh... This is what I have so far ;o)

 
Public Sub SearchReferences()

        Dim objExcel As Excel.Application
        Dim objWorkbook As Excel.Workbook
        Dim objWorksheet As Excel.Worksheet
        Dim objWord As Word.Application
        Dim objWordDocumentInput As Word.Document
        Dim objWordDocumentOutput As Word.Document
        Dim strInputFileExcel As String
        Dim strInputFileWord As String
        Dim misValue As Object = System.Reflection.Missing.Value

        strInputFileWord = "D:\References.docx"
        strInputFileExcel = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx"
        objExcel = New Excel.Application
        objWorkbook = objExcel.Workbooks.Open(strInputFileExcel)
        objWorksheet = objWorkbook.Worksheets(1)
        objWord = New Word.Application
        objWordDocumentInput = objWord.Documents.Open(strInputFileWord)
        objWordDocumentOutput = objWord.Documents.Add(misValue)

        'Required script for search....

        objWorkbook.Save()
        objWordDocumentOutput.Save()
        objWordDocumentInput.Close()
        objWordDocumentOutput.Close()
        objWord.Quit()
        objExcel.Quit()

        objWord = Nothing
        objWordDocumentInput = Nothing
        objWordDocumentOutput = Nothing
        objExcel = Nothing
        objWorkbook = Nothing
        objWorksheet = Nothing


    End Sub

Open in new window

0
 

Author Comment

by:G_M
ID: 36996549
missed a line... :o)


objWordDocumentOutput.SaveAs(frmGetReferences.lblOutputFolder.Text & "" & frmGetReferences.txtOutputFilename.Text & ".docx")

Open in new window

0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996567
Here is the bit to open and the Word app and docs and to call the sub.

I'll try to fit it with you code
Dim objWordApp As Word.Application
        Dim objSrceDoc As Word.Document
        Dim objDstDoc As Word.Document
        Dim objSelection As Word.Selection
        
        objWordApp = CreateObject("Word.Application")
        objSrceDoc = objWord.Documents.Open("C:\MyFolder\input.DocX")
        objDstDoc = objWord.Documents.Add()
        '...
        TwoStrings(objSrceDoc, objSrceDoc, "septation", "block")
        '...

Open in new window

0
 

Author Comment

by:G_M
ID: 36996586
Ok I am now trying to put these values into a range so they can be offset before the loop

 
Public Sub SearchReferences()

        Dim objExcel As Excel.Application
        Dim objWorkbook As Excel.Workbook
        Dim objWorksheet As Excel.Worksheet
        Dim objWord As Word.Application
        Dim objWordDocumentInput As Word.Document
        Dim objWordDocumentOutput As Word.Document
        Dim strInputFileExcel As String
        Dim strInputFileWord As String
        Dim misValue As Object = System.Reflection.Missing.Value
        Dim strExcelString1 As String
        Dim strExcelString2 As String
        Dim objExcelRange1 As Excel.Range
        Dim objExcelRange2 As Excel.Range

        strInputFileWord = "D:\References.docx"
        strInputFileExcel = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx"
        objExcel = New Excel.Application
        objWorkbook = objExcel.Workbooks.Open(strInputFileExcel)
        objWorksheet = objWorkbook.Worksheets(1)
        objWord = New Word.Application
        objWordDocumentInput = objWord.Documents.Open(strInputFileWord)
        objWordDocumentOutput = objWord.Documents.Add(misValue)
        objWordDocumentOutput.SaveAs(frmGetReferences.lblOutputFolder.Text & "" & frmGetReferences.txtOutputFilename.Text & ".docx")
        objExcelRange1 = objExcel.Range("A2")
        objExcelRange2 = objExcel.Range("B2")

        strExcelString1 = objExcel.Range(objExcelRange1).Value
        strExcelString2 = objExcel.Range(objExcelRange2).Value


        'Required script for search....

        objWorkbook.Save()
        objWordDocumentOutput.Save()
        objWordDocumentInput.Close()
        objWordDocumentOutput.Close()
        objWord.Quit()
        objExcel.Quit()

        objWord = Nothing
        objWordDocumentInput = Nothing
        objWordDocumentOutput = Nothing
        objExcel = Nothing
        objWorkbook = Nothing
        objWorksheet = Nothing


    End Sub

Open in new window

0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996629
Hmm. I've just switched from VBA (where it is much faster to develop) to .Net, opened an earlier project and found some very similar code with the object 'frmGetReferences' flagged as unknown, so I guess that I've tried to work on this before.
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996648
I don't understand where the pair of strings is coming from. Your sheet has only one text column.
0
 

Author Comment

by:G_M
ID: 36996652
That would be this one... http://rdsrc.us/SJ6yR2

I am looking at the same code
0
 

Author Comment

by:G_M
ID: 36996655
The second string is the "Year" column B2:B
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996759
I haven't set up all the externals yet, so this code is untested
Imports System
Imports Microsoft.Office.Interop
Imports Microsoft.Office.Interop.Word
Imports Microsoft.Office.Interop.Excel
Public Class Form1

    Public Sub SearchReferences()

        Dim objExcel As Excel.Application
        Dim objWorkbook As Excel.Workbook
        Dim objWorksheet As Excel.Worksheet
        Dim objWord As Word.Application
        Dim objWordDocumentInput As Word.Document
        Dim objWordDocumentOutput As Word.Document
        Dim strInputFileExcel As String
        Dim strInputFileWord As String
        Dim misValue As Object = System.Reflection.Missing.Value
        Dim strExcelString1 As String
        Dim strExcelString2 As String
        Dim objExcelRange1 As Excel.Range
        Dim objExcelRange2 As Excel.Range
        Dim r As Integer

        strInputFileWord = "D:\References.docx"
        strInputFileExcel = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx"
        objExcel = New Excel.Application
        objWorkbook = objExcel.Workbooks.Open(strInputFileExcel)
        objWorksheet = objWorkbook.Worksheets(1)
        objWord = New Word.Application
        objWordDocumentInput = objWord.Documents.Open(strInputFileWord)
        objWordDocumentOutput = objWord.Documents.Add(misValue)
        r = 2
        Do Until objWorksheet.Cells(r, 1).Value = ""
            strExcelString1 = objWorksheet.Cells(r, 1).Value
            strExcelString2 = objWorksheet.Cells(r, 2).Value
            TwoStrings(objWordDocumentInput, objWordDocumentOutput, strExcelString1, strExcelString2)
        Loop
        'objWorkbook.Save()
        objWordDocumentOutput.SaveAs(frmGetReferences.lblOutputFolder.Text & "" & frmGetReferences.txtOutputFilename.Text & ".docx")
        'objWordDocumentOutput.Save()
        objWordDocumentInput.Close()
        objWordDocumentOutput.Close()
        objWord.Quit()
        objExcel.Quit()

        'I believe that all this setting to Nothing is unnesessary. VB does it automatically.
        'objWord = Nothing
        'objWordDocumentInput = Nothing
        'objWordDocumentOutput = Nothing
        'objExcel = Nothing
        'objWorkbook = Nothing
        'objWorksheet = Nothing


    End Sub

    Sub TwoStrings(ByVal objSourceDoc As Word.Document, ByVal objDestDoc As Word.Document, ByVal strText1 As String, ByVal strText2 As String)
        Dim para As Word.Paragraph

        For Each para In objSourceDoc.Paragraphs
            If para.Range.Find.Execute(strText1) Then
                If para.Range.Find.Execute(strText2) Then
                    para.Range.Copy()
                    objDestDoc.Bookmarks("\EndOfDoc").Range.Paste()
                End If
            End If
        Next para
    End Sub


End Class

Open in new window

0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 2000 total points
ID: 36996767
And just as I clicked Submit, I realised that I had left out the vital r = r+1 line
Imports System
Imports Microsoft.Office.Interop
Imports Microsoft.Office.Interop.Word
Imports Microsoft.Office.Interop.Excel
Public Class Form1

    Public Sub SearchReferences()

        Dim objExcel As Excel.Application
        Dim objWorkbook As Excel.Workbook
        Dim objWorksheet As Excel.Worksheet
        Dim objWord As Word.Application
        Dim objWordDocumentInput As Word.Document
        Dim objWordDocumentOutput As Word.Document
        Dim strInputFileExcel As String
        Dim strInputFileWord As String
        Dim misValue As Object = System.Reflection.Missing.Value
        Dim strExcelString1 As String
        Dim strExcelString2 As String
        Dim objExcelRange1 As Excel.Range
        Dim objExcelRange2 As Excel.Range
        Dim r As Integer

        strInputFileWord = "D:\References.docx"
        strInputFileExcel = frmGetReferences.lblOutputFolder.Text & "wf_" & frmGetReferences.txtOutputFilename.Text & ".xlsx"
        objExcel = New Excel.Application
        objWorkbook = objExcel.Workbooks.Open(strInputFileExcel)
        objWorksheet = objWorkbook.Worksheets(1)
        objWord = New Word.Application
        objWordDocumentInput = objWord.Documents.Open(strInputFileWord)
        objWordDocumentOutput = objWord.Documents.Add(misValue)
        r = 2
        Do Until objWorksheet.Cells(r, 1).Value = ""
            strExcelString1 = objWorksheet.Cells(r, 1).Value
            strExcelString2 = objWorksheet.Cells(r, 2).Value
            TwoStrings(objWordDocumentInput, objWordDocumentOutput, strExcelString1, strExcelString2)
            r = r + 1
        Loop
        'objWorkbook.Save()
        objWordDocumentOutput.SaveAs(frmGetReferences.lblOutputFolder.Text & "" & frmGetReferences.txtOutputFilename.Text & ".docx")
        'objWordDocumentOutput.Save()
        objWordDocumentInput.Close()
        objWordDocumentOutput.Close()
        objWord.Quit()
        objExcel.Quit()

        'I believe that all this setting to Nothing is unnesessary. VB does it automatically.
        'objWord = Nothing
        'objWordDocumentInput = Nothing
        'objWordDocumentOutput = Nothing
        'objExcel = Nothing
        'objWorkbook = Nothing
        'objWorksheet = Nothing


    End Sub

    Sub TwoStrings(ByVal objSourceDoc As Word.Document, ByVal objDestDoc As Word.Document, ByVal strText1 As String, ByVal strText2 As String)
        Dim para As Word.Paragraph

        For Each para In objSourceDoc.Paragraphs
            If para.Range.Find.Execute(strText1) Then
                If para.Range.Find.Execute(strText2) Then
                    para.Range.Copy()
                    objDestDoc.Bookmarks("\EndOfDoc").Range.Paste()
                End If
            End If
        Next para
    End Sub


End Class

Open in new window

0
 
LVL 76

Assisted Solution

by:GrahamSkan
GrahamSkan earned 2000 total points
ID: 36996870
I have now tested it, and I think that it was too slow. Here is a faster version of the TwoStrings procedure. It searches the whole of the document for the first string.
Sub TwoStrings(ByVal objSourceDoc As Word.Document, ByVal objDestDoc As Word.Document, ByVal strText1 As String, ByVal strText2 As String)
        Dim para As Word.Paragraph
        Dim rng As Word.Range

        rng = objSourceDoc.Range
        Do While rng.Find.Execute(strText1)
            para = rng.Paragraphs.First
            If para.Range.Find.Execute(strText2) Then
                para.Range.Copy()
                objDestDoc.Bookmarks("\EndOfDoc").Range.Paste()
            End If
        Loop
    End Sub

Open in new window

0
 

Author Comment

by:G_M
ID: 36996890
It wasnt that it was too slow. you were passing the wrong value names to the TwoStrings Sub. I'm just running a test with those changes.

If you can post your code again with those values changed, I can give it the tick
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996916
I require some convincing about the wrong values being passed. The first worked slowly because it was searching each paragraph separately. The second version seems to work without changing the passed parameters.
0
 

Author Comment

by:G_M
ID: 36996942
hmmm... sorry my misunderstanding; no offense meant.

That second one was a lot faster.

there does seem to be some inconsistancy though. I have shortened my list so I got some hits and it does something strange with how it detects the first hit but then doesn't repeat the strangeness on the last entry.

You'll need to look at the files attached to get what I'm saying :o)

ch1.docx
 input.docx
 references.xlsx
0
 

Author Comment

by:G_M
ID: 36996966
Disregard that last one... it's 1am here I missed the second string not matching... Thank you for your help.

Thanks again Graham, I really appreciate you help.
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 36996981
Thanks. It's 1am here (Guildford), too
0
 

Author Comment

by:G_M
ID: 36998252
I cannot believe you bothered to help me out with this one at 1am... thank you again Graham :o)
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to get Spreadsheet Compare 2016 working with the 64 bit version of Office 2016
Ever visit a website where you spotted a really cool looking Font, yet couldn't figure out which font family it belonged to, or how to get a copy of it for your own use? This article explains the process of doing exactly that, as well as showing how…
This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question