Link to home
Start Free TrialLog in
Avatar of joshkrak
joshkrakFlag for United States of America

asked on

.net MS Word Interop - Find text and return a Word.Range

What I'm trying to do is take Indentifiers that I type into a word doc and set bookmarks on them.

Say I have a Word Doc with the following text

"The Raven" is a narrative poem by the American <Goody> Edgar Allan Poe, first published in January 1845. It is noted for its musicality, <HereIAm> language, and supernatural atmosphere. It tells of a talking raven's mysterious visit to a distraught lover, <AnotherWord> the man's slow descent into madness. The lover, often identified as being a student,[1][2] is lamenting the loss of his love, Lenore. The raven, sitting on a bust of Pallas, seems to further instigate his distress with its constant repetition of the word, "Nevermore". Throughout the poem, Poe makes allusions to folklore and various classical works.

I Would like to use something like a Regex to find strings formated like <[SomeText]> and then create a bookmark over <[SomeText]>. IE, in the above paragraph I would create bookmarks fro <AnotherWord>, <Goody>, And <HereIAm>

To do this using Word, I would highlight the <[SomeText]> then goto Insert->Bookmark.

I believe I can figure out how to add the bookmark as long as I get can a Range object that covers the text range of the <[SomeText]> identifier. I've tried running a regex on the Word.Content.Text using Text.RegularExpression.Regex and I get find the matches but when I try to create a range from the starting index of the match and the index of the last character its not selecting the right text.

Another problem that I noticed, is if Word.Content.Text is used, it does not return text that is inside a head or footer. I will also need code that can find these identifiers within header and footer fields.

Thanks
Avatar of GrahamSkan
GrahamSkan
Flag of United Kingdom of Great Britain and Northern Ireland image

Word has its own Find method which can use Wild Cards similar to Regular expressions.

It is a method of a Range (or Selection) object and will set the object to the found range.

This is a VBA snippet to set a series of bookmarks. I accepts text between the <> signs provided that is comprises letters or space characters.
    Dim doc As Word.Document
    Dim rng As Word.Range
    '...
    Set rng = doc.Range.Duplicate
    Dim i As Integer
    With rng.Find
        .Text = "\<[A-Za-z ]{1,}\>"
        .MatchWildcards = True
        Do While .Execute()
            i = i + 1
            doc.Bookmarks.Add "bmk" & i, rng
            rng.Collapse wdCollapseEnd
            rng.End = doc.Range.End
        Loop
    End With

Open in new window

Avatar of joshkrak

ASKER

Thank you for the help. Your solution does for for the most part but there are some problems

First, just so I'm clear, I'm not working inside of the Word app so its not just a macro I'm looking for. However, the interop dll supplied by microsoft to automate Word in .Net uses the same basic syntax and object naming so I was easily able to convert your solution.

The problems I still have are

1) The loop never advances. It just keeps returning the first identifier and loops indefinitely. I Did get around this by Changing the text in the found range to something that would not watch the RegEx so its not that big of a deal but just wanted to let you know.

2) This method does not find identifiers that are within Header or Footer fields. Goto View->Header and Footers then type some text and some indentifiers in there and try your script again. I MUST have it able to find and bookmark within the Header and Footers as well.


All in all though, I'm much further than I could have ever gotten on my own, so thank you for that. Just try to take a stab at the Header/Footer problem and we'll call it done.

-Josh
I do realise that you are not writing for a Word macro, but I don't have the application that you are using or the experience to write in the precise code that will work in your scenario, but I hope to provide enough for you to see how it is done and to apply it in your situation.

Given that, I am not sure why it is not working as designed. The intention is that, once an instance that matches the criteria has been found, the next search looks for another occurrence in the rest of the main part of the document.
Headers and Footers are not part of the main document range. They are displayed and - more important,ly -  printed around each page to which they apply.

Word has several separate range types, called Stories, and it is possible to step through the first of each :

Dim stry As Range
For Each stry In wdDoc.StoryRanges
'...
 
 
ASKER CERTIFIED SOLUTION
Avatar of GrahamSkan
GrahamSkan
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Lol, np, as soon as you said storys I was able to play around and get it working before your next post. My final solution in VB.NET is shown below.

Thanks again, GrahamSkan! Wish I could give ya more than 500.
                wordApp = New Word.ApplicationClass
                wordDoc = wordApp.Documents.Add(DirectCast(FullFileName, Object))
                Dim SearchStorys() As Word.WdStoryType = {Word.WdStoryType.wdFirstPageHeaderStory, Word.WdStoryType.wdMainTextStory, Word.WdStoryType.wdFirstPageFooterStory}
                For Each Story As Word.WdStoryType In SearchStorys
                    Dim Range As Word.Range = wordDoc.StoryRanges.Item(Story)
                    Dim i As Integer = 0
                    Range.Find.Text = "\<[A-Za-z0-9_]{1,}\>"
                    Range.Find.MatchWildcards = True
                    While Range.Find.Execute
                        Dim BmText As String = Range.Text.Trim(New Char() {"<"c, ">"c})
                        Range.Text = "<!" & BmText & "!>"
                        If Not Al.Contains(BmText) Then
                            Dim objRange As Object = Range
                            wordDoc.Bookmarks.Add(BmText, objRange)
                            Al.Add(BmText)
                        End If
                        Range.Collapse(Word.WdCollapseDirection.wdCollapseEnd)
                        Range.End = wordDoc.Range.End
                    End While
                Next
                Dim objFullFileName As Object = FullFileName
                wordDoc.SaveAs(objFullFileName)

Open in new window

Excellent expert advice! He definitely knows his shit!
That's excellent news. Thank you.