Solved

how to access the heading REF ID of a heading in Word via VBA/Perl?

Posted on 2010-09-18
6
563 Views
Last Modified: 2013-11-25
Hi,

I'm trying to write a parser in Perl to parse a Microsoft Word document. I'm trying to find out how to retrieve the REF Id of headings or captions in my document but have had no success. I can get access to the REF Id from cross references to these headings and captions. But this gets me half way there. I also need to know where these REF Id live in the document - namely the heading and caption locations to complete the link. Here's how I access the REF Id from the cross reference in the document:

my $fieldsCollection = $document->Fields();
$enumerate = new Win32::OLE::Enum($fieldsCollection);
while(defined(my $field = $enumerate->Next())) {
    if ($field->Type == wdFieldRef) {
        # a cross-reference
        if ($field->Code->Text =~ /REF _Ref/) {
            my $fieldCode = $field->Code->Text;
            my $fieldText = $field->Result->Text;
            my $startAt = $field->Result->Start;
            my $endAt = $field->Result->End;
            &logMsg(TRC_DEBUG, "$fieldCode, $fieldText, $startAt, $endAt");
        }
    }  
}

I can lookup the list of styles, headings, bookmarks, fieldcode and paragraph objects defined in the document but cannot see how i can get access to this REF Id from Word's Object Model.


Hoping someone can help.
0
Comment
Question by:tricass
  • 3
  • 3
6 Comments
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 33710703
That's the first Perl code that I have seen, so I won't try to give you a code sample.

The REF Id is a hidden Bookmark. It is created automatically when the user inserts a cross reference. Hidden bookmark names begin with an underscore and are not normally shown in the Bookmarks dialogue.

Bookmarks have a range. In the case of a numbered paragraph (heading) the bookmark's range includes the whole of the paragraph text.
0
 

Author Comment

by:tricass
ID: 33711327
Hi GrahamSkan,

Thanks for clue. I thought it might have been a bookmark but when I retrieve the lists of Bookmarks from the document I get an empty collection. Sorry about the Perl code :). However, if you don't mind providing an example in VB code I might be able to translate into a Perl equivalent. In fact, that's how I got started with this parsing work as most of the material on the web uses VB. Here's the Perl equivalent I used to retrieve the list of Bookmarks.

my $bookmarksCollection = $document->Bookmarks();
$enumerate = new Win32::OLE::Enum($bookmarksCollection);
while(defined(my $bookmark = $enumerate->Next())) {
    my $bookmarkName = $bookmark->Name;
    my $bookmarkStart = $bookmark->Start;
    my $bookmarkEnd = $bookmark->End;
    my $startAt = $bookmark->Range->Start;
    my $endAt = $bookmark->Range->End;
   
    &logMsg(TRC_DEBUG, "$bookmarkName, $bookmarkStart, $bookmarkEnd, $startAt, $endAt");
}

Looking at the Word Object Model Reference information I could not find a method that would allow me to search for a matching bookmark based on the _Ref code defined in a cross reference.

tricass.
0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 500 total points
ID: 33711508
Well I am surprised. It seems that the collection does not show hidden bookmarks in the count, and so they cannot be returned by index number, but a hidden bookmark can be found by name.

I have just created a document with a numbered heading and inserted a cross-reference field to it. I ran the VBA macro code below to demonstrate that effect.
Sub FindXReference()

    Dim fld As Field

    Dim strFieldText As String

    Dim strBookMarkName As String

    Dim strReferencedText As String

    Dim strWords() As String

    Dim i As Integer

    

    Dim bmk As Bookmark

    

    MsgBox "BookMark Count:" & ActiveDocument.Bookmarks.Count 'returns 0 

    For Each bmk In ActiveDocument.Bookmarks

        MsgBox "Bookmark: " & bmk.Name & ", " & bmk.Range.Text 'No message displayes

    Next bmk

    

    For Each fld In ActiveDocument.Fields

        If fld.Type = wdFieldRef Then

            Set fld = ActiveDocument.Fields(1) 'only the one field in my test document

            

            strFieldText = fld.Code

            MsgBox "Field code: " & strFieldText

            strWords = Split(strFieldText, " ")

            'REF is the default field feidl type so that the word REF sould be omitted

            

            For i = 0 To UBound(strWords)

                If Left$(strWords(i), 4) = "_Ref" Then 'Hidden bookmark in XRef format

                    strBookMarkName = strWords(i)

                    Exit For

                End If

            Next i

            MsgBox "Bookmark: " & strBookMarkName

            strReferencedText = ActiveDocument.Bookmarks(strBookMarkName).Range.Text

            MsgBox "Referenced Text: " & strReferencedText

        End If

    Next fld

End Sub

Open in new window

0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 

Author Closing Comment

by:tricass
ID: 33714782
Exactly what I was after. Thanks!

I have to say the documentation at (http://msdn.microsoft.com/en-us/library/bb214817(v=office.12).aspx) does not indicate you can use the Bookmarks method in such a manner! I guess its an undocumented feature :)
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 33714899
I thought that I had already set the option to show hidden bookmarks in the dialogue, so I didn't mention it as it seemed to be ineffective. However that option does make the hidden bookmarks fully visible programatically.

You can set it in code:

    ActiveDocument.Bookmarks.ShowHidden = True


0
 

Author Comment

by:tricass
ID: 33715008
Just tested it and it works! So I now have two approaches to chose from.

Thanks for the follow-up.

tricass
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now