Link to home
Start Free TrialLog in
Avatar of SiddharthRout
SiddharthRoutFlag for India

asked on

Retrieving text from Headers in MS WORD

Please find attached a sample file. I just want to extract the header text and the text below it for Header3

So the Output will be like this in 3 separate message boxes

Heading3a
This is a sample line1

so the output for the above

Msgbox "Heading3a contains This is a sample line1"

Heading3b
This is a sample line4

Msgbox "Heading3b contains This is a sample line4"

Heading3c
This is a sample line6

Msgbox "Heading3c contains This is a sample line6"

Sid
 Sample.docx
Avatar of rspahitz
rspahitz
Flag of United States of America image

How about putting this in the ThisDocument code area:
Sub GetText()
    Dim iSentenceCount As Integer
    Dim iSentenceCntr As Integer
    
    iSentenceCount = ActiveDocument.Sentences.Count
    For iSentenceCntr = 1 To iSentenceCount
        If ActiveDocument.Sentences(iSentenceCntr).Characters(1).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Sentences(iSentenceCntr) & " " & ActiveDocument.Sentences(iSentenceCntr + 1)
        End If
    Next
End Sub

Open in new window

Avatar of SiddharthRout

ASKER

Thaks Rob.

It give object variable not set error.

Sid
Also there might be a line or an entire paragraph below each heading.

Sid
I know how to extract the Header Text. That is not a problem. I need to get the para after heading 3.

The code that I have till now is which gives me the header text of all Heading3 style.

Public Sub ExtractText()
    Dim para As Paragraph
 
    For Each para In ActiveDocument.Paragraphs
       If para.Format.Style Like "Heading [3]" Then
           Debug.Print para.Range.Text
       End If
    Next para
End Sub

Open in new window


Sid
Well, it seems that you're pretty much there.  Maybe because you're using the for-each instead of just the for-count:
Sub GetText()
    Dim iParagraphCount As Integer
    Dim iParagraphCntr As Integer
    
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Range.Text & " " & ActiveDocument.Paragraphs(iParagraphCntr + 1).Range.Text
        End If
    Next
End Sub

Open in new window

No.Here is a much better sample file. I need to extract the colored text.

Sid
Sample.docx
ASKER CERTIFIED SOLUTION
Avatar of rspahitz
rspahitz
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Almost. It gives me the text for the first header 3 but the rest 2 are blank.

Sid
I see what you mean. Let me make few checks.

Sid
yeah, it works perfectly for me...for a test, you could try adding this after the first "IF"


            MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Style
sent you an email

Sid
The code was skipping Headers if They were one after the other for example if I have

Head3a
text
Head3b
text
Head3c
text
Head3d
text
Head3e
text

Then the output was

Head3a
text
Head3c
text
Head3e
text

Anyways, I changed it to

Sub GetText()
    Dim iParagraphCount As Integer, iParagraphCntr As Integer
    Dim bHeaderFound As Boolean
    Dim strSectionBody As String
   
    bHeaderFound = False
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If bHeaderFound Then
            If ActiveDocument.Paragraphs(iParagraphCntr).Style <> ActiveDocument.Styles("Normal") Then
                Debug.Print headerText & vbNewLine & strSectionBody
                bHeaderFound = False
                iParagraphCntr = iParagraphCntr - 1
            Else
                strSectionBody = strSectionBody & ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            End If
        ElseIf ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
            headerText = ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            bHeaderFound = True
            strSectionBody = ""
        End If
    Next
End Sub

Couldn't have got it without you :)

Sid
Sorry, I missed the part

    If strSectionBody <> "" Then
        MsgBox strSectionBody
    End If

at the end.

Thanks again.

Sid
yes, it looked like I missed the part about two consecutive header 3 sections ... glad you got it all.  Word VBA just seems so much tougher than Excel VBA !
I was just getting bored so I thought of experimenting with headers and I got stuck and hence the above question :)

Sid