Avatar of SiddharthRout
SiddharthRout
Flag for India asked on

Retrieving text from Headers in MS WORD

Please find attached a sample file. I just want to extract the header text and the text below it for Header3

So the Output will be like this in 3 separate message boxes

Heading3a
This is a sample line1

so the output for the above

Msgbox "Heading3a contains This is a sample line1"

Heading3b
This is a sample line4

Msgbox "Heading3b contains This is a sample line4"

Heading3c
This is a sample line6

Msgbox "Heading3c contains This is a sample line6"

Sid
 Sample.docx
Microsoft ExcelMicrosoft Word

Avatar of undefined
Last Comment
SiddharthRout

8/22/2022 - Mon
rspahitz

How about putting this in the ThisDocument code area:
Sub GetText()
    Dim iSentenceCount As Integer
    Dim iSentenceCntr As Integer
    
    iSentenceCount = ActiveDocument.Sentences.Count
    For iSentenceCntr = 1 To iSentenceCount
        If ActiveDocument.Sentences(iSentenceCntr).Characters(1).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Sentences(iSentenceCntr) & " " & ActiveDocument.Sentences(iSentenceCntr + 1)
        End If
    Next
End Sub

Open in new window

SiddharthRout

ASKER
Thaks Rob.

It give object variable not set error.

Sid
SiddharthRout

ASKER
Also there might be a line or an entire paragraph below each heading.

Sid
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
SiddharthRout

ASKER
I know how to extract the Header Text. That is not a problem. I need to get the para after heading 3.

The code that I have till now is which gives me the header text of all Heading3 style.

Public Sub ExtractText()
    Dim para As Paragraph
 
    For Each para In ActiveDocument.Paragraphs
       If para.Format.Style Like "Heading [3]" Then
           Debug.Print para.Range.Text
       End If
    Next para
End Sub

Open in new window


Sid
rspahitz

Well, it seems that you're pretty much there.  Maybe because you're using the for-each instead of just the for-count:
Sub GetText()
    Dim iParagraphCount As Integer
    Dim iParagraphCntr As Integer
    
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Range.Text & " " & ActiveDocument.Paragraphs(iParagraphCntr + 1).Range.Text
        End If
    Next
End Sub

Open in new window

SiddharthRout

ASKER
No.Here is a much better sample file. I need to extract the colored text.

Sid
Sample.docx
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
ASKER CERTIFIED SOLUTION
rspahitz

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SiddharthRout

ASKER
Almost. It gives me the text for the first header 3 but the rest 2 are blank.

Sid
SiddharthRout

ASKER
I see what you mean. Let me make few checks.

Sid
rspahitz

yeah, it works perfectly for me...for a test, you could try adding this after the first "IF"


            MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Style
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
SiddharthRout

ASKER
sent you an email

Sid
SiddharthRout

ASKER
The code was skipping Headers if They were one after the other for example if I have

Head3a
text
Head3b
text
Head3c
text
Head3d
text
Head3e
text

Then the output was

Head3a
text
Head3c
text
Head3e
text

Anyways, I changed it to

Sub GetText()
    Dim iParagraphCount As Integer, iParagraphCntr As Integer
    Dim bHeaderFound As Boolean
    Dim strSectionBody As String
   
    bHeaderFound = False
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If bHeaderFound Then
            If ActiveDocument.Paragraphs(iParagraphCntr).Style <> ActiveDocument.Styles("Normal") Then
                Debug.Print headerText & vbNewLine & strSectionBody
                bHeaderFound = False
                iParagraphCntr = iParagraphCntr - 1
            Else
                strSectionBody = strSectionBody & ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            End If
        ElseIf ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
            headerText = ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            bHeaderFound = True
            strSectionBody = ""
        End If
    Next
End Sub

Couldn't have got it without you :)

Sid
SiddharthRout

ASKER
Sorry, I missed the part

    If strSectionBody <> "" Then
        MsgBox strSectionBody
    End If

at the end.

Thanks again.

Sid
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
rspahitz

yes, it looked like I missed the part about two consecutive header 3 sections ... glad you got it all.  Word VBA just seems so much tougher than Excel VBA !
SiddharthRout

ASKER
I was just getting bored so I thought of experimenting with headers and I got stuck and hence the above question :)

Sid