Solved

Retrieving text from Headers in MS WORD

Posted on 2011-03-07
15
303 Views
Last Modified: 2012-05-11
Please find attached a sample file. I just want to extract the header text and the text below it for Header3

So the Output will be like this in 3 separate message boxes

Heading3a
This is a sample line1

so the output for the above

Msgbox "Heading3a contains This is a sample line1"

Heading3b
This is a sample line4

Msgbox "Heading3b contains This is a sample line4"

Heading3c
This is a sample line6

Msgbox "Heading3c contains This is a sample line6"

Sid
 Sample.docx
0
Comment
Question by:SiddharthRout
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 5
15 Comments
 
LVL 22

Expert Comment

by:rspahitz
ID: 35059603
How about putting this in the ThisDocument code area:
Sub GetText()
    Dim iSentenceCount As Integer
    Dim iSentenceCntr As Integer
    
    iSentenceCount = ActiveDocument.Sentences.Count
    For iSentenceCntr = 1 To iSentenceCount
        If ActiveDocument.Sentences(iSentenceCntr).Characters(1).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Sentences(iSentenceCntr) & " " & ActiveDocument.Sentences(iSentenceCntr + 1)
        End If
    Next
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059633
Thaks Rob.

It give object variable not set error.

Sid
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059678
Also there might be a line or an entire paragraph below each heading.

Sid
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059713
I know how to extract the Header Text. That is not a problem. I need to get the para after heading 3.

The code that I have till now is which gives me the header text of all Heading3 style.

Public Sub ExtractText()
    Dim para As Paragraph
 
    For Each para In ActiveDocument.Paragraphs
       If para.Format.Style Like "Heading [3]" Then
           Debug.Print para.Range.Text
       End If
    Next para
End Sub

Open in new window


Sid
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35059755
Well, it seems that you're pretty much there.  Maybe because you're using the for-each instead of just the for-count:
Sub GetText()
    Dim iParagraphCount As Integer
    Dim iParagraphCntr As Integer
    
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
                MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Range.Text & " " & ActiveDocument.Paragraphs(iParagraphCntr + 1).Range.Text
        End If
    Next
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059818
No.Here is a much better sample file. I need to extract the colored text.

Sid
Sample.docx
0
 
LVL 22

Accepted Solution

by:
rspahitz earned 500 total points
ID: 35059914
I see that the colored text is set at style Heading 2.  I assume that's supposed to be "Normal"

This code seems to work for me:
 
Sub GetText()
    Dim iParagraphCount As Integer
    Dim iParagraphCntr As Integer
    Dim bHeaderFound As Boolean
    Dim strSectionBody As String
    
    bHeaderFound = False
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If bHeaderFound Then
            If ActiveDocument.Paragraphs(iParagraphCntr).Style <> ActiveDocument.Styles("Normal") Then
                MsgBox strSectionBody
                bHeaderFound = False
            Else
                strSectionBody = strSectionBody & ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            End If
        ElseIf ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
            bHeaderFound = True
            strSectionBody = ""
        End If
    Next
    If strSectionBody <> "" Then
        MsgBox strSectionBody
    End If
End Sub

Open in new window


Make sure you save this as a macro-enabled document (docm)
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059930
Almost. It gives me the text for the first header 3 but the rest 2 are blank.

Sid
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35059941
I see what you mean. Let me make few checks.

Sid
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35059954
yeah, it works perfectly for me...for a test, you could try adding this after the first "IF"


            MsgBox ActiveDocument.Paragraphs(iParagraphCntr).Style
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35060101
sent you an email

Sid
0
 
LVL 30

Author Closing Comment

by:SiddharthRout
ID: 35060450
The code was skipping Headers if They were one after the other for example if I have

Head3a
text
Head3b
text
Head3c
text
Head3d
text
Head3e
text

Then the output was

Head3a
text
Head3c
text
Head3e
text

Anyways, I changed it to

Sub GetText()
    Dim iParagraphCount As Integer, iParagraphCntr As Integer
    Dim bHeaderFound As Boolean
    Dim strSectionBody As String
   
    bHeaderFound = False
    iParagraphCount = ActiveDocument.Paragraphs.Count
    For iParagraphCntr = 1 To iParagraphCount
        If bHeaderFound Then
            If ActiveDocument.Paragraphs(iParagraphCntr).Style <> ActiveDocument.Styles("Normal") Then
                Debug.Print headerText & vbNewLine & strSectionBody
                bHeaderFound = False
                iParagraphCntr = iParagraphCntr - 1
            Else
                strSectionBody = strSectionBody & ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            End If
        ElseIf ActiveDocument.Paragraphs(iParagraphCntr).Style = ActiveDocument.Styles("Heading 3") Then
            headerText = ActiveDocument.Paragraphs(iParagraphCntr).Range.Text
            bHeaderFound = True
            strSectionBody = ""
        End If
    Next
End Sub

Couldn't have got it without you :)

Sid
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35060742
Sorry, I missed the part

    If strSectionBody <> "" Then
        MsgBox strSectionBody
    End If

at the end.

Thanks again.

Sid
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35060858
yes, it looked like I missed the part about two consecutive header 3 sections ... glad you got it all.  Word VBA just seems so much tougher than Excel VBA !
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35061053
I was just getting bored so I thought of experimenting with headers and I got stuck and hence the above question :)

Sid
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will guide you to convert a grid from a picture into Excel format using Microsoft OneNote and no other 3rd party application.
Do you use a spreadsheet like Microsoft's Excel?  Have you ever wanted to link out to a non excel file on your computer or network drive?  This is the way I found to do it!
This Micro Tutorial will demonstrate in Microsoft Excel how to add style and sexy appeal to horizontal bar charts.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

687 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question