Solved

Retrieving text from Headers in MS WORD - Stuck in Word 2003

Posted on 2011-03-07
16
326 Views
Last Modified: 2012-06-27
The code works perfectly in Word 2007. However when I run the same code in Word 2003 then it slows down drastically. I tried the code in Word 2007 in a document with 674 pages and it took approx 30 mins to finish. In word 2003 It just hung. I put a counter in the loop to check and realized that in Word 2003, it starts of pretty fast but then slows down and then after few minutes, it traverses 1 line per second to 1 line in 2 secs and as times goes, it increases to 1 line in 2 secs to 1 line in 3 or 4 seconds.

What seems to be the problem?

Sid
0
Comment
Question by:SiddharthRout
  • 8
  • 5
  • 2
  • +1
16 Comments
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 35068032
The title says that this is about Headers, and you have posted in the Excel zone.

This code deals with Headings and it's in Word.


Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            End If
            MsgBox strMessage
            bThree = False
            strMessage = ""
        End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35068270
Graham. Thanks for the code. But it is almost the same code as what Rob posted in the related thread. Please note that the above will not give me the text between 2 headers. If you could have a look at the related thread then you will see that I mean.

Thanks once again.

Sid
0
 
LVL 6

Expert Comment

by:TinTombStone
ID: 35068405
Word 2003 does seem to be slower, but then you would expect a newer version to be quicker.

My results for a 600 page document with 310 "Heading 3"s

Word 2003 104 seconds, virtual on Windows XP
Word 2007 42 seconds , virtual on Windows XP
Word 2010 24 seconds, native on Windows 7

Word 2003 was also running on a Virtual machine


I modified the procedure a little, but I dont think that would account for a massive time saving

For the tests I commented out the MsgBox line and put in a counter to count the paragraphs

But you can try it.

Sub ShowHeadings()
    Dim para As Paragraph
    Dim strMessage As String
   
    For Each para In ActiveDocument.Paragraphs
        If para.Style = "Heading 3" Then
            If para.Next(1).Range.Style = "Normal" Then
                MsgBox para.Range.Text & vbCr & para.Next(1).Range.Text
            End If
        End If
    Next para

End Sub
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 30

Author Comment

by:SiddharthRout
ID: 35068432
Thank you TinTombStone: Nope it just gives the first heading and ignores the rest. If you check the sample file in the related thread, you can check it yourself :)

Sid
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 35068979
The difference is that it uses For Each instead of paragraph indexing. This is much faster with large collection in large documents because the next paragraph is known to be at the end of the last one. In indexing, the indexed paragraph has to be re-calculated from the beginning of the document

It seems to work OK with your sample document.
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35069063
Graham. Here is a sample file that you can test with. I just copied some text from wikipedia :)

Now I want to extract the Headers3 text and the subtext from the doc and leave the Header2 text. For example

Header3 1 Sample
From Wikipedia's newest articles:
Hikmat Abu Zayd, the first female cabinet minister in Egypt...... till the end


Header3 2 Sample
From Wikipedia's newest articles:
"Under the Horse Chestnut Tree" (1898), a drypoint and aquatint print by Mary...........till the end


Header3 3 Sample
Text is available under the Creative Commons Attribution.... Till the end


Thanks again.

Sid
Sample.doc
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 35069486
Ah, you have multiple paragraphs of body text. I guess that you want all of that up to the next paragraph of any style other than Normal. Try this
Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            Else
                MsgBox strMessage
                bThree = False
                strMessage = ""
            End If
            End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35069523
Almost close.

I got this as output

Header3 1 Sample
From Wikipedia's newest articles:
Hikmat Abu Zayd, the first female cabinet minister in Egypt...... till the end

Header3 2 Sample
From Wikipedia's newest articles:
"Under the Horse Chestnut Tree" (1898), a drypoint and aquatint print by Mary...........till the end

Still missing the

Header3 3 Sample
Text is available under the Creative Commons Attribution.... Till the end

Or am I doing something wrong?

Sid
0
 
LVL 6

Expert Comment

by:TinTombStone
ID: 35069745
Ran the code on your sample and get

Header3 1 Sample
From Wikipedia's newest articles:

Header3 2 Sample
From Wikipedia's newest articles:

Header3 3 Sample
Text is available...

I dont see what the problem is?
0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 500 total points
ID: 35069803
It's mine that has the problem. There was a pending message at the end
Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            Else
                MsgBox strMessage
                bThree = False
                strMessage = ""
            End If
        End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
    If bThree Then
        MsgBox strMessage
    End If
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35069846
GrahamSkan: Perfect. This gives me the right results. Now let me test it in 2003 and get back to you. I am in 2007 now. Have to restart my laptop.

Sid
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35069868
TinTombStone: I am sorry but I was not getting the right results with your code.

Sid
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 35069926
TinTombStone,

Like my first attempt, your code only gathers the first normal paragraph after the heading. This works for Sid's first example, but his second example shows that there can be several such paragraphs before the next Heading paragraph.
0
 
LVL 30

Author Comment

by:SiddharthRout
ID: 35070128
Whoa!!!!

Quick Update:

It took 45 secs on a 674 page file :) Simply amazing. Just verifying the output.

Sid
0
 
LVL 30

Author Closing Comment

by:SiddharthRout
ID: 35070232
Rock On!

45 Seconds! You gotta be kidding me!!!! I was expecting approx 30 mins for the operation to complete. Lolzzz...

Sid
0
 
LVL 22

Expert Comment

by:rspahitz
ID: 35070269
Getting here a bit late but checking in to see how it goes :)
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
Freeze panes is an option within all variants of Excel to enable parts of a sheet to remain stationary when the cursor is in another part of the sheet. This is a very useful feature which is overlooked or under used.
Learn how to create and modify your own paragraph styles in Microsoft Word. This can be helpful when wanting to make consistently referenced styles throughout a document or template.
This Micro Tutorial will demonstrate in Microsoft Excel how to add style and sexy appeal to horizontal bar charts.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question