Solved

Retrieving text from Headers in MS WORD - Stuck in Word 2003

Posted on 2011-03-07
16
321 Views
Last Modified: 2012-06-27
The code works perfectly in Word 2007. However when I run the same code in Word 2003 then it slows down drastically. I tried the code in Word 2007 in a document with 674 pages and it took approx 30 mins to finish. In word 2003 It just hung. I put a counter in the loop to check and realized that in Word 2003, it starts of pretty fast but then slows down and then after few minutes, it traverses 1 line per second to 1 line in 2 secs and as times goes, it increases to 1 line in 2 secs to 1 line in 3 or 4 seconds.

What seems to be the problem?

Sid
0
Comment
Question by:SiddharthRout
  • 8
  • 5
  • 2
  • +1
16 Comments
 
LVL 76

Expert Comment

by:GrahamSkan
Comment Utility
The title says that this is about Headers, and you have posted in the Excel zone.

This code deals with Headings and it's in Word.


Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            End If
            MsgBox strMessage
            bThree = False
            strMessage = ""
        End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
Graham. Thanks for the code. But it is almost the same code as what Rob posted in the related thread. Please note that the above will not give me the text between 2 headers. If you could have a look at the related thread then you will see that I mean.

Thanks once again.

Sid
0
 
LVL 6

Expert Comment

by:TinTombStone
Comment Utility
Word 2003 does seem to be slower, but then you would expect a newer version to be quicker.

My results for a 600 page document with 310 "Heading 3"s

Word 2003 104 seconds, virtual on Windows XP
Word 2007 42 seconds , virtual on Windows XP
Word 2010 24 seconds, native on Windows 7

Word 2003 was also running on a Virtual machine


I modified the procedure a little, but I dont think that would account for a massive time saving

For the tests I commented out the MsgBox line and put in a counter to count the paragraphs

But you can try it.

Sub ShowHeadings()
    Dim para As Paragraph
    Dim strMessage As String
   
    For Each para In ActiveDocument.Paragraphs
        If para.Style = "Heading 3" Then
            If para.Next(1).Range.Style = "Normal" Then
                MsgBox para.Range.Text & vbCr & para.Next(1).Range.Text
            End If
        End If
    Next para

End Sub
0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
Thank you TinTombStone: Nope it just gives the first heading and ignores the rest. If you check the sample file in the related thread, you can check it yourself :)

Sid
0
 
LVL 76

Expert Comment

by:GrahamSkan
Comment Utility
The difference is that it uses For Each instead of paragraph indexing. This is much faster with large collection in large documents because the next paragraph is known to be at the end of the last one. In indexing, the indexed paragraph has to be re-calculated from the beginning of the document

It seems to work OK with your sample document.
0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
Graham. Here is a sample file that you can test with. I just copied some text from wikipedia :)

Now I want to extract the Headers3 text and the subtext from the doc and leave the Header2 text. For example

Header3 1 Sample
From Wikipedia's newest articles:
Hikmat Abu Zayd, the first female cabinet minister in Egypt...... till the end


Header3 2 Sample
From Wikipedia's newest articles:
"Under the Horse Chestnut Tree" (1898), a drypoint and aquatint print by Mary...........till the end


Header3 3 Sample
Text is available under the Creative Commons Attribution.... Till the end


Thanks again.

Sid
Sample.doc
0
 
LVL 76

Expert Comment

by:GrahamSkan
Comment Utility
Ah, you have multiple paragraphs of body text. I guess that you want all of that up to the next paragraph of any style other than Normal. Try this
Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            Else
                MsgBox strMessage
                bThree = False
                strMessage = ""
            End If
            End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
Almost close.

I got this as output

Header3 1 Sample
From Wikipedia's newest articles:
Hikmat Abu Zayd, the first female cabinet minister in Egypt...... till the end

Header3 2 Sample
From Wikipedia's newest articles:
"Under the Horse Chestnut Tree" (1898), a drypoint and aquatint print by Mary...........till the end

Still missing the

Header3 3 Sample
Text is available under the Creative Commons Attribution.... Till the end

Or am I doing something wrong?

Sid
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 6

Expert Comment

by:TinTombStone
Comment Utility
Ran the code on your sample and get

Header3 1 Sample
From Wikipedia's newest articles:

Header3 2 Sample
From Wikipedia's newest articles:

Header3 3 Sample
Text is available...

I dont see what the problem is?
0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 500 total points
Comment Utility
It's mine that has the problem. There was a pending message at the end
Sub ShowHeadings()
    Dim para As Paragraph
    Dim bThree As Boolean
    Dim strMessage As String
    
    For Each para In ActiveDocument.Paragraphs
        If bThree Then
            If para.Style = "Normal" Then
                strMessage = strMessage & para.Range.Text
            Else
                MsgBox strMessage
                bThree = False
                strMessage = ""
            End If
        End If
        If para.Style = "Heading 3" Then
            strMessage = para.Range.Text
            bThree = True
        End If
    Next para
    If bThree Then
        MsgBox strMessage
    End If
End Sub

Open in new window

0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
GrahamSkan: Perfect. This gives me the right results. Now let me test it in 2003 and get back to you. I am in 2007 now. Have to restart my laptop.

Sid
0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
TinTombStone: I am sorry but I was not getting the right results with your code.

Sid
0
 
LVL 76

Expert Comment

by:GrahamSkan
Comment Utility
TinTombStone,

Like my first attempt, your code only gathers the first normal paragraph after the heading. This works for Sid's first example, but his second example shows that there can be several such paragraphs before the next Heading paragraph.
0
 
LVL 30

Author Comment

by:SiddharthRout
Comment Utility
Whoa!!!!

Quick Update:

It took 45 secs on a 674 page file :) Simply amazing. Just verifying the output.

Sid
0
 
LVL 30

Author Closing Comment

by:SiddharthRout
Comment Utility
Rock On!

45 Seconds! You gotta be kidding me!!!! I was expecting approx 30 mins for the operation to complete. Lolzzz...

Sid
0
 
LVL 22

Expert Comment

by:rspahitz
Comment Utility
Getting here a bit late but checking in to see how it goes :)
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.
In a previous video Micro Tutorial here at Experts Exchange (http://www.experts-exchange.com/videos/1358/How-to-get-a-free-trial-of-Office-365-with-the-Office-2016-desktop-applications.html), I explained how to get a free, one-month trial of Office …

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now