Link to home
Start Free TrialLog in
Avatar of Mike Eghtebas
Mike EghtebasFlag for United States of America

asked on

Audit words in Sample file (using vba in Access 2016)

I have the attached doc file in (C:\Book1\SampleText.docx). This sample file has 276 words. I want to...

1) read words it contains one at a time (and eventually in a future question enter them in an access table along with data points described below),
               
2) for each word and its occurrences, I need to read:
     2a- Page Number
     2b- Paragraph Number
     2c- Line Number in page
     2d- Line Number in the paragraph
     2e- Chapter Number (given in the page headers)

I intend to use VBA in Access to perform the above tasks.

Thank you.
SampleText.docx
Avatar of GrahamSkan
GrahamSkan
Flag of United Kingdom of Great Britain and Northern Ireland image

Capturing line and page numbers from a Word document file is not to be recommended, because they are so volatile. The Word document model has section, paragraph, sentence, word and character objects.  It does not have page or line objects for the text, because the point at which a new page or a line is created is subject to non-textual influences, including the current printer driver.

This is  a Word VBA macro tries to capture those numbers by using the vertical position of the word.

 Note. It needs augmentation to run from another application.

 Also you might not agree with the way that the application splits the text into word ranges, (a paragraph mark becomes a word and 17/09/2107 produces five words), so some detailed programming would be necessary.
ASKER CERTIFIED SOLUTION
Avatar of als315
als315
Flag of Russian Federation image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Mike Eghtebas

ASKER

Thank you very much.
Since answering a further question on this subject I notice that I failed to post my code for the Word macro that I mentioned in my previous comment.
For that sake of completeness and because the code shows how to do the job in Word alone (not involving Access), I am posting it here.
Sub CountWords()
    Dim rngWord As Word.Range
    Dim sec As Word.Section
    Dim para As Word.Paragraph
    Dim iPara As Integer
    Dim iLine As Integer
    Dim iPageLine As Integer
    Dim iParaLine As Integer
    Dim iChapter As Integer
    Dim iPage As Integer
    Dim strWord As String
    Dim strHeadingText As String
    Dim strHeadingTextParts() As String
    Dim sngWordHeight As Single
    
    iPage = 1
    For Each sec In ActiveDocument.Sections
        strHeadingText = sec.Headers(wdHeaderFooterPrimary).Range.Text
        strHeadingTextParts = Split(strHeadingText, "Chapter")
        iChapter = Val(Trim(strHeadingTextParts(1)))
        For Each para In sec.Range.Paragraphs
            iParaLine = 0
            iPara = iPara + 1
            For Each rngWord In para.Range.Words
                strWord = rngWord.Text
                If rngWord.Information(wdVerticalPositionRelativeToPage) <> sngWordHeight Then
                    If rngWord.Information(wdVerticalPositionRelativeToPage) > sngWordHeight Then
                        'new line
                        iPageLine = iPageLine + 1
                        iParaLine = iParaLine + 1
                    Else
                        'new page
                        iPageLine = 1
                        iParaLine = iParaLine + 1
                        iPage = rngWord.Information(wdActiveEndPageNumber)
                    End If
                    sngWordHeight = rngWord.Information(wdVerticalPositionRelativeToPage)
                End If
                Debug.Print "wrd:" & strWord,
                Debug.Print "Pg:" & iPage,
                Debug.Print "Para:" & iPara,
                Debug.Print "LPg:" & iPageLine,
                Debug.Print "LPar:" & iParaLine,
                Debug.Print "Chap:" & iChapter
            Next rngWord
        Next para
    Next sec
End Sub

Open in new window