VBA Code to extract a table that consists the heading "Critical Controls" from multiple word documents into Excel

HI I need your help as the below code doesn't work.  Thanks

Sub wordScrape()

Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer

FolderName = "C:\code" ' Change this to the folder containing your word documents

Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files

x = 1
For Each wd In objFiles
    If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
        Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
        sh1.Cells(x, 1) = wd.Name
        sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(3).Cell(Row:=3, Column:=2).Range)
        'sh1.Cells(x, 3) = ....more extracted data....
        x = x + 1
    wrdDoc.Close
    End If

Next wd
wordApp.Quit
End Sub

Open in new window

Natashar7Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

GrahamSkanRetiredCommented:
What happens when you try it?
I declared a couple of undeclared variables because I have Option Explicit set, and it seemed to work.
Natashar7Author Commented:
Where do I indicate in the code to extract the table that has the heading critical controls.  If u can please show me where I indicate the folder to access the word documents and where to include the excel path.  Thanks.
Natashar7Author Commented:
Can I email u the word doc that has the table so u can see ?  

I need a vba code that will extract the tables and put Into an excel worksheet.
Starting with Angular 5

Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

GrahamSkanRetiredCommented:
It would certainly help if we had a sample document to work with.

Make sure that it doesn't have any confidential data and attach it:
Attach File>Browse> Upload file
Natashar7Author Commented:
I am getting an error. Compile error variable not defined.  For folder name = I put the path to the word doc folder. Eg c:
Natashar7Author Commented:
Here is the word doc file what it looks like.  Thanks
GrahamSkanRetiredCommented:
Sorry for the delay.
I added this line to overcome that error:
Dim FolderName As String

Open in new window


Actually you didn't succeed in attaching any files.
Natashar7Author Commented:
Thank you for taking the time to answer. Where do I indicate for it to search only table headings that consist of "Critical Controls" heading?  Did you get the attachment I sent you?
Natashar7Author Commented:
confused do I paste the code in excel or word developer tab?
Natashar7Author Commented:
still getting an error it says variable not defined.  

For Each wd In ObjFiles

please help
Natashar7Author Commented:
i still have not heard back and got a response to my question?
GrahamSkanRetiredCommented:
As I indicated in comment 40950752, the attachement is not there.
GrahamSkanRetiredCommented:
You need to declare the wd variable:

Dim wd As Object

It would be easier if you could post a sample input document.
GrahamSkanRetiredCommented:
You have used the EE message system to send me two messages. Since they do not appear to be private, I will explain here that the first asks a supplementary question, while the second seems to be another attempt to attach a document.
 
The supplementary question is
"Can u pls edit my code and tell me where I should indicate the folder path for the word and excel?  Also the word docs are tables it has to extract the heading that says critical controls on to excel.  I will send u a snapshot later today "

For that supplementary, you have used the variable 'FolderName' to hold the path for the word documents. For the Excel path, you seem to be working with an active worksheet, so its path would be that of the ActiveWorkbook
 
There was (still) no attachment to illustrate exactly what the document looks like, especially what is meant by the table heading. It could be that there is a paragraph with a heading style just before the table. Or it could refer to one of the cells in the first row.

In the morning (GMT + 1) I will try to create some code that assumes the latter.
Natashar7Author Commented:
how do I send you a private message? please explain as I am new to this site thanks
GrahamSkanRetiredCommented:
Sorry. I have confused me. You did send me private messages, but you asked me technical questions that didn't have to be private and could have been asked in this thread.
GrahamSkanRetiredCommented:
Here is some code that should do the job. It doesn't use FileSystemObject, but relies in the intrinsic Dir()  function instead. Don't forget to set the reference to the Word object library or you will get a 'User-defined tyoe not defined' error.
Sub wordScrape()
'Early binding for so set a reference (Tools/References) to the Microsoft Word Object LIbrary
Dim wrdDoc As Word.Document, wordApp As Word.Application
Dim sh1 As Worksheet
Dim bWordAppStartedHere As Boolean
Dim x As Integer
Dim strFolderName As String
Dim strFileName As String
Dim tbl As Word.Table

strFolderName = "C:\code" ' Change this to the folder containing your word documents

Set sh1 = ThisWorkbook.Sheets(1)

'Try to use existing Word Application
On Error Resume Next 'supress error checking
    Set wordApp = GetObject(, "Word.Application")
On Error GoTo 0 'Resume error checking
If wordApp Is Nothing Then 'word application wasn't already running
    Set wordApp = CreateObject("Word.application")
    bWordAppStartedHere = True
End If

x = 1
strFileName = Dir(strFolderName & "\*.docx")
Do While Len(strFileName) > 0
    Set wrdDoc = wordApp.Documents.Open(strFolderName & "\" & strFileName)
    For Each tbl In wrdDoc.Tables
        If InStr(tbl.Cell(1, 1).Range.Text, "Critical Controls") > 0 Then
            sh1.Cells(x, 1).Value = strFileName
            sh1.Cells(x, 2).Value = Application.WorksheetFunction.Clean(tbl.Cell(3, 2).Range.Text)
            'sh1.Cells(x, 3).Value = ....more extracted data....
            x = x + 1
            Exit For
        End If
    Next tbl
    wrdDoc.Close wdDoNotSaveChanges
    strFileName = Dir() 'get name of next file in foldere
Loop
'only close Word if it wasn't already running
If bWordAppStartedHere Then
    wordApp.Quit
End If
End Sub

Open in new window

GrahamSkanRetiredCommented:
There is a typo in my penultimat comment:
'Sorry. I have confused me.'
Should (obviously?) read
'Sorry. I have confused you.
Natashar7Author Commented:
did you get my private message
Natashar7Author Commented:
I tried it and pasted the code in excel and it doesn't work? It showed it was running but there is nothing in sheet 1?
GrahamSkanRetiredCommented:
You have now sent me a sample document via the EE message system. Unless there is a good reason not to, please post in the question.
I have attached a copy here.
Critical-Controls.docx
GrahamSkanRetiredCommented:
The document has a paragraph in Normal style with the text 'Critical Controls', followed by an empty paragraph, and then by a table.
This version of the code looks for any table in the document that is preceded by the 'Critical Controls' paragraph. There can be any number of empty paragraphs in between.
Sub wordScrape()
'Early binding for so set a reference (Tools/References) to the Microsoft Word Object LIbrary
Dim wrdDoc As Word.Document, wordApp As Word.Application
Dim sh1 As Worksheet
Dim bWordAppStartedHere As Boolean
Dim x As Integer
Dim strFolderName As String
Dim strFileName As String
Dim tbl As Word.Table
Dim rng As Word.Range
Dim para As Word.Paragraph

Dim p As Integer
strFolderName = "C:\code" ' Change this to the folder containing your word documents

Set sh1 = ThisWorkbook.Sheets(1)

'Try to use existing Word Application
On Error Resume Next 'supress error checking
    Set wordApp = GetObject(, "Word.Application")
On Error GoTo 0 'Resume error checking
If wordApp Is Nothing Then 'word application wasn't already running
    Set wordApp = CreateObject("Word.application")
    bWordAppStartedHere = True
End If

x = 1
strFileName = Dir(strFolderName & "\*.docx") 'get first file name
Do While Len(strFileName) > 0
    Set wrdDoc = wordApp.Documents.Open(strFolderName & "\" & strFileName)
    For Each tbl In wrdDoc.Tables 'walk throught the document tables
        Set rng = tbl.Range
        rng.Collapse wdCollapseStart
        rng.Start = wrdDoc.Range.Start 'range now goes from start of document to start of table
        For p = rng.Paragraphs.Count To 1 Step -1 'move backwards through the range
            Set para = rng.Paragraphs(p)
            If Len(para.Range) > 1 Then 'skip empty paragraphs
                If para.Range.Text = "Critical Controls" & vbCr Then
                    sh1.Cells(x, 1).Value = strFileName
                    sh1.Cells(x, 2).Value = Application.WorksheetFunction.Clean(tbl.Cell(3, 2).Range.Text)
                    'sh1.Cells(x, 3).Value = ....more extracted data....
                    x = x + 1
                Else
                    Exit For 'text in prior paragraph is not "Critical Controls"
                End If
            End If
        Next p
    Next tbl
    wrdDoc.Close wdDoNotSaveChanges
    strFileName = Dir() 'get name of next file in folder
Loop
'only close Word if it wasn't already running
If bWordAppStartedHere Then
    wordApp.Quit
End If
End Sub

Open in new window

Natashar7Author Commented:
Thank you it worked part of it but not all.  It didn't bring in the Control ID and corresponding description?  Each control id has a unique description.  Please check and let me know.  

Thanks for your hardwork.
Natashar7Author Commented:
i haven't heard back from you on a corrected code?
[ fanpages ]IT Services ConsultantCommented:
GrahamSkanRetiredCommented:
This forum is to help anyone who is stuck on what they are trying to do and to show how to do it. You are tryng to write some macro code to do a particular job, and seemed to be stuck at several points.  Therefore I have tidied up your original code so that it works in your context.

You now have an extra requirement that should be fairly easy to achieve even to someone who is only just beginning to learn VBA programming.

Your original macro has a commented-out line that acknowledges the fact that you may need to augment the code to collect more data from the document and it shows where it should go.
 
                  'sh1.Cells(x, 3).Value = ....more extracted data....

Open in new window

The previous line shows how to get the data from one cell in the table. All you need to do is to do something very similar with another cell.

We are here to show you how - not to do your work for you.
Natashar7Author Commented:
your code still doesn't work.
GrahamSkanRetiredCommented:
Previously, you said that it was working, but yoou now wanted to transfer some more data.
Has it stopped working because you tried to make changes, or does the code that I last posted stopped?
If the former can you post the code with your changes? If the latter can you say exactly what happens?
Natashar7Author Commented:
your code doesn't bring in all the critical controls it brought it just the first line but there are many in one table and in all word docs.  I didn't make any changes to the code just used your code
[ fanpages ]IT Services ConsultantCommented:
"your code doesn't bring in all the critical controls it brought it just the first line but there are many in one table and in all word docs.  I didn't make any changes to the code just used your code"

Please see GrahamSkan's comment (ID: 40956075) above.

You are expected to continue to meet your requirements.

If you are unsure how to do this, that is a different issue, but Graham has advised you what needs to happen to complete your task.

It is unfair to say that his code "does not work".
Natashar7Author Commented:
it brings only the first line but not all the lines stacked below each other
[ fanpages ]IT Services ConsultantCommented:
it brings only the first line but not all the lines stacked below each other

Are you still using the sample file you provided to Graham'; (re-)posted in ID: 40954449?

If so, what has changed since you said the code you were using was working (in ID: 40954649)?
GrahamSkanRetiredCommented:
This version defines a new variable for the row number in the Word table.

Sub wordScrape()
'Early binding for Word, so set a reference (Tools/References) to the Microsoft Word Object LIbrary
Dim wrdDoc As Word.Document, wordApp As Word.Application
Dim sh1 As Worksheet
Dim bWordAppStartedHere As Boolean
Dim x As Integer
Dim strFolderName As String
Dim strFileName As String
Dim tbl As Word.Table
Dim rng As Word.Range
Dim para As Word.Paragraph
Dim r As Integer

Dim p As Integer
strFolderName = "C:\code" ' Change this to the folder containing your word documents

Set sh1 = ThisWorkbook.Sheets(1)

'Try to use existing Word Application
On Error Resume Next 'supress error checking
    Set wordApp = GetObject(, "Word.Application")
On Error GoTo 0 'Resume error checking
If wordApp Is Nothing Then 'word application wasn't already running
    Set wordApp = CreateObject("Word.application")
    bWordAppStartedHere = True
End If

x = 1
strFileName = Dir(strFolderName & "\*.docx") 'get first file name
Do While Len(strFileName) > 0
    Set wrdDoc = wordApp.Documents.Open(strFolderName & "\" & strFileName)
    For Each tbl In wrdDoc.Tables 'walk throught the document tables
        Set rng = tbl.Range
        rng.Collapse wdCollapseStart
        rng.Start = wrdDoc.Range.Start 'range now goes from start of document to start of table
        For p = rng.Paragraphs.Count To 1 Step -1 'move backwards through the range
            Set para = rng.Paragraphs(p)
            If Len(para.Range) > 1 Then 'skip empty paragraphs
                If para.Range.Text = "Critical Controls" & vbCr Then
                    'step through the non-header rows on the Word table
                    For r = 2 To tbl.Rows.Count
                        sh1.Cells(x, 1).Value = strFileName
                        sh1.Cells(x, 2).Value = Application.WorksheetFunction.Clean(tbl.Cell(r, 1).Range.Text)
                        sh1.Cells(x, 3).Value = Application.WorksheetFunction.Clean(tbl.Cell(r, 2).Range.Text)
                        x = x + 1
                    Next r
                Else
                    Exit For 'text in prior paragraph is not "Critical Controls"
                End If
            End If
        Next p
    Next tbl
    wrdDoc.Close wdDoNotSaveChanges
    strFileName = Dir() 'get name of next file in folder
Loop
'only close Word if it wasn't already running
If bWordAppStartedHere Then
    wordApp.Quit
End If
End Sub

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Word

From novice to tech pro — start learning today.