Natashar7
asked on
VBA Code to extract a table that consists the heading "Critical Controls" from multiple word documents into Excel
HI I need your help as the below code doesn't work. Thanks
Sub wordScrape()
Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer
FolderName = "C:\code" ' Change this to the folder containing your word documents
Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files
x = 1
For Each wd In objFiles
If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
sh1.Cells(x, 1) = wd.Name
sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(3).Cell(Row:=3, Column:=2).Range)
'sh1.Cells(x, 3) = ....more extracted data....
x = x + 1
wrdDoc.Close
End If
Next wd
wordApp.Quit
End Sub
ASKER
Where do I indicate in the code to extract the table that has the heading critical controls. If u can please show me where I indicate the folder to access the word documents and where to include the excel path. Thanks.
ASKER
Can I email u the word doc that has the table so u can see ?
I need a vba code that will extract the tables and put Into an excel worksheet.
I need a vba code that will extract the tables and put Into an excel worksheet.
It would certainly help if we had a sample document to work with.
Make sure that it doesn't have any confidential data and attach it:
Attach File>Browse> Upload file
Make sure that it doesn't have any confidential data and attach it:
Attach File>Browse> Upload file
ASKER
I am getting an error. Compile error variable not defined. For folder name = I put the path to the word doc folder. Eg c:
ASKER
Here is the word doc file what it looks like. Thanks
Sorry for the delay.
I added this line to overcome that error:
Actually you didn't succeed in attaching any files.
I added this line to overcome that error:
Dim FolderName As String
Actually you didn't succeed in attaching any files.
ASKER
Thank you for taking the time to answer. Where do I indicate for it to search only table headings that consist of "Critical Controls" heading? Did you get the attachment I sent you?
ASKER
confused do I paste the code in excel or word developer tab?
ASKER
still getting an error it says variable not defined.
For Each wd In ObjFiles
please help
For Each wd In ObjFiles
please help
ASKER
i still have not heard back and got a response to my question?
As I indicated in comment 40950752, the attachement is not there.
You need to declare the wd variable:
Dim wd As Object
It would be easier if you could post a sample input document.
Dim wd As Object
It would be easier if you could post a sample input document.
You have used the EE message system to send me two messages. Since they do not appear to be private, I will explain here that the first asks a supplementary question, while the second seems to be another attempt to attach a document.
The supplementary question is
"Can u pls edit my code and tell me where I should indicate the folder path for the word and excel? Also the word docs are tables it has to extract the heading that says critical controls on to excel. I will send u a snapshot later today "
For that supplementary, you have used the variable 'FolderName' to hold the path for the word documents. For the Excel path, you seem to be working with an active worksheet, so its path would be that of the ActiveWorkbook
There was (still) no attachment to illustrate exactly what the document looks like, especially what is meant by the table heading. It could be that there is a paragraph with a heading style just before the table. Or it could refer to one of the cells in the first row.
In the morning (GMT + 1) I will try to create some code that assumes the latter.
The supplementary question is
"Can u pls edit my code and tell me where I should indicate the folder path for the word and excel? Also the word docs are tables it has to extract the heading that says critical controls on to excel. I will send u a snapshot later today "
For that supplementary, you have used the variable 'FolderName' to hold the path for the word documents. For the Excel path, you seem to be working with an active worksheet, so its path would be that of the ActiveWorkbook
There was (still) no attachment to illustrate exactly what the document looks like, especially what is meant by the table heading. It could be that there is a paragraph with a heading style just before the table. Or it could refer to one of the cells in the first row.
In the morning (GMT + 1) I will try to create some code that assumes the latter.
ASKER
how do I send you a private message? please explain as I am new to this site thanks
Sorry. I have confused me. You did send me private messages, but you asked me technical questions that didn't have to be private and could have been asked in this thread.
Here is some code that should do the job. It doesn't use FileSystemObject, but relies in the intrinsic Dir() function instead. Don't forget to set the reference to the Word object library or you will get a 'User-defined tyoe not defined' error.
Sub wordScrape()
'Early binding for so set a reference (Tools/References) to the Microsoft Word Object LIbrary
Dim wrdDoc As Word.Document, wordApp As Word.Application
Dim sh1 As Worksheet
Dim bWordAppStartedHere As Boolean
Dim x As Integer
Dim strFolderName As String
Dim strFileName As String
Dim tbl As Word.Table
strFolderName = "C:\code" ' Change this to the folder containing your word documents
Set sh1 = ThisWorkbook.Sheets(1)
'Try to use existing Word Application
On Error Resume Next 'supress error checking
Set wordApp = GetObject(, "Word.Application")
On Error GoTo 0 'Resume error checking
If wordApp Is Nothing Then 'word application wasn't already running
Set wordApp = CreateObject("Word.application")
bWordAppStartedHere = True
End If
x = 1
strFileName = Dir(strFolderName & "\*.docx")
Do While Len(strFileName) > 0
Set wrdDoc = wordApp.Documents.Open(strFolderName & "\" & strFileName)
For Each tbl In wrdDoc.Tables
If InStr(tbl.Cell(1, 1).Range.Text, "Critical Controls") > 0 Then
sh1.Cells(x, 1).Value = strFileName
sh1.Cells(x, 2).Value = Application.WorksheetFunction.Clean(tbl.Cell(3, 2).Range.Text)
'sh1.Cells(x, 3).Value = ....more extracted data....
x = x + 1
Exit For
End If
Next tbl
wrdDoc.Close wdDoNotSaveChanges
strFileName = Dir() 'get name of next file in foldere
Loop
'only close Word if it wasn't already running
If bWordAppStartedHere Then
wordApp.Quit
End If
End Sub
There is a typo in my penultimat comment:
'Sorry. I have confused me.'
Should (obviously?) read
'Sorry. I have confused you.
'Sorry. I have confused me.'
Should (obviously?) read
'Sorry. I have confused you.
ASKER
did you get my private message
ASKER
I tried it and pasted the code in excel and it doesn't work? It showed it was running but there is nothing in sheet 1?
You have now sent me a sample document via the EE message system. Unless there is a good reason not to, please post in the question.
I have attached a copy here.
Critical-Controls.docx
I have attached a copy here.
Critical-Controls.docx
The document has a paragraph in Normal style with the text 'Critical Controls', followed by an empty paragraph, and then by a table.
This version of the code looks for any table in the document that is preceded by the 'Critical Controls' paragraph. There can be any number of empty paragraphs in between.
This version of the code looks for any table in the document that is preceded by the 'Critical Controls' paragraph. There can be any number of empty paragraphs in between.
Sub wordScrape()
'Early binding for so set a reference (Tools/References) to the Microsoft Word Object LIbrary
Dim wrdDoc As Word.Document, wordApp As Word.Application
Dim sh1 As Worksheet
Dim bWordAppStartedHere As Boolean
Dim x As Integer
Dim strFolderName As String
Dim strFileName As String
Dim tbl As Word.Table
Dim rng As Word.Range
Dim para As Word.Paragraph
Dim p As Integer
strFolderName = "C:\code" ' Change this to the folder containing your word documents
Set sh1 = ThisWorkbook.Sheets(1)
'Try to use existing Word Application
On Error Resume Next 'supress error checking
Set wordApp = GetObject(, "Word.Application")
On Error GoTo 0 'Resume error checking
If wordApp Is Nothing Then 'word application wasn't already running
Set wordApp = CreateObject("Word.application")
bWordAppStartedHere = True
End If
x = 1
strFileName = Dir(strFolderName & "\*.docx") 'get first file name
Do While Len(strFileName) > 0
Set wrdDoc = wordApp.Documents.Open(strFolderName & "\" & strFileName)
For Each tbl In wrdDoc.Tables 'walk throught the document tables
Set rng = tbl.Range
rng.Collapse wdCollapseStart
rng.Start = wrdDoc.Range.Start 'range now goes from start of document to start of table
For p = rng.Paragraphs.Count To 1 Step -1 'move backwards through the range
Set para = rng.Paragraphs(p)
If Len(para.Range) > 1 Then 'skip empty paragraphs
If para.Range.Text = "Critical Controls" & vbCr Then
sh1.Cells(x, 1).Value = strFileName
sh1.Cells(x, 2).Value = Application.WorksheetFunction.Clean(tbl.Cell(3, 2).Range.Text)
'sh1.Cells(x, 3).Value = ....more extracted data....
x = x + 1
Else
Exit For 'text in prior paragraph is not "Critical Controls"
End If
End If
Next p
Next tbl
wrdDoc.Close wdDoNotSaveChanges
strFileName = Dir() 'get name of next file in folder
Loop
'only close Word if it wasn't already running
If bWordAppStartedHere Then
wordApp.Quit
End If
End Sub
ASKER
Thank you it worked part of it but not all. It didn't bring in the Control ID and corresponding description? Each control id has a unique description. Please check and let me know.
Thanks for your hardwork.
Thanks for your hardwork.
ASKER
i haven't heard back from you on a corrected code?
Potential duplicate question(s):
[ https://www.experts-exchange.com/questions/28710134/VBA-Code-to-extract-Critical-Control-heading-from-a-table.html ]
[ https://www.experts-exchange.com/questions/28710134/VBA-Code-to-extract-Critical-Control-heading-from-a-table.html ]
Please confirm if any/all of these questions are for the same requested outcome.
Thank you.
[ https://www.experts-exchange.com/questions/28710134/VBA-Code-to-extract-Critical-Control-heading-from-a-table.html ]
[ https://www.experts-exchange.com/questions/28710134/VBA-Code-to-extract-Critical-Control-heading-from-a-table.html ]
Please confirm if any/all of these questions are for the same requested outcome.
Thank you.
This forum is to help anyone who is stuck on what they are trying to do and to show how to do it. You are tryng to write some macro code to do a particular job, and seemed to be stuck at several points. Therefore I have tidied up your original code so that it works in your context.
You now have an extra requirement that should be fairly easy to achieve even to someone who is only just beginning to learn VBA programming.
Your original macro has a commented-out line that acknowledges the fact that you may need to augment the code to collect more data from the document and it shows where it should go.
We are here to show you how - not to do your work for you.
You now have an extra requirement that should be fairly easy to achieve even to someone who is only just beginning to learn VBA programming.
Your original macro has a commented-out line that acknowledges the fact that you may need to augment the code to collect more data from the document and it shows where it should go.
'sh1.Cells(x, 3).Value = ....more extracted data....
The previous line shows how to get the data from one cell in the table. All you need to do is to do something very similar with another cell.We are here to show you how - not to do your work for you.
ASKER
your code still doesn't work.
Previously, you said that it was working, but yoou now wanted to transfer some more data.
Has it stopped working because you tried to make changes, or does the code that I last posted stopped?
If the former can you post the code with your changes? If the latter can you say exactly what happens?
Has it stopped working because you tried to make changes, or does the code that I last posted stopped?
If the former can you post the code with your changes? If the latter can you say exactly what happens?
ASKER
your code doesn't bring in all the critical controls it brought it just the first line but there are many in one table and in all word docs. I didn't make any changes to the code just used your code
"your code doesn't bring in all the critical controls it brought it just the first line but there are many in one table and in all word docs. I didn't make any changes to the code just used your code"
Please see GrahamSkan's comment (ID: 40956075) above.
You are expected to continue to meet your requirements.
If you are unsure how to do this, that is a different issue, but Graham has advised you what needs to happen to complete your task.
It is unfair to say that his code "does not work".
Please see GrahamSkan's comment (ID: 40956075) above.
You are expected to continue to meet your requirements.
If you are unsure how to do this, that is a different issue, but Graham has advised you what needs to happen to complete your task.
It is unfair to say that his code "does not work".
ASKER
it brings only the first line but not all the lines stacked below each other
it brings only the first line but not all the lines stacked below each other
Are you still using the sample file you provided to Graham'; (re-)posted in ID: 40954449?
If so, what has changed since you said the code you were using was working (in ID: 40954649)?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I declared a couple of undeclared variables because I have Option Explicit set, and it seemed to work.