Link to home
Create AccountLog in
Avatar of forumware
forumware

asked on

Extract particular section from word files

I have about 100 word (Doc, docx) files and need to extract a particular section. 2.9: Address. This "section" is in the Table of Contents as well as within the document.  How can i find and extract this section using vba, macro or python etc.
Avatar of GrahamSkan
GrahamSkan
Flag of United Kingdom of Great Britain and Northern Ireland image

Presumably the 'section' begins with an automatically-numbered Heading 2 paragraph with the text "Address". What defines the section end?
Avatar of forumware
forumware

ASKER

The next section also starts with Heading 2 but varies, some is general information, interconnections, logical diagrams.
SOLUTION
Avatar of Anastasia D. Gavanas
Anastasia D. Gavanas
Flag of Greece image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
forumware, let us know about GrahamSkan's macro, I think it will work just fine :)
Thanks to both, will give it a try shortly. I forgot to mention that I wanted to extract this information to a csv instead of another word document.
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
xtermie is right. There is no built-in procedure to export a Word document to a CSV file. It does exist in Excel and Access because those applications deal exclusively with data in a table format.
The new task, therefore is to fit the data to a table. Then it can be converted to comma-separated text and saved as a text file.
Graham, I think it would it be easier to copy-paste the particular text into an Excel file and then save the Excel file in a .csv format. What do you think?  If we change that in your macro, everything should work exactly as forumware wants.
xtermie,
Sorry, I'm a few hours behind you.
Yes, that seems like a good idea.
We would need to know the rules for splitting the text into fields (columns) - by words, sentences, paragraphs or something else.
All apologies on the delay. I've tried running the script and get Run time error '5174'

This file could not be found. But i see it listed in the c:\test directory.
Check if it is .docx or .doc
If it is .doc change script accordingly
Did that, left one DOCX in c:\test and got the same runtime error.
Which line does it fail on?
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
experts provided solid collaborative working solution to author