Solved

Automated import of data from Word to Excel with VBA

Posted on 2016-07-29
26
77 Views
Last Modified: 2016-08-15
Hi Experts,

I'm after a little bit of advice...

I have a word document that users essentially fill out. It's a relatively large document (50 pages or so) and contains a number of features such as bookmarks and references etc. Once the guys that are completing the document have finished I then need to bring certain pieces of data that they've completed into a 4 page summary.

I've created a template in Excel (at this stage) that will house the desired information in the desired format.

I'm wondering if you guys have some advice on the best way to do this?

Some of the features that will need to be imported include:

- paragraphs from specific sections of the document
- values from tables in the word document
- a number of pictures that they will have placed in predetermined locations in the word document

I was hoping that I could write a script that would find the various references (i.e. locate section 2.5.1 and copy the paragraph / table / picture before pasting it in the corresponding location in the Excel template).

I may have bitten off more than I can chew I'm not sure but keen to get an opinion or two / some good articles that I can use to guide me. Perhaps I need to re-think and revert to PPT instead of Excel.

Any ideas really appreciated.

Cheers

Martywal
0
Comment
Question by:martywal
  • 16
  • 9
26 Comments
 
LVL 15

Expert Comment

by:WalkaboutTigger
ID: 41735441
Well...two ways to approach this and I recommend using both.

The first is a Word VBA macro which creates an Excel file with the document elements you want to import into Excel.  Somehow associate this newly-created Excel workbook.

You will then use your template file to get the data from this intermediate Excel file and manipulate it in Excel.

Today I spent over 8 hours working in Excel VBA, so my ability to continue to code has been depleted.  I do hope this helps to give you a starting point.

If the blank documents are not proprietary and are neither company nor industry identifiable, your uploading them here would help us help you reach a solution.

If that is not possible, if you can create some kind of mock-up Word document and upload it, that, would be helpful.
1
 

Author Comment

by:martywal
ID: 41735468
Thanks WalkaboutTigger I'll post versions of the files later.

So, with your word macro suggestion, I'm guessing that I could fire that from within the Excel?
From the Excel open the word, fire the macro, create another excel with the data, close the word file, copy from the created excel to the template...

It may become clearer when I post the files.

Thanks for your help thus far! :-)
0
 

Author Comment

by:martywal
ID: 41735482
It won't let me update a docm or xlsm but I've uploaded docx & xlsx. Hopefully it gives the idea..

I've added descriptions in red text to see the sections I'll be taking over.
Project-Handbook-v1-demox.docx
Business-Casex.xlsx
0
 

Author Comment

by:martywal
ID: 41736202
Hi again Experts,
Just thought I'd follow up on this again.
I've managed to bring most of the data into my workbook (which is great!) the trick now is going to be bringing the images/pictures in... Can that be done?
Fingers crossed
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41736395
You will have to use Copy and Paste,
Your pictures all seem to be inline, so at the Word end you will need something like:
MyDoc.inlineShapes(1).Range.Copy

Open in new window

1
 

Author Comment

by:martywal
ID: 41737260
Thanks GrahamSkan, I was hoping that I could do this from within the Excel though. Fingers crossed, Thanks again
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41737424
The code could run in either application. By 'Word end', I was referring to getting the image from the document as opposed to putting it somewhere into the workbook.
0
 

Author Comment

by:martywal
ID: 41738157
OK that's cool. I'll explore it in the coming days. You did mention that the pictures are in line. They are now but that's just because I've placed them in the document, the users won't be so accurate I fear. Is there a where to use the bookmarks / contents headings to pick up a picture and copy it?

Thanks again :-)
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41739065
A floating picture is implemented as a Shape object, and this will find the first shape anchored in the given range.
Function GetFloatingPicture(rng As Range) As Shape
    Set GetFloatingPicture = rng.ShapeRange(1)
End Function

Open in new window

0
 
LVL 76

Accepted Solution

by:
GrahamSkan earned 500 total points
ID: 41739194
Here is a way to find the Gantt chart from the sample document.
Function GetInlineShapeAfterText(doc As Document, strText As String) As InlineShape
    Dim rng As Range
    Dim para As Paragraph
    
    Set rng = doc.Range
    With rng.Find
        .Text = strText
        If .Execute() Then
            Set para = GetNextParagraph(rng)
            Set GetInlineShapeAfterText = para.Range.InlineShapes(1)
        End If
    End With
End Function
 
Function GetNextParagraph(rng As Range) As Paragraph
    rng.Expand wdParagraph
    rng.Collapse wdCollapseEnd
    rng.Move wdCharacter, 1
    rng.Expand wdParagraph
    Set GetNextParagraph = rng.Paragraphs.First
End Function

Open in new window

1
 

Author Comment

by:martywal
ID: 41745051
Hi GrahamSkan, sorry not to have got back on this sooner. I'm back into it over the weekend. Let you know how I go. Thanks again for you tips!
0
 

Author Comment

by:martywal
ID: 41745743
Hi GrahamSkan, hopefully this isn't too silly a question but I'm curious where I need to place this script? I have a module that opens the word document and extracts the tables "ImportWordTable()" but I'm not too sure where to call the function.

Here's the table import:

Sub ImportWordTable()
Dim wdDoc As Object
Dim wdFileName As Variant
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
On Error Resume Next

Application.ScreenUpdating = False
   
Worksheets("TableImporter").Select
 'Need to select the
    'ActiveSheet.Range("A:AZ").ClearContents
    Worksheets("TableImporter").Range("A:N").ClearContents
   
    wdFileName = Application.GetOpenFilename("Word Documents, *.doc*", , _
    "Browse for Project Handbook that will be used to create the Automated Business Case!", "OK")
   
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
   
    Set wdDoc = GetObject(wdFileName) 'open Word file
   
    With wdDoc
        tableNo = wdDoc.tables.Count
        tableTot = wdDoc.tables.Count
        If tableNo = 0 Then
            MsgBox "This document is not an Automated Handbook", _
            vbExclamation, "Import Word Table"
        ElseIf tableNo > 1 Then

        End If
       
        resultRow = 4
       
        For tableStart = 1 To tableTot
            'Call showProgress
            With .tables(tableStart)
                'copy cell contents from Word table cells to Excel cells
                totRow = .Rows.Count
                For iRow = 1 To totRow
                DoEvents
                'Call showProgress
                DoEvents
                    For iCol = 1 To .Columns.Count
                        Cells(resultRow, iCol) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)
                    Next iCol
                    resultRow = resultRow + 1
                Next iRow
            End With
            resultRow = resultRow + 1
        Next tableStart
    End With
Application.ScreenUpdating = True
End Sub
0
 

Author Comment

by:martywal
ID: 41745921
Hi experts, I've managed to pull the GANTT through (as it's the first image this seems ok) I'm wondering how to grab ALL images.

From the above script I've used wdDoc.inlineshapes(1) & wdDoc.shapes(1) and both are getting me the GANTT - it's great but I need to get the remaining images.

Is there a way that I can use the text in the document (e.g. after 2.1 I need to get the pictures of the 2 tables located in that section...)?

Thanks again

Martywal
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41746531
There are several improvements to make in your code.

1. Have Option Explicit at the top of the code
2. Use Early Binding, at least  in the development phase
3. Use For Each when stepping through all the members of a collection
4. Declare variables as tightly as possible (at least  in the development phase)
5. Try to avoid Selecting the range in question

i'll try to work through your code to incorporate the recommendations
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41746553
Early Binding will require a reference (Tools/References) to the relevant library. In this case it is the Microsoft Word x.x Object Library. Intellisense and F1 Help are extended to the referenced application. It is recommended that Late Binding be used for distribution in case of differences between library versions. This would require 'foreign' objects such as Word.Table being declared as objects. Also any built-in constants used would require declaration.

Here you can show the code in a snippet box  by selecting the text and clicking 'CODE' in the tool bar above the edit box.
Option Explicit

Sub ImportWordTable()
'Late Binding
'Dim wdDoc As Object

'Early Binding
Dim wdDoc As Word.Document

Dim wdFileName As Variant
'Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim wdTable As Word.Table
Dim wdRow As Word.Row
Dim wdCell As Word.Cell
Dim xlSheet As Worksheet

'Supressing errors makes debugging difficult
'On Error Resume Next

Application.ScreenUpdating = False
    
Set xlSheet = Worksheets("TableImporter")
 'No Need to select the sheet
    xlSheet.Range("A:N").ClearContents
    
    wdFileName = Application.GetOpenFilename("Word Documents, *.doc*", , _
    "Browse for Project Handbook that will be used to create the Automated Business Case!", "OK")
    
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
    
    Set wdDoc = GetObject(wdFileName) 'open Word file
    
    
    With wdDoc
        If .Tables.Count = 0 Then
            MsgBox "This document is not an Automated Handbook", _
            vbExclamation, "Import Word Table"
        Else
            resultRow = 4
            For Each wdTable In wdDoc.Tables
                'Call showProgress
                With wdTable
                    For Each wdRow In wdTable.Rows
                    'copy cell contents from Word table cells to Excel cells
                        DoEvents
                        'Call showProgress
                        DoEvents
                        For Each wdCell In wdRow.Cells
                            xlSheet.Cells(resultRow, wdCell.ColumnIndex) = WorksheetFunction.Clean(wdCell.Range.Text)
                        Next wdCell
                        resultRow = resultRow + 1
                    Next wdRow
                End With
            Next wdTable
        End If
    End With
Application.ScreenUpdating = True
End Sub

Open in new window

Oh, and don't suppress errors. If an error is raised, it is best to know straight away.
0
 

Author Comment

by:martywal
ID: 41747084
Hi GrahamSkan, I should probably add that the script I added is working what I've got it doing is opening the target word document and bringing all the tables in to the "TableImporter" worksheet. In that worksheet there are a number of formulas that then map the values from the tables into my template. I'm happy enough with that portion.

The issue is that wdDoc.inlineshapes(1) & wdDoc.shapes(1) are only bringing me the first image (the GANTT) I have to try and get the other images to and wondering where (even from your initial functions) they need to go to bring the remaining images through.

I'm starting to think I may have bitten off more than I can chew...
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41747392
There are four Shapes and four inline shapes in the sample document. The Shapes are all the blue line drawings in the Costs section.

You could step through the InLine shapes in the document:
    For Each ilsh In doc.InlineShapes
        DoEvents
        ilsh.Range.Copy
        MsgBox "Inline shape now in clipboard"
    Next ilsh

Open in new window

but how will you differentiate between them?

This code divides the document into blocks separated at the your level 2 style headings and puts each into the clipboard
Sub GetInLineShapes(doc As Word.Document)
Dim ilsh As Word.InlineShape
Dim rng As Word.Range
Dim rng1 As Range
Dim rngBlocks() As Word.Range
Dim strHeadings() As String
Dim b As Integer

Set rng = doc.Range
With rng.Find
    .Style = "Heading 2,heading 2,Intro Text Bold,proj2,proj21,proj22,proj23,proj24,proj25,proj26,proj27,proj28,proj29,proj210,proj211,proj212,proj221,proj231,proj241,proj251,proj261,proj271,proj281,proj291,proj2101,proj2111,proj213,proj222,proj232,proj242,proj252,2"
    Do While .Execute()
        DoEvents
        Set rng1 = rng.Duplicate
        'Debug.Print rng1.Start, rng1.End, rng1.Text
        ReDim Preserve strHeadings(b)
        ReDim Preserve rngBlocks(b)
        strHeadings(b) = rng1.Text
        Set rngBlocks(b) = rng1 'range of heading only
        If b > 0 Then
            rngBlocks(b - 1).End = rng1.Start 'extend range of previous block to the start of current one
        End If
        b = b + 1
    Loop
    rngBlocks(b - 1).End = doc.Range.End
End With
For b = 0 To UBound(strHeadings)
    For Each ilsh In rngBlocks(b).InlineShapes
        DoEvents
        ilsh.Range.Copy
        MsgBox "Inline shape under " & strHeadings(b) & " now in clipboard"
    Next ilsh
Next b
    
End Sub

Open in new window

1
 

Author Comment

by:martywal
ID: 41751396
I'll be back tonight. Thanks again! You guys rock :-)
0
 

Author Comment

by:martywal
ID: 41754525
Hi GrahamSkan,
I'm not to sure where the "GetInLineShapes" needs to go? I've added it as a module but I'm getting a compile error.
Sorry hope that's not too silly a question
0
 

Author Comment

by:martywal
ID: 41754539
Hi GrahamSkan, I've opened a new question as I think this one is probably too open. The new one (28963147) focuses only on the image import. This way you'll get the points for this one and the other one may get more hits. I'll close this and allocate the points if that sounds ok to you. Thanks again
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41754601
Just put the procedure in the same module as the rest of the code.

I guess that you are getting 'User-defined type not defined'. I usually develop using early binding. Please add a reference to the Microsoft Word Object library (Tools/References in the VBA IDE). This will set up the early binding for Word objects. This means that the will have Intellisence and F1 Help for Word objects as Excel objects will already have.
0
 

Author Comment

by:martywal
ID: 41754758
Ok where do I then call the functions though?
0
 

Author Comment

by:martywal
ID: 41754759
I also think that there are too many streams to follow in this trail hence the more direct question update. Sorry?
0
 
LVL 76

Expert Comment

by:GrahamSkan
ID: 41754767
You can call it as soon as you have your Word document object, so you can put this at about line 33 in your original code:
GetInLineShapes wdDoc

Open in new window

Be aware, that it is illustrative, in that it shows one way of getting at the inline shapes. It doesn't actually do anything but report.
0
 

Author Comment

by:martywal
ID: 41757155
I'm closing this question and awarding the points as it has significantly deviated from the initial question. As mentioned above, I've opened a new question that only looks at the importing of images from Word to Excel. Thanks for all your help.
Martywal
0
 

Author Closing Comment

by:martywal
ID: 41757157
I'm closing this question and awarding the points as it has significantly deviated from the initial question. As mentioned above, I've opened a new question that only looks at the importing of images from Word to Excel. Thanks for all your help.
Martywal
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Nice table. Huge mess. Maybe this was something you created way back before you figured out tabs or a document you received from someone else. Either way, using the spacebar to separate the columns resulted in a mess. Trying to convert text to t…
Shortcuts in Word Just the other day I had a training for Microsoft and they wanted me to show how well the new Windows and Office behaved on a touch device, which by the way is great, but it was only then that I realized that using keyboard shortc…
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.
Office 365 is currently available in five editions. Three of them are for business use: Office 365 Business Essentials, Office 365 Business, and Office 365 Business Premium. Two of them are for home/personal use: Office 365 Home and Office 365 Perso…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now