Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 171
  • Last Modified:

Automated import of data from Word to Excel with VBA

Hi Experts,

I'm after a little bit of advice...

I have a word document that users essentially fill out. It's a relatively large document (50 pages or so) and contains a number of features such as bookmarks and references etc. Once the guys that are completing the document have finished I then need to bring certain pieces of data that they've completed into a 4 page summary.

I've created a template in Excel (at this stage) that will house the desired information in the desired format.

I'm wondering if you guys have some advice on the best way to do this?

Some of the features that will need to be imported include:

- paragraphs from specific sections of the document
- values from tables in the word document
- a number of pictures that they will have placed in predetermined locations in the word document

I was hoping that I could write a script that would find the various references (i.e. locate section 2.5.1 and copy the paragraph / table / picture before pasting it in the corresponding location in the Excel template).

I may have bitten off more than I can chew I'm not sure but keen to get an opinion or two / some good articles that I can use to guide me. Perhaps I need to re-think and revert to PPT instead of Excel.

Any ideas really appreciated.

Cheers

Martywal
0
martywal
Asked:
martywal
  • 16
  • 9
1 Solution
 
WalkaboutTiggerCommented:
Well...two ways to approach this and I recommend using both.

The first is a Word VBA macro which creates an Excel file with the document elements you want to import into Excel.  Somehow associate this newly-created Excel workbook.

You will then use your template file to get the data from this intermediate Excel file and manipulate it in Excel.

Today I spent over 8 hours working in Excel VBA, so my ability to continue to code has been depleted.  I do hope this helps to give you a starting point.

If the blank documents are not proprietary and are neither company nor industry identifiable, your uploading them here would help us help you reach a solution.

If that is not possible, if you can create some kind of mock-up Word document and upload it, that, would be helpful.
1
 
martywalAuthor Commented:
Thanks WalkaboutTigger I'll post versions of the files later.

So, with your word macro suggestion, I'm guessing that I could fire that from within the Excel?
From the Excel open the word, fire the macro, create another excel with the data, close the word file, copy from the created excel to the template...

It may become clearer when I post the files.

Thanks for your help thus far! :-)
0
 
martywalAuthor Commented:
It won't let me update a docm or xlsm but I've uploaded docx & xlsx. Hopefully it gives the idea..

I've added descriptions in red text to see the sections I'll be taking over.
Project-Handbook-v1-demox.docx
Business-Casex.xlsx
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
martywalAuthor Commented:
Hi again Experts,
Just thought I'd follow up on this again.
I've managed to bring most of the data into my workbook (which is great!) the trick now is going to be bringing the images/pictures in... Can that be done?
Fingers crossed
0
 
GrahamSkanCommented:
You will have to use Copy and Paste,
Your pictures all seem to be inline, so at the Word end you will need something like:
MyDoc.inlineShapes(1).Range.Copy

Open in new window

1
 
martywalAuthor Commented:
Thanks GrahamSkan, I was hoping that I could do this from within the Excel though. Fingers crossed, Thanks again
0
 
GrahamSkanCommented:
The code could run in either application. By 'Word end', I was referring to getting the image from the document as opposed to putting it somewhere into the workbook.
0
 
martywalAuthor Commented:
OK that's cool. I'll explore it in the coming days. You did mention that the pictures are in line. They are now but that's just because I've placed them in the document, the users won't be so accurate I fear. Is there a where to use the bookmarks / contents headings to pick up a picture and copy it?

Thanks again :-)
0
 
GrahamSkanCommented:
A floating picture is implemented as a Shape object, and this will find the first shape anchored in the given range.
Function GetFloatingPicture(rng As Range) As Shape
    Set GetFloatingPicture = rng.ShapeRange(1)
End Function

Open in new window

0
 
GrahamSkanCommented:
Here is a way to find the Gantt chart from the sample document.
Function GetInlineShapeAfterText(doc As Document, strText As String) As InlineShape
    Dim rng As Range
    Dim para As Paragraph
    
    Set rng = doc.Range
    With rng.Find
        .Text = strText
        If .Execute() Then
            Set para = GetNextParagraph(rng)
            Set GetInlineShapeAfterText = para.Range.InlineShapes(1)
        End If
    End With
End Function
 
Function GetNextParagraph(rng As Range) As Paragraph
    rng.Expand wdParagraph
    rng.Collapse wdCollapseEnd
    rng.Move wdCharacter, 1
    rng.Expand wdParagraph
    Set GetNextParagraph = rng.Paragraphs.First
End Function

Open in new window

1
 
martywalAuthor Commented:
Hi GrahamSkan, sorry not to have got back on this sooner. I'm back into it over the weekend. Let you know how I go. Thanks again for you tips!
0
 
martywalAuthor Commented:
Hi GrahamSkan, hopefully this isn't too silly a question but I'm curious where I need to place this script? I have a module that opens the word document and extracts the tables "ImportWordTable()" but I'm not too sure where to call the function.

Here's the table import:

Sub ImportWordTable()
Dim wdDoc As Object
Dim wdFileName As Variant
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
On Error Resume Next

Application.ScreenUpdating = False
   
Worksheets("TableImporter").Select
 'Need to select the
    'ActiveSheet.Range("A:AZ").ClearContents
    Worksheets("TableImporter").Range("A:N").ClearContents
   
    wdFileName = Application.GetOpenFilename("Word Documents, *.doc*", , _
    "Browse for Project Handbook that will be used to create the Automated Business Case!", "OK")
   
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
   
    Set wdDoc = GetObject(wdFileName) 'open Word file
   
    With wdDoc
        tableNo = wdDoc.tables.Count
        tableTot = wdDoc.tables.Count
        If tableNo = 0 Then
            MsgBox "This document is not an Automated Handbook", _
            vbExclamation, "Import Word Table"
        ElseIf tableNo > 1 Then

        End If
       
        resultRow = 4
       
        For tableStart = 1 To tableTot
            'Call showProgress
            With .tables(tableStart)
                'copy cell contents from Word table cells to Excel cells
                totRow = .Rows.Count
                For iRow = 1 To totRow
                DoEvents
                'Call showProgress
                DoEvents
                    For iCol = 1 To .Columns.Count
                        Cells(resultRow, iCol) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)
                    Next iCol
                    resultRow = resultRow + 1
                Next iRow
            End With
            resultRow = resultRow + 1
        Next tableStart
    End With
Application.ScreenUpdating = True
End Sub
0
 
martywalAuthor Commented:
Hi experts, I've managed to pull the GANTT through (as it's the first image this seems ok) I'm wondering how to grab ALL images.

From the above script I've used wdDoc.inlineshapes(1) & wdDoc.shapes(1) and both are getting me the GANTT - it's great but I need to get the remaining images.

Is there a way that I can use the text in the document (e.g. after 2.1 I need to get the pictures of the 2 tables located in that section...)?

Thanks again

Martywal
0
 
GrahamSkanCommented:
There are several improvements to make in your code.

1. Have Option Explicit at the top of the code
2. Use Early Binding, at least  in the development phase
3. Use For Each when stepping through all the members of a collection
4. Declare variables as tightly as possible (at least  in the development phase)
5. Try to avoid Selecting the range in question

i'll try to work through your code to incorporate the recommendations
0
 
GrahamSkanCommented:
Early Binding will require a reference (Tools/References) to the relevant library. In this case it is the Microsoft Word x.x Object Library. Intellisense and F1 Help are extended to the referenced application. It is recommended that Late Binding be used for distribution in case of differences between library versions. This would require 'foreign' objects such as Word.Table being declared as objects. Also any built-in constants used would require declaration.

Here you can show the code in a snippet box  by selecting the text and clicking 'CODE' in the tool bar above the edit box.
Option Explicit

Sub ImportWordTable()
'Late Binding
'Dim wdDoc As Object

'Early Binding
Dim wdDoc As Word.Document

Dim wdFileName As Variant
'Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim wdTable As Word.Table
Dim wdRow As Word.Row
Dim wdCell As Word.Cell
Dim xlSheet As Worksheet

'Supressing errors makes debugging difficult
'On Error Resume Next

Application.ScreenUpdating = False
    
Set xlSheet = Worksheets("TableImporter")
 'No Need to select the sheet
    xlSheet.Range("A:N").ClearContents
    
    wdFileName = Application.GetOpenFilename("Word Documents, *.doc*", , _
    "Browse for Project Handbook that will be used to create the Automated Business Case!", "OK")
    
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
    
    Set wdDoc = GetObject(wdFileName) 'open Word file
    
    
    With wdDoc
        If .Tables.Count = 0 Then
            MsgBox "This document is not an Automated Handbook", _
            vbExclamation, "Import Word Table"
        Else
            resultRow = 4
            For Each wdTable In wdDoc.Tables
                'Call showProgress
                With wdTable
                    For Each wdRow In wdTable.Rows
                    'copy cell contents from Word table cells to Excel cells
                        DoEvents
                        'Call showProgress
                        DoEvents
                        For Each wdCell In wdRow.Cells
                            xlSheet.Cells(resultRow, wdCell.ColumnIndex) = WorksheetFunction.Clean(wdCell.Range.Text)
                        Next wdCell
                        resultRow = resultRow + 1
                    Next wdRow
                End With
            Next wdTable
        End If
    End With
Application.ScreenUpdating = True
End Sub

Open in new window

Oh, and don't suppress errors. If an error is raised, it is best to know straight away.
0
 
martywalAuthor Commented:
Hi GrahamSkan, I should probably add that the script I added is working what I've got it doing is opening the target word document and bringing all the tables in to the "TableImporter" worksheet. In that worksheet there are a number of formulas that then map the values from the tables into my template. I'm happy enough with that portion.

The issue is that wdDoc.inlineshapes(1) & wdDoc.shapes(1) are only bringing me the first image (the GANTT) I have to try and get the other images to and wondering where (even from your initial functions) they need to go to bring the remaining images through.

I'm starting to think I may have bitten off more than I can chew...
0
 
GrahamSkanCommented:
There are four Shapes and four inline shapes in the sample document. The Shapes are all the blue line drawings in the Costs section.

You could step through the InLine shapes in the document:
    For Each ilsh In doc.InlineShapes
        DoEvents
        ilsh.Range.Copy
        MsgBox "Inline shape now in clipboard"
    Next ilsh

Open in new window

but how will you differentiate between them?

This code divides the document into blocks separated at the your level 2 style headings and puts each into the clipboard
Sub GetInLineShapes(doc As Word.Document)
Dim ilsh As Word.InlineShape
Dim rng As Word.Range
Dim rng1 As Range
Dim rngBlocks() As Word.Range
Dim strHeadings() As String
Dim b As Integer

Set rng = doc.Range
With rng.Find
    .Style = "Heading 2,heading 2,Intro Text Bold,proj2,proj21,proj22,proj23,proj24,proj25,proj26,proj27,proj28,proj29,proj210,proj211,proj212,proj221,proj231,proj241,proj251,proj261,proj271,proj281,proj291,proj2101,proj2111,proj213,proj222,proj232,proj242,proj252,2"
    Do While .Execute()
        DoEvents
        Set rng1 = rng.Duplicate
        'Debug.Print rng1.Start, rng1.End, rng1.Text
        ReDim Preserve strHeadings(b)
        ReDim Preserve rngBlocks(b)
        strHeadings(b) = rng1.Text
        Set rngBlocks(b) = rng1 'range of heading only
        If b > 0 Then
            rngBlocks(b - 1).End = rng1.Start 'extend range of previous block to the start of current one
        End If
        b = b + 1
    Loop
    rngBlocks(b - 1).End = doc.Range.End
End With
For b = 0 To UBound(strHeadings)
    For Each ilsh In rngBlocks(b).InlineShapes
        DoEvents
        ilsh.Range.Copy
        MsgBox "Inline shape under " & strHeadings(b) & " now in clipboard"
    Next ilsh
Next b
    
End Sub

Open in new window

1
 
martywalAuthor Commented:
I'll be back tonight. Thanks again! You guys rock :-)
0
 
martywalAuthor Commented:
Hi GrahamSkan,
I'm not to sure where the "GetInLineShapes" needs to go? I've added it as a module but I'm getting a compile error.
Sorry hope that's not too silly a question
0
 
martywalAuthor Commented:
Hi GrahamSkan, I've opened a new question as I think this one is probably too open. The new one (28963147) focuses only on the image import. This way you'll get the points for this one and the other one may get more hits. I'll close this and allocate the points if that sounds ok to you. Thanks again
0
 
GrahamSkanCommented:
Just put the procedure in the same module as the rest of the code.

I guess that you are getting 'User-defined type not defined'. I usually develop using early binding. Please add a reference to the Microsoft Word Object library (Tools/References in the VBA IDE). This will set up the early binding for Word objects. This means that the will have Intellisence and F1 Help for Word objects as Excel objects will already have.
0
 
martywalAuthor Commented:
Ok where do I then call the functions though?
0
 
martywalAuthor Commented:
I also think that there are too many streams to follow in this trail hence the more direct question update. Sorry?
0
 
GrahamSkanCommented:
You can call it as soon as you have your Word document object, so you can put this at about line 33 in your original code:
GetInLineShapes wdDoc

Open in new window

Be aware, that it is illustrative, in that it shows one way of getting at the inline shapes. It doesn't actually do anything but report.
0
 
martywalAuthor Commented:
I'm closing this question and awarding the points as it has significantly deviated from the initial question. As mentioned above, I've opened a new question that only looks at the importing of images from Word to Excel. Thanks for all your help.
Martywal
0
 
martywalAuthor Commented:
I'm closing this question and awarding the points as it has significantly deviated from the initial question. As mentioned above, I've opened a new question that only looks at the importing of images from Word to Excel. Thanks for all your help.
Martywal
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 16
  • 9
Tackle projects and never again get stuck behind a technical roadblock.
Join Now