splitting up a word document into multiple pdfs using vbs script

Hi,

I have a number of large word documents that need to be cut up into multiple peices at ad hoc intervals so will need to use a delimeter of some sort eg. Doc1Start filename1 [data] Doc1End, Doc2 Start filename2 [data] Doc2 End.

I need to run a vbs script to open up the file and then create a new word document for each designated section using the file name designated.

Any help?

Cheers,
Shap.
PrimedWebbieAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

BovardThoCommented:
Are the files csv types or generic word letter documents and how would we identify where each file would need to be cut? For example is there a line of text which would be used to locate or a page number.

Once the above is answered, you just want a vbs script to loop through the .doc files in a folder and split based on the above criteria and then place in a completed folder?

0
GrahamSkanRetiredCommented:
If it's a Word document, you will need Word to work on it. Given that, you might as well use a Word VBA macro instead of VBS.  Insert Next Page Section Breaks at the points that you need the document to be split, and then run this macro.
Sub SplitIntoSections()
    Dim sec As Word.Section
    Dim rng As Word.Range
    Dim doc As Word.Document
    
    For Each sec In ActiveDocument.Sections
        Set rng = sec.Range
        If rng.End < ActiveDocument.Endnotes Then
            rng.MoveEnd wdCharacter, -1
        End If
    Set doc = Documents.Add
    rng.Copy
    doc.Range.Paste
    doc.SaveAs "C:\myfolder\MyFile" & sec.Index & ".doc"
    doc.Close wdDoNotSaveChanges
End Sub
    

Open in new window

0
PrimedWebbieAuthor Commented:
BrovardTho yep, there is. The delimeter would be put into the code.
Eg.
      For SectionNumber = 1 to NumberOfSections
      StartString = "StartingSection"  & CHR(SectionNumber)
      EndString = "EndingSection" & CHR(SectionNumber)

GrahamSkan, Unfortunately i have many documents which i need to open and close and then do other functions like print to pdf and create other html code from these so i am trying to steer away from macros.
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

PrimedWebbieAuthor Commented:
Another note Graham, because there are parts of the document at the start and the end we are not wanting to use, we don't want to create a separate document for those peices of the large document.
0
GrahamSkanRetiredCommented:
This is the macro adapted to VBS
    Dim docIn 'As Word.Document
    Dim wdApp 'As Word.Application
    Dim sec 'As Word.Section
    Dim rng 'As Word.Range
    Dim doc 'As Word.Document

    Const wdDoNotSaveChanges = 0
    Const wdCharacter = 1

    Set wdApp = CreateObject("Word.Application")
    wdApp.Visible = True
    Set docIn = wdApp.Documents.Open("C:\MyFolder\MyInFile.Doc")
    For Each sec In docIn.Sections
        Set rng = sec.Range
        If rng.End < docIn.Range.End Then
            rng.MoveEnd wdCharacter, -1
        End If
        Set doc = wdApp.Documents.Add
        rng.Copy
        doc.Range.Paste
        doc.SaveAs "C:\myfolder\MyOutFile" & sec.Index & ".doc"
        doc.Close wdDoNotSaveChanges
    Next 'sec
    docIn.Close wdDoNotSaveChanges
    wdApp.Quit

Open in new window

0
GrahamSkanRetiredCommented:
You could then miss out the first and last sections


    Dim docIn 'As Word.Document
    Dim wdApp 'As Word.Application
    Dim sec 'As Word.Section
    Dim rng 'As Word.Range
    Dim doc 'As Word.Document
    
    Set wdApp = CreateObject("Word.Application")
        Set docIn = wdApp.Documents.Open("C:\MyFolder\MyInFile.Doc")
            For Each sec In docIn.Sections
                If sec.Index > 1 And sec.Index < docIn.Sections.Count Then
                    Set rng = sec.Range
                    rng.MoveEnd wdCharacter, -1 'omit section break
                    Set doc = Documents.Add
                    rng.Copy
                    doc.Range.Paste
                    doc.SaveAs "C:\myfolder\MyOutFile" & sec.Index & ".doc"
                    doc.Close wdDoNotSaveChanges
                End If
            Next sec
        docIn.Close wdDoNotSaveChanges
    wdApp.Quit

Open in new window

0
PrimedWebbieAuthor Commented:
Hi Graham,
I've been getting wierd memory issues when i run your code.  Its like its doing something in the background. ie. Process winword is running.  Also last bit of code you wrote has a syntax error in line 19 when i run it.
0
PrimedWebbieAuthor Commented:
I guess a better way of explaining this now is this.
I need to copy the content in a word document between two tags and then either assign it to a variable for use in the code or to paste into a new document.
Eg. Word document is 100 pages long.
on page 3 there is a tag [sectionTitle1]A really kewl document[/sectionTitle1] [startOfContent1] yada...picture....tables and stuff....[/endOfContent1]

I then need to use the variable sectionTitle in a number of places so need it stored in a variable.
I then need to paste the data between the content tags into a new word document which is actually a template.

Hope that makes it clearer why i can't use sections, but thanks for efforts so far.
0
PrimedWebbieAuthor Commented:
Hi,

I have worked on the code a bit and have this function i did ages ago in VB.  I need to convert it into VBScript. The problem is I don't know the equivalents.


Sub CopyNotes(objWord,ThisSectionNumber,ThisDataType)
    
	
	' objWord is the file already open 
	' ThisSectionNumber is the current section of the document i'm wanting to copy
	' ThisDataType is the string value of the tag eg. [section] and [/section]
	
    Dim strStart 
    Dim strEnd 
    
    strStart = "[" & ThisDataType & CStr(ThisSectionNumber) & "]"   '
    strEnd = "[/" & ThisDataType & CStr(ThisSectionNumber) & "]"        

    objWord.Selection.Find.ClearFormatting  'find start in doc
    With objWord.Selection.Find
        .Text = strStart
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
    End With
    objWord.Selection.Find.Execute
    objWord.Selection.Extend                'extend the selection to the end word
    objWord.Selection.Find.ClearFormatting
    With objWord.Selection.Find
        .Text = strEnd
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindAsk
    End With
    objWord.Selection.Find.Execute
    objWord.Selection.MoveStart wdCharacter, Len(strStart) + 1     'adjust selection to ignore
    objWord.Selection.MoveEnd wdCharacter, -1 * (Len(strEnd) + 1)  'start and end identifiers
    objWord.Selection.Copy
    
End Sub

Open in new window

0
GrahamSkanRetiredCommented:
Do the parts inside the square brackets confor to exactly the pattern implied, i.e.
"sectionTitle1", "/sectionTitle1", "startOfContent1","/endOfContent1" and then "sectionTitle2",  "/sectionTitle2", etc?

0
GrahamSkanRetiredCommented:
This is a Word macro, again, written with a view to converting it to VBS. It is untested. If it doesn't work and/or you need it converted fully to VBS, it would help to see a sample document.
0
PrimedWebbieAuthor Commented:
Hi Graham, yes to your first question re conforming to pattern.

Re the second comment re untested macro....can't see it. Did you mean to attach it to your message?
0
GrahamSkanRetiredCommented:
Duh!
Sub SplitIntoSections2()
    Dim docIn As Word.Document 'Word Macro - As Word.Document not needed for VBS
    Dim wdApp As Word.Application 'Word Macro
    Dim rng As Word.Range 'Word Macro
    Dim rngContent As Word.Range 'Word Macro
    Dim rngTitle As Word.Range 'Word Macro
    Dim docOut As Word.Document 'Word Macro
    Dim sec As Long 'Word Macro
    Dim strTitle As String
    
    'Const wdCollapseEnd = 0 'VBS
    'Const wdDoNotSaveChanges = 0 'VBS
    'Const wdCharacter = 1 'VBS
    
    sec = 1
    'Set wdApp = CreateObject("Word.Application") 'VBS
    Set wdApp = Application 'Word Macro
    Set docIn = wdApp.Documents.Open("C:\MyFolder\MyInFile.Doc")
    Set rng = docIn.Range
    Do
        With rng.Find
            'Title
            .Text = "[sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngTitle = rng
             .Text = "[/sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngTitle.End = rng.Start
            rng.Collapse wdCollapseEnd
            strTitle = rngTitle.Text
            
            'Content
            .Text = "[startOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngContent = rng
            .Text = "[/endOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngContent.End = rng.Start
            rng.Collapse wdCollapseEnd
            
            'New document
            Set docOut = wdApp.Documents.Add("C:\MyFolder\.MyTemplate.dot")
            rngContent.Copy
            docOut.Bookmarks("\EndOfDoc").Range.Paste
            docOut.SaveAs "C:\MyFolder\" & strTitle & ".doc"
            docOut.Close wdDoNotSaveChanges
        End With
    Loop While True
    docIn.Close wdDoNotSaveChanges
    'wdApp.Quit 'VBS

End Sub

Open in new window

0
PrimedWebbieAuthor Commented:
Hi There,
When i try to run this code nothing happens at all. Do i comment out the Word Macro only items?
I need it to run in vbscript.
0
GrahamSkanRetiredCommented:
If it doesn't work in VBA, there is no reason to believe that it will be any better in VBScript.

I have created a document that I think matches your input document. The markup text is in red font.

The new snippet has a couple of tweaks and works with the document.

With VBA, as opposed to VBS, you can step through the code with the F8 key to see what is happening and hence debug it. If you don't feel up to that, I suggest that you post a sample document.


Sub SplitIntoSections2a()
    Dim docIn As Word.Document 'Word Macro - As Word.Document not needed for VBS
    Dim wdApp As Word.Application 'Word Macro
    Dim rng As Word.Range 'Word Macro
    Dim rngContent As Word.Range 'Word Macro
    Dim rngTitle As Word.Range 'Word Macro
    Dim docOut As Word.Document 'Word Macro
    Dim sec As Long 'Word Macro
    Dim strTitle As String
    
    'Const wdCollapseEnd = 0 'VBS
    'Const wdDoNotSaveChanges = 0 'VBS
    'Const wdCharacter = 1 'VBS
    
    sec = 1
    'Set wdApp = CreateObject("Word.Application") 'VBS
    Set wdApp = Application 'Word Macro
    Set docIn = wdApp.Documents.Open("C:\MyFolder\MyInFile.Doc")
    Set rng = docIn.Range
    Do
        With rng.Find
            'Title
            .Text = "[sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngTitle = rng.Duplicate
             .Text = "[/sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngTitle.End = rng.Start
            rng.Collapse wdCollapseEnd
            strTitle = rngTitle.Text
            
            'Content
            .Text = "[startOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngContent = rng.Duplicate
            .Text = "[/endOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngContent.End = rng.Start
            rng.Collapse wdCollapseEnd
            
            'New document
            Set docOut = wdApp.Documents.Add("C:\MyFolder\MyTemplate.dot")
            rngContent.Copy
            docOut.Bookmarks("\EndOfDoc").Range.Paste
            docOut.SaveAs "C:\MyFolder\" & strTitle & ".doc"
            docOut.Close wdDoNotSaveChanges
            sec = sec + 1
        End With
    Loop While True
    docIn.Close wdDoNotSaveChanges
    'wdApp.Quit 'VBS

End Sub

Open in new window

MyInFile.doc
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
PrimedWebbieAuthor Commented:
Ok, i'm almost there. Thanks for your help.
I've rewritten everything and its splitting up the files as requested perfectly, but the thing is now its only now restricted to one file. So i am trying to pass in the file path as a string but its not working.  Can you call argumented macros from a cmd line .  Eg.  Run "thismacro(filename)" ???
0
GrahamSkanRetiredCommented:
If so, I don't know how.

You could put all the files in one folder and process them all. I think that the VBS version of this macro would have to use FSO instead of the Dir$() function.
Sub SplitIntoSections2b()
    Dim docIn As Word.Document 'Word Macro - As Word.Document not needed for VBS
    Dim wdApp As Word.Application 'Word Macro
    Dim rng As Word.Range 'Word Macro
    Dim rngContent As Word.Range 'Word Macro
    Dim rngTitle As Word.Range 'Word Macro
    Dim docOut As Word.Document 'Word Macro
    Dim sec As Long 'Word Macro
    Dim strTitle As String
    
    Dim strFile As String
    Const strInfolder = "C:\MyInFolder\"

    'Const wdCollapseEnd = 0 'VBS
    'Const wdDoNotSaveChanges = 0 'VBS
    'Const wdCharacter = 1 'VBS
    
    sec = 1
    'Set wdApp = CreateObject("Word.Application") 'VBS
    Set wdApp = Application 'Word Macro
    strFile = Dir$(strInfolder & "*.doc*")
    Do Until strFile = ""
        Set docIn = wdApp.Documents.Open(strInfolder & strFile)
        Set rng = docIn.Range
        Do
            With rng.Find
                'Title
                .Text = "[sectionTitle" & sec & "]"
                If Not .Execute Then
                    Exit Do
                End If
                rng.Collapse wdCollapseEnd
                Set rngTitle = rng.Duplicate
                 .Text = "[/sectionTitle" & sec & "]"
                If Not .Execute Then
                    Exit Do
                End If
                rngTitle.End = rng.Start
                rng.Collapse wdCollapseEnd
                strTitle = rngTitle.Text
                
                'Content
                .Text = "[startOfContent" & sec & "]"
                If Not .Execute Then
                    Exit Do
                End If
                rng.Collapse wdCollapseEnd
                Set rngContent = rng.Duplicate
                .Text = "[/endOfContent" & sec & "]"
                If Not .Execute Then
                    Exit Do
                End If
                rngContent.End = rng.Start
                rng.Collapse wdCollapseEnd
                
                'New document
                Set docOut = wdApp.Documents.Add("C:\MyTemplateFolder\MyTemplate.dot")
                rngContent.Copy
                docOut.Bookmarks("\EndOfDoc").Range.Paste
                docOut.SaveAs "C:\MyOutFolder\" & strTitle & ".doc"
                docOut.Close wdDoNotSaveChanges
                sec = sec + 1
            End With
        Loop While True
        
        docIn.Close wdDoNotSaveChanges
        strFile = Dir$()
    Loop
    'wdApp.Quit 'VBS

End Sub

Open in new window

0
PrimedWebbieAuthor Commented:
Graham your genius status is well deserved. Although i feel we were on different pages a lot of the time during this process, i did learn a lot from your guidance so thank you so very much.  The only problem i have now is that if there are more than two sections, it doesn't continue. Your thoughts?
0
GrahamSkanRetiredCommented:
As an attempt to fix the two-sections limit, try adding a line after line 54 in the snippet for comment 31172260 (above)

 54:               rng.Collapse wdCollapseEnd
 55:               rng.End =  docIn.Range.End            


If it doesn't work, please post a sample doc. Thanks
0
PrimedWebbieAuthor Commented:
Hi Graham,
Here is the cod i'm using. I added in the line you mentioned at the end, ie. rng.End=docin.Range.End.

Its not working still.

There are references to other functions which do other things.
Sub GetSectionDataMacro(thisMasterFileName)
    Dim docIn As Word.Document 'Word Macro - As Word.Document not needed for VBS
    Dim wdApp As Word.Application 'Word Macro
    Dim rng As Word.Range 'Word Macro
    Dim rngContent As Word.Range 'Word Macro
    Dim rngTitle As Word.Range 'Word Macro
    Dim rngInteractionNumber As Word.Range 'Word Macro
    Dim rngIntroText As Word.Range 'Word Macro
    Dim rngUnitCode As Word.Range 'Word Macro
    Dim docOut As Word.Document 'Word Macro
    Dim sec As Long 'Word Macro
    Dim strTitle As String
    Dim strCurrentInteractionNumber As String
    Dim strCurrentIntroText As String
    Dim strCurrentUnitCode As String
    Dim strCFMPath As String
    
    'Const wdCollapseEnd = 0 'VBS
    'Const wdDoNotSaveChanges = 0 'VBS
    'Const wdCharacter = 1 'VBS
     
    sec = 1
    'Set wdApp = CreateObject("Word.Application") 'VBS
    Set wdApp = Application 'Word Macro
    'Set docIn = wdApp.Documents.Open("e:\_epiq\masterDocs\Test.doc")
    'MsgBox ("Opening master file " & thisMasterFileName)
    Set docIn = wdApp.Documents.Open(thisMasterFileName)
    Set rng = docIn.Range
    Do
        With rng.Find
            'Title
            'MsgBox ("Getting the sectionTitle")
            .Text = "[sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngTitle = rng.Duplicate
             .Text = "[/sectionTitle" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngTitle.End = rng.Start
            rng.Collapse wdCollapseEnd
            strTitle = rngTitle.Text
            
            'MsgBox ("The strTitle is " & strTitle)
            
            'Interaction Number
            'MsgBox ("Getting the InteractionNumber")
            .Text = "[intnum" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngInteractionNumber = rng.Duplicate
             .Text = "[/intnum" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngInteractionNumber.End = rng.Start
            rng.Collapse wdCollapseEnd
            strCurrentInteractionNumber = rngInteractionNumber.Text
            
            'MsgBox ("The interaction number is " & strCurrentInteractionNumber)
            
            'UnitCode
            'MsgBox ("Getting the Unit Code")
            .Text = "[unitcode" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngUnitCode = rng.Duplicate
             .Text = "[/unitcode" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngUnitCode.End = rng.Start
            rng.Collapse wdCollapseEnd
            strCurrentUnitCode = rngUnitCode.Text
            
            'MsgBox ("The Unit Code is " & strCurrentUnitCode)
            
            'IntroText
            'MsgBox ("Getting the IntroText")
            .Text = "[introtext" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngIntroText = rng.Duplicate
             .Text = "[/introtext" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngIntroText.End = rng.Start
            rng.Collapse wdCollapseEnd
            strCurrentIntroText = rngIntroText.Text
            
            'MsgBox ("The Intro Text is " & strCurrentIntroText)
                        
            'Content
            'MsgBox ("Getting the Content")
            .Text = "[startOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rng.Collapse wdCollapseEnd
            Set rngContent = rng.Duplicate
            .Text = "[/endOfContent" & sec & "]"
            If Not .Execute Then
                Exit Do
            End If
            rngContent.End = rng.Start
            rng.End = docIn.Range.End
            rng.Collapse wdCollapseEnd
            
            'New document
            Set docOut = wdApp.Documents.Add("e:\_epiq\templates\epiq_template_portrait.doc")
            rngContent.Copy
            docOut.Bookmarks("\EndOfDoc").Range.Paste
            docOut.SaveAs "e:\_epiq\exportedDocs\" & strTitle & ".doc"
            docOut.Close wdDoNotSaveChanges
            sec = sec + 1
        End With
        
        strCFMPath = "e:\_epiq\CFMS\"
        
        Call CreateTheCFM(strCurrentUnitCode, strCurrentInteractionNumber, strTitle, strCurrentIntroText, strCFMPath)
        
        
    Loop While True
    docIn.Close wdDoNotSaveChanges

Call CreateThePDFs
Call MoveThePDFs

'wdApp.Quit 'VBS

End Sub

Open in new window

0
PrimedWebbieAuthor Commented:
Hi Graham, all good and all working. I had the statements around the wrong way. Wish i could add more points as you totally deserve. Thank you.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Computer Games

From novice to tech pro — start learning today.