Word segment to html


I am trying to write some code in a vb macro to take a word document and save it as a html file without all the horrid word html code.

It would need to strip out the images and use the images as jpgs or gifs, doesn't really matter, but most importantly must use same formatting.

Any ideas?

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Muhammad KhanManager, ITCommented:
you can MS Word Automation to save the word document as HTML .. .visit the following link

I have cleaned up the script that aiklamha linked to as it was VBScript rather than VB/VBA.

This code is for controlling the word application from another application - eg Excel etc
Public Sub Doc2HTML()

' This subroutine opens a Word document,
' then saves it as HTML, and closes Word.
' If the HTML file exists, it is overwritten.
' If Word was already active, the subroutine
' will leave the other document(s) alone and
' close only its "own" document.
' Written by Rob van der Woude
' http://www.robvanderwoude.com
    ' Standard housekeeping
    Const DocFile As String = "C:\xxx.doc"
    Const wdFormatFilteredHTML As Integer = 10
    Dim objDoc As Object, objFile As Object, objFSO As Object, objWord As Object
    Dim strFile As String, strHTML As String

    ' Create a File System object
    Set objFSO = CreateObject("Scripting.FileSystemObject")

    ' Create a Word object
    Set objWord = CreateObject("Word.Application")

    With objWord
        ' True: make Word visible; False: invisible
        .Visible = True

        ' Check if the Word document exists
        If objFSO.FileExists(DocFile) Then
            Set objFile = objFSO.GetFile(DocFile)
            strFile = objFile.Path
            MsgBox "FILE OPEN ERROR: The file does not exist" & vbCrLf
            ' Close Word
            Exit Sub
        End If
        ' Build the fully qualified HTML file name
        strHTML = objFSO.BuildPath(objFile.ParentFolder, _
                  objFSO.GetBaseName(objFile) & ".html")

        ' Open the Word document
        .Documents.Open strFile

        ' Make the opened file the active document
        Set objDoc = .ActiveDocument

        ' Save as HTML
        objDoc.SaveAs strHTML, wdFormatFilteredHTML

        ' Close the active document

        ' Close Word
    End With
End Sub

Open in new window

If you were in Word and wanted to run a macro on the active document then it would just be something like below
Sub testtest()
Dim doc As Document

    Set doc = ActiveDocument
    doc.SaveAs2 "C:\Pathetc\name.doc\", wdFormatFilteredHTML

End Sub

Open in new window

Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

PrimedWebbieAuthor Commented:
Thanks that works but still not what i need as it has all of the html styles.
Muhammad KhanManager, ITCommented:
>>Thanks that works but still not what i need as it has all of the html styles.
Can you explain this? what do you mean by "all of the html styles"
PrimedWebbieAuthor Commented:
Sorry i didn't clarify this myself.
When you save a document as html from a word document if puts in "in line" css and a heap of other other which is unnecessary.  This can be removed by opening in dreamweaver and clicking clean up word html.  Thats the type of stuff i cant have in the file.
The filtered version - save as html option is the best option that word has for creating a HTML version of a document with the minimum amount of extra code included.  The other 2 options save as HTML and MHTML will put a lot more rubbish into the file it creates.

Sorry but that is all there is - I can only recommend that you run them through Dreamweaver (as you suggested) to remove some of the unnecessary junk in them.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
PrimedWebbieAuthor Commented:
Thanks. :)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic Classic

From novice to tech pro — start learning today.