Link to home
Start Free TrialLog in
Avatar of joex
joex

asked on

Convert HTML to RTF, remove tables, copy RTF and text to clipboard

If anyone knows how to implement the VB.Net code at bottom in c++, please let me know.

Thanks.

            Dim fileName as String = "htmlToRtf"
            Dim wd As Object = CreateObject("word.application")

            ' open html in word
            wd.Documents.Open(fileName:=fileName + ".htm")
            ' remove tables
            Dim tblToConvert As Object
            For Each tblToConvert In wd.ActiveDocument.Tables
                tblToConvert.ConvertToText(vbTab, True)
             Next
             
            ' create RTF and text files
            wd.ActiveDocument.SaveAs(fileName:=fileName + ".rtf", FileFormat:=wdFormatRTF)
            wd.ActiveDocument.SaveAs(fileName:=fileName + ".txt", FileFormat:=wdFormatText)
            wd.quit()
            wd = Nothing

            ' read in RTF and text files
            Dim rtfFile As String
            Dim sr As System.IO.StreamReader = New System.IO.StreamReader(fileName + ".rtf")
            rtfFile = sr.ReadToEnd
            sr.Close()

            Dim textFile As String
            sr = New System.IO.StreamReader(fileName + ".txt")
            textFile = sr.ReadToEnd
            sr.Close()

            ' store RTF and text in clipboard
            Dim sData As String
            Dim dataObject As New DataObject
            dataObject.SetData(DataFormats.Text, True, textFile)
            dataObject.SetData(DataFormats.Rtf, True, rtfFile)
            Clipboard.SetDataObject(dataObject)

     
Avatar of joex
joex

ASKER

The aforementioned link pertains to accessing a VB DLL from a VC++ DLL

This email pertains to implementing an HTML-RTF conversion in c++ rather than vb.net.



Check out this link:
http://www.codeproject.com/com/word_ole_bm.asp

That sets you up with an OLE project, and shows you how to use OLE automation which is the technology which will let you do what you need with Microsoft Word. That's the first step.

The next few steps are pretty basic C++ string and file handling routines.

Then the final step is to copy to the clipboard, and a simple tutorial exists here:
http://www.codeproject.com/clipboard/archerclipboard1.asp

Hope this helps!
Avatar of joex

ASKER

Thanks a lot.

My plan is to look into this over the weekend.

To incorporate this into a DLL, please recommend if the DLL should be of a certain type, for instance, MFC DLL.
ASKER CERTIFIED SOLUTION
Avatar of thegilb
thegilb

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of joex

ASKER

Most of the clipboard code samples simply set the clipboard to one type of data.

It is not clear how to perform the following in c++:

            dataObject.SetData(DataFormats.Text, True, textFile)
            dataObject.SetData(DataFormats.Rtf, True, rtfFile)
            Clipboard.SetDataObject(dataObject)



Avatar of joex

ASKER

This project this link references does not appear to build successfully under Visual Studio .Net 2003:

      http://www.codeproject.com/com/word_ole_bm.asp
Avatar of joex

ASKER

Also, the code in the projected referenced by the above link does not appear to have any features to support the conversion of tables to text in order to support the following critical step:
           ' remove tables
            Dim tblToConvert As Object
            For Each tblToConvert In wd.ActiveDocument.Tables
                tblToConvert.ConvertToText(vbTab, True)
             Next

Avatar of joex

ASKER

The following link includes a project with a file that contains the ConvertToText method:

    http://www.codeproject.com/com/AutoSpellCheck.asp
How far have you progressed with this problem now?
Avatar of joex

ASKER

Not very far.  msword9.h contains the method which supports the removal of tables - ConvertToText, but even though the call looks straightorward, opening the word document is a problem.
Well post up your code and let's have a look
Avatar of joex

ASKER

The following links are useful in how to access word via .net c++:

http://support.microsoft.com/kb/308338/
http://support.microsoft.com/kb/307473/EN-US/ 

There are issues that remain to be resolved, but this is a useful start