Avatar of joex
joex
 asked on

Convert HTML to RTF, remove tables, copy RTF and text to clipboard

If anyone knows how to implement the VB.Net code at bottom in c++, please let me know.

Thanks.

            Dim fileName as String = "htmlToRtf"
            Dim wd As Object = CreateObject("word.application")

            ' open html in word
            wd.Documents.Open(fileName:=fileName + ".htm")
            ' remove tables
            Dim tblToConvert As Object
            For Each tblToConvert In wd.ActiveDocument.Tables
                tblToConvert.ConvertToText(vbTab, True)
             Next
             
            ' create RTF and text files
            wd.ActiveDocument.SaveAs(fileName:=fileName + ".rtf", FileFormat:=wdFormatRTF)
            wd.ActiveDocument.SaveAs(fileName:=fileName + ".txt", FileFormat:=wdFormatText)
            wd.quit()
            wd = Nothing

            ' read in RTF and text files
            Dim rtfFile As String
            Dim sr As System.IO.StreamReader = New System.IO.StreamReader(fileName + ".rtf")
            rtfFile = sr.ReadToEnd
            sr.Close()

            Dim textFile As String
            sr = New System.IO.StreamReader(fileName + ".txt")
            textFile = sr.ReadToEnd
            sr.Close()

            ' store RTF and text in clipboard
            Dim sData As String
            Dim dataObject As New DataObject
            dataObject.SetData(DataFormats.Text, True, textFile)
            dataObject.SetData(DataFormats.Rtf, True, rtfFile)
            Clipboard.SetDataObject(dataObject)

     
C++

Avatar of undefined
Last Comment
joex

8/22/2022 - Mon
joex

ASKER
The aforementioned link pertains to accessing a VB DLL from a VC++ DLL

This email pertains to implementing an HTML-RTF conversion in c++ rather than vb.net.



thegilb

Check out this link:
http://www.codeproject.com/com/word_ole_bm.asp

That sets you up with an OLE project, and shows you how to use OLE automation which is the technology which will let you do what you need with Microsoft Word. That's the first step.

The next few steps are pretty basic C++ string and file handling routines.

Then the final step is to copy to the clipboard, and a simple tutorial exists here:
http://www.codeproject.com/clipboard/archerclipboard1.asp

Hope this helps!
joex

ASKER
Thanks a lot.

My plan is to look into this over the weekend.

To incorporate this into a DLL, please recommend if the DLL should be of a certain type, for instance, MFC DLL.
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER CERTIFIED SOLUTION
thegilb

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
joex

ASKER
Most of the clipboard code samples simply set the clipboard to one type of data.

It is not clear how to perform the following in c++:

            dataObject.SetData(DataFormats.Text, True, textFile)
            dataObject.SetData(DataFormats.Rtf, True, rtfFile)
            Clipboard.SetDataObject(dataObject)



joex

ASKER
This project this link references does not appear to build successfully under Visual Studio .Net 2003:

      http://www.codeproject.com/com/word_ole_bm.asp
joex

ASKER
Also, the code in the projected referenced by the above link does not appear to have any features to support the conversion of tables to text in order to support the following critical step:
           ' remove tables
            Dim tblToConvert As Object
            For Each tblToConvert In wd.ActiveDocument.Tables
                tblToConvert.ConvertToText(vbTab, True)
             Next

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
joex

ASKER
The following link includes a project with a file that contains the ConvertToText method:

    http://www.codeproject.com/com/AutoSpellCheck.asp
thegilb

How far have you progressed with this problem now?
joex

ASKER
Not very far.  msword9.h contains the method which supports the removal of tables - ConvertToText, but even though the call looks straightorward, opening the word document is a problem.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
thegilb

Well post up your code and let's have a look
joex

ASKER
The following links are useful in how to access word via .net c++:

http://support.microsoft.com/kb/308338/
http://support.microsoft.com/kb/307473/EN-US/ 

There are issues that remain to be resolved, but this is a useful start