joex
asked on
Convert HTML to RTF, remove tables, copy RTF and text to clipboard
If anyone knows how to implement the VB.Net code at bottom in c++, please let me know.
Thanks.
Dim fileName as String = "htmlToRtf"
Dim wd As Object = CreateObject("word.applica tion")
' open html in word
wd.Documents.Open(fileName :=fileName + ".htm")
' remove tables
Dim tblToConvert As Object
For Each tblToConvert In wd.ActiveDocument.Tables
tblToConvert.ConvertToText (vbTab, True)
Next
' create RTF and text files
wd.ActiveDocument.SaveAs(f ileName:=f ileName + ".rtf", FileFormat:=wdFormatRTF)
wd.ActiveDocument.SaveAs(f ileName:=f ileName + ".txt", FileFormat:=wdFormatText)
wd.quit()
wd = Nothing
' read in RTF and text files
Dim rtfFile As String
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(fil eName + ".rtf")
rtfFile = sr.ReadToEnd
sr.Close()
Dim textFile As String
sr = New System.IO.StreamReader(fil eName + ".txt")
textFile = sr.ReadToEnd
sr.Close()
' store RTF and text in clipboard
Dim sData As String
Dim dataObject As New DataObject
dataObject.SetData(DataFor mats.Text, True, textFile)
dataObject.SetData(DataFor mats.Rtf, True, rtfFile)
Clipboard.SetDataObject(da taObject)
Thanks.
Dim fileName as String = "htmlToRtf"
Dim wd As Object = CreateObject("word.applica
' open html in word
wd.Documents.Open(fileName
' remove tables
Dim tblToConvert As Object
For Each tblToConvert In wd.ActiveDocument.Tables
tblToConvert.ConvertToText
Next
' create RTF and text files
wd.ActiveDocument.SaveAs(f
wd.ActiveDocument.SaveAs(f
wd.quit()
wd = Nothing
' read in RTF and text files
Dim rtfFile As String
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(fil
rtfFile = sr.ReadToEnd
sr.Close()
Dim textFile As String
sr = New System.IO.StreamReader(fil
textFile = sr.ReadToEnd
sr.Close()
' store RTF and text in clipboard
Dim sData As String
Dim dataObject As New DataObject
dataObject.SetData(DataFor
dataObject.SetData(DataFor
Clipboard.SetDataObject(da
Check out this link:
http://www.codeproject.com/com/word_ole_bm.asp
That sets you up with an OLE project, and shows you how to use OLE automation which is the technology which will let you do what you need with Microsoft Word. That's the first step.
The next few steps are pretty basic C++ string and file handling routines.
Then the final step is to copy to the clipboard, and a simple tutorial exists here:
http://www.codeproject.com/clipboard/archerclipboard1.asp
Hope this helps!
http://www.codeproject.com/com/word_ole_bm.asp
That sets you up with an OLE project, and shows you how to use OLE automation which is the technology which will let you do what you need with Microsoft Word. That's the first step.
The next few steps are pretty basic C++ string and file handling routines.
Then the final step is to copy to the clipboard, and a simple tutorial exists here:
http://www.codeproject.com/clipboard/archerclipboard1.asp
Hope this helps!
ASKER
Thanks a lot.
My plan is to look into this over the weekend.
To incorporate this into a DLL, please recommend if the DLL should be of a certain type, for instance, MFC DLL.
My plan is to look into this over the weekend.
To incorporate this into a DLL, please recommend if the DLL should be of a certain type, for instance, MFC DLL.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Most of the clipboard code samples simply set the clipboard to one type of data.
It is not clear how to perform the following in c++:
dataObject.SetData(DataFor mats.Text, True, textFile)
dataObject.SetData(DataFor mats.Rtf, True, rtfFile)
Clipboard.SetDataObject(da taObject)
It is not clear how to perform the following in c++:
dataObject.SetData(DataFor
dataObject.SetData(DataFor
Clipboard.SetDataObject(da
ASKER
This project this link references does not appear to build successfully under Visual Studio .Net 2003:
http://www.codeproject.com/com/word_ole_bm.asp
http://www.codeproject.com/com/word_ole_bm.asp
ASKER
Also, the code in the projected referenced by the above link does not appear to have any features to support the conversion of tables to text in order to support the following critical step:
' remove tables
Dim tblToConvert As Object
For Each tblToConvert In wd.ActiveDocument.Tables
tblToConvert.ConvertToText (vbTab, True)
Next
' remove tables
Dim tblToConvert As Object
For Each tblToConvert In wd.ActiveDocument.Tables
tblToConvert.ConvertToText
Next
ASKER
The following link includes a project with a file that contains the ConvertToText method:
http://www.codeproject.com/com/AutoSpellCheck.asp
http://www.codeproject.com/com/AutoSpellCheck.asp
How far have you progressed with this problem now?
ASKER
Not very far. msword9.h contains the method which supports the removal of tables - ConvertToText, but even though the call looks straightorward, opening the word document is a problem.
Well post up your code and let's have a look
ASKER
The following links are useful in how to access word via .net c++:
http://support.microsoft.com/kb/308338/
http://support.microsoft.com/kb/307473/EN-US/
There are issues that remain to be resolved, but this is a useful start
http://support.microsoft.com/kb/308338/
http://support.microsoft.com/kb/307473/EN-US/
There are issues that remain to be resolved, but this is a useful start
ASKER
This email pertains to implementing an HTML-RTF conversion in c++ rather than vb.net.