Convert doc to Html to database coln thru lateBinding

Posted on 2002-05-29
Last Modified: 2011-09-20
Hi All,

I am trying to convert an Word doc to Html and this Html's source code is read and put in sql database column(ntext datatype).

I had few problems trying to do this, but purchased the code from EE. I guess the accepted code was of "Richie_Simonetti" of date 08/03/2001 10:29AM

The problem is as follows:

1)First of all, the code works fine if i give a "Reference" to the Word object library, and then use the word object to open a doc file, and "SaveAs" html file.

After I save the doc as Htm file, I give reference to filesystem and text stream to read the html file and insert into my database 'ntext' column. The reason I am stressing on 'ntext' datatype is because anything other than 'ntext' does not take the formatted doc fike, like bold, the colors, tables etc. That's why 'ntext'

But my machine has Windows2000 Professional configured so in my source code i had referenced word 10.0 library.

And my clients are all using Word 97, so there is this problem of 'version conflict'

So I thought of using 'late binding' which took care of the version conflict, put was not converting the doc to html in the proper way(I could see all junk, small boxes)

2)Second is my clients may have in their doc files some images, like charts and pie diagrams, so when I convert this doc to html it creates a separate folder(with images as it always does if you save as html).

I was wondering if sql database has a datatype which will hold text, formatted text and 'images' as in this case.

Am I on the right track, any info, help would be apprciated or maybe a workaround.

Here's the synopsis:
-how to late bind because of version conflict,
-convert it into proper html format without junk,
-and third if at all images are there in the doc file, what is the workaround.

Thanks all,


Question by:priya_pbk
LVL 142

Accepted Solution

Guy Hengel [angelIII / a3] earned 50 total points
ID: 7041180
-how to late bind:
 use code like this
 DIM objWord as Object
 SET objWord = CreateObject("Word.Application")

-convert without junk:
 I fear this could be a Word97 problem (checking...)
-other datatype:
 IMAGE, similar use than NTEXT, but only stored binary data instead of interpreting caracter strings, which could lead to junk data.

LVL 43

Expert Comment

ID: 7041200
The only work-around for the images would be to create another table which references the original one and store the images in seperate records in IMAGE datatype in this table. This way you can preserve the structure with the html.

Alternatively you can skip the formatting as html, if you have embedded images etc and really must store them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype?

Author Comment

ID: 7041211
yes, i too wrote the synatax for late binding that way. But with late binding and SaveAs syntax, it gives me junk.
I tried both the ways:

1)windows2000 Professional(My PC):
 a)Early binding(refr Word 10.0 library)+saveAs-->"Success"

2)Different Machine(Word 97 installed pc) opened the source code and referenced the windows97 object library
 b)lateBinding-->did not check at all


Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.


Author Comment

ID: 7041263
I think the images problem can wait for a while. What I am stuck up is with the conversion of doc to html.

If I can make this workable, i can proceed with the images part. I am not able to convert word97 doc to into html?

But i guess the problem is with late binding, Am i doing this right? This is what I am doing:
Dim wapp As Object
Set wapp = CreateObject("Word.Application")
Set Doc = wapp.Documents.Open(CStr(txtWFileToOpen), True)

Doc.SaveAs FileName:="C:\TmpstockIdeaFiles\toShow.htm", FileFormat:= wdFormatHTML

So where am i doing wrong ....???

Expert Comment

ID: 7041915
I don't think your problem is with the binding. Your Junk HTML is being created by Word '97. The HTML Converter in Word '97 is notoriously poor at generating clean HTML. Later versions of the Office products generate XML documents which are much prettier. (In fact the only Office '97 product that created clean HTML was Excel, no wonder there)

Author Comment

ID: 7043801
This question is to TimCottee:

You said:
"Alternatively you can skip the formatting as html, if you have embedded images etc and really must store
them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype? "

If this is so then,

-1)how can I do that, i mean the synatx for converting the doc to image and then store that image in sql database, is it simlar to SaveAs with a different parameter for Htmlformat?

-2) What about the size and wont the image be heavy and take time loading in the web page (I guess the image will be large enough coz conversion of 4-5 pages of doc will surely yield into a huge size of image.

just curious, wanted to know?



Expert Comment

ID: 7045186
TimCottee is not saying convert the word doc to an image but actually storing the document itself in the database using an 'image' datatype. You can then query the DB for the document and it will have all your images embedded into the binary file stored in the field

Author Comment

ID: 7048389
I think I will grant angelIII the points, because angelIII  was closer to the probable solution and also the first to answer.

Thanks everyone for the inputs and suggestion



Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I’ve seen a number of people looking for examples of how to access web services from VB6.  I’ve been using a test harness I built in VB6 (using many resources I found online) that I use for small projects to work out how to communicate with web serv…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…

778 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question