Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


Convert doc to Html to database coln thru lateBinding

Posted on 2002-05-29
Medium Priority
Last Modified: 2011-09-20
Hi All,

I am trying to convert an Word doc to Html and this Html's source code is read and put in sql database column(ntext datatype).

I had few problems trying to do this, but purchased the code from EE. I guess the accepted code was of "Richie_Simonetti" of date 08/03/2001 10:29AM

The problem is as follows:

1)First of all, the code works fine if i give a "Reference" to the Word object library, and then use the word object to open a doc file, and "SaveAs" html file.

After I save the doc as Htm file, I give reference to filesystem and text stream to read the html file and insert into my database 'ntext' column. The reason I am stressing on 'ntext' datatype is because anything other than 'ntext' does not take the formatted doc fike, like bold, the colors, tables etc. That's why 'ntext'

But my machine has Windows2000 Professional configured so in my source code i had referenced word 10.0 library.

And my clients are all using Word 97, so there is this problem of 'version conflict'

So I thought of using 'late binding' which took care of the version conflict, put was not converting the doc to html in the proper way(I could see all junk, small boxes)

2)Second is my clients may have in their doc files some images, like charts and pie diagrams, so when I convert this doc to html it creates a separate folder(with images as it always does if you save as html).

I was wondering if sql database has a datatype which will hold text, formatted text and 'images' as in this case.

Am I on the right track, any info, help would be apprciated or maybe a workaround.

Here's the synopsis:
-how to late bind because of version conflict,
-convert it into proper html format without junk,
-and third if at all images are there in the doc file, what is the workaround.

Thanks all,


Question by:priya_pbk
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 143

Accepted Solution

Guy Hengel [angelIII / a3] earned 200 total points
ID: 7041180
-how to late bind:
 use code like this
 DIM objWord as Object
 SET objWord = CreateObject("Word.Application")

-convert without junk:
 I fear this could be a Word97 problem (checking...)
-other datatype:
 IMAGE, similar use than NTEXT, but only stored binary data instead of interpreting caracter strings, which could lead to junk data.

LVL 43

Expert Comment

ID: 7041200
The only work-around for the images would be to create another table which references the original one and store the images in seperate records in IMAGE datatype in this table. This way you can preserve the structure with the html.

Alternatively you can skip the formatting as html, if you have embedded images etc and really must store them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype?

Author Comment

ID: 7041211
yes, i too wrote the synatax for late binding that way. But with late binding and SaveAs syntax, it gives me junk.
I tried both the ways:

1)windows2000 Professional(My PC):
 a)Early binding(refr Word 10.0 library)+saveAs-->"Success"

2)Different Machine(Word 97 installed pc) opened the source code and referenced the windows97 object library
 b)lateBinding-->did not check at all


What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.


Author Comment

ID: 7041263
I think the images problem can wait for a while. What I am stuck up is with the conversion of doc to html.

If I can make this workable, i can proceed with the images part. I am not able to convert word97 doc to into html?

But i guess the problem is with late binding, Am i doing this right? This is what I am doing:
Dim wapp As Object
Set wapp = CreateObject("Word.Application")
Set Doc = wapp.Documents.Open(CStr(txtWFileToOpen), True)

Doc.SaveAs FileName:="C:\TmpstockIdeaFiles\toShow.htm", FileFormat:= wdFormatHTML

So where am i doing wrong ....???

Expert Comment

ID: 7041915
I don't think your problem is with the binding. Your Junk HTML is being created by Word '97. The HTML Converter in Word '97 is notoriously poor at generating clean HTML. Later versions of the Office products generate XML documents which are much prettier. (In fact the only Office '97 product that created clean HTML was Excel, no wonder there)

Author Comment

ID: 7043801
This question is to TimCottee:

You said:
"Alternatively you can skip the formatting as html, if you have embedded images etc and really must store
them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype? "

If this is so then,

-1)how can I do that, i mean the synatx for converting the doc to image and then store that image in sql database, is it simlar to SaveAs with a different parameter for Htmlformat?

-2) What about the size and wont the image be heavy and take time loading in the web page (I guess the image will be large enough coz conversion of 4-5 pages of doc will surely yield into a huge size of image.

just curious, wanted to know?



Expert Comment

ID: 7045186
TimCottee is not saying convert the word doc to an image but actually storing the document itself in the database using an 'image' datatype. You can then query the DB for the document and it will have all your images embedded into the binary file stored in the field

Author Comment

ID: 7048389
I think I will grant angelIII the points, because angelIII  was closer to the probable solution and also the first to answer.

Thanks everyone for the inputs and suggestion



Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…
Suggested Courses

598 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question