Solved

Convert doc to Html to database coln thru lateBinding

Posted on 2002-05-29
8
478 Views
Last Modified: 2011-09-20
Hi All,

I am trying to convert an Word doc to Html and this Html's source code is read and put in sql database column(ntext datatype).

I had few problems trying to do this, but purchased the code from EE. I guess the accepted code was of "Richie_Simonetti" of date 08/03/2001 10:29AM

The problem is as follows:

1)First of all, the code works fine if i give a "Reference" to the Word object library, and then use the word object to open a doc file, and "SaveAs" html file.

After I save the doc as Htm file, I give reference to filesystem and text stream to read the html file and insert into my database 'ntext' column. The reason I am stressing on 'ntext' datatype is because anything other than 'ntext' does not take the formatted doc fike, like bold, the colors, tables etc. That's why 'ntext'

But my machine has Windows2000 Professional configured so in my source code i had referenced word 10.0 library.

And my clients are all using Word 97, so there is this problem of 'version conflict'

So I thought of using 'late binding' which took care of the version conflict, put was not converting the doc to html in the proper way(I could see all junk, small boxes)

2)Second is my clients may have in their doc files some images, like charts and pie diagrams, so when I convert this doc to html it creates a separate folder(with images as it always does if you save as html).

I was wondering if sql database has a datatype which will hold text, formatted text and 'images' as in this case.

Am I on the right track, any info, help would be apprciated or maybe a workaround.

Here's the synopsis:
-how to late bind because of version conflict,
-convert it into proper html format without junk,
-and third if at all images are there in the doc file, what is the workaround.

Thanks all,

-Priya

0
Comment
Question by:priya_pbk
8 Comments
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 50 total points
Comment Utility
-how to late bind:
 use code like this
 DIM objWord as Object
 SET objWord = CreateObject("Word.Application")
 ...

-convert without junk:
 I fear this could be a Word97 problem (checking...)
 
-other datatype:
 IMAGE, similar use than NTEXT, but only stored binary data instead of interpreting caracter strings, which could lead to junk data.

CHeers
 
 
0
 
LVL 43

Expert Comment

by:TimCottee
Comment Utility
The only work-around for the images would be to create another table which references the original one and store the images in seperate records in IMAGE datatype in this table. This way you can preserve the structure with the html.

Alternatively you can skip the formatting as html, if you have embedded images etc and really must store them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype?
0
 
LVL 2

Author Comment

by:priya_pbk
Comment Utility
yes, i too wrote the synatax for late binding that way. But with late binding and SaveAs syntax, it gives me junk.
I tried both the ways:

1)windows2000 Professional(My PC):
 a)Early binding(refr Word 10.0 library)+saveAs-->"Success"
 b)LateBinding+saveAs-->"junk"

2)Different Machine(Word 97 installed pc) opened the source code and referenced the windows97 object library
 a)EarlyBinding-->"junk"
 b)lateBinding-->did not check at all

-priya

0
 
LVL 2

Author Comment

by:priya_pbk
Comment Utility
I think the images problem can wait for a while. What I am stuck up is with the conversion of doc to html.

If I can make this workable, i can proceed with the images part. I am not able to convert word97 doc to into html?

But i guess the problem is with late binding, Am i doing this right? This is what I am doing:
---------------------
Dim wapp As Object
Set wapp = CreateObject("Word.Application")
Set Doc = wapp.Documents.Open(CStr(txtWFileToOpen), True)

Doc.SaveAs FileName:="C:\TmpstockIdeaFiles\toShow.htm", FileFormat:= wdFormatHTML

---------------------
So where am i doing wrong ....???
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 3

Expert Comment

by:MCummings111400
Comment Utility
I don't think your problem is with the binding. Your Junk HTML is being created by Word '97. The HTML Converter in Word '97 is notoriously poor at generating clean HTML. Later versions of the Office products generate XML documents which are much prettier. (In fact the only Office '97 product that created clean HTML was Excel, no wonder there)
0
 
LVL 2

Author Comment

by:priya_pbk
Comment Utility
This question is to TimCottee:

You said:
"Alternatively you can skip the formatting as html, if you have embedded images etc and really must store
them in your sql database then why don't you simply stream the original .doc into an IMAGE datatype? "

If this is so then,

-1)how can I do that, i mean the synatx for converting the doc to image and then store that image in sql database, is it simlar to SaveAs with a different parameter for Htmlformat?

-2) What about the size and wont the image be heavy and take time loading in the web page (I guess the image will be large enough coz conversion of 4-5 pages of doc will surely yield into a huge size of image.

just curious, wanted to know?

-priya

0
 
LVL 3

Expert Comment

by:MCummings111400
Comment Utility
TimCottee is not saying convert the word doc to an image but actually storing the document itself in the database using an 'image' datatype. You can then query the DB for the document and it will have all your images embedded into the binary file stored in the field
0
 
LVL 2

Author Comment

by:priya_pbk
Comment Utility
I think I will grant angelIII the points, because angelIII  was closer to the probable solution and also the first to answer.

Thanks everyone for the inputs and suggestion

-priya

0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Introduction While answering a recent question (http://www.experts-exchange.com/Q_27402310.html) in the VB classic zone, I wrote some VB code in the (Office) VBA environment, rather than fire up my older PC.  I didn't post completely correct code o…
When designing a form there are several BorderStyles to choose from, all of which can be classified as either 'Fixed' or 'Sizable' and I'd guess that 'Fixed Single' or one of the other fixed types is the most popular choice. I assume it's the most p…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now