asked on

VB6/SQLServer/ADO stream : Corruptions when inserting/updating binary/image fields

Hi all - there are . . . FIVE HUNDRED ( 500 ) points up for grabs on this one . . .

Background:
^^^^^^^^
I have created a bespoke document management system that allows a user to read in documents from the file system into a local MSDE database, and check files in and out between the local MSDE "cache" and the central SQL Server 2000 "repository". I am unable to store text pointers to external file system objects in this case, so I am using binary/image objects.

There are three tables used, over two databases...
CACHE.cDocument_New : When new documents are imported locally on a workstation, they are inserted into this table. The table contains numerous fields of meta-data relating to the document, and a field called, "XMLContent" (it's not always XML) which contains binary data.

REPOSITORY.rDocument : When the document is "checked in", it is inserted into this table and becomes "under version control" status. This table is synchronised to:

CACHE.cDocument - which "downloads" a copy of the rDocument data, such that a user can continue to access files when not connected to the REPOSITORY.

The Problem:
^^^^^^^^

I can insert a new record into cDocument_New using rs.AddNew and write it to file using ADODB.stream -> this file opens correctly

When I check the file in (DELETE from rDocument, then rs.AddNew) and then attempt to open it by writing it to file and opening -> the file is corrupted.

Exceptions:
^^^^^^^

.txt files do not corrupt - even when as big as 128kb.
.jpg files corrupt, but can still be opened - however the bottom 3/4 of the image is usually inverted in colour (corruptions).

-------------

Anyone - please help! Suggestions welcome ;)

Thanks,

Mike

RadimHampel

Give little more information, what is the way, you are storing information into database, how are you getting it back, what is your db structure, stored procedures? ...

Radim

liddlefeesh

ASKER

Hi there Radim,

The scripts to generate the tables are all similar to:

CREATE TABLE [dbo].[cDocument] (
      [Document_ID] [int] NOT NULL ,
      [Document_Status_ID] [int] NOT NULL ,
      [Folder_ID] [int] NULL ,
      [Author_ID] [int] NULL ,
      [DocumentName] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,
      [Length] [bigint] NULL ,
      [XMLContent] [image] NULL ,
      [CheckInDate] [datetime] NOT NULL ,
      [Comments] [varchar] (255) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

No stored procedures are being used (I removed them whilst trying to debug this issue).

Since you asked for it - in verbose the code to take a document and check it in to version control:

' delete any existing repository documents with the same name as the check-in-ee (should do a user prompt really)
strSQL = "DELETE " & _
"FROM REPOSITORY.dbo.rDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL

' insert new repository document
rs.Open "SELECT * FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic

rs.AddNew
rs.Fields("Document_Status_ID").Value = 1
rs.Fields("Folder_ID").Value = pintCurrentFolder_ID
rs.Fields("Author_ID").Value = pstrAuthor_ID
rs.Fields("DocumentName").Value = DocumentName
rs.Fields("XMLContent").AppendChunk DocumentContent
rs.Fields("CheckInDate").Value = Now()
rs.Fields("Comments").Value = CheckInComment.mstrComments
rs.Update
rs.Close

' get the latest document_ID and persist it
rs.Open "SELECT Document_ID FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
intDocument_ID = rs!Document_ID
rs.Close

' do cache stuff next
Set cnn = DC.Connection("CACHE")

' delete any existing cache documents
strSQL = "DELETE " & _
"FROM CACHE.dbo.cDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL

' insert new cache document

rs.Open "SELECT * FROM cDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic

rs.AddNew
rs.Fields("Document_ID").Value = intDocument_ID
rs.Fields("Document_Status_ID").Value = 1
rs.Fields("Folder_ID").Value = pintCurrentFolder_ID
rs.Fields("DocumentName").Value = DocumentName
rs.Fields("XMLContent").AppendChunk (DocumentContent)
rs.Fields("CheckInDate").Value = Now()
rs.Fields("Comments").Value = CheckInComment.mstrComments
rs.Update
rs.Close

My code for writing an image field to a file works -> it's something to do with writing to the REPOSITORY database... that is causing corruptions in certain filetypes.

I'm assuming I'm incorrectly utilising the rs.fields("FIELD").Appendchunk method... I am passing it "DocumentContent" which is a string value obtained earlier in the application with the following code:

strSQL = "SELECT * " & _
"FROM rDocument " & _
"WHERE DocumentName = '" & lvwRepExplorer.SelectedItem.Text & "'"
rs.Open strSQL, cnn, adOpenStatic
strXMLContent = ReplaceQuotes(IIf(IsNull(rs!XMLContent), "", rs!XMLContent))

------
Essentially - I read the XMLContent into a vb string from an ADODB.Recordset field, and then I use the above code to write the new content into the repository document (rDocument) table.

When I attempt to save the repository XMLContent data to file and open the file - the file is corrupted.

The "save-image-to-file-using-ADODB-stream" function works fine, I will post that up too in a moment

--------

Thanks for your reply!!

Mike

liddlefeesh

ASKER

Abridged version of the save-image to file function:

Dim stm as adodb.stream
Set stm = New ADODB.Stream

With stm
.Type = adTypeBinary
.Open
Set cnn = DC.Connection("CACHE")
rs.Open "SELECT Length, XMLContent FROM rDocument WHERE DocumentName = '" & lvwRepExplorer.SelectedItem.Text & "'", cnn, adOpenStatic
.Write rs.Fields("XMLContent").Value
.SaveToFile "C:\TEMP\" & lvwRepExplorer.SelectedItem.Text, adSaveCreateOverWrite
cnn.Close
.Close
End With

Anthony Perkins

First of all it should be pointed out that saving binary data to the database is generally not recommended. Having said that I suggest you use the Stream object instead of the AppendChunk method to save the binary data to the XMLContent column, in much the same way you are using it to retrieve the data. Let me know if you need a sample.

Anthony

liddlefeesh

ASKER

Hi Anthony,

Sample would be great - thanks.

Cheers,

Mike

Anthony Perkins

Don't get me wrong, there should not be anything wrong with using the AppendChunk, if coded correctly.

Anthony

ASKER CERTIFIED SOLUTION

Anthony Perkins

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Anthony Perkins

Of course if you are just updating the recordset than comment out the line that reads:
rs.AddNew

Anthony

liddlefeesh

ASKER

Hi Anthony,

Thanks for your reply, and your example.

However, what I want to do is to update the XMLContent field in a table that may be located in a seperate database, with an XMLContent field in the first database.

No files are being read/written to at this stage. My ADODB.stream method to write to file works fine.

Thanks again,

Mike

Anthony Perkins

Then all you need is an update statement. It is not necessary to use the Stream object or the AppendChunk method (though you could use them if you wanted).

Your Update SQL statement would look something like this (assuming that Document_ID was your key field between the two tables):

Update rDocument
Set XMLContent = cDocument_New.XMLContent
From CACHE.dbo.cDocument_New
Where cDocument_New.Document_ID = 1234567890

Notice that there is no need to do a Delete and an Insert (though you could).

Suggest you test that out first in Query Analyzer, modifying the value for Document_ID appropriately.

Anthony

Anthony Perkins

By the way, I may have got the database names wrong. I was assuming that you wanted to update the table in the REPOSITORY database using the data from the CACHE database and you were connected to the first database. If that is not the case, than change as appropriate.

Anthony

liddlefeesh

ASKER

Hi Anthony,

That is correct - thanks... however it doesn't explain that when I use ADODB.stream afterwards to stream XMLContent from both tables into seperate files that the repository file is corrupted...

I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.

WinDiff tells me that start file (in cDocument_New) differs from cDocument/rDocument in white space. File sizes on stream to file are the same - but word documents and excel documents - and images are corrupted...

I can do the SQL, but I can't explain the corruptions!

Cheers,

Mike

Anthony Perkins

>>I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.<<
It looks like you may be deleting it and then inserting it. That is the reason I recommended an update, however if you want to do it this way than change the Update statement I posted to an Insert:

Insert rDocument
Select *
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'

I realize now that DocumentName is your key field and not Document_ID. Also, if Document_ID is an IDENTITY column that you will have to name all the columns in the Insert statement, except for the
Documnet_ID. As in:

Insert rDocument
Select Document_Status_ID,
Folder_ID,
Author_ID,
DocumentName,
Length,
XMLContent,
CheckInDate,
Comments
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'

>>I can do the SQL, but I can't explain the corruptions!<<
I suspect it has to do something with the AppendChunk method and how you are doing it. But to be honest I have not used that method for several years now, ever since the Stream object was introduced. Besides I felt that doing a straight Update (or Delete/Insert) was far more efficient than opening a recordset and doing an AddNew.

Anthony

Anthony Perkins

Do you need any more help with this question?

liddlefeesh

ASKER

Hi Anthony,

Using the Stream method of the ADODB object worked - although I have to read/write to a temporary file first in order to use the .loadfromfile command.

It's only a minor workaround - but hey, it's working!

I've awarded you the 500 points.

Kind Regards,

Mike

Anthony Perkins

>>although I have to read/write to a temporary file first in order to use the .loadfromfile command.<<
I believe, this is not necessary. Show me your code and I can make more suggestions.