Link to home
Start Free TrialLog in
Avatar of liddlefeesh
liddlefeesh

asked on

VB6/SQLServer/ADO stream : Corruptions when inserting/updating binary/image fields

Hi all - there are . . . FIVE HUNDRED ( 500 ) points up for grabs on this one . . .

Background:
^^^^^^^^
I have created a bespoke document management system that allows a user to read in documents from the file system into a local MSDE database, and check files in and out between the local MSDE "cache" and the central SQL Server 2000 "repository". I am unable to store text pointers to external file system objects in this case, so I am using binary/image objects.

There are three tables used, over two databases...
CACHE.cDocument_New : When new documents are imported locally on a workstation, they are inserted into this table. The table contains numerous fields of meta-data relating to the document, and a field called, "XMLContent" (it's not always XML) which contains binary data.

REPOSITORY.rDocument : When the document is "checked in", it is inserted into this table and becomes "under version control" status. This table is synchronised to:

CACHE.cDocument - which "downloads" a copy of the rDocument data, such that a user can continue to access files when not connected to the REPOSITORY.

The Problem:
^^^^^^^^

I can insert a new record into cDocument_New using rs.AddNew and write it to file using ADODB.stream -> this file opens correctly

When I check the file in (DELETE from rDocument, then rs.AddNew) and then attempt to open it by writing it to file and opening -> the file is corrupted.

Exceptions:
^^^^^^^

.txt files do not corrupt - even when as big as 128kb.
.jpg files corrupt, but can still be opened - however the bottom 3/4 of the image is usually inverted in colour (corruptions).

-------------

Anyone - please help! Suggestions welcome ;)

Thanks,

Mike
Avatar of RadimHampel
RadimHampel

Give little more information, what is the way, you are storing information into database, how are you getting it back, what is your db structure, stored procedures? ...

Radim
Avatar of liddlefeesh

ASKER

Hi there Radim,

The scripts to generate the tables are all similar to:

CREATE TABLE [dbo].[cDocument] (
      [Document_ID] [int] NOT NULL ,
      [Document_Status_ID] [int] NOT NULL ,
      [Folder_ID] [int] NULL ,
      [Author_ID] [int] NULL ,
      [DocumentName] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,
      [Length] [bigint] NULL ,
      [XMLContent] [image] NULL ,
      [CheckInDate] [datetime] NOT NULL ,
      [Comments] [varchar] (255) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

No stored procedures are being used (I removed them whilst trying to debug this issue).

Since you asked for it - in verbose the code to take a document and check it in to version control:
       
        ' delete any existing repository documents with the same name as the check-in-ee (should do a user prompt really)
        strSQL = "DELETE  " & _
                 "FROM    REPOSITORY.dbo.rDocument " & _
                 "WHERE   DocumentName = '" & DocumentName & "'"
        cnn.Execute strSQL
       
        ' insert new repository document
        rs.Open "SELECT * FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
           
        rs.AddNew
        rs.Fields("Document_Status_ID").Value = 1
        rs.Fields("Folder_ID").Value = pintCurrentFolder_ID
        rs.Fields("Author_ID").Value = pstrAuthor_ID
        rs.Fields("DocumentName").Value = DocumentName
        rs.Fields("XMLContent").AppendChunk DocumentContent
        rs.Fields("CheckInDate").Value = Now()
        rs.Fields("Comments").Value = CheckInComment.mstrComments
        rs.Update
        rs.Close
       
        ' get the latest document_ID and persist it
        rs.Open "SELECT Document_ID FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
        intDocument_ID = rs!Document_ID
        rs.Close
       
        ' do cache stuff next
        Set cnn = DC.Connection("CACHE")
           
        ' delete any existing cache documents
        strSQL = "DELETE  " & _
                 "FROM    CACHE.dbo.cDocument " & _
                 "WHERE   DocumentName = '" & DocumentName & "'"
        cnn.Execute strSQL
       
        ' insert new cache document
       
        rs.Open "SELECT * FROM cDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
       
        rs.AddNew
        rs.Fields("Document_ID").Value = intDocument_ID
        rs.Fields("Document_Status_ID").Value = 1
        rs.Fields("Folder_ID").Value = pintCurrentFolder_ID
        rs.Fields("DocumentName").Value = DocumentName
        rs.Fields("XMLContent").AppendChunk (DocumentContent)
        rs.Fields("CheckInDate").Value = Now()
        rs.Fields("Comments").Value = CheckInComment.mstrComments
        rs.Update
        rs.Close

My code for writing an image field to a file works -> it's something to do with writing to the REPOSITORY database... that is causing corruptions in certain filetypes.

I'm assuming I'm incorrectly utilising the rs.fields("FIELD").Appendchunk method... I am passing it "DocumentContent" which is a string value obtained earlier in the application with the following code:

                strSQL = "SELECT    * " & _
                         "FROM      rDocument " & _
                         "WHERE     DocumentName = '" & lvwRepExplorer.SelectedItem.Text & "'"
                rs.Open strSQL, cnn, adOpenStatic
                strXMLContent = ReplaceQuotes(IIf(IsNull(rs!XMLContent), "", rs!XMLContent))

------
Essentially - I read the XMLContent into a vb string from an ADODB.Recordset field, and then I use the above code to write the new content into the repository document (rDocument) table.

When I attempt to save the repository XMLContent data to file and open the file - the file is corrupted.

The "save-image-to-file-using-ADODB-stream" function works fine, I will post that up too in a moment

--------

Thanks for your reply!!

Mike
Abridged version of the save-image to file function:

Dim stm as adodb.stream
Set stm = New ADODB.Stream

                    With stm
                        .Type = adTypeBinary
                        .Open
                        Set cnn = DC.Connection("CACHE")
                        rs.Open "SELECT Length, XMLContent FROM rDocument WHERE DocumentName = '" & lvwRepExplorer.SelectedItem.Text & "'", cnn, adOpenStatic
                        .Write rs.Fields("XMLContent").Value
                        .SaveToFile "C:\TEMP\" & lvwRepExplorer.SelectedItem.Text, adSaveCreateOverWrite
                        cnn.Close
                        .Close
                    End With
Avatar of Anthony Perkins
First of all it should be pointed out that saving binary data to the database is generally not recommended.  Having said that I suggest you use the Stream object instead of the AppendChunk method to save the binary data to the XMLContent column, in much the same way you are using it to retrieve the data.  Let me know if you need a sample.

Anthony
Hi Anthony,

Sample would be great - thanks.

Cheers,

Mike
Don't get me wrong, there should not be anything wrong with using the AppendChunk, if coded correctly.

Anthony
ASKER CERTIFIED SOLUTION
Avatar of Anthony Perkins
Anthony Perkins
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Of course if you are just updating the recordset than comment out the line that reads:
  rs.AddNew

Anthony
Hi Anthony,

Thanks for your reply, and your example.

However, what I want to do is to update the XMLContent field in a table that may be located in a seperate database, with an XMLContent field in the first database.

No files are being read/written to at this stage. My ADODB.stream method to write to file works fine.

Thanks again,

Mike
Then all you need is an update statement.  It is not necessary to use the Stream object or the AppendChunk method (though you could use them if you wanted).  

Your Update SQL statement would look something like this (assuming that Document_ID was your key field between the two tables):

Update rDocument
Set XMLContent = cDocument_New.XMLContent
From CACHE.dbo.cDocument_New
Where cDocument_New.Document_ID = 1234567890

Notice that there is no need to do a Delete and an Insert (though you could).

Suggest you test that out first in Query Analyzer, modifying the value for Document_ID appropriately.

Anthony
By the way, I may have got the database names wrong.  I was assuming that you wanted to update the table in the REPOSITORY database using the data from the CACHE database and you were connected to the first database.  If that is not the case, than change as appropriate.

Anthony
Hi Anthony,

That is correct - thanks... however it doesn't explain that when I use ADODB.stream afterwards to stream XMLContent from both tables into seperate files that the repository file is corrupted...

I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.

WinDiff tells me that start file (in cDocument_New) differs from cDocument/rDocument in white space. File sizes on stream to file are the same - but word documents and excel documents - and images are corrupted...

I can do the SQL, but I can't explain the corruptions!

Cheers,

Mike
>>I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.<<
It looks like you may be deleting it and then inserting it.  That is the reason I recommended an update, however if you want to do it this way than change the Update statement I posted to an Insert:

Insert rDocument
Select *
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'

I realize now that DocumentName is your key field and not Document_ID.  Also, if Document_ID is an IDENTITY column that you will have to name all the columns in the Insert statement, except for the
Documnet_ID.  As in:

Insert rDocument
Select  Document_Status_ID,
           Folder_ID,
           Author_ID,
           DocumentName,
           Length,
           XMLContent,
           CheckInDate,
           Comments
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'

>>I can do the SQL, but I can't explain the corruptions!<<
I suspect it has to do something with the AppendChunk method and how you are doing it.  But to be honest I have not used that method for several years now, ever since the Stream object was introduced.  Besides I felt that doing a straight Update (or Delete/Insert) was far more efficient than opening a recordset and doing an AddNew.

Anthony
Do you need any more help with this question?
Hi Anthony,

Using the Stream method of the ADODB object worked - although I have to read/write to a temporary file first in order to use the .loadfromfile command.

It's only a minor workaround - but hey, it's working!

I've awarded you the 500 points.

Kind Regards,

Mike
>>although I have to read/write to a temporary file first in order to use the .loadfromfile command.<<
I believe, this is not necessary.  Show me your code and I can make more suggestions.