liddlefeesh
asked on
VB6/SQLServer/ADO stream : Corruptions when inserting/updating binary/image fields
Hi all - there are . . . FIVE HUNDRED ( 500 ) points up for grabs on this one . . .
Background:
^^^^^^^^
I have created a bespoke document management system that allows a user to read in documents from the file system into a local MSDE database, and check files in and out between the local MSDE "cache" and the central SQL Server 2000 "repository". I am unable to store text pointers to external file system objects in this case, so I am using binary/image objects.
There are three tables used, over two databases...
CACHE.cDocument_New : When new documents are imported locally on a workstation, they are inserted into this table. The table contains numerous fields of meta-data relating to the document, and a field called, "XMLContent" (it's not always XML) which contains binary data.
REPOSITORY.rDocument : When the document is "checked in", it is inserted into this table and becomes "under version control" status. This table is synchronised to:
CACHE.cDocument - which "downloads" a copy of the rDocument data, such that a user can continue to access files when not connected to the REPOSITORY.
The Problem:
^^^^^^^^
I can insert a new record into cDocument_New using rs.AddNew and write it to file using ADODB.stream -> this file opens correctly
When I check the file in (DELETE from rDocument, then rs.AddNew) and then attempt to open it by writing it to file and opening -> the file is corrupted.
Exceptions:
^^^^^^^
.txt files do not corrupt - even when as big as 128kb.
.jpg files corrupt, but can still be opened - however the bottom 3/4 of the image is usually inverted in colour (corruptions).
-------------
Anyone - please help! Suggestions welcome ;)
Thanks,
Mike
Background:
^^^^^^^^
I have created a bespoke document management system that allows a user to read in documents from the file system into a local MSDE database, and check files in and out between the local MSDE "cache" and the central SQL Server 2000 "repository". I am unable to store text pointers to external file system objects in this case, so I am using binary/image objects.
There are three tables used, over two databases...
CACHE.cDocument_New : When new documents are imported locally on a workstation, they are inserted into this table. The table contains numerous fields of meta-data relating to the document, and a field called, "XMLContent" (it's not always XML) which contains binary data.
REPOSITORY.rDocument : When the document is "checked in", it is inserted into this table and becomes "under version control" status. This table is synchronised to:
CACHE.cDocument - which "downloads" a copy of the rDocument data, such that a user can continue to access files when not connected to the REPOSITORY.
The Problem:
^^^^^^^^
I can insert a new record into cDocument_New using rs.AddNew and write it to file using ADODB.stream -> this file opens correctly
When I check the file in (DELETE from rDocument, then rs.AddNew) and then attempt to open it by writing it to file and opening -> the file is corrupted.
Exceptions:
^^^^^^^
.txt files do not corrupt - even when as big as 128kb.
.jpg files corrupt, but can still be opened - however the bottom 3/4 of the image is usually inverted in colour (corruptions).
-------------
Anyone - please help! Suggestions welcome ;)
Thanks,
Mike
ASKER
Hi there Radim,
The scripts to generate the tables are all similar to:
CREATE TABLE [dbo].[cDocument] (
[Document_ID] [int] NOT NULL ,
[Document_Status_ID] [int] NOT NULL ,
[Folder_ID] [int] NULL ,
[Author_ID] [int] NULL ,
[DocumentName] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,
[Length] [bigint] NULL ,
[XMLContent] [image] NULL ,
[CheckInDate] [datetime] NOT NULL ,
[Comments] [varchar] (255) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
No stored procedures are being used (I removed them whilst trying to debug this issue).
Since you asked for it - in verbose the code to take a document and check it in to version control:
' delete any existing repository documents with the same name as the check-in-ee (should do a user prompt really)
strSQL = "DELETE " & _
"FROM REPOSITORY.dbo.rDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL
' insert new repository document
rs.Open "SELECT * FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
rs.AddNew
rs.Fields("Document_Status _ID").Valu e = 1
rs.Fields("Folder_ID").Val ue = pintCurrentFolder_ID
rs.Fields("Author_ID").Val ue = pstrAuthor_ID
rs.Fields("DocumentName"). Value = DocumentName
rs.Fields("XMLContent").Ap pendChunk DocumentContent
rs.Fields("CheckInDate").V alue = Now()
rs.Fields("Comments").Valu e = CheckInComment.mstrComment s
rs.Update
rs.Close
' get the latest document_ID and persist it
rs.Open "SELECT Document_ID FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
intDocument_ID = rs!Document_ID
rs.Close
' do cache stuff next
Set cnn = DC.Connection("CACHE")
' delete any existing cache documents
strSQL = "DELETE " & _
"FROM CACHE.dbo.cDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL
' insert new cache document
rs.Open "SELECT * FROM cDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
rs.AddNew
rs.Fields("Document_ID").V alue = intDocument_ID
rs.Fields("Document_Status _ID").Valu e = 1
rs.Fields("Folder_ID").Val ue = pintCurrentFolder_ID
rs.Fields("DocumentName"). Value = DocumentName
rs.Fields("XMLContent").Ap pendChunk (DocumentContent)
rs.Fields("CheckInDate").V alue = Now()
rs.Fields("Comments").Valu e = CheckInComment.mstrComment s
rs.Update
rs.Close
My code for writing an image field to a file works -> it's something to do with writing to the REPOSITORY database... that is causing corruptions in certain filetypes.
I'm assuming I'm incorrectly utilising the rs.fields("FIELD").Appendc hunk method... I am passing it "DocumentContent" which is a string value obtained earlier in the application with the following code:
strSQL = "SELECT * " & _
"FROM rDocument " & _
"WHERE DocumentName = '" & lvwRepExplorer.SelectedIte m.Text & "'"
rs.Open strSQL, cnn, adOpenStatic
strXMLContent = ReplaceQuotes(IIf(IsNull(r s!XMLConte nt), "", rs!XMLContent))
------
Essentially - I read the XMLContent into a vb string from an ADODB.Recordset field, and then I use the above code to write the new content into the repository document (rDocument) table.
When I attempt to save the repository XMLContent data to file and open the file - the file is corrupted.
The "save-image-to-file-using- ADODB-stre am" function works fine, I will post that up too in a moment
--------
Thanks for your reply!!
Mike
The scripts to generate the tables are all similar to:
CREATE TABLE [dbo].[cDocument] (
[Document_ID] [int] NOT NULL ,
[Document_Status_ID] [int] NOT NULL ,
[Folder_ID] [int] NULL ,
[Author_ID] [int] NULL ,
[DocumentName] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,
[Length] [bigint] NULL ,
[XMLContent] [image] NULL ,
[CheckInDate] [datetime] NOT NULL ,
[Comments] [varchar] (255) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
No stored procedures are being used (I removed them whilst trying to debug this issue).
Since you asked for it - in verbose the code to take a document and check it in to version control:
' delete any existing repository documents with the same name as the check-in-ee (should do a user prompt really)
strSQL = "DELETE " & _
"FROM REPOSITORY.dbo.rDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL
' insert new repository document
rs.Open "SELECT * FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
rs.AddNew
rs.Fields("Document_Status
rs.Fields("Folder_ID").Val
rs.Fields("Author_ID").Val
rs.Fields("DocumentName").
rs.Fields("XMLContent").Ap
rs.Fields("CheckInDate").V
rs.Fields("Comments").Valu
rs.Update
rs.Close
' get the latest document_ID and persist it
rs.Open "SELECT Document_ID FROM rDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
intDocument_ID = rs!Document_ID
rs.Close
' do cache stuff next
Set cnn = DC.Connection("CACHE")
' delete any existing cache documents
strSQL = "DELETE " & _
"FROM CACHE.dbo.cDocument " & _
"WHERE DocumentName = '" & DocumentName & "'"
cnn.Execute strSQL
' insert new cache document
rs.Open "SELECT * FROM cDocument WHERE DocumentName = '" & DocumentName & "' ", cnn, adOpenForwardOnly, adLockOptimistic
rs.AddNew
rs.Fields("Document_ID").V
rs.Fields("Document_Status
rs.Fields("Folder_ID").Val
rs.Fields("DocumentName").
rs.Fields("XMLContent").Ap
rs.Fields("CheckInDate").V
rs.Fields("Comments").Valu
rs.Update
rs.Close
My code for writing an image field to a file works -> it's something to do with writing to the REPOSITORY database... that is causing corruptions in certain filetypes.
I'm assuming I'm incorrectly utilising the rs.fields("FIELD").Appendc
strSQL = "SELECT * " & _
"FROM rDocument " & _
"WHERE DocumentName = '" & lvwRepExplorer.SelectedIte
rs.Open strSQL, cnn, adOpenStatic
strXMLContent = ReplaceQuotes(IIf(IsNull(r
------
Essentially - I read the XMLContent into a vb string from an ADODB.Recordset field, and then I use the above code to write the new content into the repository document (rDocument) table.
When I attempt to save the repository XMLContent data to file and open the file - the file is corrupted.
The "save-image-to-file-using-
--------
Thanks for your reply!!
Mike
ASKER
Abridged version of the save-image to file function:
Dim stm as adodb.stream
Set stm = New ADODB.Stream
With stm
.Type = adTypeBinary
.Open
Set cnn = DC.Connection("CACHE")
rs.Open "SELECT Length, XMLContent FROM rDocument WHERE DocumentName = '" & lvwRepExplorer.SelectedIte m.Text & "'", cnn, adOpenStatic
.Write rs.Fields("XMLContent").Va lue
.SaveToFile "C:\TEMP\" & lvwRepExplorer.SelectedIte m.Text, adSaveCreateOverWrite
cnn.Close
.Close
End With
Dim stm as adodb.stream
Set stm = New ADODB.Stream
With stm
.Type = adTypeBinary
.Open
Set cnn = DC.Connection("CACHE")
rs.Open "SELECT Length, XMLContent FROM rDocument WHERE DocumentName = '" & lvwRepExplorer.SelectedIte
.Write rs.Fields("XMLContent").Va
.SaveToFile "C:\TEMP\" & lvwRepExplorer.SelectedIte
cnn.Close
.Close
End With
First of all it should be pointed out that saving binary data to the database is generally not recommended. Having said that I suggest you use the Stream object instead of the AppendChunk method to save the binary data to the XMLContent column, in much the same way you are using it to retrieve the data. Let me know if you need a sample.
Anthony
Anthony
ASKER
Hi Anthony,
Sample would be great - thanks.
Cheers,
Mike
Sample would be great - thanks.
Cheers,
Mike
Don't get me wrong, there should not be anything wrong with using the AppendChunk, if coded correctly.
Anthony
Anthony
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Of course if you are just updating the recordset than comment out the line that reads:
rs.AddNew
Anthony
rs.AddNew
Anthony
ASKER
Hi Anthony,
Thanks for your reply, and your example.
However, what I want to do is to update the XMLContent field in a table that may be located in a seperate database, with an XMLContent field in the first database.
No files are being read/written to at this stage. My ADODB.stream method to write to file works fine.
Thanks again,
Mike
Thanks for your reply, and your example.
However, what I want to do is to update the XMLContent field in a table that may be located in a seperate database, with an XMLContent field in the first database.
No files are being read/written to at this stage. My ADODB.stream method to write to file works fine.
Thanks again,
Mike
Then all you need is an update statement. It is not necessary to use the Stream object or the AppendChunk method (though you could use them if you wanted).
Your Update SQL statement would look something like this (assuming that Document_ID was your key field between the two tables):
Update rDocument
Set XMLContent = cDocument_New.XMLContent
From CACHE.dbo.cDocument_New
Where cDocument_New.Document_ID = 1234567890
Notice that there is no need to do a Delete and an Insert (though you could).
Suggest you test that out first in Query Analyzer, modifying the value for Document_ID appropriately.
Anthony
Your Update SQL statement would look something like this (assuming that Document_ID was your key field between the two tables):
Update rDocument
Set XMLContent = cDocument_New.XMLContent
From CACHE.dbo.cDocument_New
Where cDocument_New.Document_ID = 1234567890
Notice that there is no need to do a Delete and an Insert (though you could).
Suggest you test that out first in Query Analyzer, modifying the value for Document_ID appropriately.
Anthony
By the way, I may have got the database names wrong. I was assuming that you wanted to update the table in the REPOSITORY database using the data from the CACHE database and you were connected to the first database. If that is not the case, than change as appropriate.
Anthony
Anthony
ASKER
Hi Anthony,
That is correct - thanks... however it doesn't explain that when I use ADODB.stream afterwards to stream XMLContent from both tables into seperate files that the repository file is corrupted...
I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.
WinDiff tells me that start file (in cDocument_New) differs from cDocument/rDocument in white space. File sizes on stream to file are the same - but word documents and excel documents - and images are corrupted...
I can do the SQL, but I can't explain the corruptions!
Cheers,
Mike
That is correct - thanks... however it doesn't explain that when I use ADODB.stream afterwards to stream XMLContent from both tables into seperate files that the repository file is corrupted...
I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.
WinDiff tells me that start file (in cDocument_New) differs from cDocument/rDocument in white space. File sizes on stream to file are the same - but word documents and excel documents - and images are corrupted...
I can do the SQL, but I can't explain the corruptions!
Cheers,
Mike
>>I'm not using an update SQL statement here - only an insert from cDocument_New into rDocument and cDocument.<<
It looks like you may be deleting it and then inserting it. That is the reason I recommended an update, however if you want to do it this way than change the Update statement I posted to an Insert:
Insert rDocument
Select *
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'
I realize now that DocumentName is your key field and not Document_ID. Also, if Document_ID is an IDENTITY column that you will have to name all the columns in the Insert statement, except for the
Documnet_ID. As in:
Insert rDocument
Select Document_Status_ID,
Folder_ID,
Author_ID,
DocumentName,
Length,
XMLContent,
CheckInDate,
Comments
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName = 'document name goes here'
>>I can do the SQL, but I can't explain the corruptions!<<
I suspect it has to do something with the AppendChunk method and how you are doing it. But to be honest I have not used that method for several years now, ever since the Stream object was introduced. Besides I felt that doing a straight Update (or Delete/Insert) was far more efficient than opening a recordset and doing an AddNew.
Anthony
It looks like you may be deleting it and then inserting it. That is the reason I recommended an update, however if you want to do it this way than change the Update statement I posted to an Insert:
Insert rDocument
Select *
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName
I realize now that DocumentName is your key field and not Document_ID. Also, if Document_ID is an IDENTITY column that you will have to name all the columns in the Insert statement, except for the
Documnet_ID. As in:
Insert rDocument
Select Document_Status_ID,
Folder_ID,
Author_ID,
DocumentName,
Length,
XMLContent,
CheckInDate,
Comments
From CACHE.dbo.cDocument_New
Where cDocument_New.DocumentName
>>I can do the SQL, but I can't explain the corruptions!<<
I suspect it has to do something with the AppendChunk method and how you are doing it. But to be honest I have not used that method for several years now, ever since the Stream object was introduced. Besides I felt that doing a straight Update (or Delete/Insert) was far more efficient than opening a recordset and doing an AddNew.
Anthony
Do you need any more help with this question?
ASKER
Hi Anthony,
Using the Stream method of the ADODB object worked - although I have to read/write to a temporary file first in order to use the .loadfromfile command.
It's only a minor workaround - but hey, it's working!
I've awarded you the 500 points.
Kind Regards,
Mike
Using the Stream method of the ADODB object worked - although I have to read/write to a temporary file first in order to use the .loadfromfile command.
It's only a minor workaround - but hey, it's working!
I've awarded you the 500 points.
Kind Regards,
Mike
>>although I have to read/write to a temporary file first in order to use the .loadfromfile command.<<
I believe, this is not necessary. Show me your code and I can make more suggestions.
I believe, this is not necessary. Show me your code and I can make more suggestions.
Radim