Office 2007 docx file corrupted when retrieved from SQL 2005 DB in C++

Posted on 2008-10-21
Last Modified: 2012-05-05
The code attached uses Chunking to store a doc file as a binary file on SQL DB. This code works fine for Office 2003 doc files, but Office 2007 docx files are being retrieved from the database incorrectly.

When I try to open the file retrieved from the DB, Word says its corrupted, if you select Repair, the original file is recovered.

Have been at a loss to explain the difference, I know docx is essentially a zip file containing XML files, but they are getting stored as binary on the database, the field type on the DB is image. I feel that this method of storage should be ok, but am not sure.

Have read online that SQL 2005 is storing docx files on the database with an extra byte, however I have compared byte array length and content at the end and they are similar for the file being read in and the data read out of DB.

This has halted our migration to Office 2007 and any help would be greatly appreciated.

Many thanks
// storing the data on DB
	// get the file in memory
	ULONG datasize = (ULONG)file.GetLength();
	int counter=0;
	char ch;
	SAFEARRAYBOUND rgsabound[1];
	rgsabound[0].lLbound = 0;
	rgsabound[0].cElements = datasize;
    	psa = SafeArrayCreate(VT_UI1,1,rgsabound);
	long index1 = 0;
	std::ifstream in(filename, std::ios::in | std::ios::binary);
		HRESULT hr = SafeArrayPutElement(psa,&index1,(void*)&ch);
	// update the CV image data
	cmd.Format("select * from CV where DocID = %ld", cd.m_nDocID);
	pCVInfo->CursorType = adOpenKeyset;
	pCVInfo->LockType = adLockOptimistic;
	pCVInfo->Open(_bstr_t(cmd), _variant_t((IDispatch*)m_pConn,true),adOpenKeyset,adLockOptimistic,adCmdText);
	_variant_t varChunk;
	varChunk.vt = VT_ARRAY|VT_UI1;
	varChunk.parray = psa;
		bRet = true;
// retrieving the data from DB
	CString filter;
	filter.Format("select data from CV where DocID = %ld", docID);
	const int nChunkSize = 1024;
	_RecordsetPtr pCVInfo = NULL;
	pCVInfo->CursorType = adOpenStatic;
	pCVInfo->LockType = adLockOptimistic;
	pCVInfo->Open(_bstr_t(filter), _variant_t((IDispatch*)m_pConn,true),adOpenStatic,adLockReadOnly,adCmdText);
	ULONG datasize = pCVInfo->ADOFields->Item["Data"]->ActualSize;
	//Create a safe array to store the array of BYTES  
	ULONG lngOffSet = 0;
	std::ofstream out(filename, std::ios::out | std::ios::binary);
	UCHAR chData;
	while(lngOffSet < datasize)
		_variant_t varChunk = pCVInfo->ADOFields->Item["Data"]->GetChunk(nChunkSize);
		//Copy the data only upto the Actual Size of Field.  
		for(long index=0;index<=(nChunkSize-1);index++)
			HRESULT hr = SafeArrayGetElement(varChunk.parray,&index,(void*)&chData);
		lngOffSet = lngOffSet + nChunkSize;
	lngOffSet = 0;		
	pCVInfo = NULL;

Open in new window

Question by:husaam
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
LVL 31

Accepted Solution

Zoppo earned 500 total points
ID: 22766472
Hi husaam,

I guess the problem is your function writes too much at the end of the file if the last chunk is shorter than 'nChunkSize' since in the 'for' loop you always write 'nChunkSize' chars, no matter if 'lngOffSet + nChunkSize' >= 'datasize'.

I found a sample for how to use this in MSDN and found that there the return value of SafeArrayGetElement is evaluated to find when no more data is present - maybe this works for you too:

               for(long index=0;index<=(nChunkSize-1);index++)
                        HRESULT hr = SafeArrayGetElement(varChunk.parray,&index,(void*)&chData);
                        if ( SUCCEEDED( hr ) )
                lngOffSet = lngOffSet + nChunkSize;

Hope that helps,


Author Comment

ID: 22766646
Fantastic Zoppo,

you were spot on.
Guess Word in Office 2003 did not mind the extra characters, Word 12 does.

Thanks once again,

Author Closing Comment

ID: 31508213
Thanks !
LVL 31

Expert Comment

ID: 22767128
Yes, it seem so ...

you're welcome, I'm glad I could help you.

Have a nice day,

best regards,


Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A few years ago I was very much a beginner at VBA, and that very much remains the case today.  I'll do my best to explain things as I go in the hope that other beginners can follow.  If you just want to check out a tool that creates a Select Case fu…
Container Orchestration platforms empower organizations to scale their apps at an exceptional rate. This is the reason numerous innovation-driven companies are moving apps to an appropriated datacenter wide platform that empowers them to scale at a …
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.
Office 365 is currently available in five editions. Three of them are for business use: Office 365 Business Essentials, Office 365 Business, and Office 365 Business Premium. Two of them are for home/personal use: Office 365 Home and Office 365 Perso…

724 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question