Solved

Office 2007 docx file corrupted when retrieved from SQL 2005 DB in C++

Posted on 2008-10-21
4
923 Views
Last Modified: 2012-05-05
The code attached uses Chunking to store a doc file as a binary file on SQL DB. This code works fine for Office 2003 doc files, but Office 2007 docx files are being retrieved from the database incorrectly.

When I try to open the file retrieved from the DB, Word says its corrupted, if you select Repair, the original file is recovered.

Have been at a loss to explain the difference, I know docx is essentially a zip file containing XML files, but they are getting stored as binary on the database, the field type on the DB is image. I feel that this method of storage should be ok, but am not sure.

Have read online that SQL 2005 is storing docx files on the database with an extra byte, however I have compared byte array length and content at the end and they are similar for the file being read in and the data read out of DB.

This has halted our migration to Office 2007 and any help would be greatly appreciated.

Many thanks
// storing the data on DB
 

	// get the file in memory

	ULONG datasize = (ULONG)file.GetLength();

	file.Close();
 

	int counter=0;

	char ch;

	SAFEARRAY FAR *psa;

	SAFEARRAYBOUND rgsabound[1];

	rgsabound[0].lLbound = 0;

	rgsabound[0].cElements = datasize;

    	psa = SafeArrayCreate(VT_UI1,1,rgsabound);

	long index1 = 0;

	std::ifstream in(filename, std::ios::in | std::ios::binary);

	while(!in.eof())

	{

		in.get(ch);

		HRESULT hr = SafeArrayPutElement(psa,&index1,(void*)&ch);

		index1++;

	} 

	in.close();
 

	// update the CV image data

	cmd.Format("select * from CV where DocID = %ld", cd.m_nDocID);

	pCVInfo->CursorType = adOpenKeyset;

	pCVInfo->LockType = adLockOptimistic;

	pCVInfo->Open(_bstr_t(cmd), _variant_t((IDispatch*)m_pConn,true),adOpenKeyset,adLockOptimistic,adCmdText);

	_variant_t varChunk;

	varChunk.vt = VT_ARRAY|VT_UI1;

	varChunk.parray = psa;

	pCVInfo->ADOFields->GetItem("Data")->AppendChunk(varChunk);

	if(pCVInfo->Update()==S_OK)

	{

		pCVInfo->Close();

		bRet = true;

	}

	pCVInfo=NULL;
 
 

// retrieving the data from DB
 

	CString filter;

	filter.Format("select data from CV where DocID = %ld", docID);
 

	const int nChunkSize = 1024;

	_RecordsetPtr pCVInfo = NULL;
 

	pCVInfo.CreateInstance(__uuidof(Recordset));

	pCVInfo->CursorType = adOpenStatic;

	pCVInfo->LockType = adLockOptimistic;

	pCVInfo->Open(_bstr_t(filter), _variant_t((IDispatch*)m_pConn,true),adOpenStatic,adLockReadOnly,adCmdText);
 

	ULONG datasize = pCVInfo->ADOFields->Item["Data"]->ActualSize;

	//Create a safe array to store the array of BYTES  

	ULONG lngOffSet = 0;

	std::ofstream out(filename, std::ios::out | std::ios::binary);

	UCHAR chData;

	while(lngOffSet < datasize)

	{

		_variant_t varChunk = pCVInfo->ADOFields->Item["Data"]->GetChunk(nChunkSize);

		//Copy the data only upto the Actual Size of Field.  

		for(long index=0;index<=(nChunkSize-1);index++)

		{

			HRESULT hr = SafeArrayGetElement(varChunk.parray,&index,(void*)&chData);

			out.put((char)chData);

		}

		lngOffSet = lngOffSet + nChunkSize;

	}

	lngOffSet = 0;		

	out.close();
 

	pCVInfo->Close();

	pCVInfo = NULL;

Open in new window

0
Comment
Question by:husaam
  • 2
  • 2
4 Comments
 
LVL 30

Accepted Solution

by:
Zoppo earned 500 total points
Comment Utility
Hi husaam,

I guess the problem is your function writes too much at the end of the file if the last chunk is shorter than 'nChunkSize' since in the 'for' loop you always write 'nChunkSize' chars, no matter if 'lngOffSet + nChunkSize' >= 'datasize'.

I found a sample for how to use this in MSDN and found that there the return value of SafeArrayGetElement is evaluated to find when no more data is present - maybe this works for you too:

...
               for(long index=0;index<=(nChunkSize-1);index++)
                {
                        HRESULT hr = SafeArrayGetElement(varChunk.parray,&index,(void*)&chData);
                        if ( SUCCEEDED( hr ) )
                        {
                           out.put((char)chData);
                        }
                        else
                        {
                            break;
                        }
                }
                lngOffSet = lngOffSet + nChunkSize;
...


Hope that helps,

ZOPPO
0
 

Author Comment

by:husaam
Comment Utility
Fantastic Zoppo,

you were spot on.
Guess Word in Office 2003 did not mind the extra characters, Word 12 does.

Thanks once again,
0
 

Author Closing Comment

by:husaam
Comment Utility
Thanks !
0
 
LVL 30

Expert Comment

by:Zoppo
Comment Utility
Yes, it seem so ...

you're welcome, I'm glad I could help you.

Have a nice day,

best regards,

ZOPPO
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

The Selection object is designed for user interaction. It has a Range property, so it can be used in most places that a Range object can. Recorded macros must use the Selection because they are simply copying what the user is doing. A Range prope…
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
In a previous video Micro Tutorial here at Experts Exchange (http://www.experts-exchange.com/videos/1358/How-to-get-a-free-trial-of-Office-365-with-the-Office-2016-desktop-applications.html), I explained how to get a free, one-month trial of Office …
This Experts Exchange video Micro Tutorial shows how to tell Microsoft Office that a word is NOT spelled correctly. Microsoft Office has a built-in, main dictionary that is shared by Office apps, including Excel, Outlook, PowerPoint, and Word. When …

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now