Link to home
Start Free TrialLog in
Avatar of gateguard
gateguard

asked on

How can I read MS Word files with C programs?

I have some Borland C++ programs I wrote that search text files and edit them automatically and I would like to adapt these c-programs to do the same thing with Word97 files.  How do I alter these c-programs  to get past all the Word97 "wrapping" and into the text of the files--which is the only part I care about?
Avatar of gateguard
gateguard

ASKER

Edited text of question
Sorry, but mainly bad news here...

There is no "wrapper" to get past so that you can access the text part of Word document. Since each individual character can have distinct properties, the special characters are interspersed with "text" throughout the document.

Your only chance is to find someone who has hacked this proprietary format. Unfortunately, the only people that have gone through the considerable expense of doing this, are selling products based on that ability.

HOWEVER, the good news is that I located one for you. Here is the URL for one company that sells such a library.
http://wwwwbs.cs.tu-berlin.de/~schwartz/pmh/laola.html

Another company that has done this is DataViz, but I believe they only sell a product to translate the Word document to something else.

Tom

Word 97 files are not ASCII.  They contain enormous amounts of binary data that describes the fonts, character formatting, embedded fields, embedded objects, page format etc.  As far as I know, the format of this information is not published, so you will not be able to parse it to extract the text.

You can have word save the file in other formats, such as text only which is easy to parse but has no formatting.  Or you can have it save in RTF which is a well documented format that does have the formatting information, but since you can parse the information, you can skip the formatting information.
>>>As far as I know, the format of this information is not published, so you will not be able to parse it to extract the text.
====================
Sorry, nietod, but I have to disagree with you. The fact that a file's format is not published doesn't prevent anyone from hacking it (or buying software to do so). I listed two such companies above.

At Peachtree Software we had to hack Quicken and QuickBook files in order to provide translation to our file format. Intuit did the same thing with our files. Obviously, since Peachtree and Intuit are number 1 and 2 in the small business accounting software field, neither company gave the other the file formats needed. Each company had to spend the time and money to hack the format of the other companies files themselves.

However, gateguard, the point is that it is going to cost you either money (to buy a third-party library) or time (to hack the file yourself) in order to get the Word file format.

Tom

>>>As far as I know, the format of this information is not published, so you will not be able to
     parse it to extract the text.
     ====================
>>     Sorry, nietod, but I have to disagree with you. The fact that a file's format is not published doesn't prevent anyone from hacking it (or buying software to do so). I listed two such companies above.

But realistically, getegaurd is not going to be able to hack through it himself.  (If he was that experienced, he woudn't have asked.)  In fact I wonder how well these other companies have done handling the advanced features.  It might not be an issue, in this case anyway, but I suspect that one is capable of producing word documents that these libraries can't handle.
I'd like to suggest an alternative.  I agree that the MS Word file format is not published but there are products out there that can read/write it.  The problem with them is that as soon as Word is updated, they are broken until the "code" is cracked again.  MS Word, however, has a published interface that will allow you get do whatever you want with it's data.  It's called OLE and works very well indeed.  Granted that the learning curve to write your first OLE program is significant, it's the "right" way to solve this problem (on a Windows platform for now at least) and keeps you form having to worry about what version of Word you are working with.  The biggest drawback that I can think of would be the requirement to have Word available on the system as an OLE server to provide the data.
In addition to jhance's remarks I think it really comes down to what gateguard is trying to do.

My answer was originally directed solely at his question of how to read a Word file. However, if he wants to do an automated search and replace, he can use a combination of Automation and Word macros to accomplish this. However,the only thing that worries me here is that he explicitly stated "c-programs". As we all know, Automation in 'C' is a helluva lot more difficult than in C++. He may have been using the term 'C' in a generic sense and meant C/C++ or he may only need to know how to write Word macros.

In any case, I think at this point we need a little more detail to fully and correctly answer the question.

Tom

Thanks, everyone.  I am using borland c++.  Here's what I have, and here's what I'm doing.  I have files with personally written notes.  Each file has 100 notes.  Each Note has a personally constructed header with a number, title, date, synopsis, and list of categories.  Beneath that is the body of the note.  My Borland C++ program allows me to input what category I want to search the note data base with and then it pulls out as much each note in the category as I want.  Maybe I want titles only.  Or maybe headers only.  Or maybe only the first five lines of the body.  Whatever.  I don't care about c++.  I'm just don't want text files anymore because I'm starting to use links in my notes and I want to keep the Word97 format.  So it sounds like OLE is the way to go.  Maybe.  Definitely not word macros as far as I can tell.  I can't really tell who locked the question but I'm giving you the points.  This is a great discussion.
gateguard,

This is one of situations where nietod and I focused on what you asked and weren't able to guess what you needed. It's obvious now that you are going to need to use Automation. That is going to be the only way you are going to control Word from another application. In that case, since jhance was the first person to realize this, you need to reject my answer and ask him to answer so that he can get the points.

Tom

Also you say that

>> I can't really tell who locked the question but I'm giving you the points.

The currrenly proopsed answer appears  right below the question.   That is how you tell who will get the points.  Please try to make sure that the right person (who you feel is the right person) gets the points.  Nothing makes experts more -- well lets say grumpy -- than having the wrong answer get the points.
ok, jhance, go ahead, how do i use automation to do this?
ok, jhance, go ahead, how do i use automation to do this?
ASKER CERTIFIED SOLUTION
Avatar of RONSLOW
RONSLOW

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
RONSLOW,

I you have an example of how to do this, jump right in.  I know it can be done using OLE, but I've not done it myself and don't have any examples handy.
The Word document file format isn't proprietary. It's quite documented. Just because you haven't done the research doesn't mean there's a secret.

Get the Office Developer's SDK, or look in MSDN for the topic "Microsoft Word 97 Binary File Format".  You can also find an article or two on ordering this specific documentation if you don't already have MSDN.

B ekiM


Refer to the DrawCLI MFC sample .. this has classes for reading property info streams from a file.

My classes sit on top of that.

Following is two classes COlePropertySet which gives some utility routines and provides a 'friendlier' interface to the property sets.

Derived from this is CSummaryInformationPropertySet, which has members specific to the OLE summary information info.

They are long .. but here you go...
/////////////////////////////////////////////////////////////////////////////
//  OlePropertySet.h OLE property sets
//

#ifndef _COlePropertySet_
#define _COlePropertySet_

#include <winnls.h>

#define LPCTSTR_SUMMARYINFORMATION "SummaryInformation"
#define LPCTSTR_DOCUMENTSUMMARYINFORMATION "DocumentSummaryInformation"

static const GUID GUID_SUMMARYINFORMATION =
{ 0xf29f85e0, 0x4ff9, 0x1068,
{ 0xab, 0x91, 0x08, 0x00, 0x2b, 0x27, 0xb3, 0xd9 } };
static const GUID GUID_DOCUMENTSUMMARYINFORMATION =
{ 0xd5cdd502, 0x2e9c, 0x101b,
{ 0x93, 0x97, 0x08, 0x00, 0x2b, 0x2c, 0xf9, 0xae } };

extern const CLSID theCLSID;

// these come from the DrawCLI sample
class CPropertySet;
class CPropertySection;

class COlePropertySet : public CObject {
      DECLARE_DYNAMIC(COlePropertySet)
      CLSID m_AppID;            // CLSID of Set(same as application)
      CString m_Name;            // Name of Set and Section (eg. "SummaryInformation")
      GUID m_FormatID;      // Format GUID
      CPropertySet* m_pSet;
      CPropertySection* m_pSection;
protected:
      static GUID NameToGUID(LPCTSTR name);
      GUID NameToGUID() { return NameToGUID(m_Name); }
public:
      COlePropertySet(LPCTSTR name = LPCTSTR_SUMMARYINFORMATION);
      ~COlePropertySet();
      void Reset();
      bool WriteToStorage( IStorage* pIStorage ) const;
      bool ReadFromStorage( IStorage* pIStorage );
private:
      LPVOID Get( DWORD dwPropID ) const;
      bool Set( DWORD dwPropID, LPVOID pValue, DWORD dwType );
      bool GetVariant( DWORD dwID, COleVariant& v ) const;
      bool SetVariant( DWORD dwID, const COleVariant& v );
      DWORD GetType ( DWORD dwID) const;
public:
      CString GetString( DWORD dwID) const { COleVariant v; GetVariant(dwID,v); v.ChangeType(VT_BSTR); return v.bstrVal; }
      bool SetString( DWORD dwID, LPCSTR v ) { return Set(dwID,v,VT_LPSTR); }
      int GetInt( DWORD dwID) const { COleVariant v; GetVariant(dwID,v); v.ChangeType(VT_I4); return v.lVal; }
      bool SetInt( DWORD dwID, int i) { return Set(dwID,&i,VT_I4); }
      bool GetBOOL( DWORD dwID) const { COleVariant v; GetVariant(dwID,v); v.ChangeType(VT_BOOL); return v.boolVal != 0; }
      bool SetBOOL( DWORD dwID, bool b) { VARIANT_BOOL vb = (VARIANT_BOOL)b; return Set(dwID,&vb,VT_BOOL); }
      FILETIME GetFILETIME( DWORD dwID) const { static const FILETIME z = {0,0}; FILETIME* pFILETIME = (FILETIME*)Get(dwID); return pFILETIME ? *pFILETIME : z; }
      bool SetFILETIME( DWORD dwID, const FILETIME& filetime) { return Set(dwID,LPVOID(&filetime),VT_FILETIME); }
};

#endif


#include "stdafx.h"
#include "OlePropertySet.h"
#include "PropSet.h"

#ifdef _DEBUG
#undef THIS_FILE
static char BASED_CODE THIS_FILE[] = __FILE__;
#endif

////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////

IMPLEMENT_DYNAMIC(COlePropertySet, CObject)

GUID COlePropertySet::NameToGUID(LPCTSTR name) {
#define CBIT_CHARMASK      5
#define CBIT_BYTE            8
#define CBIT_GUID            (CBIT_BYTE * sizeof(GUID))
#define CCH_MAP                  (1 << CBIT_CHARMASK)        /* 32 */
#define CHARMASK            (CCH_MAP - 1)            /* 0x1f */
      if (0 == strcmp(name,LPCTSTR_SUMMARYINFORMATION)) {
            return GUID_SUMMARYINFORMATION;
      } else if (0 == strcmp(name,LPCTSTR_DOCUMENTSUMMARYINFORMATION)) {
            return GUID_DOCUMENTSUMMARYINFORMATION;
      } else {
            GUID id = {0};
            LPCSTR pc = (LPCSTR)name;
            LPBYTE pb = ((LPBYTE)&id) - 1;
            for (ULONG cbit = 0; cbit < CBIT_GUID; cbit += CBIT_CHARMASK) {
                  ULONG cbitUsed = cbit % CBIT_BYTE;
                  ULONG cbitStored;
                  if (cbitUsed == 0) pb++;
                  int c = *pc;
                  if (c) pc++;
                  if (isascii(c) && islower(c)) c = toupper(c);
                  int i = isupper(c) ? c-'A' : isdigit(c) ? c-'0'+'Z'-'A'+1 : -1;
                  if (i < 0 || i >= CCH_MAP) i = CCH_MAP-1;
                  *pb |= (BYTE)(i << cbitUsed);
                  cbitStored = min(CBIT_BYTE - cbitUsed, CBIT_CHARMASK);
                  // If the translated bits wouldn't all fit in the current byte
                  if (cbitStored < CBIT_CHARMASK) {
                        i >>= CBIT_BYTE - cbitUsed;
                        if (cbit + cbitStored == CBIT_GUID) break;
                        pb++;
                        *pb |= (BYTE)i;
                  }
            }
            return id;
      }
}

COlePropertySet::COlePropertySet(LPCTSTR name)
: m_Name(name)
{
      m_FormatID = NameToGUID();
      m_pSet = new CPropertySet(theCLSID);
      m_pSection = m_pSet->AddSection(m_FormatID);
      m_pSection->SetSectionName(m_Name);
      UINT cp = GetACP();
      m_pSection->Set( PID_CODEPAGE, (void*)&cp, VT_I2);
}

COlePropertySet::~COlePropertySet() {
      delete m_pSet;
}

void COlePropertySet::Reset() {
      m_FormatID = NameToGUID();
      if (m_pSet) delete m_pSet;
      m_pSet = new CPropertySet(theCLSID);
      m_pSection = m_pSet->AddSection(m_FormatID);
      m_pSection->SetSectionName(m_Name);
      UINT cp = GetACP();
      m_pSection->Set( PID_CODEPAGE, (void*)&cp, VT_I2);
}

bool COlePropertySet::WriteToStorage( LPSTORAGE lpRootStg ) const {
      LPSTREAM lpStream = NULL;
      CString streamname = '\005'+m_Name;
      if (lpRootStg == NULL) {
            TRACE(_T("No root storage for %s\n"),LPCTSTR(m_Name));
            return false;
      } else {
            HRESULT result = lpRootStg->CreateStream(
                  LPCOLESTRVAR(LPCTSTR(streamname)),
                  STGM_SHARE_EXCLUSIVE|STGM_CREATE|STGM_READWRITE,
                  0, 0, &lpStream
                  );
            if (FAILED(result)) {
                  TRACE(_T("CreateStream %s failed %d\n"),LPCTSTR(m_Name),result);
                  return false;
            } else {
                  bool ok = m_pSet->WriteToStream(lpStream);
                  if (ok) {
                        lpRootStg->Commit(STGC_DEFAULT);
                  } else {
                        lpRootStg->Revert();
                        TRACE(_T("WriteToStream %s failed\n"),LPCTSTR(m_Name));
                  }
                  lpStream->Release();
                  return ok;
            }
      }
}

bool COlePropertySet::ReadFromStorage( LPSTORAGE lpRootStg ) {
      LPSTREAM lpStream = NULL;
      CString streamname = '\005'+m_Name;
      if (lpRootStg == NULL) {
            TRACE(_T("No root storage for %s\n"),LPCTSTR(m_Name));
            return false;
      } else {
            HRESULT result = lpRootStg->OpenStream(
                  LPCOLESTRVAR(LPCTSTR(streamname)),
                  NULL, STGM_SHARE_EXCLUSIVE|STGM_READ,
                  0, &lpStream
                  );
            if (FAILED(result)) {
                  TRACE(_T("OpenStream %s failed %d\n"),LPCTSTR(m_Name),result);
                  return false;
            } else {
                  bool ok = m_pSet->ReadFromStream(lpStream);
                  lpStream->Release();
                  m_pSection = m_pSet->GetSection(m_FormatID);
                  if (m_pSection) {
                        m_pSection->SetSectionName(m_Name);
                  } else {
                        m_pSet->RemoveAll();
                  }
                  FixupNames();
                  if (! ok) {
                        TRACE(_T("ReadFromStream %s failed\n"),LPCTSTR(m_Name));
                  }
                  return ok;
            }
      }
}

LPVOID COlePropertySet::Get( DWORD dwPropID ) const {
      return m_pSection->Get(dwPropID);
}

bool COlePropertySet::Set( DWORD dwPropID, LPVOID pValue, DWORD dwType ) {
      return m_pSection->Set(dwPropID,pValue,dwType);
}

bool COlePropertySet::GetVariant( DWORD dwID, COleVariant& v ) const {
      CProperty* pProp = m_pSection->GetProperty(dwID);
      if (! pProp) {
            v.Clear();
      return false;
      }
      DWORD dwType = pProp->GetType();
      LPVOID pValue = pProp->Get();
      switch( dwType ) {
      case VT_EMPTY:          /* nothing                     */
            v.ChangeType((VARTYPE)dwType);
            return true;
      case VT_I2:             /* 2 byte signed int           */
            v = COleVariant(*(short*)pValue,(VARTYPE)dwType);
            return true;
      case VT_BOOL:           /* True=-1, False=0            */
            v = COleVariant(*(VARIANT_BOOL*)pValue,(VARTYPE)dwType);
            return true;
      case VT_I4:             /* 4 byte signed int           */
            v = COleVariant(*(long*)pValue,(VARTYPE)dwType);
            return true;
      case VT_R4:             /* 4 byte real                 */
            v = *(float*)pValue;
            return true;
      case VT_R8:             /* 8 byte real                 */
            v = *(double*)pValue;
            return true;
      case VT_CY:             /* currency                    */
            v = COleCurrency(*(CURRENCY*)pValue);
            return true;
      case VT_DATE:           /* date                        */
            v = COleDateTime(*(DATE*)pValue);
            return true;
      case VT_BSTR:           /* binary string               */
            v = CString((BSTR)pValue);
            return true;
      case VT_STREAM:         /* Name of the stream follows  */
      case VT_STORAGE:        /* Name of the storage follows */
      case VT_STREAMED_OBJECT:/* Stream contains an object   */
      case VT_STORED_OBJECT:  /* Storage contains an object  */
      case VT_STREAMED_PROPSET:/* Stream contains a propset  */
      case VT_STORED_PROPSET: /* Storage contains a propset  */
            v = CString((LPCTSTR)pValue);
            return true;
      case VT_LPSTR:          /* null terminated string      */
            v = CString((LPCSTR)pValue);
            return true;
      default:
            v.Clear();
            return false;
      }
}
bool COlePropertySet::SetVariant( DWORD dwID, const COleVariant& v ) {
      if (v.vt == VT_BSTR) {
            return m_pSection->Set(dwID,LPVOID(LPCSTRVAR(v.bstrVal)),VT_LPSTR);
      } else if (v.vt == VT_UI1) {
            short iVal = v.bVal;
            return m_pSection->Set(dwID,LPVOID(&iVal),VT_I2);
      } else if (v.vt == VT_ERROR) {
            return m_pSection->Set(dwID,LPVOID(&v.scode),VT_I4);
      } else {
            return m_pSection->Set(dwID,LPVOID(&v.bVal),v.vt);
      }
}

bool COlePropertySet::GetValue( DWORD dwID, QAnyValue& v ) const {
      CProperty* pProp = m_pSection->GetProperty(dwID);
      if (! pProp) {
            v.Clear();
            return false;
      }
      DWORD dwType = pProp->GetType();
      LPVOID pValue = pProp->Get();
      switch( dwType ) {
      case VT_EMPTY:          /* nothing                     */
            v.Clear();
            return true;
      case VT_I2:             /* 2 byte signed int           */
            v = (int)*(short*)pValue;
            return true;
      case VT_BOOL:           /* True=-1, False=0            */
            v = (bool)*(VARIANT_BOOL*)pValue;
            return true;
      case VT_I4:             /* 4 byte signed int           */
            v = (int)*(long*)pValue;
            return true;
      case VT_R4:             /* 4 byte real                 */
            v = *(float*)pValue;
            return true;
      case VT_R8:             /* 8 byte real                 */
            v = *(double*)pValue;
            return true;
      case VT_CY:             /* currency                    */
            v = (*(CURRENCY*)pValue).int64/10000.0;
            return true;
      case VT_DATE:           /* date                        */
            v = *(DATE*)pValue;
            return true;
      case VT_BSTR:           /* binary string               */
            v = CString((BSTR)pValue);
            return true;
      case VT_STREAM:         /* Name of the stream follows  */
      case VT_STORAGE:        /* Name of the storage follows */
      case VT_STREAMED_OBJECT:/* Stream contains an object   */
      case VT_STORED_OBJECT:  /* Storage contains an object  */
      case VT_STREAMED_PROPSET:/* Stream contains a propset  */
      case VT_STORED_PROPSET: /* Storage contains a propset  */
            v = CString((LPCTSTR)pValue);
            return true;
      case VT_LPSTR:          /* null terminated string      */
            v = CString((LPCSTR)pValue);
            return true;
      default:
            v.Clear();
            return false;
      }
}
bool COlePropertySet::SetValue( DWORD dwID, const QAnyValue& v ) {
      QAnyValue vv(v);
      if (vv.IsString()) {
            return m_pSection->Set(dwID,LPVOID(LPCSTRVAR(vv.MakeCString())),VT_LPSTR);
      } else if (vv.IsBool()) {
            VARIANT_BOOL vb = vv.MakeBool();
            return m_pSection->Set(dwID,LPVOID(&vb),VT_BOOL);
      } else if (vv.IsInt()) {
            int vi = vv.MakeInt();
            return m_pSection->Set(dwID,LPVOID(&vi),VT_I4);
      } else if (vv.IsDouble()) {
            double vd = vv.MakeDouble();
            return m_pSection->Set(dwID,LPVOID(&vd),VT_R8);
      } else {
            double vd = vv.MakeDouble();
            return m_pSection->Set(dwID,LPVOID(&vd),VT_R8);
      }
}

DWORD COlePropertySet::GetType( DWORD dwID) const {
      CProperty* pProp = m_pSection->GetProperty(dwID);
      return pProp ? pProp->GetType() : -1;
}



/////////////////////////////////////////////////////////////////////////////
//  SummaryInformationPropertySet.h OLE property sets
//

#ifndef _CSummaryInformationPropertySet_
#define _CSummaryInformationPropertySet_

#include "OlePropertySet.h"

class CSummaryInformationPropertySet : public COlePropertySet {
      __int64 m_startEdit;
public:
      CSummaryInformationPropertySet();
      void Reset();
public:
      bool SetTitle(LPCTSTR szTitle) { return SetString(PIDSI_TITLE, szTitle); }
      bool SetSubject(LPCTSTR szSubject) { return SetString(PIDSI_SUBJECT, szSubject); }
      bool SetAuthor(LPCTSTR szAuthor) { return SetString(PIDSI_AUTHOR, szAuthor); }
      bool SetKeywords(LPCTSTR szKeywords) { return SetString(PIDSI_KEYWORDS, szKeywords); }
      bool SetComments(LPCTSTR szComments) { return SetString(PIDSI_COMMENTS, szComments); }
      bool SetTemplate(LPCTSTR szTemplate) { return SetString(PIDSI_TEMPLATE, szTemplate); }
      bool SetLastAuthor(LPCTSTR szLastAuthor) { return SetString(PIDSI_LASTAUTHOR, szLastAuthor); }
      bool SetRevNumber(ULONG nRevNumber);
      bool SetEditTime(const FILETIME& ftEditTime) { return SetFILETIME(PIDSI_EDITTIME, ftEditTime); }
      bool SetLastPrintDate(const FILETIME& ftLastPrintDate) { return SetFILETIME(PIDSI_LASTPRINTED, ftLastPrintDate); }
      bool SetCreateDate(const FILETIME& ftCreateDate) { return SetFILETIME(PIDSI_CREATE_DTM, ftCreateDate); }
      bool SetLastSaveDate(const FILETIME& ftLastSaveDate) { return SetFILETIME(PIDSI_LASTSAVE_DTM, ftLastSaveDate); }
      bool SetNumPages(ULONG nNumPages) { return SetInt(PIDSI_PAGECOUNT, nNumPages); }
      bool SetNumWords(ULONG nNumWords) { return SetInt(PIDSI_WORDCOUNT, nNumWords); }
      bool SetNumChars(ULONG nNumChars) { return SetInt(PIDSI_CHARCOUNT, nNumChars); }
      bool SetAppname(LPCTSTR szAppname) { return SetString(PIDSI_APPNAME, szAppname); }
      bool SetSecurity(ULONG nSecurity) { return SetInt(PIDSI_DOC_SECURITY, nSecurity); }
public:
      bool IncrRevNumber();
      void StartEditTimeCount();
      bool AddCountToEditTime();
      bool RecordPrintDate();
      bool RecordCreateDate();
      bool RecordSaveDate();
      bool SetUserAsAuthor();
      bool SetUserAsLastAuthor();
public:
      CString GetTitle() const { return GetString(PIDSI_TITLE); }
      CString GetSubject() const { return GetString(PIDSI_SUBJECT); }
      CString GetAuthor() const { return GetString(PIDSI_AUTHOR); }
      CString GetKeywords() const { return GetString(PIDSI_KEYWORDS); }
      CString GetComments() const { return GetString(PIDSI_COMMENTS); }
      CString GetTemplate() const { return GetString(PIDSI_TEMPLATE); }
      CString GetLastAuthor() const { return GetString(PIDSI_LASTAUTHOR); }
      ULONG GetRevNumber() const;
      FILETIME GetEditTime() const { return GetFILETIME(PIDSI_EDITTIME); }
      FILETIME GetLastPrintDate() const { return GetFILETIME(PIDSI_LASTPRINTED); }
      FILETIME GetCreateDate() const { return GetFILETIME(PIDSI_CREATE_DTM); }
      FILETIME GetLastSaveDate() const { return GetFILETIME(PIDSI_LASTSAVE_DTM); }
      ULONG GetNumPages() const { return GetInt(PIDSI_PAGECOUNT); }
      ULONG GetNumWords() const { return GetInt(PIDSI_WORDCOUNT); }
      ULONG GetNumChars() const { return GetInt(PIDSI_CHARCOUNT); }
      CString GetAppname() const { return GetString(PIDSI_APPNAME); }
      ULONG GetSecurity() const { return GetInt(PIDSI_DOC_SECURITY); }
};

#endif



/////////////////////////////////////////////////////////////////////////////
//  SummaryInformationPropertySet.cpp OLE property sets
//

#include "stdafx.h"

#include "SummaryInformationPropertySet.h"

#ifdef _DEBUG
#undef THIS_FILE
static char BASED_CODE THIS_FILE[] = __FILE__;
#endif

inline __int64 toint64(const FILETIME& ft) { return *(__int64*)&ft; }
inline FILETIME toFILETIME(__int64 i) { return *(FILETIME*)&i; }

CSummaryInformationPropertySet::CSummaryInformationPropertySet()
: COlePropertySet()
, m_startEdit(0)
{
      Reset();
}

void CSummaryInformationPropertySet::Reset() {
      COlePropertySet::Reset();

      SetNewName(PIDSI_TITLE,_T("title"));
      SetNewName(PIDSI_SUBJECT,_T("subject"));
      SetNewName(PIDSI_AUTHOR,_T("author"));
      SetNewName(PIDSI_KEYWORDS,_T("keywords"));
      SetNewName(PIDSI_COMMENTS,_T("comments"));
      SetNewName(PIDSI_TEMPLATE,_T("template"));
      SetNewName(PIDSI_LASTAUTHOR,_T("lastauthor"));
      SetNewName(PIDSI_REVNUMBER,_T("revnumber"));
      SetNewName(PIDSI_EDITTIME,_T("edittime"));
      SetNewName(PIDSI_LASTPRINTED,_T("lastprinted"));
      SetNewName(PIDSI_CREATE_DTM,_T("create_dtm"));
      SetNewName(PIDSI_LASTSAVE_DTM,_T("lastsave_dtm"));
      SetNewName(PIDSI_PAGECOUNT,_T("pagecount"));
      SetNewName(PIDSI_WORDCOUNT,_T("wordcount"));
      SetNewName(PIDSI_CHARCOUNT,_T("charcount"));
      SetNewName(PIDSI_THUMBNAIL,_T("thumbnail"));
      SetNewName(PIDSI_APPNAME,_T("appname"));
      SetNewName(PIDSI_DOC_SECURITY,_T("security"));

      SetUserAsAuthor();
      SetUserAsLastAuthor();
      SetAppname(AfxGetAppName());
}

bool CSummaryInformationPropertySet::SetRevNumber(ULONG nRev) {
      char buff[20];
      sprintf(buff, "%lu", nRev);
      return SetString (PIDSI_REVNUMBER, buff);
}

ULONG CSummaryInformationPropertySet::GetRevNumber() const {
      ULONG nRev = 0;
      CString strRev = GetString(PIDSI_REVNUMBER);
      if (strRev) sscanf(strRev, "%lu", &nRev);
      return nRev;
}

bool CSummaryInformationPropertySet::IncrRevNumber() {
      ULONG nRev = GetRevNumber();
      nRev++;
      return SetRevNumber(nRev);
}

void CSummaryInformationPropertySet::StartEditTimeCount() {
      FILETIME now;
      CoFileTimeNow(&now);
      m_startEdit = *(__int64*)&now;
}

bool CSummaryInformationPropertySet::AddCountToEditTime() {
      FILETIME now;
      CoFileTimeNow(&now);
      __int64 currTime = toint64(now);
      __int64 thisSession = currTime - m_startEdit;
      __int64 lastTotal = toint64(GetEditTime());
      __int64 newTotal = lastTotal + thisSession;
      return SetEditTime(toFILETIME(newTotal));
}

bool CSummaryInformationPropertySet::RecordPrintDate() {
      FILETIME printDate;
      CoFileTimeNow(&printDate);
      return SetLastPrintDate(printDate);
}

bool CSummaryInformationPropertySet::RecordCreateDate() {
      FILETIME createDate;
      CoFileTimeNow(&createDate);
      return SetCreateDate(createDate);
}

bool CSummaryInformationPropertySet::RecordSaveDate() {
      FILETIME saveDate;
      CoFileTimeNow(&saveDate);
      return SetLastSaveDate(saveDate);
}

static LPCTSTR GetUserName () {
      #define MAXUSERNAME 128
      static char username[MAXUSERNAME+1] = "";
      DWORD len = MAXUSERNAME;
      ::GetUserName(username,&len);
      return username;
}

bool CSummaryInformationPropertySet::SetUserAsAuthor() {
      return SetAuthor(GetUserName());
}

bool CSummaryInformationPropertySet::SetUserAsLastAuthor() {
      return SetLastAuthor(GetUserName());
}

my apologies for the formatting ... forgot to detab it.

This code has been trimmed and massaged from working code .. I hope it works for you.

My hats off to you all.  I'll try it and post a comment on what worked.