• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1538
  • Last Modified:

How to get word document content

Hi ,
      After opening the word document through the following code we can get the Text content of Word Document.

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

How to get the entire word document content ( text, images and tables) in Byte Array.
0
ILGDRM
Asked:
ILGDRM
  • 5
  • 4
  • 2
  • +2
1 Solution
 
alb66Commented:
Why use Automation to get the binary content?
You can use the standard file function.
0
 
ILGDRMAuthor Commented:
Thanks for reply
Actually I want to retrive document content on DocumentOpen Event. On the fly i want to decrypt the content and assign this content to _Document Ptr so that user can access the Decrypted Content.
On DocumentBeforeSave I want to encrypt the content and store into the file.
0
 
ambienceCommented:
>> On the fly i want to decrypt the content and assign this content to _Document Ptr so that user can access the Decrypted Content.

Why do you think Word would be able to open your encrypted content? (given that you want to fetch whole thing into a byte array and encrypt it)? OnOpenDocument would get fired only after word was able to open up the document successfuly and that requires that the format/contents of the document be something that it could understand.

The way Word stores contents can be different from the way we see it. For example the picture may be linked and likewise there are so many other things like embedded objects etc. that make it absolutely difficult to fetch everything in one array of bytes. It may not make sense to encrypt an embedded excel chart object.

I suggest that you revisit your requirements and see what you exactly want to achieve. There are various alternatives that you may want to try out.

Options 1:
-------------
- Cut the entire contents to the clipboard.
- Fetch the contents as RTF.
- Encrypt RTF and encode and put RTF back onto clipboard.
- Paste from clipboard and save.

Options 2:
-------------
Consider encrypting the whole document, rather than contents of the doc. You can encrypt the file after it is written by Word and conversely, decrypt before feeding it to word.

Options 3:
-------------
Encrypt the file after it is saved and put it as Text into another document (surrogate document). When the surrogate is opened read the text and create the real document on the fly and open it. The benefit from 2 is that you will have a valid .doc around.

Options 4:
-------------
Use intelligent in-place encryption. Encrypt and replace text as text. Encrypt images and replace original with valid encrypted images. Repeat this for every sensitive object. You can attach VB scripts with the document that do the actual work in handlers for Open/Close document.


Hope this helps ...
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
ILGDRMAuthor Commented:
Thanks for Approaches provided by you.
Can you provide sample source code for Suggested approaches.
0
 
isprabuCommented:
Alternatively, you can yse VBA. VBA is better suited for Office Automation. In VBA, you can access the document content easily:
        ActiveDocument.Content



0
 
ambienceCommented:
ActiveDocument.Content will give a Range object. You can use that object to iterate over images, text, tables etc. but there is no easy way to convert it into a byte array.

Before we embark upon a particular option can you please provide a little insight into your requirements?

What exactly do you want to achieve by enc/dec a word document?
What were your initial thoughts that made you try enc/dec the contents in open/close handlers?
When you enc a document, is it necessary that the resultant be also a word doc?
0
 
ILGDRMAuthor Commented:
Hi Ambience,

What exactly do you want to achieve by enc/dec a word document?
What were your initial thoughts that made you try enc/dec the contents in open/close handlers?
->
My requirement is I want to protect ( Encrypt ) the output generated by word document (i.e. *.doc).
I want to encrypt  it immedietly after storing document on disk(On DocumentSave event). Same encrypted document I want to decrypt on the fly when user tries to open it(On DocumentBeforeOpen event). I want to encrypt entire document or content irrespective of it's data type. How to achive
Option 2 suggested by you i.e. "Consider encrypting the whole document, rather than contents of the doc. You can encrypt the file after it is written by Word and conversely, decrypt before feeding it to word."

When you enc a document, is it necessary that the resultant be also a word doc?
-> Not Necssary
0
 
ambienceCommented:
Well then the simplest that I can recommend is using the EncryptFile API provided by windows.

BOOL bRes = EncryptFile(szPathToWordDoc);

where szPathToWordDoc is the path to the file created by word. Similarly you can use the DecryptFile function to do the reverse (before opening the document).


Foe more info see:
http://msdn.microsoft.com/en-us/library/aa364021(VS.85).aspx
http://msdn.microsoft.com/en-us/library/aa363903(VS.85).aspx


Hope this helps ...
0
 
ILGDRMAuthor Commented:
Hi ,
Thanks for Reply!!
Suggested solution is possible only if intented document is closed ( in Off Line Mode ).
Actually my requirement is
 I want to retrive document content or whole document on DocumentOpen Event or while opening Encrypted Document. And while saving or on DocumentBeforeSave I want to Encrypt the Document Content or Whole document. While Accessing the File ( Document ) on Hard Disk it should be in encrypted format. When Word application try to access encrypted file content should be in decrypted format on the fly.
0
 
ambienceCommented:
Strange, the code snippet that you originally posted

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

suggested something different. It appeared as though you have access to the file and Automate word to open the file. In that case I thought that you may insert an extra decryption step.

Is it ok for you to insert VB script macros to the document that you are trying to encrypt? Or do you want to use the C++ application to encrypt the doc?
0
 
isprabuCommented:
ILGDRM,
As I said earlier, using VBA is my choice for this problem. Are you open to using VBA. If yes, you can get the content of the document easily (see may earlier post) and there are ways how you can encrpyt the data from within. See the below link:
http://www.webace.com.au/~balson/InsaneExcel/Encryption.htm
0
 
ILGDRMAuthor Commented:
Hi
I want to use the C++ Application to Encrypt / Decrypt the document ( COM Addin ).
If possible please provide the sample source code.
0
 
espsCommented:
HI

Did you ever get this to work? I am currently trying to achieve the same thing
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

  • 5
  • 4
  • 2
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now