Solved

How to get word document content

Posted on 2008-10-06
15
1,459 Views
Last Modified: 2013-12-14
Hi ,
      After opening the word document through the following code we can get the Text content of Word Document.

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

How to get the entire word document content ( text, images and tables) in Byte Array.
0
Comment
Question by:ILGDRM
  • 5
  • 4
  • 2
  • +2
15 Comments
 
LVL 19

Expert Comment

by:alb66
ID: 22648685
Why use Automation to get the binary content?
You can use the standard file function.
0
 

Author Comment

by:ILGDRM
ID: 22649523
Thanks for reply
Actually I want to retrive document content on DocumentOpen Event. On the fly i want to decrypt the content and assign this content to _Document Ptr so that user can access the Decrypted Content.
On DocumentBeforeSave I want to encrypt the content and store into the file.
0
 
LVL 22

Accepted Solution

by:
ambience earned 500 total points
ID: 22700583
>> On the fly i want to decrypt the content and assign this content to _Document Ptr so that user can access the Decrypted Content.

Why do you think Word would be able to open your encrypted content? (given that you want to fetch whole thing into a byte array and encrypt it)? OnOpenDocument would get fired only after word was able to open up the document successfuly and that requires that the format/contents of the document be something that it could understand.

The way Word stores contents can be different from the way we see it. For example the picture may be linked and likewise there are so many other things like embedded objects etc. that make it absolutely difficult to fetch everything in one array of bytes. It may not make sense to encrypt an embedded excel chart object.

I suggest that you revisit your requirements and see what you exactly want to achieve. There are various alternatives that you may want to try out.

Options 1:
-------------
- Cut the entire contents to the clipboard.
- Fetch the contents as RTF.
- Encrypt RTF and encode and put RTF back onto clipboard.
- Paste from clipboard and save.

Options 2:
-------------
Consider encrypting the whole document, rather than contents of the doc. You can encrypt the file after it is written by Word and conversely, decrypt before feeding it to word.

Options 3:
-------------
Encrypt the file after it is saved and put it as Text into another document (surrogate document). When the surrogate is opened read the text and create the real document on the fly and open it. The benefit from 2 is that you will have a valid .doc around.

Options 4:
-------------
Use intelligent in-place encryption. Encrypt and replace text as text. Encrypt images and replace original with valid encrypted images. Repeat this for every sensitive object. You can attach VB scripts with the document that do the actual work in handlers for Open/Close document.


Hope this helps ...
0
 

Author Comment

by:ILGDRM
ID: 22700655
Thanks for Approaches provided by you.
Can you provide sample source code for Suggested approaches.
0
 
LVL 5

Expert Comment

by:isprabu
ID: 22700747
Alternatively, you can yse VBA. VBA is better suited for Office Automation. In VBA, you can access the document content easily:
        ActiveDocument.Content



0
 
LVL 22

Expert Comment

by:ambience
ID: 22700982
ActiveDocument.Content will give a Range object. You can use that object to iterate over images, text, tables etc. but there is no easy way to convert it into a byte array.

Before we embark upon a particular option can you please provide a little insight into your requirements?

What exactly do you want to achieve by enc/dec a word document?
What were your initial thoughts that made you try enc/dec the contents in open/close handlers?
When you enc a document, is it necessary that the resultant be also a word doc?
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:ILGDRM
ID: 22702033
Hi Ambience,

What exactly do you want to achieve by enc/dec a word document?
What were your initial thoughts that made you try enc/dec the contents in open/close handlers?
->
My requirement is I want to protect ( Encrypt ) the output generated by word document (i.e. *.doc).
I want to encrypt  it immedietly after storing document on disk(On DocumentSave event). Same encrypted document I want to decrypt on the fly when user tries to open it(On DocumentBeforeOpen event). I want to encrypt entire document or content irrespective of it's data type. How to achive
Option 2 suggested by you i.e. "Consider encrypting the whole document, rather than contents of the doc. You can encrypt the file after it is written by Word and conversely, decrypt before feeding it to word."

When you enc a document, is it necessary that the resultant be also a word doc?
-> Not Necssary
0
 
LVL 22

Expert Comment

by:ambience
ID: 22718408
Well then the simplest that I can recommend is using the EncryptFile API provided by windows.

BOOL bRes = EncryptFile(szPathToWordDoc);

where szPathToWordDoc is the path to the file created by word. Similarly you can use the DecryptFile function to do the reverse (before opening the document).


Foe more info see:
http://msdn.microsoft.com/en-us/library/aa364021(VS.85).aspx
http://msdn.microsoft.com/en-us/library/aa363903(VS.85).aspx


Hope this helps ...
0
 

Author Comment

by:ILGDRM
ID: 22720393
Hi ,
Thanks for Reply!!
Suggested solution is possible only if intented document is closed ( in Off Line Mode ).
Actually my requirement is
 I want to retrive document content or whole document on DocumentOpen Event or while opening Encrypted Document. And while saving or on DocumentBeforeSave I want to Encrypt the Document Content or Whole document. While Accessing the File ( Document ) on Hard Disk it should be in encrypted format. When Word application try to access encrypted file content should be in decrypted format on the fly.
0
 
LVL 22

Expert Comment

by:ambience
ID: 22728309
Strange, the code snippet that you originally posted

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

suggested something different. It appeared as though you have access to the file and Automate word to open the file. In that case I thought that you may insert an extra decryption step.

Is it ok for you to insert VB script macros to the document that you are trying to encrypt? Or do you want to use the C++ application to encrypt the doc?
0
 
LVL 5

Expert Comment

by:isprabu
ID: 22728359
ILGDRM,
As I said earlier, using VBA is my choice for this problem. Are you open to using VBA. If yes, you can get the content of the document easily (see may earlier post) and there are ways how you can encrpyt the data from within. See the below link:
http://www.webace.com.au/~balson/InsaneExcel/Encryption.htm
0
 

Author Comment

by:ILGDRM
ID: 22728374
Hi
I want to use the C++ Application to Encrypt / Decrypt the document ( COM Addin ).
If possible please provide the sample source code.
0
 

Expert Comment

by:esps
ID: 24121321
HI

Did you ever get this to work? I am currently trying to achieve the same thing
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
Introduction This article is a continuation of the C/C++ Visual Studio Express debugger series. Part 1 provided a quick start guide in using the debugger. Part 2 focused on additional topics in breakpoints. As your assignments become a little more …
The viewer will learn how to use and create keystrokes in Netbeans IDE 8.0 for Windows.
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now