We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

Text documents (grabbing the first few sentences)

rossryan
rossryan asked
on
Medium Priority
209 Views
Last Modified: 2010-04-15
Aright, I'm going to hell for asking this question (wait for it): how can I grab the first xxx about of characters from a text (I know this one), doc, or rtf file?
Comment
Watch Question

Commented:
You doing this on a server?  Or can you use Word automation?
Once you get the file contents read into a string you can use Substring. No offense if you already knew this much. Substring takes two arguments - the index of the first character and the index of the last character. You could do something like

aStringFormofsomeWorddoc.Substring(0,3);

Hope that helps

Author

Commented:
It's the getting the file contents (particulary word documents) into the string that I am interested in ;).

What is word automation (api?).

I need to programmatically grab the first few sentences...i.e. I need an API or code that grabs it (for use in a program).
Word along with many other MS office products has a means of writing scripts. This is called VBA Visual Basic for applications. It's under tools/macro/visual basic editor. Perhaps this is what is meant by API?

Have you tried reading a .doc file with standard IO?

I've got to run now. If you haven't figured this out by the time I get back, I will try to help. Good luck.

Commented:
>>It's the getting the file contents (particulary word documents) into the string that I am interested in ;).
That's why I asked.  If you are running a Winforms app, you can use the Word automation objects to open and manipulate a document.  Using the Word objects it is relatively trivial to grab the the first X characters.
http://msdn.microsoft.com/vstudio/office/default.aspx?pull=/library/en-us/odc_vsto2003_ta/html/wordobject.asp

If however you are ripping through lots of documents being uploaded to a server, this might not be the best option,
Yes I agree with the above. It depends on what kind of solution you are trying to provide. A couple years back I worked with a content managment system that published web templates from Word. It worked with a VBA script and validated documents to be published against a DTD. I didn't build it so I'm afraid I don't know much about it.

Author

Commented:
Right. Now, does this work with Office XP, or shall I install 2003? (I've been hoping to hold off on that one).
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.