Convert Word doc to Plain text before database insert

Is there anyway when uploading a word document to convert the main body of the document to text/html for storage in a varchar or similar column in SQL server instead of those dreadful BLOB fields. Max size of a document to be uploaded is 128K.

I hope this method would improve searches, outputting greatly.

By the way has anyone tried out Verity Ultraseek products ?

500 Points for grabs

Who is Participating?
jyokumConnect With a Mentor Commented:
Download the Text Extractor and put the jar file in your CF classpath. This extractor is based on Jakarta POI. If you need more functionality than just reading the file, go get the full blown POI from Apache (

Here's the link to

once you get it setup, it's simple to use

fileName = ExpandPath('ee.doc'); // this should be the full path to your file
try {
      input = CreateObject('java','').init(fileName);
      docText = CreateObject('java','org.textmining.text.extraction.WordExtractor').extractText(input);
} catch(Any e) {
      WriteOutput('ERROR: ' & e.detail);

contents of "#fileName#"<br />
<textarea cols="50" rows="12">#docText#</textarea>
Samuel Neff put together some really good information that he presented at CFUN this year regarding Office integration. I'm sure there's stuff in here that can help

What type of server are your running? Windows/*nix, Apache/IIS.
Options will vary according to your architecture
jturkingtonAuthor Commented:
Windows 2003 Web Edition /  IIS6  Windows 2003 Standard SQL Server Enterprise 2000
jturkingtonAuthor Commented:
Thanks jyokum but not what im looking for

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.