Avatar of Garbonzo_Horowitz
Garbonzo_Horowitz
Flag for United States of America

asked on 

Parsing a MS Word document with Coldfusion, XPath and creating a new one with the same formatting as the original.

Greeting Experts! I have a coding issue that I could use some help with. I'm using ColdFusion 10 to try and parse a Microsoft Word document, extract the contents, keep it's formatting, and put it into a SQL Server database in multiple parts. For example if the word doc were HTML, each row would go into the database separately. This is turning out to be a tough nut to crack. I'm having trouble with the XPath part. I can find individual cells with the XML structure that ColdFusion creates but I can't seem to list all child nodes for a given element. This is all an attempt to keep the MS Word formatting, including special characters. I know parsing HTML would be easier but the goal is to do it with the original MS Word document. Given the key words from the left hand column, the Word Doc will filter out the Details Text in the right column. For example if the key word search was "Bob" then only the rows that have "Bob" in the first column will be displayed. I'd like to stay away from third party plug-ins. Any help is appreciated.  Thank you.
EE_CFM_XML_DB.txt
TheDocument.docx
XMLMicrosoft WordColdFusion Language* XPathSQL

Avatar of undefined
Last Comment
_agx_

8/22/2022 - Mon