Link to home
Start Free TrialLog in
Avatar of No1Coder
No1CoderFlag for United States of America

asked on

Convert word document to text using c# running ASP.net

I have a web application built using c#, asp.net.  I need to convert an uploaded word document to a text string.  I have looked at microsoft.office.interop.word (which works locally), but when I run on a server (where word is not installed) it doesn't work because the DLL is not registered.  It there another method of accomplishing this task?  OpenXML?

Here is code I have worked with thus far:

        public string convertWordToText(string fileName)
        {
            try
            {
                object missing = Type.Missing;
                object readOnly = true;

                Microsoft.Office.Interop.Word.Application application = new Microsoft.Office.Interop.Word.Application();
                Microsoft.Office.Interop.Word.Document document = application.Documents.Open(fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);

                string text = document.Content.Text;
                ((Microsoft.Office.Interop.Word._Application)application).Quit(); //cast as _Application because there's ambiguity
                object saveChanges = Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges;
                ((Microsoft.Office.Interop.Word._Application)application).Quit(ref saveChanges, ref missing, ref missing); //cast as _Application because there's ambiguity
                return text;
            }
            catch (Exception ex)
            {
                Log.error("Error converting document to text.");
                return string.Format("An error occured when converting this document to text.  The error returned is '{0}'.  \n\nYou can try copy and paste instead.", ex.Message);
            }
        }
Avatar of quizwedge
quizwedge
Flag of United States of America image

Definitely want to use OpenXML. Using microsoft.office.interop.word on the server is unsupported and may violate your licensing. Microsoft recommends using OpenXML via System.IO.Package.IO. See http://support.microsoft.com/en-us/kb/257757 for more details, specifically the section entitled "Alternatives to server-side Automation"
please downlaod

office runtime from following location and install it on your server.

http://www.microsoft.com/en-us/download/details.aspx?id=18346
ASKER CERTIFIED SOLUTION
Avatar of quizwedge
quizwedge
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Éric Moreau
There are also some 3rd party that can help you. The one I have used is http://www.aspose.com/.net/word-component.aspx