Convert word document to text using c# running ASP.net

I have a web application built using c#, asp.net.  I need to convert an uploaded word document to a text string.  I have looked at microsoft.office.interop.word (which works locally), but when I run on a server (where word is not installed) it doesn't work because the DLL is not registered.  It there another method of accomplishing this task?  OpenXML?

Here is code I have worked with thus far:

        public string convertWordToText(string fileName)
        {
            try
            {
                object missing = Type.Missing;
                object readOnly = true;

                Microsoft.Office.Interop.Word.Application application = new Microsoft.Office.Interop.Word.Application();
                Microsoft.Office.Interop.Word.Document document = application.Documents.Open(fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);

                string text = document.Content.Text;
                ((Microsoft.Office.Interop.Word._Application)application).Quit(); //cast as _Application because there's ambiguity
                object saveChanges = Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges;
                ((Microsoft.Office.Interop.Word._Application)application).Quit(ref saveChanges, ref missing, ref missing); //cast as _Application because there's ambiguity
                return text;
            }
            catch (Exception ex)
            {
                Log.error("Error converting document to text.");
                return string.Format("An error occured when converting this document to text.  The error returned is '{0}'.  \n\nYou can try copy and paste instead.", ex.Message);
            }
        }
No1CoderAsked:
Who is Participating?
 
quizwedgeCommented:
More on OpenXML, looks like Microsoft has an SDK for OpenXML built on top of System.IO.Packaging at http://www.microsoft.com/en-us/search/Results.aspx?q=Open%20xml%20sdk&form=DLC (There are a few different versions and samples)

Then you should be able to us

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
// Open a WordprocessingDocument for editing using the filepath.
WordprocessingDocument wordprocessingDocument = 
    WordprocessingDocument.Open(filepath, true);
// Assign a reference to the existing document body.
Body body = wordprocessingDocument.MainDocumentPart.Document.Body;

Open in new window


Code taken from https://msdn.microsoft.com/en-us/library/office/ff478255.aspx?cs-save-lang=1&cs-lang=csharp#code-snippet-1
0
 
quizwedgeCommented:
Definitely want to use OpenXML. Using microsoft.office.interop.word on the server is unsupported and may violate your licensing. Microsoft recommends using OpenXML via System.IO.Package.IO. See http://support.microsoft.com/en-us/kb/257757 for more details, specifically the section entitled "Alternatives to server-side Automation"
0
 
atulvjain1Commented:
please downlaod

office runtime from following location and install it on your server.

http://www.microsoft.com/en-us/download/details.aspx?id=18346
0
 
Éric MoreauSenior .Net ConsultantCommented:
There are also some 3rd party that can help you. The one I have used is http://www.aspose.com/.net/word-component.aspx
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.