No1Coder
asked on
Convert word document to text using c# running ASP.net
I have a web application built using c#, asp.net. I need to convert an uploaded word document to a text string. I have looked at microsoft.office.interop.w ord (which works locally), but when I run on a server (where word is not installed) it doesn't work because the DLL is not registered. It there another method of accomplishing this task? OpenXML?
Here is code I have worked with thus far:
public string convertWordToText(string fileName)
{
try
{
object missing = Type.Missing;
object readOnly = true;
Microsoft.Office.Interop.W ord.Applic ation application = new Microsoft.Office.Interop.W ord.Applic ation();
Microsoft.Office.Interop.W ord.Docume nt document = application.Documents.Open (fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
string text = document.Content.Text;
((Microsoft.Office.Interop .Word._App lication)a pplication ).Quit(); //cast as _Application because there's ambiguity
object saveChanges = Microsoft.Office.Interop.W ord.WdSave Options.wd DoNotSaveC hanges;
((Microsoft.Office.Interop .Word._App lication)a pplication ).Quit(ref saveChanges, ref missing, ref missing); //cast as _Application because there's ambiguity
return text;
}
catch (Exception ex)
{
Log.error("Error converting document to text.");
return string.Format("An error occured when converting this document to text. The error returned is '{0}'. \n\nYou can try copy and paste instead.", ex.Message);
}
}
Here is code I have worked with thus far:
public string convertWordToText(string fileName)
{
try
{
object missing = Type.Missing;
object readOnly = true;
Microsoft.Office.Interop.W
Microsoft.Office.Interop.W
string text = document.Content.Text;
((Microsoft.Office.Interop
object saveChanges = Microsoft.Office.Interop.W
((Microsoft.Office.Interop
return text;
}
catch (Exception ex)
{
Log.error("Error converting document to text.");
return string.Format("An error occured when converting this document to text. The error returned is '{0}'. \n\nYou can try copy and paste instead.", ex.Message);
}
}
Definitely want to use OpenXML. Using microsoft.office.interop.w ord on the server is unsupported and may violate your licensing. Microsoft recommends using OpenXML via System.IO.Package.IO. See http://support.microsoft.com/en-us/kb/257757 for more details, specifically the section entitled "Alternatives to server-side Automation"
please downlaod
office runtime from following location and install it on your server.
http://www.microsoft.com/en-us/download/details.aspx?id=18346
office runtime from following location and install it on your server.
http://www.microsoft.com/en-us/download/details.aspx?id=18346
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
There are also some 3rd party that can help you. The one I have used is http://www.aspose.com/.net/word-component.aspx