Reading of MS Word Document through automation is Very Slow?

I am trying to read the data present in MS word Document through Automation,But is very slow?How can i speed up the reading?I have attached the code ,in that OnGetTextFromWord() is a event handler that will be involked whenever the user press the GetText Button (in my App).  

 ***MsWord.GetLine(i); is the instruction that gets data from the word line By line and displays in RichEDitControl using AppendToLog(szFirstLine, RGB(0, 0, 0)); function.



CString CWordAutomation::GetLine(int nLine)
{
	CString szLine = _T("");
	if(NULL  == m_pdispWordApp)
		return szLine;
 
	VARIANTARG varg1, varg2;
	int wdGoToLine = 3;		//MsWord constant
	int wdGoToAbsolute = 1;	//MsWord constant
	int wdLine = 5;			//MsWord constant
	int wdExtend = 1;		//MsWord constant
	
	//Got to line
	ClearAllArgs();
	if (!WordInvoke(m_pdispWordApp, L"Selection", &varg1, DISPATCH_PROPERTYGET, 0))
		return szLine;
	ClearAllArgs();
	AddArgumentInt2(L"What", 0, wdGoToLine);
	AddArgumentInt2(L"Which", 0, wdGoToAbsolute);
	AddArgumentInt2(L"Count", 0, nLine);
	if (!WordInvoke(varg1.pdispVal, L"GoTo", NULL, DISPATCH_METHOD, 0))
		return szLine;
	
	//Selection.HomeKey Unit:=wdLine
	ClearAllArgs();
	AddArgumentInt2(L"Unit", 0, wdLine);
	if (!WordInvoke(varg1.pdispVal, L"HomeKey", NULL, DISPATCH_METHOD, 0))
		return szLine;
	//Selection.EndKey Unit:=wdLine, Extend:=wdExtend
	ClearAllArgs();
	AddArgumentInt2(L"Unit", 0, wdLine);
	AddArgumentInt2(L"Extend", 0, wdExtend);
	if (!WordInvoke(varg1.pdispVal, L"EndKey", &varg2, DISPATCH_METHOD, 0))
		return szLine;
	ClearAllArgs();
	if (!WordInvoke(varg1.pdispVal, L"Text", &varg2, DISPATCH_PROPERTYGET, 0))
		return szLine;
 
	//Get text from varg2
	VARTYPE Type = varg2.vt;
	switch (Type) 
		{
			case VT_UI1:
				{
					unsigned char nChr = varg2.bVal;
					szLine.Format("%c", nChr);
				}
				break;
			case VT_I4:
				{
					long nVal = varg2.lVal;
					szLine.Format("%i", nVal);
				}
				break;
			case VT_R4:
				{
					float fVal = varg2.fltVal;
					szLine.Format("%f", fVal);
				}
				break;
			case VT_R8:
				{
					double dVal = varg2.dblVal;
					szLine.Format("%f", dVal);
				}
				break;
			case VT_BSTR:
				{
					BSTR b = varg2.bstrVal;
					szLine = b;
				}
				break;
			case VT_BYREF|VT_UI1:
				{
					//Not tested
					unsigned char* pChr = varg2.pbVal;
					szLine.Format("%c", *pChr);
				}
				break;
			case VT_BYREF|VT_BSTR:
				{
					//Not tested
					BSTR* pb = varg2.pbstrVal;
					szLine = *pb;
				}
			case 0:
				{
					//Empty
					szLine = _T("");
				}
			}
 
	
	return szLine;
 
}
/****************************************************************************************/
 
void CMSWordDemoDlg::OnGetTextFromWord() 
{
	//Use Windows file dialog to obtain FileName 
	char szFilter[] =
      "Word Files (*.*)|*.doc|Text Files (*.txt)|*.txt|All Files (*.*)|*.*||";
 
	CFileDialog	DataRead(TRUE, // TRUE for FileOpen, FALSE for FileSaveAs
		NULL, NULL,
		OFN_PATHMUSTEXIST|OFN_OVERWRITEPROMPT,
		szFilter,
		NULL);
 
		int nFileRead = DataRead.DoModal();
		
		if(IDOK == nFileRead)
		{
			//Get file name for opening Excel file
			CString szFileName = DataRead.GetPathName();
			if(szFileName.IsEmpty())
				return;
			//Do not make Word visible
			CEzWordAutomation MsWord(FALSE);	
			MsWord.OpenWordFile(szFileName);
 
			int nLineCount = MsWord.GetLineCount();
			CString szFirstLine, szLastLine;
                  for(int i=1;i<=nLineCount;i++)
				 {
			      szFirstLine = MsWord.GetLine(i);
	                       AppendToLog(szFirstLine, RGB(0, 0, 0));
	                      AppendToLog("\n",RGB(0, 0x99, 0));
				 }
			MsWord.CloseDocument(FALSE);
			MsWord.ReleaseWord();
 
			CString szMessage;
			szMessage.Format("Found %i line(s) in this file. \n First and Last Lines in this file are: \n", nLineCount);
			szMessage = szMessage + szFirstLine+_T("\n ... \n") + szLastLine;
			MessageBox(szMessage);
		}
}

Open in new window

Rajeshm8484Asked:
Who is Participating?
 
chip3dConnect With a Mentor Commented:
Hi Rajeshm8484,

to use your GetLine function to read a word document line by line could be quite time consuming. You are using goto with wdGoToAbsolute and the current line to read. Every time you do that, word starts moving the range form the beginning of the document to the line you have specified. If you do this one time even for a line that is at the end of a big document, it is fast enough for most circumstances. But doing this for every line in a big document can be very time consuming because you have a comlexity of O(n²) (n are the number of lines you read). Instead you could use goto with wdGoToRelative to read the document line by line, ending with a complexity that is linear instead of quadratic.
0
 
cupCommented:
Is it slow reading the document or starting word?  From past experience, it has always been slow starting word (15-20 seconds).  Once that has started, it just zips through.  Excel is the same - takes about 20s on a 3GHz machine running XP.

Also is AppendToLog in memory?  i.e. is the log stored in memory.  Memory reallocation is pretty expensive if you're reallocating large chunks in small steps.
0
 
Rajeshm8484Author Commented:
Reading the Document from my application is slow....

Also AppendToLog is a function that will display the content of Document  in RichEDitControl...
0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
Rajeshm8484Author Commented:
Hi chip3d,

Thanks for ur suggestion..Can u modify my Getline function using wdGoToRelative and attach the code snippet?Because i am new to Automation.
0
 
Rajeshm8484Author Commented:
Thanks for ur solution...
0
 
ILGDRMCommented:
Hi ,
      After opening the word document through the following code u can get the Text content of Word Document.

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

How to get the entire word document content ( text, images and tables) in Byte Array.

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.