[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

Reading of  MS Word Document  through automation is Very Slow?

Posted on 2008-06-15
8
Medium Priority
?
1,652 Views
Last Modified: 2013-11-20
I am trying to read the data present in MS word Document through Automation,But is very slow?How can i speed up the reading?I have attached the code ,in that OnGetTextFromWord() is a event handler that will be involked whenever the user press the GetText Button (in my App).  

 ***MsWord.GetLine(i); is the instruction that gets data from the word line By line and displays in RichEDitControl using AppendToLog(szFirstLine, RGB(0, 0, 0)); function.



CString CWordAutomation::GetLine(int nLine)
{
	CString szLine = _T("");
	if(NULL  == m_pdispWordApp)
		return szLine;
 
	VARIANTARG varg1, varg2;
	int wdGoToLine = 3;		//MsWord constant
	int wdGoToAbsolute = 1;	//MsWord constant
	int wdLine = 5;			//MsWord constant
	int wdExtend = 1;		//MsWord constant
	
	//Got to line
	ClearAllArgs();
	if (!WordInvoke(m_pdispWordApp, L"Selection", &varg1, DISPATCH_PROPERTYGET, 0))
		return szLine;
	ClearAllArgs();
	AddArgumentInt2(L"What", 0, wdGoToLine);
	AddArgumentInt2(L"Which", 0, wdGoToAbsolute);
	AddArgumentInt2(L"Count", 0, nLine);
	if (!WordInvoke(varg1.pdispVal, L"GoTo", NULL, DISPATCH_METHOD, 0))
		return szLine;
	
	//Selection.HomeKey Unit:=wdLine
	ClearAllArgs();
	AddArgumentInt2(L"Unit", 0, wdLine);
	if (!WordInvoke(varg1.pdispVal, L"HomeKey", NULL, DISPATCH_METHOD, 0))
		return szLine;
	//Selection.EndKey Unit:=wdLine, Extend:=wdExtend
	ClearAllArgs();
	AddArgumentInt2(L"Unit", 0, wdLine);
	AddArgumentInt2(L"Extend", 0, wdExtend);
	if (!WordInvoke(varg1.pdispVal, L"EndKey", &varg2, DISPATCH_METHOD, 0))
		return szLine;
	ClearAllArgs();
	if (!WordInvoke(varg1.pdispVal, L"Text", &varg2, DISPATCH_PROPERTYGET, 0))
		return szLine;
 
	//Get text from varg2
	VARTYPE Type = varg2.vt;
	switch (Type) 
		{
			case VT_UI1:
				{
					unsigned char nChr = varg2.bVal;
					szLine.Format("%c", nChr);
				}
				break;
			case VT_I4:
				{
					long nVal = varg2.lVal;
					szLine.Format("%i", nVal);
				}
				break;
			case VT_R4:
				{
					float fVal = varg2.fltVal;
					szLine.Format("%f", fVal);
				}
				break;
			case VT_R8:
				{
					double dVal = varg2.dblVal;
					szLine.Format("%f", dVal);
				}
				break;
			case VT_BSTR:
				{
					BSTR b = varg2.bstrVal;
					szLine = b;
				}
				break;
			case VT_BYREF|VT_UI1:
				{
					//Not tested
					unsigned char* pChr = varg2.pbVal;
					szLine.Format("%c", *pChr);
				}
				break;
			case VT_BYREF|VT_BSTR:
				{
					//Not tested
					BSTR* pb = varg2.pbstrVal;
					szLine = *pb;
				}
			case 0:
				{
					//Empty
					szLine = _T("");
				}
			}
 
	
	return szLine;
 
}
/****************************************************************************************/
 
void CMSWordDemoDlg::OnGetTextFromWord() 
{
	//Use Windows file dialog to obtain FileName 
	char szFilter[] =
      "Word Files (*.*)|*.doc|Text Files (*.txt)|*.txt|All Files (*.*)|*.*||";
 
	CFileDialog	DataRead(TRUE, // TRUE for FileOpen, FALSE for FileSaveAs
		NULL, NULL,
		OFN_PATHMUSTEXIST|OFN_OVERWRITEPROMPT,
		szFilter,
		NULL);
 
		int nFileRead = DataRead.DoModal();
		
		if(IDOK == nFileRead)
		{
			//Get file name for opening Excel file
			CString szFileName = DataRead.GetPathName();
			if(szFileName.IsEmpty())
				return;
			//Do not make Word visible
			CEzWordAutomation MsWord(FALSE);	
			MsWord.OpenWordFile(szFileName);
 
			int nLineCount = MsWord.GetLineCount();
			CString szFirstLine, szLastLine;
                  for(int i=1;i<=nLineCount;i++)
				 {
			      szFirstLine = MsWord.GetLine(i);
	                       AppendToLog(szFirstLine, RGB(0, 0, 0));
	                      AppendToLog("\n",RGB(0, 0x99, 0));
				 }
			MsWord.CloseDocument(FALSE);
			MsWord.ReleaseWord();
 
			CString szMessage;
			szMessage.Format("Found %i line(s) in this file. \n First and Last Lines in this file are: \n", nLineCount);
			szMessage = szMessage + szFirstLine+_T("\n ... \n") + szLastLine;
			MessageBox(szMessage);
		}
}

Open in new window

0
Comment
Question by:Rajeshm8484
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 11

Expert Comment

by:cup
ID: 21788719
Is it slow reading the document or starting word?  From past experience, it has always been slow starting word (15-20 seconds).  Once that has started, it just zips through.  Excel is the same - takes about 20s on a 3GHz machine running XP.

Also is AppendToLog in memory?  i.e. is the log stored in memory.  Memory reallocation is pretty expensive if you're reallocating large chunks in small steps.
0
 

Author Comment

by:Rajeshm8484
ID: 21790844
Reading the Document from my application is slow....

Also AppendToLog is a function that will display the content of Document  in RichEDitControl...
0
 
LVL 4

Accepted Solution

by:
chip3d earned 1500 total points
ID: 21791717
Hi Rajeshm8484,

to use your GetLine function to read a word document line by line could be quite time consuming. You are using goto with wdGoToAbsolute and the current line to read. Every time you do that, word starts moving the range form the beginning of the document to the line you have specified. If you do this one time even for a line that is at the end of a big document, it is fast enough for most circumstances. But doing this for every line in a big document can be very time consuming because you have a comlexity of O(n²) (n are the number of lines you read). Instead you could use goto with wdGoToRelative to read the document line by line, ending with a complexity that is linear instead of quadratic.
0
Moving data to the cloud? Find out if you’re ready

Before moving to the cloud, it is important to carefully define your db needs, plan for the migration & understand prod. environment. This wp explains how to define what you need from a cloud provider, plan for the migration & what putting a cloud solution into practice entails.

 

Author Comment

by:Rajeshm8484
ID: 21793427
Hi chip3d,

Thanks for ur suggestion..Can u modify my Getline function using wdGoToRelative and attach the code snippet?Because i am new to Automation.
0
 

Author Closing Comment

by:Rajeshm8484
ID: 31467307
Thanks for ur solution...
0
 

Expert Comment

by:ILGDRM
ID: 22648632
Hi ,
      After opening the word document through the following code u can get the Text content of Word Document.

IDispatch* pDispRange = oDocument.GetContent();
Range objRange(pDispRange);
AfxMessageBox(objRange.GetText());

How to get the entire word document content ( text, images and tables) in Byte Array.

0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Hints for the grid button.  Nested classes, templated collections.  Squash that darned bug! Continuing from the sixth article about sudoku.   Open the project in visual studio. First we will finish with the SUD_SETVALUE messa…
In this post we will learn different types of Android Layout and some basics of an Android App.
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.
Suggested Courses

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question