Solved

Extracting text from a Word doc, but every line ends with CR/CR

Posted on 2015-02-21
4
140 Views
Last Modified: 2015-02-24
Hi, I'm using this code to extract text from a Word .doc file:

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count;//get the number of chars to select
    Result:=WordApp.Documents.item(1).Range(0, CharsCount).Text;//Select the text and retrieve the selection
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window


It works good except for one thing - every line of text it returns is terminated by a CR/CR (ie. #13#13), instead of a CR/LF (ie. #13#10). Is there a way to have the lines of the extracted text terminated my CR/LF?

Thanks!
    Shawn
0
Comment
Question by:shawn857
  • 2
  • 2
4 Comments
 
LVL 24

Expert Comment

by:jimyX
ID: 40623914
Seems like when copying text by range, it loses the CR&LF.
Better let's use Clipboard:

uses ClipBrd, ComObj;

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count; //get the number of chars to select
    WordApp.Selection.SetRange(0, CharsCount); //make the selection
    WordApp.Selection.Copy;//copy to the clipboard
    Result:= Clipboard.AsText;//get the text from the clipboard
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window

0
 

Author Comment

by:shawn857
ID: 40624690
Thanks Jimy, but the clipboard method runs so much slower than copying text by range. So nothing can be done in the original method to replace CRCR to CRLF?

Thanks
    Shawn
0
 
LVL 24

Accepted Solution

by:
jimyX earned 500 total points
ID: 40625030
> "So nothing can be done in the original method to replace CRCR to CRLF?"

It is possible by using StringReplace. But sounds unsafe to replace every occurrence of #13#13. You better test it carefully.

Result:= StringReplace(CopiedText, CRCR, CRLF, [rfReplaceAll]);

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count;//get the number of chars to select
    Result:=WordApp.Documents.item(1).Range(0, CharsCount).Text;//Select the text and retrieve the selection
    Result:=StringReplace(Result, #13#13, #13#10, [rfReplaceAll]);
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window

0
 

Author Closing Comment

by:shawn857
ID: 40629301
Thanks Jimy!

Cheers
    Shawn
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, just open a new email message. In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now