[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

Extracting text from a Word doc, but every line ends with CR/CR

Posted on 2015-02-21
4
Medium Priority
?
178 Views
Last Modified: 2015-02-24
Hi, I'm using this code to extract text from a Word .doc file:

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count;//get the number of chars to select
    Result:=WordApp.Documents.item(1).Range(0, CharsCount).Text;//Select the text and retrieve the selection
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window


It works good except for one thing - every line of text it returns is terminated by a CR/CR (ie. #13#13), instead of a CR/LF (ie. #13#10). Is there a way to have the lines of the extracted text terminated my CR/LF?

Thanks!
    Shawn
0
Comment
Question by:shawn857
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 24

Expert Comment

by:jimyX
ID: 40623914
Seems like when copying text by range, it loses the CR&LF.
Better let's use Clipboard:

uses ClipBrd, ComObj;

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count; //get the number of chars to select
    WordApp.Selection.SetRange(0, CharsCount); //make the selection
    WordApp.Selection.Copy;//copy to the clipboard
    Result:= Clipboard.AsText;//get the text from the clipboard
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window

0
 

Author Comment

by:shawn857
ID: 40624690
Thanks Jimy, but the clipboard method runs so much slower than copying text by range. So nothing can be done in the original method to replace CRCR to CRLF?

Thanks
    Shawn
0
 
LVL 24

Accepted Solution

by:
jimyX earned 2000 total points
ID: 40625030
> "So nothing can be done in the original method to replace CRCR to CRLF?"

It is possible by using StringReplace. But sounds unsafe to replace every occurrence of #13#13. You better test it carefully.

Result:= StringReplace(CopiedText, CRCR, CRLF, [rfReplaceAll]);

function ExtractTextFromWordFile(const FileName:string):string;
var
  WordApp    : Variant;
  CharsCount : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  try
    WordApp.Visible := False;
    WordApp.Documents.open(FileName);
    CharsCount:=Wordapp.Documents.item(1).Characters.Count;//get the number of chars to select
    Result:=WordApp.Documents.item(1).Range(0, CharsCount).Text;//Select the text and retrieve the selection
    Result:=StringReplace(Result, #13#13, #13#10, [rfReplaceAll]);
    WordApp.documents.item(1).Close;
  finally
   WordApp.Quit;
  end;
end;

Open in new window

0
 

Author Closing Comment

by:shawn857
ID: 40629301
Thanks Jimy!

Cheers
    Shawn
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question