Extract text from Word document ...

Hi,

I have a client who needs me to write a small tool that reads in a Word document, searches for a specific text string, reads in some text immediately following it, and close it without making any changes.

All I can find are lots of examples on how to open a word document and write data, or replace strings, etc.

Can anybody help ?

The OLEContainer methods are not very well documented, which makes it difficult to try anything on the fly.
LVL 1
JustinByromAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
Pierre CorneliusConnect With a Mentor Commented:
Here's a demo. In my demo I just create the document on the fly but in your case you would probably do wa.Documents.Open to load the file.

PAS File:
=============================================================
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls;

const
  wdMove          = 0;
  wdExtend        = 1;

  wdCharacter     = 1;
  wdParagraph     = 4;
  wdLine          = 5;
  wdStory         = 6;

  wdFindStop      = 0;
  wdFindContinue  = 1;
  wdFindAsk       = 2;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  end;

var
  Form1: TForm1;

implementation

uses ComObj;

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
var wa, wd, selection: Variant;
    s: string;
begin
  wa:=CreateOleObject('Word.Application');
  wa.Visible:=True;
  wd:= wa.Documents.Add();
  wd.Select;
  wa.Selection.TypeText('blah blah findtext1=value1;'#13#10);
  wa.Selection.TypeText('blah findtext2=value2; blah'#13#10);
  wa.Selection.TypeText('blah blah findtext3=value3;'#13#10);
  wa.Selection.TypeText('blah findtext4=value4;'#13#10);

    wa.Selection.Find.ClearFormatting;
    wa.Selection.Find.Text:='findtext2=';
    wa.Selection.Find.Replacement.Text:= '';
    wa.Selection.Find.Forward:= true;
    wa.Selection.find.Wrap:= wdFindContinue;
    wa.Selection.Find.Format:= False;
    wa.Selection.Find.MatchCase:= False;
    wa.Selection.Find.MatchWholeWord:= False;
    wa.Selection.Find.MatchWildcards:= False;
    wa.Selection.Find.MatchSoundsLike:= False;
    wa.Selection.Find.MatchAllWordForms:= False;
    wa.Selection.Find.Execute;
    wa.Selection.MoveRight(wdCharacter, 8, Extend:=wdExtend);

  s:= wa.Selection.range.text;
  ShowMessage(s);
end;

end.


DFM File:
=================================================================
object Form1: TForm1
  Left = 192
  Top = 114
  Width = 696
  Height = 480
  Caption = 'Form1'
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'MS Sans Serif'
  Font.Style = []
  OldCreateOrder = False
  PixelsPerInch = 96
  TextHeight = 13
  object Button1: TButton
    Left = 32
    Top = 32
    Width = 75
    Height = 25
    Caption = 'Button1'
    TabOrder = 0
    OnClick = Button1Click
  end
end


Regards
Pierre
0
 
JustinByromAuthor Commented:
Atul,

I've looked at this one, but it concentrates on tables.  I have a full text document, and need to find 'sampletext' and then read a number of characters after that.

Justin
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
JustinByromAuthor Commented:
Further to the initial post, I am also having problems including the Word_TLB.pas file.

It falls over at the 'FOnXMLBeforeDelete' call within the InvokeEvent (...) procedure.

I think it has something to do with the fact that I'm compiling under Delphi 7, but have Delphi 8.NET installed (which was installed after D7).  I am looking into this currently.
0
 
atul_parmarCommented:
Hi

Use the following code. it does not require Word_TLB.pas to be included.

var
  WordApp : Variant;
  TextToFind : String;
  SelStart, SelEnd : integer;
begin
  WordApp := CreateOleObject('Word.Application');
  WordApp.Visible := True;
  Wordapp.documents.open('c:\Test.doc');
  TextToFind := 'Test';
  SelStart := WordApp.Selection.Start + Length(TextToFind);
  SelEnd := SelStart + 5; // number of character following the found text
  WordApp.Selection.Find.Execute(TextToFind);
  WordApp.Selection.SetRange(SelStart, SelEnd);
  WordApp.Selection.copy;
  memo1.lines.clear;
  Memo1.pastefromclipboard;
  WordApp.documents.item(1).Close;
  WordApp.Quit;
end;
0
 
JustinByromAuthor Commented:
Atul,

Thanks for your reply, but Pierre was first and I have used his solution (and modified it to suit) successfully.  I think it is only fair to award him all the points, although your solution is more concise.

Thanks very much,

Justin
0
 
atul_parmarCommented:
That's fine. :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.