Link to home
Start Free TrialLog in
Avatar of nickjfox
nickjfox

asked on

Delphi - string manipulation

Using TMemo I need to look for a phrase and delete the phrase plus two words after the phrase.

So for instance if the phrase is The quick brown fox, the string
"The quick brown fox jumps over the lazy dog"  becomes "the lazy dog"
ASKER CERTIFIED SOLUTION
Avatar of Geert G
Geert G
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nickjfox
nickjfox

ASKER

OK, thanks. Line10 is Results:=PosEx(  rather than Pos( ?

Also, sorry I've done hardly any work with strings before, but once I have found the position
how do I then delete the word in that position?

Thanks for your patience.



I thought I found the solution here http://www.delphicorner.f9.co.uk/articles/misc9.htm.

This adds each word of the TMemo line to a string list. I then need to find the index numbers of the words I am looking for and then delete them together with Index numbers +1 and +2.

However the code I used below seems not to populate the string list?
procedure fillstringlist;
var memoline: TStringList; fulltext, wrd: String;
begin
memoline:=TStringList.Create;
while AnsiPos(' ',fulltext)>0 do begin
///while there are spaces continue
wrd:=copy(fulltext,1,Ansipos(' ',fulltext)-1);
///get first word 
memoline.Add(Wrd);
///add it to the TString list
fulltext:=Copy(fulltext,AnsiPos(' ',fulltext)+1,Length(fulltext)-Length(Wrd)+1); 
//take the first word out of the main text
 end;
 if Length(fulltext)>0 then memoline.Add(fulltext);
///check to see nothing is left
end;

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@Aikimark

I get error messages from your function

line1: default value required for FoundEnd
line12:  too many actual parameters

BTW, there are always two words after the phrase. In real life the phrase is always 'passed by' and there  always follows two names.

I need to delete: "passed by XXXXX YYYYY" from the memo line.
oops.  I wrongly placed my two parameters after the optional Startat parameter.  Try this.
function FindPhrasePlusTwoWords(aText, aPhrase: string; FoundStart, FoundEnd: integer; StartAt: integer = 1): Boolean;
var p, n: integer;
begin
  Result := False;
  p := PosEx(aPhrase, aText, StartAt);
  if p > 0 then  
  begin
    FoundStart := p;
    n := PosEx(' ', aText, p + Length(aPhrase));
    if n > 0 then 
      begin
        FoundEnd := Pos(' ', aText, n+1); 
        Result := True;
      end;
    else
     FoundEnd := p + Length(aPhrase);
  end;
end; 

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
damn it, everybody's extending my code ... looool
nice work guys, i thought everybody would be out like me enjoying the nice weather
i guess some of you didn't get lucky with the weather
c'mon, Geert.  Look at the top of the Delphi leader board.  You won't see me up there.  If I'm going to offer a Delphi solution, I'm probably going to start with some trustworthy source, such as you. :-)

No luck on the weather either.  Rainy today.
OK everyone, I'm sure it's me to blame - but I couldn't get any of the above to work.

Geert - I get an error message too many parameters line 10, ditto Akimark's revised version. Jonas I get a string of nothing.

The code below splits the relevant memo1 line into a string list, from where I can get the string list index of the relevant words in the phrase.

Now, I just have to figure out how to delete the string list indexed at the 4 points and return the result back into a string, which might well be tomorrow's question.

Anyway, thanks for the help. I'll split the points.




Procedure Deletion;
var  fullstring, wrd: String; ILine: Integer; memoline: TStringList;
begin
for iLine := Form1.Memo1.Lines.Count-1 downto 0 do begin
   fullstring:=TrimLeft(Form1.Memo1.Lines[iLine]);
  if AnsiContainsStr(fullstring,'passed by') then begin
   memoline:=TStringList.Create;
    while (Pos(' ',fullstring)>0) do begin
   wrd:=copy(fullstring,1,Pos(' ',fullstring)-1);
   Memoline.Add(Wrd);
   fullstring:=Copy(fullstring,Pos(' ',fullstring)+1,Length(fullstring)-Length(Wrd)+1);
end;

Open in new window

hey guys, just joking
for the asker if you post a remark, don't close it immediately, with a B
i'm on holiday, so i'm slow in answering ...
now lets see if we can earn an A grade

wow, looks like you'r in bad need of some delphi programming advise ...
fwiw, TStringlist.Create and no free ? = memory leak,

so you want to know what words after your text ?
let me code a few minutes i'll get back to you


i would, this is an interesting Q ...

here is where i am at the moment.
I created some helper functions to find positions in the string
function PhraseStart(aPhrase, aHaystack: string; aStartAt: Integer = 1): integer;
begin
  Result := PosEx(UpperCase(aPhrase), UpperCase(aHaystack), aStartAt);
end;
 
function PhraseEnd(aPhrase: string; aStartAt: Integer = 1): integer;
begin
  if aStartAt < 1 then
    aStartAt := 1;
  Result := aStartAt + Length(aPhrase) -1;
end;
 
function NextWordEnd(aText: string; aStartAt: integer = 1): integer;
var n, l: Integer;
  IsLetter: Boolean;
begin
  Result := 0;
  if aStartAt < 1 then
    aStartAt := 1;
  n := 0;
  l := Length(aText);
  repeat
    IsLetter := False;
    if n + aStartAt < l then
    begin
      if Pos(aText[n + aStartAt], WordLetters) > 0 then
        IsLetter := True;
    end;
    if IsLetter then
      n := n + 1;
  until not IsLetter;
end;
 
function NextWordBegin(aText: string; aStartAt: integer = 1): integer;
var n, l: Integer;
  IsLetter, EndOfLine: Boolean;
begin
  Result := 0;
  if aStartAt < 1 then
    aStartAt := 1;
  n := 0;
  l := Length(aText);
  EndOfLine := False;
  repeat
    IsLetter := False;
    if n + aStartAt < l then
    begin
      if Pos(aText[n + aStartAt], WordLetters) > 0 then
        IsLetter := True;
    end else EndOfLine := True;
    if not IsLetter then
      n := n + 1;
  until IsLetter or EndOfLine;
end;

Open in new window

here is code (in D2009)

unit Unit1;
 
interface
 
uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls;
 
type
  TForm1 = class(TForm)
    btnSearch: TButton;
    memLines: TMemo;
    edSearch: TEdit;
    memTests: TMemo;
    edStartAt: TEdit;
    procedure btnSearchClick(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;
 
var
  Form1: TForm1;
 
implementation
 
uses StrUtils;
 
{$R *.dfm}
 
const WordLetters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZéèçàùâêûîôäëüïöÿáúíóàèùìò';
 
function PhraseStart(aPhrase, aHaystack: string; aStartAt: Integer = 1): integer;
begin
  Result := PosEx(aPhrase, aHaystack, aStartAt);
end;
 
function PhraseEnd(aPhrase: string; aStartAt: Integer = 1): integer;
begin
  if aStartAt < 1 then
    aStartAt := 1;
  Result := aStartAt + Length(aPhrase) -1;
end;
 
function NextWordEnd(aText: string; aStartAt: integer = 1): integer;
var n, l: Integer;
  IsLetter: Boolean;
begin
  Result := 0;
  if aStartAt < 1 then
    aStartAt := 1;
  n := 0;
  l := Length(aText);
  repeat
    IsLetter := False;
    if n + aStartAt < l then
    begin
      if Pos(aText[n + aStartAt], WordLetters) > 0 then
        IsLetter := True;
    end;
    if IsLetter then
      n := n + 1;
  until not IsLetter;
  if n > 0 then
    Result := aStartAt + n;
end;
 
function NextWordBegin(aText: string; aStartAt: integer = 1): integer;
var n, l: Integer;
  IsLetter, EndOfLine: Boolean;
begin
  Result := 0;
  if aStartAt < 1 then
    aStartAt := 1;
  n := 0;
  l := Length(aText);
  EndOfLine := False;
  repeat
    IsLetter := False;
    if n + aStartAt < l then
    begin
      if Pos(aText[n + aStartAt], WordLetters) > 0 then
        IsLetter := True;
    end else EndOfLine := True;
    if not IsLetter then
      n := n + 1;
  until IsLetter or EndOfLine;
  if EndOfLine then
    Result := -1
  else if n > 0 then
    Result := aStartAt + n;
end;
 
procedure TForm1.btnSearchClick(Sender: TObject);
var searchText, memText, origText: string;
  sa: integer; // start at
  ps: integer; // start of phrase
  pe: integer; // end of phrase (last letter pos)
  wb, wb2: integer; // beginning of first and second word
  we, we2: integer; // end of first and second word
begin
  searchText := UpperCase(Trim(edSearch.Text)); // Make all uppercase (has no bearings on positions)
  origText := memLines.Lines.Text;
  memText := UpperCase(origText);
  sa := StrToInt(edStartAt.Text);
  if searchText <> '' then
  begin
    ps := PhraseStart(searchText, memText, sa);
    pe := PhraseEnd(searchText, ps);
    // now we need to find the beginning of the next word
    // it's possibly on the next line
    wb := NextWordBegin(memText, pe+1);
    // if we found a word, let's continue
    if wb > 0 then
    begin
      we := NextWordEnd(memText, wb);
      if we > 0 then
      begin
        wb2 := NextWordBegin(memText, we);
        if wb2 > 0 then
        begin
          we2 := NextWordEnd(memText, wb2);
          if we2 > 0 then
          begin
            // now we have all needed positions ...
            // Extract all items to other memo and delete in original memo
 
            memTests.Lines.Add('phrase: "' + Copy(origText, ps, pe-ps+1) + '"');
            memTests.Lines.Add('word 1 after phrase : "' + Copy(origText, wb, we-wb) + '"');
            memTests.Lines.Add('word 2 after phrase : "' + Copy(origText, wb2, we2-wb2) + '"');
            memTests.Lines.Add('phrase + 2 words : "' + Copy(origText, ps, we2-ps) + '"');
            memTests.Lines.Add('------ modified text below ------');
            Delete(origText, ps, we2-ps);
            memTests.Lines.Add(origText);
            memTests.Lines.Add('------ modified text above ------');
            edStartAt.Text := IntToStr(we2);
 
            // you would only need the following code
            // Delete(origText, ps, we2-ps);
            // memTests.Lines.Add(origText);
          end;
        end;
      end;
    end;
  end else
    ShowMessage('Look for an empty string in the wardrobe ... !');
end;
 
end.
 
 
-- dfm --
object Form1: TForm1
  Left = 0
  Top = 0
  Caption = 'Form1'
  ClientHeight = 673
  ClientWidth = 726
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'Tahoma'
  Font.Style = []
  OldCreateOrder = False
  PixelsPerInch = 96
  TextHeight = 13
  object btnSearch: TButton
    Left = 24
    Top = 28
    Width = 75
    Height = 25
    Caption = 'btnSearch'
    TabOrder = 0
    OnClick = btnSearchClick
  end
  object memLines: TMemo
    Left = 112
    Top = 58
    Width = 585
    Height = 137
    Lines.Strings = (
      
        'Using TMemo I need to look for a phrase and delete the phrase pl' +
        'us two words after the phrase.'
      ''
      'So for instance if the phrase is The quick brown fox, the string'
      
        '"The quick brown fox jumps over the lazy dog"  becomes "the lazy' +
        ' dog"')
    TabOrder = 1
  end
  object edSearch: TEdit
    Left = 112
    Top = 32
    Width = 241
    Height = 21
    TabOrder = 2
    Text = 'The quick brown fox'
  end
  object memTests: TMemo
    Left = 112
    Top = 201
    Width = 585
    Height = 456
    TabOrder = 3
  end
  object edStartAt: TEdit
    Left = 576
    Top = 31
    Width = 121
    Height = 21
    Alignment = taRightJustify
    NumbersOnly = True
    TabOrder = 4
    Text = '1'
  end
end

Open in new window

it will need to be searched 2 times in this example
the edit in the top right get's adjusted after every search
this coding is not built for speed, i does the job,
russel libby is a lot better with fast string manipulations ...
@Geert

* Why look at letters when you could just look for the next space and back up one?

* If I were seeing this for the first time, I might ask if there shouldn't be a parameter that specifies the number of nextwords to find rather than executing the NextWordBegin twice.
looking at the sample code from the asker, i'm guessing he is new to delphi
that's why this way

finding the next word is not allways easy, what about a comma,
this code has no problem with 3 or 4 spaces in between words
(it does have other problems)
@Geert

That is another question for nickjfox...are we including other punctuation as delimiters or just spaces?

I think there should probably be a mixture of Pos() and single character inspection.  When looking for the start of the next word, use a single character inspection for not-a-space.

Some delimiter rules can get VERY tricky -- when is an apostrophe used as a contraction versus a quote?  Things get even trickier when we consider multiple language support.

Another unknown is the cleanliness of the text.  Is it messy?  Are sentences well formed and punctuated properly?
all that can only be solved with a very good parser ...
but i don't think this Q will see one posted here (at least not for points)
speaking of parsers...what about a TRegExp solution?  If you could get the pattern right it would be the minimum amount of Delphi code.  However, it wouldn't perform very well and would suffer the same flexibility problems and messiness issues we've been discussing.
I'm sorry, I didn't mean to offend anyone by giving a B but it was hard to give an A when I couldn't get the solutions to work. The reason I gave the points was because I did find a solution, final code below.
It's probably rubbish but it does do the job. Anyway, thanks again.
Procedure DeletePhrasePlusTwo;
var  oldtext, wrd, newtext: String; ILine, i, phraseposition: Integer; memoline: TStringList;
begin
for iLine := Form1.Memo1.Lines.Count-1 downto 0 do begin
   oldtext:=TrimLeft(Form1.Memo1.Lines[iLine]);
   ///look for phrase//
  if AnsiContainsStr(oldtext,'passed by') then begin  
   memoline:=TStringList.Create;
   ///if found, parse text to Stringlist
    while (Pos(' ',oldtext)>0) do begin     
   wrd:=copy(oldtext,1,Pos(' ',oldtext)-1);
   Memoline.Add(Wrd);
   oldtext:=Copy(oldtext,Pos(' ',oldtext)+1,Length(oldtext)-Length(Wrd)+1);
end;
 
   if Length(oldtext)>0 then memoline.Add(oldtext); //check text is //finished
 
  phraseposition:=Memoline.IndexOf('passed');   ///get position of //'passed'
  if Memoline[phraseposition+1]='by' then begin  ///check that ///'passed' is followed by 'by'
 
 for i:=  0 to 3 do
   Memoline.Delete(phraseposition);  ////delete 'passed', 'by' and two //words that follow
 
 for I := 0 to Memoline.Count - 1 do  ///get list back to string 
  newtext := newtext +' '+ memoline[i]; 
 
   memoline.Free;
 
  end;
 end;
  end;
end;
 
 
end.

Open in new window

no offence taken, but let me point out, the code has a memory leak ...
i just redid the indentation and found the TStringList.Create on indent = 3 and free on indent = 4

basically you only free the stringlist when you find the word 'by'

are you a beginner with Delphi ?


Procedure DeletePhrasePlusTwo;
var  oldtext, wrd, newtext: String; ILine, i, phraseposition: Integer; memoline: TStringList;
begin
  // indent = 1
  for iLine := Form1.Memo1.Lines.Count-1 downto 0 do
  begin
    // indent = 2
    oldtext:=TrimLeft(Form1.Memo1.Lines[iLine]);
    ///look for phrase//
    if AnsiContainsStr(oldtext,'passed by') then
    begin
      // indent = 3
      memoline:=TStringList.Create;
      ///if found, parse text to Stringlist
      while (Pos(' ',oldtext)>0) do
      begin
        wrd:=copy(oldtext,1,Pos(' ',oldtext)-1);
        Memoline.Add(Wrd);
        oldtext:=Copy(oldtext,Pos(' ',oldtext)+1,Length(oldtext)-Length(Wrd)+1);
      end;
 
      if Length(oldtext)>0 then
        memoline.Add(oldtext); //check text is //finished
 
      phraseposition:=Memoline.IndexOf('passed');   ///get position of //'passed'
 
      if Memoline[phraseposition+1]='by' then
      begin  ///check that ///'passed' is followed by 'by'
        // indent = 4
        for i:=  0 to 3 do
          Memoline.Delete(phraseposition);  ////delete 'passed', 'by' and two //words that follow
 
        for I := 0 to Memoline.Count - 1 do  ///get list back to string
          newtext := newtext +' '+ memoline[i];
 
        memoline.Free;
      end;
    end;
  end;
end;

Open in new window

@ Geert, thanks, I'll take your advice re the memory leak
@ aikimark, I awarded the points yesterday
@nickjfox

I know, but I unaccepted your selections thinking that Geert could supply a satisfactory answer.  I didn't know that you had found your solution separately.
your solution is not very flexible ...
let me explain :
it only works on Form1
it's only for Ansi (D2009 is unicode)
your cleanup is not error proof (you don't use try finally)
it only works for the phrase 'passed by'
it only works if words are separated by spaces (EOL and LF) ?

@Geert

I don't want marks for my code. How many times do I have to say my code is crap and I don't know what I am doing.

I am a beginner. Although you keep asking whether I am a beginner, you can get the clue from the top where it says:

"Author: nickjfox, Limited Member
Beginner on this subject. "

I've spent more time on this site looking for a solution than doing my own solution myself
And I am a beginner as it says at the top of the page
@nickjfox

I appologize for this poor experience.  Please reconsider.  

Your earlier experience (beginner status) comment highlights one of the shortcomings with the EE member profiles that we are currently working to fix.  Most subject matter experts at EE don't think that it is reliable and usually ignore it.  In this case, we should have explicitly asked you if you needed help with the code that we posted.  

This non-reliance on questioner experience can cut both ways, so I usually assume that the questioner has some experience.  I've angered some questioners by asking them if they were beginners, when they were just non-native English speakers.  It's a tough call.
@ModernMatt

[quote]I  I am sure if you had posted the original code you used to resolve this matter when you closed the question yesterday, everything would have made complete sense and this would have been less of an issue.[/quote]

Do you people actually see threads written by the asker?

Read  24781159



Have I been banned?
@nickjfox

I hope not.  You really didn't have to award ANY points to ANY expert in this thread.  If it will make you feel any better, we can do this for you.  You were working independently towards a solution and you found yours first.  I really didn't expect any points when I reopened the question, since I thought that Geert was already close to posting the best solution.

As Geert commented, this was an interesting problem to be solved.  I only reopened the question because I thought you weren't satisfied.  This action might have been the opening action to this 'can of worms'.  If so, I'm very sorry to you, the other experts participating in this thread, and to the mods/admins who have been brought into this.