Solved

Extract all words from a richedit and insert into a stringlist

Posted on 2007-03-19
12
445 Views
Last Modified: 2010-04-05
Hello guys,

I have a text in a richedit and I would like a function where I could get all words from my richedit and insert into a stringlist. this function need to take away the simbols ",!@?/'etc.

Thanks
0
Comment
Question by:hidrau
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 2
12 Comments
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748573
Try something like this:

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  Explode(StringReplace(RichEdit1.Lines.Text, '''', '''''', [rfReplaceAll]), ListBox1.Items);
end;
0
 
LVL 1

Author Comment

by:hidrau
ID: 18748742
EddieShipman,

I am having problem with your function because I have some word with accent. Like

ação = action
sábia = knew
fábrica = factory

maybe your function doesn't work with accented words

Could you take a look at?

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748968
ok, hold on.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 26

Expert Comment

by:EddieShipman
ID: 18749207
Seems the NextToken function of TParser does not handle extended chars.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18750415
no way?
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18750758
I found a free component on Torry's that seems to work OK, give it a try, here's a demo:

SQL Parser By Kei.
TQueryParserComp is a simple text parser which recognizes comments, strings, statement delimiters
and special characters.

Fully functional
Source: Included

Download: http://www.torry.net/db/access/db_sql/kqparser.zip

******************************************************************************************
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls, SQLParserComp;

type
  TForm1 = class(TForm)
    Button1: TButton;
    ListBox1: TListBox;
    RichEdit1: TRichEdit;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
    procedure QueryParserStatementDelimiter(Sender: TObject;
      SQLStatement: String);
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  //
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
  function AddToken(var AToken: String):Boolean;
  var
    i: Integer;
  begin
    Result := False;
    if (Length(Trim(AToken)) > 0) then
    begin
      for i := Length(AToken) downto 1 do
        if (AToken[i] in [',', '.', '"', '(', ')', '@']) then
          Delete(AToken, i, 1);
        if Length(Trim(AToken)) > 0 then
          Result := True;
    end;
  end;

var
  QueryParser: TQueryParserComp;
  sToken: String;
begin
  try
    QueryParser := TQueryParserComp.Create(nil);
    QueryParser.IsEOFStmtDelimiter := False;
    QueryParser.StringDelimiters   := #39'"';
    QueryParser.RemoveStrDelimiter := False;
    QueryParser.CountFromStatement := True;
    QueryParser.TextToParse := RichEdit1.Lines.Text;
    QueryParser.FirstToken;
    while not QueryParser.EOF do
    begin
      sToken := QueryParser.Token;
      if AddToken(sToken)then
        ListBox1.Items.Add(sToken);
      QueryParser.NextToken;
    end;
  finally
    ShowMessage(IntToStr(QueryParser.SymbolsCount));
    QueryParser.Free;
  end;
end;

procedure TForm1.QueryParserStatementDelimiter(Sender: TObject;
  SQLStatement: String);
begin
end;

end.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18750983
EddieShipman:

If I knew that you will try with some component I would have advised you. it is not allowed to install component here in company. Sorry to say :(

it seems that solution is hard
0
 
LVL 28

Expert Comment

by:2266180
ID: 18751150
an aprox 5 min solution (includes starting up the IDE :) )

type TCharSet=set of char;

procedure moveto(r:TRichEdit; l:TStrings; strip:TCharSet);
var s,t:string;
    i:integer;
begin
  s:='';
  t:=r.Lines.Text;
  for i:=1 to length(t) do
    if not (t[i] in strip) then
      s:=s+t[i];
  l.Text:=s;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, ['"',',','!','@','?','/','''']);
end;

there is a lot to optimize here speedwise (not using a set but an array if char, using pchar for s and t, etc, but I concentrated on making the solution fast rather than maing a fast solution :) 1st important thing is to make it work :) (this is a pour excuse for the above algorithm :P )
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751242
I have a text in my richedit and I need to create a listbox or a tstringlist of words from my text.
This Tstringlist or listbox can't have repeated words.

0
 
LVL 28

Accepted Solution

by:
2266180 earned 500 total points
ID: 18751660
>This Tstringlist or listbox can't have repeated words.
this is a very important detail you forgot to mention.

and I missunderstood the word list part :)

here is an updated code:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls;

type
  TForm1 = class(TForm)
    RichEdit1: TRichEdit;
    Button1: TButton;
    Memo1: TMemo;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

function isWordDelimiter(c:char):boolean;
begin
{  result:=(c<#48) or // calculated according to www.asciitable.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#128) or
                        ( (c>#151) and
                          ( (c<#153) or
                            ( (c>#154) and
                              ( (c<#160) or
                                (c>#165)
                              )
                            )
                          )
                        )
                      )
                    )
                  )
                )
              )
            );{}
  result:=(c<#48) or // calculated according to www.lookuptables.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#192) or
                        ( (c>#214) and
                          (c<#216)
                        )
                      )
                    )
                  )
                )
              )
            );{}
end;

function findWord(w:string; l:TStrings; caseSensitive:boolean):boolean;
var i:integer;
begin
  if caseSensitive then result:=l.IndexOf(w)>-1
                   else
  begin
    i:=0;
    while (i<l.Count) and (not SameText(l[i],w)) do
      inc(i);
    result:=i<l.Count;
  end;
end;

procedure moveto(r:TRichEdit; l:TStrings; caseSensitive:boolean);
var s,t,w:string;
    i:integer;
begin
  s:='';
  w:='';
  t:=r.Lines.Text;
  i:=1;
  while i<=length(t) do
  begin
    while (i<=length(t)) and (isWordDelimiter(t[i])) do
      inc(i);
    w:='';
    while (i<=length(t)) and (not isWordDelimiter(t[i])) do
    begin
      w:=w+t[i];
      inc(i);
    end;
    if (w<>'') and not findWord(w, l, caseSensitive)then
      l.add(w);
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, false);
end;

end.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751713
thanks ciuly

that was I needed.

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18751957
hidrau, I did not install the component, only had the source code avail in my path.
I created the component at run-time.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
control image tags in a string ? 12 154
Base1 Encode/Decode 3 89
Way to decrease size of apk file 9 110
Firemonkey how to swip panel with finger tap 1 23
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question