Solved

Extract all words from a richedit and insert into a stringlist

Posted on 2007-03-19
12
443 Views
Last Modified: 2010-04-05
Hello guys,

I have a text in a richedit and I would like a function where I could get all words from my richedit and insert into a stringlist. this function need to take away the simbols ",!@?/'etc.

Thanks
0
Comment
Question by:hidrau
  • 5
  • 5
  • 2
12 Comments
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748573
Try something like this:

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  Explode(StringReplace(RichEdit1.Lines.Text, '''', '''''', [rfReplaceAll]), ListBox1.Items);
end;
0
 
LVL 1

Author Comment

by:hidrau
ID: 18748742
EddieShipman,

I am having problem with your function because I have some word with accent. Like

ação = action
sábia = knew
fábrica = factory

maybe your function doesn't work with accented words

Could you take a look at?

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748968
ok, hold on.
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 26

Expert Comment

by:EddieShipman
ID: 18749207
Seems the NextToken function of TParser does not handle extended chars.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18750415
no way?
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18750758
I found a free component on Torry's that seems to work OK, give it a try, here's a demo:

SQL Parser By Kei.
TQueryParserComp is a simple text parser which recognizes comments, strings, statement delimiters
and special characters.

Fully functional
Source: Included

Download: http://www.torry.net/db/access/db_sql/kqparser.zip

******************************************************************************************
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls, SQLParserComp;

type
  TForm1 = class(TForm)
    Button1: TButton;
    ListBox1: TListBox;
    RichEdit1: TRichEdit;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
    procedure QueryParserStatementDelimiter(Sender: TObject;
      SQLStatement: String);
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  //
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
  function AddToken(var AToken: String):Boolean;
  var
    i: Integer;
  begin
    Result := False;
    if (Length(Trim(AToken)) > 0) then
    begin
      for i := Length(AToken) downto 1 do
        if (AToken[i] in [',', '.', '"', '(', ')', '@']) then
          Delete(AToken, i, 1);
        if Length(Trim(AToken)) > 0 then
          Result := True;
    end;
  end;

var
  QueryParser: TQueryParserComp;
  sToken: String;
begin
  try
    QueryParser := TQueryParserComp.Create(nil);
    QueryParser.IsEOFStmtDelimiter := False;
    QueryParser.StringDelimiters   := #39'"';
    QueryParser.RemoveStrDelimiter := False;
    QueryParser.CountFromStatement := True;
    QueryParser.TextToParse := RichEdit1.Lines.Text;
    QueryParser.FirstToken;
    while not QueryParser.EOF do
    begin
      sToken := QueryParser.Token;
      if AddToken(sToken)then
        ListBox1.Items.Add(sToken);
      QueryParser.NextToken;
    end;
  finally
    ShowMessage(IntToStr(QueryParser.SymbolsCount));
    QueryParser.Free;
  end;
end;

procedure TForm1.QueryParserStatementDelimiter(Sender: TObject;
  SQLStatement: String);
begin
end;

end.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18750983
EddieShipman:

If I knew that you will try with some component I would have advised you. it is not allowed to install component here in company. Sorry to say :(

it seems that solution is hard
0
 
LVL 28

Expert Comment

by:2266180
ID: 18751150
an aprox 5 min solution (includes starting up the IDE :) )

type TCharSet=set of char;

procedure moveto(r:TRichEdit; l:TStrings; strip:TCharSet);
var s,t:string;
    i:integer;
begin
  s:='';
  t:=r.Lines.Text;
  for i:=1 to length(t) do
    if not (t[i] in strip) then
      s:=s+t[i];
  l.Text:=s;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, ['"',',','!','@','?','/','''']);
end;

there is a lot to optimize here speedwise (not using a set but an array if char, using pchar for s and t, etc, but I concentrated on making the solution fast rather than maing a fast solution :) 1st important thing is to make it work :) (this is a pour excuse for the above algorithm :P )
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751242
I have a text in my richedit and I need to create a listbox or a tstringlist of words from my text.
This Tstringlist or listbox can't have repeated words.

0
 
LVL 28

Accepted Solution

by:
2266180 earned 500 total points
ID: 18751660
>This Tstringlist or listbox can't have repeated words.
this is a very important detail you forgot to mention.

and I missunderstood the word list part :)

here is an updated code:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls;

type
  TForm1 = class(TForm)
    RichEdit1: TRichEdit;
    Button1: TButton;
    Memo1: TMemo;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

function isWordDelimiter(c:char):boolean;
begin
{  result:=(c<#48) or // calculated according to www.asciitable.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#128) or
                        ( (c>#151) and
                          ( (c<#153) or
                            ( (c>#154) and
                              ( (c<#160) or
                                (c>#165)
                              )
                            )
                          )
                        )
                      )
                    )
                  )
                )
              )
            );{}
  result:=(c<#48) or // calculated according to www.lookuptables.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#192) or
                        ( (c>#214) and
                          (c<#216)
                        )
                      )
                    )
                  )
                )
              )
            );{}
end;

function findWord(w:string; l:TStrings; caseSensitive:boolean):boolean;
var i:integer;
begin
  if caseSensitive then result:=l.IndexOf(w)>-1
                   else
  begin
    i:=0;
    while (i<l.Count) and (not SameText(l[i],w)) do
      inc(i);
    result:=i<l.Count;
  end;
end;

procedure moveto(r:TRichEdit; l:TStrings; caseSensitive:boolean);
var s,t,w:string;
    i:integer;
begin
  s:='';
  w:='';
  t:=r.Lines.Text;
  i:=1;
  while i<=length(t) do
  begin
    while (i<=length(t)) and (isWordDelimiter(t[i])) do
      inc(i);
    w:='';
    while (i<=length(t)) and (not isWordDelimiter(t[i])) do
    begin
      w:=w+t[i];
      inc(i);
    end;
    if (w<>'') and not findWord(w, l, caseSensitive)then
      l.add(w);
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, false);
end;

end.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751713
thanks ciuly

that was I needed.

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18751957
hidrau, I did not install the component, only had the source code avail in my path.
I created the component at run-time.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

There is an easy way, in .NET, to centralize the treatment of all unexpected errors. First of all, instead of launching the application directly in a Form, you need first to write a Sub called Main, in a module. Then, set the Startup Object to th…
If you haven’t already, I encourage you to read the first article (http://www.experts-exchange.com/articles/18680/An-Introduction-to-R-Programming-and-R-Studio.html) in my series to gain a basic foundation of R and R Studio.  You will also find the …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question