Solved

Extract all words from a richedit and insert into a stringlist

Posted on 2007-03-19
12
436 Views
Last Modified: 2010-04-05
Hello guys,

I have a text in a richedit and I would like a function where I could get all words from my richedit and insert into a stringlist. this function need to take away the simbols ",!@?/'etc.

Thanks
0
Comment
Question by:hidrau
  • 5
  • 5
  • 2
12 Comments
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748573
Try something like this:

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  Explode(StringReplace(RichEdit1.Lines.Text, '''', '''''', [rfReplaceAll]), ListBox1.Items);
end;
0
 
LVL 1

Author Comment

by:hidrau
ID: 18748742
EddieShipman,

I am having problem with your function because I have some word with accent. Like

ação = action
sábia = knew
fábrica = factory

maybe your function doesn't work with accented words

Could you take a look at?

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18748968
ok, hold on.
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18749207
Seems the NextToken function of TParser does not handle extended chars.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18750415
no way?
0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18750758
I found a free component on Torry's that seems to work OK, give it a try, here's a demo:

SQL Parser By Kei.
TQueryParserComp is a simple text parser which recognizes comments, strings, statement delimiters
and special characters.

Fully functional
Source: Included

Download: http://www.torry.net/db/access/db_sql/kqparser.zip

******************************************************************************************
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls, SQLParserComp;

type
  TForm1 = class(TForm)
    Button1: TButton;
    ListBox1: TListBox;
    RichEdit1: TRichEdit;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
    procedure QueryParserStatementDelimiter(Sender: TObject;
      SQLStatement: String);
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure Explode(myStr: String; List: TStrings);
var
  MyParser: TParser;
  MS: TMemoryStream;
begin
  MS := TMemoryStream.Create;
  MS.Position := 0;
  MS.Write(MyStr[1], Length(MyStr));
  MS.Position := 0;
  MyParser := TParser.Create(MS);
  MyStr := MyParser.TokenString;
  //
  while MyParser.Token <> toEOF do
  begin
    if MyParser.TokenSymbolIs(MyParser.TokenString) then
    begin
      MyStr := MyParser.TokenString;
      List.Add(MyStr);
    end;
    MyParser.NextToken;
  end;
  MyParser.Free;
  MS.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
  function AddToken(var AToken: String):Boolean;
  var
    i: Integer;
  begin
    Result := False;
    if (Length(Trim(AToken)) > 0) then
    begin
      for i := Length(AToken) downto 1 do
        if (AToken[i] in [',', '.', '"', '(', ')', '@']) then
          Delete(AToken, i, 1);
        if Length(Trim(AToken)) > 0 then
          Result := True;
    end;
  end;

var
  QueryParser: TQueryParserComp;
  sToken: String;
begin
  try
    QueryParser := TQueryParserComp.Create(nil);
    QueryParser.IsEOFStmtDelimiter := False;
    QueryParser.StringDelimiters   := #39'"';
    QueryParser.RemoveStrDelimiter := False;
    QueryParser.CountFromStatement := True;
    QueryParser.TextToParse := RichEdit1.Lines.Text;
    QueryParser.FirstToken;
    while not QueryParser.EOF do
    begin
      sToken := QueryParser.Token;
      if AddToken(sToken)then
        ListBox1.Items.Add(sToken);
      QueryParser.NextToken;
    end;
  finally
    ShowMessage(IntToStr(QueryParser.SymbolsCount));
    QueryParser.Free;
  end;
end;

procedure TForm1.QueryParserStatementDelimiter(Sender: TObject;
  SQLStatement: String);
begin
end;

end.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 1

Author Comment

by:hidrau
ID: 18750983
EddieShipman:

If I knew that you will try with some component I would have advised you. it is not allowed to install component here in company. Sorry to say :(

it seems that solution is hard
0
 
LVL 28

Expert Comment

by:2266180
ID: 18751150
an aprox 5 min solution (includes starting up the IDE :) )

type TCharSet=set of char;

procedure moveto(r:TRichEdit; l:TStrings; strip:TCharSet);
var s,t:string;
    i:integer;
begin
  s:='';
  t:=r.Lines.Text;
  for i:=1 to length(t) do
    if not (t[i] in strip) then
      s:=s+t[i];
  l.Text:=s;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, ['"',',','!','@','?','/','''']);
end;

there is a lot to optimize here speedwise (not using a set but an array if char, using pchar for s and t, etc, but I concentrated on making the solution fast rather than maing a fast solution :) 1st important thing is to make it work :) (this is a pour excuse for the above algorithm :P )
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751242
I have a text in my richedit and I need to create a listbox or a tstringlist of words from my text.
This Tstringlist or listbox can't have repeated words.

0
 
LVL 28

Accepted Solution

by:
2266180 earned 500 total points
ID: 18751660
>This Tstringlist or listbox can't have repeated words.
this is a very important detail you forgot to mention.

and I missunderstood the word list part :)

here is an updated code:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, ComCtrls;

type
  TForm1 = class(TForm)
    RichEdit1: TRichEdit;
    Button1: TButton;
    Memo1: TMemo;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

function isWordDelimiter(c:char):boolean;
begin
{  result:=(c<#48) or // calculated according to www.asciitable.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#128) or
                        ( (c>#151) and
                          ( (c<#153) or
                            ( (c>#154) and
                              ( (c<#160) or
                                (c>#165)
                              )
                            )
                          )
                        )
                      )
                    )
                  )
                )
              )
            );{}
  result:=(c<#48) or // calculated according to www.lookuptables.com
            ( (c>#57) and
              ( (c<#65) or
                ( (c>#90) and
                  ( (c<#97) or
                    ( (c>#122) and
                      ( (c<#192) or
                        ( (c>#214) and
                          (c<#216)
                        )
                      )
                    )
                  )
                )
              )
            );{}
end;

function findWord(w:string; l:TStrings; caseSensitive:boolean):boolean;
var i:integer;
begin
  if caseSensitive then result:=l.IndexOf(w)>-1
                   else
  begin
    i:=0;
    while (i<l.Count) and (not SameText(l[i],w)) do
      inc(i);
    result:=i<l.Count;
  end;
end;

procedure moveto(r:TRichEdit; l:TStrings; caseSensitive:boolean);
var s,t,w:string;
    i:integer;
begin
  s:='';
  w:='';
  t:=r.Lines.Text;
  i:=1;
  while i<=length(t) do
  begin
    while (i<=length(t)) and (isWordDelimiter(t[i])) do
      inc(i);
    w:='';
    while (i<=length(t)) and (not isWordDelimiter(t[i])) do
    begin
      w:=w+t[i];
      inc(i);
    end;
    if (w<>'') and not findWord(w, l, caseSensitive)then
      l.add(w);
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  moveto(richedit1, memo1.lines, false);
end;

end.
0
 
LVL 1

Author Comment

by:hidrau
ID: 18751713
thanks ciuly

that was I needed.

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 18751957
hidrau, I did not install the component, only had the source code avail in my path.
I created the component at run-time.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
Windows Script Host (WSH) has been part of Windows since Windows NT4. Windows Script Host provides architecture for building dynamic scripts that consist of a core object model, scripting hosts, and scripting engines. The key components of Window…
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now