• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 375
  • Last Modified:

Remove all sentences shorter than X words

Hi

I'm looking for some code that will remove all sentences shorter than X words from a memo.

Thanks
0
zattz
Asked:
zattz
4 Solutions
 
CodedKCommented:
Hi zattz i've made a demo for you here :
http://www.geocities.com/kalatz_gr_fisher/TheFiles/Censor.zip

Didnt test it thoroughly... Its just to give you an idea :)

Hope this helps.
0
 
zattzAuthor Commented:
Hi,

I'm not sure what your demo is doing, but it's not removing the sentences shorter than X words.

Top memo:
This sentence has five words. This sentence has six six words.  This has three.  This has four words.

Press button.

Bottom memo:
 entence entence words. three.


0
 
CodedKCommented:
The top is the original text, bottom is the one with the words smaller than x removed and the left is
just a refference of what has been deleted.

I pasted this at the top text :
------------------------------------------------------
I'm looking for some code that will remove all sentences shorter than X words from a memo.
------------------------------------------------------


and wrote to cut everything smaller than 5... Result :
------------------------------------------------------
 looking remove sentences shorter
------------------------------------------------------
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

 
CodedKCommented:
Sorry zattz...
I just saw sentences not words... sorry i misread your question...
0
 
mikelittlewoodCommented:
This assumes every sentence finishes with a full stop (.) and words separated with a space ( ).
Not tested but you should get the idea.

function removeSentences( sentences: String; TotWords: Integer): String;
var
  sl: TStringList;
  i: Integer;
begin
  sl := TStringList.Create;
  try
    sl.Delimiter := '.';
    sl.Delimitedtext := sentences;

    for i := sl.Count - 1 downto 0 do
      if CountWords( sl.Strings[ i]) < TotWords then
        sl.Delete( i);
  finally
    FreeAndNil( sl)
  end;
  Result := sl.text;
end;

function countWords( sentence: String; TotWords: Integer): Boolean;
var
  s: String;
  i: Integer;
begin
  i := 0;
  while Pos( ' ', sentence) > 0 do
  begin
     Inc( i);
     sentence := Copy( sentence, Pos( ' ', sentence) + 1, MAXINT);
  end;
  // last word no space but full stop so add 1 to total
  Result := TotWords > ( i+1);
end;
0
 
2266180Commented:
I didn't look over CodedK's example, but using the code I provided in the other question and to which I added some more methods you simply do something like:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    procedure FormCreate(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

uses Contnrs;

const WhiteSpaces=[' ',#9{tab},#13,#10];

      punctuation=['.',',','!','?'];

      other=['(',')','+','=','<','>','/','\','{','}','[',']'];
      // I put anything that is not whitspace but is can be a word delimiter here.

      AlphaNumeric=['a'..'z','A'..'Z','0'..'9'];

type TSentence=class
     private
       FWords:TStringList;
       FDelimiters:TStringList;
       function GetCount: integer;
       function GetWord(index: integer): string;
       procedure SetWord(index: integer; const Value: string);
       function GetIsEmpty: boolean;
       function GetText: string;
    function GetNonEmpty: integer;
     public
       constructor Create; overload;
       constructor Create(s:string); overload;
       destructor Destroy; override;

       class function isWordDelimiter(c:char):boolean;

       property Count:integer read GetCount;
       property NonEmpty:integer read GetNonEmpty;
       property Words[index:integer]:string read GetWord write SetWord; default;
       property Text:string read GetText;
       property IsEmpty:boolean read GetIsEmpty;
     end;

     TSentences=class
     private
       FSentences:TObjectList;
       FDelimiters:TStringList;
       function GetCount: integer;
       function GetSentence(index: integer): TSentence;
       function GetText: string;
       function GetNonEmpty: integer;
     public
       constructor Create; overload;
       constructor Create(s:string); overload;
       destructor Destroy; override;

       class function isSentenceDelimiter(c:char):boolean;
       procedure delete(index:integer);

       property Count:integer read GetCount;
       property NonEmpty:integer read GetNonEmpty;
       property Sentences[index:integer]:TSentence read GetSentence; default;
       property Text:string read GetText;
     end;

{ TSentence }

constructor TSentence.Create(s: string);
var i,j:integer; w:string;
begin
  Create;
  i:=1;
  while i<=length(s) do
  begin
    setlength(w,length(s));// maximum possible
    j:=1;
    while (i<=length(s)) and (not isWordDelimiter(s[i])) do
    begin
      w[j]:=s[i];
      inc(i);
      inc(j);
    end;
    dec(j);
    setlength(w,j);
    FWords.Add(w);

    setlength(w,length(s));// maximum possible
    j:=1;
    while (i<=length(s)) and (isWordDelimiter(s[i])) do
    begin
      w[j]:=s[i];
      inc(i);
      inc(j);
    end;
    dec(j);
    setlength(w,j);
    FDelimiters.Add(w);
  end;
end;

constructor TSentence.Create;
begin
  FWords:=TStringList.Create;
  FDelimiters:=TStringList.Create;
end;

destructor TSentence.Destroy;
begin
  FDelimiters.Free;
  FWords.Free;
  inherited;
end;

function TSentence.GetCount: integer;
begin
  result:=FWords.Count;
end;

function TSentence.GetIsEmpty: boolean;
begin
  result:=GetNonEmpty=0;
end;

function TSentence.GetNonEmpty: integer;
var i:integer;
begin
  result:=0;
  for i:=1 to Count do
    if FWords[i-1]<>'' then
      inc(result);
end;

function TSentence.GetText: string;
var i:integer;
begin
  result:='';
  for i:=1 to count do
    result:=result+FWords[i-1]+FDelimiters[i-1];
end;

function TSentence.GetWord(index: integer): string;
begin
  result:=FWords[index];
end;

class function TSentence.isWordDelimiter(c: char): boolean;
begin
  result:=(c in whitespaces) or (c in punctuation) or (c in other);
  // another definition of a word might be: any set oc alphanumerical characters
  // so it all depends on the definition.
end;

procedure TSentence.SetWord(index: integer; const Value: string);
begin
  FWords[index]:=value;
end;

{ TSentences }

constructor TSentences.Create(s: string);
var i,j:integer; t:string;
begin
  Create;
  i:=1;
  while i<=length(s) do
  begin
    setlength(t,length(s));
    j:=1;
    while (i<=length(s)) and (not isSentenceDelimiter(s[i])) do
    begin
      t[j]:=s[i];
      inc(i);
      inc(j);
    end;
    dec(j);
    setlength(t,j);
    FSentences.Add(TSentence.Create(t));

    setlength(t,length(s));
    j:=1;
    while (i<=length(s)) and (isSentenceDelimiter(s[i])) do
    begin
      t[j]:=s[i];
      inc(i);
      inc(j);
    end;
    dec(j);
    setlength(t,j);
    FDelimiters.Add(t);
  end;
end;

constructor TSentences.Create;
begin
  FSentences:=TObjectList.Create(true);
  FDelimiters:=TStringList.Create;
end;

procedure TSentences.delete(index: integer);
begin
  FSentences.Delete(index);
  FDelimiters.Delete(index);
end;

destructor TSentences.Destroy;
begin
  FDelimiters.Free;
  FSentences.Free;
  inherited;
end;

function TSentences.GetCount: integer;
begin
  result:=FSentences.Count;
end;

function TSentences.GetNonEmpty: integer;
var i:integer;
begin
  result:=0;
  for i:=1 to Count do
    if not Sentences[i-1].isEmpty then
      inc(result);
end;

function TSentences.GetSentence(index: integer): TSentence;
begin
  result:=TSentence(FSentences[index]);
end;

function TSentences.GetText: string;
var i:integer;
begin
  result:='';
  assert(Count=FDelimiters.count);
  for i:=1 to Count do
    result:=result+Sentences[i-1].Text+FDelimiters[i-1];
end;

class function TSentences.isSentenceDelimiter(c: char): boolean;
begin
  result:=c in punctuation;
end;

procedure TForm1.FormCreate(Sender: TObject);
var s:tsentences; i:integer;
begin
  memo1.lines.clear;
  memo1.lines.add('word word s. next. making 3 on a line.');
  memo1.lines.add('now splitting');
  memo1.lines.add('one sentence on 2 lines. adding some more.');
  memo1.lines.add('now we have exactly 6 sentences on 4 lines.');
  s:=TSentences.create(memo1.lines.text);

  showmessage(inttostr(s.NonEmpty));

  i:=0;
  while i<s.count do
    if (not s[i].IsEmpty) and (s[i].NonEmpty<=3) then// remove all sentences that have 3 or less words.
      s.delete(i)                                else
      inc(i);

  memo1.lines.text:=s.text;
  s.free;
end;

end.
0
 
mikelittlewoodCommented:
I changed what I was orginially going to do so replace

if CountWords( sl.Strings[ i]) < TotWords then

with

if CountWords( sl.Strings[ i]) then
0
 
CodedKCommented:
:/  Argx...
 i was about to post the same code in a demo mikelittlewood..

:)
0
 
CodedKCommented:
This is my demo...
It doesnt work with "." but with different lines... :)
http://www.geocities.com/kalatz_gr_fisher/TheFiles/Censor2.zip
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now