Solved

Extract sentences from file

Posted on 2006-11-23
9
443 Views
Last Modified: 2010-04-21
Hi

Say I have two files, I want to create a third file, by extracting the 5 sentences from file1, adding it to file3, then extracting 5 sentences from file2, and adding it to file3, then going back to file1, doing the same, then file2, doing the same until the end of both files. If either file1 or file2 file runs out then the rest of the other file must be added to file3.

It is safe to presume a sentence ends in a "." and then next one begins after the "."

Can somebody whip up some code to do this?

Thanks


0
Comment
Question by:zattz
  • 3
  • 2
  • 2
  • +2
9 Comments
 
LVL 28

Accepted Solution

by:
TName earned 125 total points
ID: 18006484
Hi, you could try something like this (put a button on a form. Demofiles: 'C:\Test1.txt', 'C:\Test2.txt'):

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  private
    function FileToString(FN: String):String;
    procedure Dismantle(Sep, Text: String; var SList:TStringList);
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}


procedure TForm1.Button1Click(Sender: TObject);
var
SL1,SL2,SL3:TStringList;
str:String;
i,j,c:Integer;
begin
  SL1:=TStringList.Create;
  SL2:=TStringList.Create;
  SL3:=TStringList.Create;

  if FileExists('C:\Test1.txt') then
    str:=FileToString('C:\Test1.txt');
  if FileExists('C:\Test2.txt') then
     str:=FileToString('C:\Test2.txt');
  try
    Dismantle('.', str, SL1);
    Dismantle('.', str, SL2);
  except
     Exit;
  end;

  c:=SL1.Count;
  if SL2.Count>c then
      c:=SL2.Count;

  try
    for i:=1 to c do begin
      j:=5;
      while j>0 do begin
         if SL1.Count>0 then begin
           SL3.Add(SL1[0]);
           SL1.Delete(0);
         end;
         Dec(j);
      end;
      j:=5;
      while j>0 do begin
         if SL2.Count>0 then begin
           SL3.Add(SL2[0]);
           SL2.Delete(0);
         end;
         Dec(j);
      end;
    end;
  except
    //
  end;
  SL3.SaveToFile('C:\Test3.txt');

  SL1.Free;
  SL2.Free;
  SL3.Free;
end;


procedure TForm1.Dismantle(Sep, Text: String; var SList:TStringList);
var
  i: Integer;
  S: String;
begin
  S := Text;
  i := 0;
  while Pos(Sep, S) > 0 do begin
    SList.Add(Copy(S, 1, Pos(Sep, S)-1));
    S := Copy(S, Pos(Sep, S) + Length(Sep), Length(S));
    Inc(i);
  end;
  SList.Add(Copy(S, 1, Length(S)));
end;


function TForm1.FileToString(FN: String):String;
var
  p: PChar;
begin
  with TFileStream.Create(FN,fmOpenRead) do
    try
      GetMem(p,Size);
      try
        Read(p^,Size);
        Result := p;
      finally
        FreeMem(p);
      end;
    finally
      Free;
    end;
end;

end.
0
 
LVL 15

Assisted Solution

by:mikelittlewood
mikelittlewood earned 125 total points
ID: 18006501
Don't you need 2 different string holder values for the files and not just the one "str"

Instead of doing

  try
    Dismantle('.', str1, SL1);
    Dismantle('.', str2, SL2);
  except
     Exit;
  end;

change it to

  SL1.Delimiter := '.';
  SL2.Delimiter := '.';
  SL1.DelimitedText := str1;
  SL2.DelimitedText := str2;
0
 
LVL 28

Expert Comment

by:TName
ID: 18006538
Thanks Mike, I've done some late changes in the edit field of the browser... !   Very clever idea :\
Glad to see someone really checks! :)


procedure TForm1.Button1Click(Sender: TObject);
var
SL1,SL2,SL3:TStringList;
str:String;
i,j,c:Integer;
begin
  SL1:=TStringList.Create;
  SL2:=TStringList.Create;
  SL3:=TStringList.Create;

  if FileExists('C:\Test1.txt') then
    str:=FileToString('C:\Test1.txt');
  try
    Dismantle('.', str, SL1);  
  except
     Exit;
  end;

 if FileExists('C:\Test2.txt') then
    str:=FileToString('C:\Test2.txt');
  try
    Dismantle('.', str, SL2);  
  except
     Exit;
  end;

  c:=SL1.Count;
  if SL2.Count>c then
      c:=SL2.Count;

{...}
0
 
LVL 15

Expert Comment

by:mikelittlewood
ID: 18006542
he he no probs
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 28

Expert Comment

by:TName
ID: 18006544
 >SL1.Delimiter := '.';
  >SL2.Delimiter := '.';
  >SL1.DelimitedText := str1;
  >SL2.DelimitedText := str2;

Or as Mike suggested.
I'll have to check this. Can you believe that I've used TStringlist for years, without noticing this function?

0
 
LVL 9

Assisted Solution

by:alkisg
alkisg earned 125 total points
ID: 18009263
A shorter version, using TStrings.LineBreak to seperate the sentences and TStrings.LoadFromFile to load them.
WARNING, I didn't compile this, there could be errors in the code:

var
  t1, t2: TStrings;
begin
  t1 := TStringList.Create;
  t1.LineBreak := '.';
  t1.LoadFromFile('text1.txt');
  t2 := TStringList.Create;
  t2.LineBreak := '.';
  t2.LoadFromFile('text1.txt');
//...now the files are read and broken to sentences,
//then you continue with the rest of the code as TName suggested
0
 
LVL 9

Expert Comment

by:alkisg
ID: 18009269
P.S. the dots at the end of the sentences are lost, so you'll have to write them manually at file3, i.e.
Write(t1[i] + '.')

The line breaks are preserved thought, and it makes code simpler...
0
 
LVL 9

Assisted Solution

by:bernani
bernani earned 125 total points
ID: 18018667
Hi,

If it can help, a little demo with display of the results in 3 memos

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    Button1: TButton;
    Memo2: TMemo;
    Memo3: TMemo;
    procedure Button1Click(Sender: TObject);
  private
    { Déclarations privées }
  public
    { Déclarations publiques }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

procedure StringPickToPieces(Str: AnsiString; Separator: Char; Strings: TStrings);
begin
  try
    Strings.Clear;
    while AnsiPos(Separator, Str) > 0 do
    begin
      Strings.Add(Copy(Str, 1, AnsiPos(Separator, Str) - 1));
      Str := Copy(Str, AnsiPos(Separator, Str) + 1, Length(Str));
    end;
    Strings.Add(Str);
  except
  {}
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
x, c, i,j: integer;
sl1, sl2, sl3: TStringList;


begin
Memo1.clear;
sl1:= TStringlist.Create;
sl1.LoadFromFile('File1.txt');
StringPickToPieces(sl1.text,'.',sl1);
// only for visual correct text display
for x:= 0 to sl1.count -1 do
begin
memo1.lines.Add(sl1[x]+ '.');
end;

Memo2.clear;
sl2:= TStringlist.Create;
sl2.LoadFromFile('File2.txt');
StringPickToPieces(sl2.text,'.',sl2);
// only for visual correct text display
for x:=0 to sl2.count-1 do
begin
memo2.lines.Add(sl2[x]+ '.');
end;

Memo3.clear;
sl3:= TStringlist.Create;

c:=SL1.Count;

if SL2.Count>c then c:= SL2.Count;

  try
    for i:=1 to c do begin
      j:=5;
      while j>0 do begin
         if SL1.Count -1 > 0 then begin
           SL3.Add(SL1[0]+'.');
           SL1.Delete(0);
         end;
         Dec(j);
      end;
      j:=5;
      while j>0 do begin
         if SL2.Count -1 > 0 then begin
           SL3.Add(SL2[0] + '.');
           SL2.Delete(0);
         end;
         Dec(j);
      end;
    end;
  except
    //
  end;

SL3.SaveToFile('File3.txt');
// only for visual correct text display
Memo3.lines.LoadFromFile('File3.txt');

sl1.free;
sl2.free;
sl3.free;
end;

end.

______________ dfm _____________

object Form1: TForm1
  Left = 342
  Top = 214
  Width = 531
  Height = 431
  Caption = 'Form1'
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'MS Shell Dlg 2'
  Font.Style = []
  OldCreateOrder = False
  PixelsPerInch = 96
  TextHeight = 13
  object Memo1: TMemo
    Left = 112
    Top = 32
    Width = 193
    Height = 169
    Lines.Strings = (
      'Memo1')
    TabOrder = 0
  end
  object Button1: TButton
    Left = 16
    Top = 360
    Width = 75
    Height = 25
    Caption = 'Text splitting '
    TabOrder = 1
    OnClick = Button1Click
  end
  object Memo2: TMemo
    Left = 312
    Top = 32
    Width = 185
    Height = 169
    Lines.Strings = (
      'Memo2')
    TabOrder = 2
  end
  object Memo3: TMemo
    Left = 112
    Top = 208
    Width = 393
    Height = 185
    Lines.Strings = (
      'Memo3')
    TabOrder = 3
  end
end


0
 

Author Closing Comment

by:zattz
ID: 31425433
sorry about the delay
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
This Micro Tutorial will teach you how to censor certain areas of your screen. The example in this video will show a little boy's face being blurred. This will be demonstrated using Adobe Premiere Pro CS6.
As a trusted technology advisor to your customers you are likely getting the daily question of, ‘should I put this in the cloud?’ As customer demands for cloud services increases, companies will see a shift from traditional buying patterns to new…

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now