[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

how to count words from file

Posted on 2009-02-12
19
Medium Priority
?
498 Views
Last Modified: 2013-11-23
i want to read a content from file
my content may have "," space "."

how to count words from file

say i have this

Hi! Want to tell you a story:
There once was a bear, it lived in a forest and the bear love to eat meat. He has a friend named Sticky.  When she came, the bear went to the beach and danced with some noodles on his head in a forest happily ever after.

Story end? Yes.
0
Comment
Question by:tankergoblin
  • 8
  • 7
  • 4
19 Comments
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 23630640
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23630739
Of course you can change cDelimiter set :)

function WordCount(const pFileName: String): Integer;
const
  cDelimiters = [#0..#31,' ','.',',','?','(',')','[',']','\','/'];
var
  fFile: TFileStream;
  vBuffer: array[0..1023] of char;
  vWord: Boolean;
  vi, vBufSize: Integer;
begin
  result := 0;
  if not FileExists(pFileName) then exit;
  try
    fFile := TFileStream.Create(pFileName,fmOpenRead);
  except
    exit; //Acces denied for file or other errors
  end;
  try
    //Reading from file and counting;
    vWord := false;
    while fFile.Position < fFile.Size do
    begin
      vBufSize := fFile.Read(vBuffer,sizeOf(vBuffer));    
 
      for vi := 0 to vBufSize do
      begin
        if not(vBuffer[vi] in cDelimiters) then
          vWord := true
        else begin
          if vWord then           
            inc(result);
          vWord := false;          
        end;                   
      end;
    end;
  finally
    FreeAndNil(fFile);
  end;
end;

Open in new window

0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23630815
500 points for 2 links?

i do as below

program below work with space example

how do you go

but if i add ?,! example
hi! where are you going?
Above does not work with my word count program.



var 
 FileWordList,FilenameList: TStringList;
 Filename: string; 
 i: integer;
begin
 FilenameList := TStringList.Create;
 FileWordList := TStringList.Create;
 
 OpenDialog := TOpenDialog.create(self);
 
 if openDialog.execute then
 begin
  Filename := OpenDialog.Filename;
  FilenameList.LoadFromFile(Filename);
 end;
 OpenDialog.Free;
 
 for i = 0 to FilenameList.count-1 do
 begin
  FileWordList.DelimitedText := FilenameList[i];
 end;
 showmessage(intToStr(FileWordList.count));

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 7

Author Comment

by:tankergoblin
ID: 23630818
sorry is *below
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23630838
Loading files to TStringList are ok, but only for small files. If you will have large file it will be inefficient.
I prefer using FileStreams like in my example, where file is loaded while counting process. In my opinion it is much faster than TStringList.LoadFromFile, on large files
0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23630845
also say i have two line

how are you
going to city

if will only read second line
how to fix
0
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 23630906
>>500 points for 2 links?
the second link i thought was exactly what you needed

and do you really expect me to copy all the code in here ?
I could off course write my own interpretation, but why should i reinvent the wheel all over again ?

are you saying that when you start a totally new application,
you don't use any code of your previous applications ?

so you basically reinvent the wheel every time ?

there is nothing wrong with providing a link to very good code,
and very well documented too ...
0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23630909
how about usage of memory .
I think using Tstringlist u store everything in an object that allow you not to read the file every time you need it.
I think this will make it more faster.
Further more the code is easy to write and shorter .
Just that i can only execute last line .
how to fix
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23630916
Below is yours code which will count all rows, but I think it is still wrong, because text like:
how are you.Going to city

will be counted as 5 words, because "you.Going" is one word for it and you can't do anything with that because TStringList can have only one delimiter char.
Sample I gave you few post earlier will handle that case.

var
 FileWordList,FilenameList: TStringList;
 Filename: string;
 i: integer;
 vResult: Integer;
begin
 FilenameList := TStringList.Create;
 FileWordList := TStringList.Create;
 
 OpenDialog := TOpenDialog.create(self);
 
 if openDialog.execute then
 begin
  Filename := OpenDialog.Filename;
  FilenameList.LoadFromFile(Filename);
 end;
 OpenDialog.Free;
 
 vresult := 0;
 for i := 0 to FilenameList.count-1 do
 begin  
   FileWordList.DelimitedText := FilenameList[i];
   vResult := vResult + FileWordList.count;
 end;
 showmessage(intToStr(vResult));
0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23630919
also you store a bunch of character in array where it takes space
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23630954
cDelimiters array can by modify by you.
Chars #0..#31 are non printable chars like
#10 - end of line
#13 - Return
#8 - Tabulator
etc.
there is also defined ' ' char which means space. You can put there any chars you like.
I wouldn't delete #8, #10, #13, ' ', '.', ',' but it is your choice :)
0
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 23631105
#8 is backspace
#9 is tab

just so you know ...
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23631152
My mistake, sorry.
0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23646834
dprochownik:
i get you point
and i had try you code and mine as below

Inc(vResult,FileWordList.Count); which i think is equivalent to your
vResult := vResult + FileWordList.count;

However as you said earlier if i put my string as

"how are you.In good condition"

it will read as 5 instead of 6
how to solve?


0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23648073
set dot '.' as delimiter - add it to cDelimiters array. If you do so, algorithm will treats dots as good as spaces and 'you.In' will be two words not one.
Thats why I advice you to add  to cDelimiter set as many characters as possible '(', ')', '?' etc, because there is always a propability that somene won't put space after '?'.
0
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 23648081
>>tankergoblin
also you store a bunch of character in array where it takes space

what is wrong with that ?
0
 
LVL 4

Expert Comment

by:dprochownik
ID: 23648153
Of course you can use other approach as in code below. You can declare set of characters wchich can be used in words and all other characters will be treaten as delimiters, but you have to know that this is VERY DANGEROUS, because you have to declare set of all allowed characters, what can be hard to do if yours app will be used on text written on different keyboard language than yours.
In this approach if source text can be written in language other than english, you have to add to set cAllowed all national characters too.

function WordCount(const pFileName: String): Integer;
const
  cAllowed = ['a'..'z','A'..'Z','-'];
var
  fFile: TFileStream;
  vBuffer: array[0..1023] of char;
  vWord: Boolean;
  vi, vBufSize: Integer;
begin
  result := 0;
  if not FileExists(pFileName) then exit;
  try
    fFile := TFileStream.Create(pFileName,fmOpenRead);
  except
    exit; //Acces denied for file or other errors
  end;
  try
    //Reading from file and counting;
    vWord := false;
    while fFile.Position < fFile.Size do
    begin
      vBufSize := fFile.Read(vBuffer,sizeOf(vBuffer));
 
      for vi := 0 to vBufSize do
      begin
        if vBuffer[vi] in cAllowed then
          vWord := true
        else begin
          if vWord then
            inc(result);
          vWord := false;
        end;
      end;
    end;
  finally
    FreeAndNil(fFile);
  end;
end;

Open in new window

0
 
LVL 7

Author Comment

by:tankergoblin
ID: 23655722
how can cdelimiter apply in my code?
0
 
LVL 4

Accepted Solution

by:
dprochownik earned 2000 total points
ID: 23657276
In yours code you could do that as below, but in my opininon it is completelly inefficient because:
  1. yours code copys all of file data into memory when  FileWordList.LoadFromFile(OpenDialog.Filename);
    so memory manager has to assing quite large space in memory for larger files,
  2. processing TStringList.DelimiterText := ...... reads all of these data, and you have to do this for all delimiters so yours code reads all block of data and process it as many times as many delimiters have been declared.
Why don't you just copy code from my first post which:
  1. holds maximum 1024 bytes of data in memory at once (using TStringStream),
  2. reads data and checks for all delimiters only once,
so it is much more efficient.

const
  cDelimiters: array[0..41] of char = (#0,#1,#2,#3,#4,#5,#6,#7,#8,#9,#10,#11,#12,#13,#14,#15,#16,#17,#18,#19,#20,#21,#22,#23,#24,#25,#26,#27,#28,#29,#30,#31,
' ','.',',','?','(',')','[',']','\','/')
var
  FileWordList: TStringList;
  i: integer;
  OpenDialog: TOpenDialog;
begin
  FileWordList := TStringList.Create;
  try
    OpenDialog := TOpenDialog.create(self);
    try
      if openDialog.execute then
        FileWordList.LoadFromFile(OpenDialog.Filename);
    finally
      Opendialog.Free;
    end;
    
    for i := 0 to high(cDelimiters) do
    begin
      FileWordList.Delimiter := cDelimiters[i];
      FileWordList.DelimitedText := FileWordList.Text;
    end;
 
    showmessage(intToStr(FileWordList.Count));
  finally
    FileWordList.Free;
  end;

Open in new window

0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
The viewer will learn how to synchronize PHP projects with a remote server in NetBeans IDE 8.0 for Windows.
The viewer will learn how to use and create new code templates in NetBeans IDE 8.0 for Windows.
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question