tankergoblin
asked on
how to count words from file
i want to read a content from file
my content may have "," space "."
how to count words from file
say i have this
Hi! Want to tell you a story:
There once was a bear, it lived in a forest and the bear love to eat meat. He has a friend named Sticky. When she came, the bear went to the beach and danced with some noodles on his head in a forest happily ever after.
Story end? Yes.
my content may have "," space "."
how to count words from file
say i have this
Hi! Want to tell you a story:
There once was a bear, it lived in a forest and the bear love to eat meat. He has a friend named Sticky. When she came, the bear went to the beach and danced with some noodles on his head in a forest happily ever after.
Story end? Yes.
Of course you can change cDelimiter set :)
function WordCount(const pFileName: String): Integer;
const
cDelimiters = [#0..#31,' ','.',',','?','(',')','[',']','\','/'];
var
fFile: TFileStream;
vBuffer: array[0..1023] of char;
vWord: Boolean;
vi, vBufSize: Integer;
begin
result := 0;
if not FileExists(pFileName) then exit;
try
fFile := TFileStream.Create(pFileName,fmOpenRead);
except
exit; //Acces denied for file or other errors
end;
try
//Reading from file and counting;
vWord := false;
while fFile.Position < fFile.Size do
begin
vBufSize := fFile.Read(vBuffer,sizeOf(vBuffer));
for vi := 0 to vBufSize do
begin
if not(vBuffer[vi] in cDelimiters) then
vWord := true
else begin
if vWord then
inc(result);
vWord := false;
end;
end;
end;
finally
FreeAndNil(fFile);
end;
end;
ASKER
500 points for 2 links?
i do as below
program below work with space example
how do you go
but if i add ?,! example
hi! where are you going?
Above does not work with my word count program.
i do as below
program below work with space example
how do you go
but if i add ?,! example
hi! where are you going?
Above does not work with my word count program.
var
FileWordList,FilenameList: TStringList;
Filename: string;
i: integer;
begin
FilenameList := TStringList.Create;
FileWordList := TStringList.Create;
OpenDialog := TOpenDialog.create(self);
if openDialog.execute then
begin
Filename := OpenDialog.Filename;
FilenameList.LoadFromFile(Filename);
end;
OpenDialog.Free;
for i = 0 to FilenameList.count-1 do
begin
FileWordList.DelimitedText := FilenameList[i];
end;
showmessage(intToStr(FileWordList.count));
ASKER
sorry is *below
Loading files to TStringList are ok, but only for small files. If you will have large file it will be inefficient.
I prefer using FileStreams like in my example, where file is loaded while counting process. In my opinion it is much faster than TStringList.LoadFromFile, on large files
I prefer using FileStreams like in my example, where file is loaded while counting process. In my opinion it is much faster than TStringList.LoadFromFile, on large files
ASKER
also say i have two line
how are you
going to city
if will only read second line
how to fix
how are you
going to city
if will only read second line
how to fix
>>500 points for 2 links?
the second link i thought was exactly what you needed
and do you really expect me to copy all the code in here ?
I could off course write my own interpretation, but why should i reinvent the wheel all over again ?
are you saying that when you start a totally new application,
you don't use any code of your previous applications ?
so you basically reinvent the wheel every time ?
there is nothing wrong with providing a link to very good code,
and very well documented too ...
the second link i thought was exactly what you needed
and do you really expect me to copy all the code in here ?
I could off course write my own interpretation, but why should i reinvent the wheel all over again ?
are you saying that when you start a totally new application,
you don't use any code of your previous applications ?
so you basically reinvent the wheel every time ?
there is nothing wrong with providing a link to very good code,
and very well documented too ...
ASKER
how about usage of memory .
I think using Tstringlist u store everything in an object that allow you not to read the file every time you need it.
I think this will make it more faster.
Further more the code is easy to write and shorter .
Just that i can only execute last line .
how to fix
I think using Tstringlist u store everything in an object that allow you not to read the file every time you need it.
I think this will make it more faster.
Further more the code is easy to write and shorter .
Just that i can only execute last line .
how to fix
Below is yours code which will count all rows, but I think it is still wrong, because text like:
how are you.Going to city
will be counted as 5 words, because "you.Going" is one word for it and you can't do anything with that because TStringList can have only one delimiter char.
Sample I gave you few post earlier will handle that case.
var
FileWordList,FilenameList: TStringList;
Filename: string;
i: integer;
vResult: Integer;
begin
FilenameList := TStringList.Create;
FileWordList := TStringList.Create;
OpenDialog := TOpenDialog.create(self);
if openDialog.execute then
begin
Filename := OpenDialog.Filename;
FilenameList.LoadFromFile( Filename);
end;
OpenDialog.Free;
vresult := 0;
for i := 0 to FilenameList.count-1 do
begin
FileWordList.DelimitedText := FilenameList[i];
vResult := vResult + FileWordList.count;
end;
showmessage(intToStr(vResu lt));
how are you.Going to city
will be counted as 5 words, because "you.Going" is one word for it and you can't do anything with that because TStringList can have only one delimiter char.
Sample I gave you few post earlier will handle that case.
var
FileWordList,FilenameList:
Filename: string;
i: integer;
vResult: Integer;
begin
FilenameList := TStringList.Create;
FileWordList := TStringList.Create;
OpenDialog := TOpenDialog.create(self);
if openDialog.execute then
begin
Filename := OpenDialog.Filename;
FilenameList.LoadFromFile(
end;
OpenDialog.Free;
vresult := 0;
for i := 0 to FilenameList.count-1 do
begin
FileWordList.DelimitedText
vResult := vResult + FileWordList.count;
end;
showmessage(intToStr(vResu
ASKER
also you store a bunch of character in array where it takes space
cDelimiters array can by modify by you.
Chars #0..#31 are non printable chars like
#10 - end of line
#13 - Return
#8 - Tabulator
etc.
there is also defined ' ' char which means space. You can put there any chars you like.
I wouldn't delete #8, #10, #13, ' ', '.', ',' but it is your choice :)
Chars #0..#31 are non printable chars like
#10 - end of line
#13 - Return
#8 - Tabulator
etc.
there is also defined ' ' char which means space. You can put there any chars you like.
I wouldn't delete #8, #10, #13, ' ', '.', ',' but it is your choice :)
#8 is backspace
#9 is tab
just so you know ...
#9 is tab
just so you know ...
My mistake, sorry.
ASKER
dprochownik:
i get you point
and i had try you code and mine as below
Inc(vResult,FileWordList.C ount); which i think is equivalent to your
vResult := vResult + FileWordList.count;
However as you said earlier if i put my string as
"how are you.In good condition"
it will read as 5 instead of 6
how to solve?
i get you point
and i had try you code and mine as below
Inc(vResult,FileWordList.C
vResult := vResult + FileWordList.count;
However as you said earlier if i put my string as
"how are you.In good condition"
it will read as 5 instead of 6
how to solve?
set dot '.' as delimiter - add it to cDelimiters array. If you do so, algorithm will treats dots as good as spaces and 'you.In' will be two words not one.
Thats why I advice you to add to cDelimiter set as many characters as possible '(', ')', '?' etc, because there is always a propability that somene won't put space after '?'.
Thats why I advice you to add to cDelimiter set as many characters as possible '(', ')', '?' etc, because there is always a propability that somene won't put space after '?'.
>>tankergoblin
also you store a bunch of character in array where it takes space
what is wrong with that ?
also you store a bunch of character in array where it takes space
what is wrong with that ?
Of course you can use other approach as in code below. You can declare set of characters wchich can be used in words and all other characters will be treaten as delimiters, but you have to know that this is VERY DANGEROUS, because you have to declare set of all allowed characters, what can be hard to do if yours app will be used on text written on different keyboard language than yours.
In this approach if source text can be written in language other than english, you have to add to set cAllowed all national characters too.
In this approach if source text can be written in language other than english, you have to add to set cAllowed all national characters too.
function WordCount(const pFileName: String): Integer;
const
cAllowed = ['a'..'z','A'..'Z','-'];
var
fFile: TFileStream;
vBuffer: array[0..1023] of char;
vWord: Boolean;
vi, vBufSize: Integer;
begin
result := 0;
if not FileExists(pFileName) then exit;
try
fFile := TFileStream.Create(pFileName,fmOpenRead);
except
exit; //Acces denied for file or other errors
end;
try
//Reading from file and counting;
vWord := false;
while fFile.Position < fFile.Size do
begin
vBufSize := fFile.Read(vBuffer,sizeOf(vBuffer));
for vi := 0 to vBufSize do
begin
if vBuffer[vi] in cAllowed then
vWord := true
else begin
if vWord then
inc(result);
vWord := false;
end;
end;
end;
finally
FreeAndNil(fFile);
end;
end;
ASKER
how can cdelimiter apply in my code?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
http://www.delphi3000.com/articles/article_524.asp?SK=
http://www.delphibasics.co.uk/Article.asp?Name=OOExamplePlus