sdesar
asked on
counting words
I know how to count words and its frequency in a text file.
However, I don not know how to count the words and its frequency in different paragraphs.
Any suggestions greatly appreciated.
Thanks
However, I don not know how to count the words and its frequency in different paragraphs.
Any suggestions greatly appreciated.
Thanks
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
NOOOOOOO!
If you like to write contens in file, then write to cnt.pl
#!/usr/bin/perl -00 -n
@w = /\w+/g;print $#w."\n";
and from command line call
cnt.pl file_1 file_2 file_3
where file_X is name of file with text where you want to count words in paragraph.
Good look
If you like to write contens in file, then write to cnt.pl
#!/usr/bin/perl -00 -n
@w = /\w+/g;print $#w."\n";
and from command line call
cnt.pl file_1 file_2 file_3
where file_X is name of file with text where you want to count words in paragraph.
Good look
ASKER
This seems to count the words only and not the words and its frequency in the individual parahs.
Anything else that I should do?
PS. Thanks for your time on these suggestions.
Anything else that I should do?
PS. Thanks for your time on these suggestions.
Well, you have all the words in @w array. If you want frequency then add
map { $wc{$_}++; } @w;
foreach $wd(keys %wc){print $wd.':'.$wc{$wd}."\n";}
print "---------------\n";
This will additionaly print you every word and number of occurences of this word in the paragraph
map { $wc{$_}++; } @w;
foreach $wd(keys %wc){print $wd.':'.$wc{$wd}."\n";}
print "---------------\n";
This will additionaly print you every word and number of occurences of this word in the paragraph
What's a word?
ASKER
I test this and it works.
How do I list the paragraph numbers-
ie Parah1
word freq
Parah2
word freq
How do I list the paragraph numbers-
ie Parah1
word freq
Parah2
word freq
#!/usr/bin/perl -00 -n
print "Parah ".$..":";
@w = /\w+/g;print $#w."\n";
map { $wc{$_}++; } @w;
foreach $wd(keys %wc){print "\t".$wd."\t".$wc{$wd}."\n ";}
print "Parah ".$..":";
@w = /\w+/g;print $#w."\n";
map { $wc{$_}++; } @w;
foreach $wd(keys %wc){print "\t".$wd."\t".$wc{$wd}."\n
ASKER
Thanks monas!!
I gave U excellent points.
Have they been recorded?
I gave U excellent points.
Have they been recorded?
Yes, TNX
ASKER
How can I use perl for
word recognition?
Example - If there are a bunch of words in a text file like -
this text is derived from the book and to see from information on deriving check out the textbook.
Since derived and deriving stem from the root - derive. How can I use perl to parse the text and recognize DERIVE.
word recognition?
Example - If there are a bunch of words in a text file like -
this text is derived from the book and to see from information on deriving check out the textbook.
Since derived and deriving stem from the root - derive. How can I use perl to parse the text and recognize DERIVE.
use Lingua::Stem qw(:all);
set_locale('en');
#add_exceptions({derived=> 'DERIVE', deriving=>'DERIVE'});
#print "@{stem(qw(Since derived and deriving stem from the root - derive. How can I use perl to parse the text and recognize DERIVE'))}\n";
print "@{stem(map{/(\w+)/g}<>)}\ n";
set_locale('en');
#add_exceptions({derived=>
#print "@{stem(qw(Since derived and deriving stem from the root - derive. How can I use perl to parse the text and recognize DERIVE'))}\n";
print "@{stem(map{/(\w+)/g}<>)}\
ASKER
Thanks -oza !!
It works as expected.
OZO or MONAS-
The parah & word counting program - I am implementing it in a web application.
I wanted to know how will this routine handle
multiple files.
That is if I have one text file_1.in that I want the
word and freq. count on and save it in file_1.out
And then if I want to generate a similar count on another file_2.in and store the results in file_2.out.
What's the efficient way to be able to handle a freq. count on multiple files?
Also, is map() a function in perl and is it using a LIST Data Structure for perform word counts?
It works as expected.
OZO or MONAS-
The parah & word counting program - I am implementing it in a web application.
I wanted to know how will this routine handle
multiple files.
That is if I have one text file_1.in that I want the
word and freq. count on and save it in file_1.out
And then if I want to generate a similar count on another file_2.in and store the results in file_2.out.
What's the efficient way to be able to handle a freq. count on multiple files?
Also, is map() a function in perl and is it using a LIST Data Structure for perform word counts?
ASKER
Unrecognized file test: -n at line 3
I made some assumptions-
I typed
#!/usr/bin/perl
per -00 -n -e '@.........."\n"list_of_fi
I am assuming that the
list_of_files is a file that cointains the text.
I thougt that there was a semicolon missing after list_of_files;
However, that didn't solve it either.
Any other suggestions?