Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 316
  • Last Modified:

Perl/UNIX shell command

Is there any perl/UNIX shell command (single command) that will take an input file and output this (count the concurrence of "myword" in a line):

myword 2
myword 3
myword 1

Input files:
This is myword and another myword
There is no myword
This is myword and another myword another myword
There is no myword
This is myword only
0
toooki
Asked:
toooki
  • 2
  • 2
  • 2
  • +3
5 Solutions
 
nemws1Commented:
I'm sure ozo will have a one-liner for this, but I prefer more readable code.  Put this into a script (I name mine 'countword.pl'):

#!/usr/bin/perl
my $target = shift(@ARGV);
while (<STDIN>) {
    my $count = () = $_=~m/$target/gi;
    print "$target $count\n";
}

Open in new window


Run with:
./countword.pl myword < your_text_file

Open in new window

0
 
wilcoxonCommented:
I'm not ozo but here's a one-liner:
perl -ne "$word = 'myword'; print $word, ' ', scalar(()=m{\b$word\b}g), \"\n\"" your_text_file

Open in new window

0
 
nemws1Commented:
Yeah, but you're a pretty darn good perl expert as well, wilcoxon. :)
0
NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

 
woolmilkporcCommented:
There is always awk:

awk '{W="myword"; for(i=1;i<=NF;i++) if($i==W) c++; if(c>0) print W, c; c=0}' inputfile
0
 
woolmilkporcCommented:
A really ugly one, just for fun:

W="myword"; sed "s/$W/@/g" inputfile | grep -w '@' | while read line; do echo  -en "$W\t"; echo $line |tr -dc '@' |wc -c; done

Open in new window

0
 
ozoCommented:
Given the input file:
This is myword and another myword
There is no myword
This is myword and another myword another myword
There is no myword
This is myword only

shouldn't the counts be
myword 2
myword 1
myword 3
myword 1
myword 1
?
Or, if we are counting /(?<!\bno )\bmyword\b/
would that be
myword 2
myword 0
myword 3
myword 0
myword 1
?
Or, if there is a "no" on the line, would we ignore the entire line, or just the "myword" immediately following the "no"?
What would you want the count to be for
This is myword and no myword
?
0
 
toookiAuthor Commented:
I am sorry for incorrectly writing the question:

Given the input file:
This is myword and another myword
There is no m y w o r d
This is myword and another myword another myword
There is no my word
This is myword only

The output that I am looking for is:
myword 2
myword 3
myword 1
0
 
ozoCommented:
perl -lne 'BEGIN{$w=shift}print "$w ".@F if @F=/\b\Q$w\E\b/g' 'myword'  your_text_file
0
 
wilcoxonCommented:
My solution only needs a simple change to work with your modified requirements:
perl -ne '$word = "myword"; $c = scalar(()=m{\b$word\b}g); print $word, " ", $c, "\n" if $c' your_text_file

Open in new window

0
 
skullnobrainsCommented:
i like woolmilkporks idea, so here is a simpler one

sed -n "s/MYWORD/@/g" inputfile | tr -c -d "@\n" | while read line ; do echo -n 'MYWORD : ' ; expr "$line" : '.*' ; done

Open in new window


i'm assuming the file does not contain '@' but you can use something like £ or µ as well

can't figure out one without a while but there should be a way to make it even simpler
0
 
toookiAuthor Commented:
Many thanks:
perl -lne 'BEGIN{$w=shift}print "$w ".@F if @F=/\b\Q$w\E\b/g' 'myword'  your_text_file
perl -ne '$word = "myword"; $c = scalar(()=m{\b$word\b}g); print $word, " ", $c, "\n" if $c' your_text_file

Both the above worked for me. Could you kindly let me know a tutorial where I could learn such parsing commands so that I can try myself on similar parsings?

Many thanks!
0

Featured Post

Fill in the form and get your FREE NFR key NOW!

Veeam is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

  • 2
  • 2
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now