Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Perl/UNIX shell command

Posted on 2013-07-01
11
Medium Priority
?
315 Views
Last Modified: 2013-11-27
Is there any perl/UNIX shell command (single command) that will take an input file and output this (count the concurrence of "myword" in a line):

myword 2
myword 3
myword 1

Input files:
This is myword and another myword
There is no myword
This is myword and another myword another myword
There is no myword
This is myword only
0
Comment
Question by:toooki
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +3
11 Comments
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 400 total points
ID: 39290627
I'm sure ozo will have a one-liner for this, but I prefer more readable code.  Put this into a script (I name mine 'countword.pl'):

#!/usr/bin/perl
my $target = shift(@ARGV);
while (<STDIN>) {
    my $count = () = $_=~m/$target/gi;
    print "$target $count\n";
}

Open in new window


Run with:
./countword.pl myword < your_text_file

Open in new window

0
 
LVL 27

Assisted Solution

by:wilcoxon
wilcoxon earned 400 total points
ID: 39291246
I'm not ozo but here's a one-liner:
perl -ne "$word = 'myword'; print $word, ' ', scalar(()=m{\b$word\b}g), \"\n\"" your_text_file

Open in new window

0
 
LVL 23

Expert Comment

by:nemws1
ID: 39291251
Yeah, but you're a pretty darn good perl expert as well, wilcoxon. :)
0
Veeam Task Manager for Hyper-V

Task Manager for Hyper-V provides critical information that allows you to monitor Hyper-V performance by displaying real-time views of CPU and memory at the individual VM-level, so you can quickly identify which VMs are using host resources.

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39292476
There is always awk:

awk '{W="myword"; for(i=1;i<=NF;i++) if($i==W) c++; if(c>0) print W, c; c=0}' inputfile
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39292657
A really ugly one, just for fun:

W="myword"; sed "s/$W/@/g" inputfile | grep -w '@' | while read line; do echo  -en "$W\t"; echo $line |tr -dc '@' |wc -c; done

Open in new window

0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 800 total points
ID: 39292680
Given the input file:
This is myword and another myword
There is no myword
This is myword and another myword another myword
There is no myword
This is myword only

shouldn't the counts be
myword 2
myword 1
myword 3
myword 1
myword 1
?
Or, if we are counting /(?<!\bno )\bmyword\b/
would that be
myword 2
myword 0
myword 3
myword 0
myword 1
?
Or, if there is a "no" on the line, would we ignore the entire line, or just the "myword" immediately following the "no"?
What would you want the count to be for
This is myword and no myword
?
0
 

Author Comment

by:toooki
ID: 39292884
I am sorry for incorrectly writing the question:

Given the input file:
This is myword and another myword
There is no m y w o r d
This is myword and another myword another myword
There is no my word
This is myword only

The output that I am looking for is:
myword 2
myword 3
myword 1
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 800 total points
ID: 39293053
perl -lne 'BEGIN{$w=shift}print "$w ".@F if @F=/\b\Q$w\E\b/g' 'myword'  your_text_file
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 39293584
My solution only needs a simple change to work with your modified requirements:
perl -ne '$word = "myword"; $c = scalar(()=m{\b$word\b}g); print $word, " ", $c, "\n" if $c' your_text_file

Open in new window

0
 
LVL 27

Accepted Solution

by:
skullnobrains earned 400 total points
ID: 39312496
i like woolmilkporks idea, so here is a simpler one

sed -n "s/MYWORD/@/g" inputfile | tr -c -d "@\n" | while read line ; do echo -n 'MYWORD : ' ; expr "$line" : '.*' ; done

Open in new window


i'm assuming the file does not contain '@' but you can use something like £ or µ as well

can't figure out one without a while but there should be a way to make it even simpler
0
 

Author Comment

by:toooki
ID: 39337059
Many thanks:
perl -lne 'BEGIN{$w=shift}print "$w ".@F if @F=/\b\Q$w\E\b/g' 'myword'  your_text_file
perl -ne '$word = "myword"; $c = scalar(()=m{\b$word\b}g); print $word, " ", $c, "\n" if $c' your_text_file

Both the above worked for me. Could you kindly let me know a tutorial where I could learn such parsing commands so that I can try myself on similar parsings?

Many thanks!
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Utilizing an array to gracefully append to a list of EmailAddresses
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question