Solved

frequency of words

Posted on 2000-03-20
13
183 Views
Last Modified: 2010-03-05
if i have a variable $test that contains a block of text like this

"this is a test file when i will test out this program. the test will
now be set. i have to test out this program as the test will be used when i test out other programs."

and another variable $word which is "test"

(now bear in mind $test can be anything and $word can be anything
so id like a program that works on any example)

what id like to print out is
file 1
out 3
will 2

ie after "test" the word "file" occurs once
the word "out" occurs 3 times and the word "will" occurs twice

would anyone have the code to to print out this data given the
$test and $word variables ?

thanks
0
Comment
Question by:boofulls
  • 6
  • 6
13 Comments
 
LVL 4

Accepted Solution

by:
binkzz earned 200 total points
ID: 2636208
#! /usr/bin/perl

my $text = "this is a test file when i will test out this program. the test will now be set. i have to test out this program as the test will be used when i test out other programs.";

my $word = "test";




my $templine = $text;

while ($templine =~ m/ $word (\w+)/s)
{
  $templine =~ s/ $word (\w+)//s;
  $item = $1;

  if (!$found{$item})
  {
    $found{$item} = 1;
  } else
  {
    $found{$item}++;
  }
}


foreach $key (keys %found)
{
  print "Word [$key] - Count - [$found{$key}]\n";
}
0
 

Author Comment

by:boofulls
ID: 2636750
Adjusted points from 100 to 150
0
 

Author Comment

by:boofulls
ID: 2636751
thanks!
could u adjust the code so
that it can do it for
file test
out test
will test
also (ie the $word is after the result this time
instead of before)

0
 
LVL 4

Expert Comment

by:binkzz
ID: 2636796
Replace the while loop with:


while ($templine =~ m/ (\w+) $word /s)
{
  $templine =~ s/ (\w+) $word //s;
  $item = $1;

  if (!$found{$item})
  {
    $found{$item} = 1;
  } else
  {
    $found{$item}++;
  }
}
0
 

Author Comment

by:boofulls
ID: 2639488
Adjusted points from 150 to 200
0
 

Author Comment

by:boofulls
ID: 2639489
just one final question honestly ;)

say that $word is "hello"
it doesnt seem to pick up on phrases
in $text such as

Hello There

(note the capital H at the start of the "hello"
that is in $text)
can u make sure that it picks up all instances
of the word (in this case "hello") even if it
is "Hello there" or "HELLO there" etc
thanks
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 4

Expert Comment

by:binkzz
ID: 2639498
If you want to search in both upper and lower case, add an i after the regular expression:

while ($templine =~ m/ (\w+) $word /si)
{
  $templine =~ s/ (\w+) $word //si;
  $item = $1;

  if (!$found{$item})
  {
    $found{$item} = 1;
  } else
  {
    $found{$item}++;
  }
}
0
 

Author Comment

by:boofulls
ID: 2639707
thanks
0
 
LVL 84

Expert Comment

by:ozo
ID: 2640175
Why if(!$found{$item}) ?
0
 
LVL 4

Expert Comment

by:binkzz
ID: 2640203
I wasn't certain if it would work out from 0 as an integer if I would say $found{$item}++ if I had not used $found{$item} before.

Why are you questionning my programming!!! :)

Binkzz
0
 
LVL 4

Expert Comment

by:binkzz
ID: 2640214
That was meant to be 'questioning' before you start doubting my spelling as well. 8)
0
 

Author Comment

by:boofulls
ID: 2640515
what would u recommend ozo?
0
 
LVL 4

Expert Comment

by:binkzz
ID: 2640940
$found{$item}++;

Would also work just as well, but I wasn't certain so I added an additional check.

Tom
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now