Link to home
Create AccountLog in
Avatar of hankknight
hankknightFlag for Canada

asked on

Array with words and frequency

I want an array with words and number of times that word occurs.

Something like this:

                      Array
                      (
                          [this] => 2
                          [is] => 3
                          [a] => 3
                      )
<pre><?php 
 
$text = "This is a test and this is only a test.  It is a test to test ability and the capability of this code.";
 
preg_match_all('/[\'0-9\-\x41-\x5a\x5f\x61-\x7a\xc0-\xd6\xd8-\xf6\xf8-\xff]+/', strlower($text), $words);
 
var_dump($words);
 
?></pre>

Open in new window

Avatar of Richard Quadling
Richard Quadling
Flag of United Kingdom of Great Britain and Northern Ireland image

Outputs ...

Array
(
    [this] => 3
    [is] => 3
    [a] => 3
    [test] => 4
    [and] => 2
    [only] => 1
    [it] => 1
    [to] => 1
    [ability] => 1
    [the] => 1
    [capability] => 1
    [of] => 1
    [code] => 1
)
<?php 
 
$text = "This is a test and this is only a test.  It is a test to test ability and the capability of this code.";
 
preg_match_all('/[\'0-9\-\x41-\x5a\x5f\x61-\x7a\xc0-\xd6\xd8-\xf6\xf8-\xff]+/', strtolower($text), $words);
 
$a_CountedWords = array();
foreach($words[0] as $word)
	{
	if (!isset($a_CountedWords[$word]))
		{
		$a_CountedWords[$word] = 0;
		}
	++$a_CountedWords[$word];
	}
 
print_r($a_CountedWords);

Open in new window

Avatar of Terry Woods
Output:
This is a test and this is only a test.  It is a test to test ability and the capability of this code.Array
(
    [0] => this
    [1] => is
    [2] => a
    [3] => test
    [4] => and
    [5] => this
    [6] => is
    [7] => only
    [8] => a
    [9] => test
    [10] => it
    [11] => is
    [12] => a
    [13] => test
    [14] => to
    [15] => test
    [16] => ability
    [17] => and
    [18] => the
    [19] => capability
    [20] => of
    [21] => this
    [22] => code
)
Array
(
    [this] => 3
    [is] => 3
    [a] => 3
    [test] => 4
    [and] => 2
    [only] => 1
    [it] => 1
    [to] => 1
    [ability] => 1
    [the] => 1
    [capability] => 1
    [of] => 1
    [code] => 1
)


<pre><?php 
 
$text = "This is a test and this is only a test.  It is a test to test ability and the capability of this code.";
echo $text; 
preg_match_all('/[\'0-9\-\x41-\x5a\x5f\x61-\x7a\xc0-\xd6\xd8-\xf6\xf8-\xff]+/', strtolower($text), $words);
 
print_r($words[0]);
 
foreach($words[0] as $key=>$value) {
  $wordCount[$value]++;
}
 
print_r($wordCount);
 
?></pre>

Open in new window

TerryAtOpus.

If you have error_reporting(E_ALL); you get the following messages.

Notice: Undefined variable: wordCount in C:\uw2.php on line 10

Notice: Undefined index:  this in C:\uw2.php on line 10

Notice: Undefined index:  is in C:\uw2.php on line 10

Notice: Undefined index:  a in C:\uw2.php on line 10

Notice: Undefined index:  test in C:\uw2.php on line 10

Notice: Undefined index:  and in C:\uw2.php on line 10

Notice: Undefined index:  only in C:\uw2.php on line 10

Notice: Undefined index:  it in C:\uw2.php on line 10

Notice: Undefined index:  to in C:\uw2.php on line 10

Notice: Undefined index:  ability in C:\uw2.php on line 10

Notice: Undefined index:  the in C:\uw2.php on line 10

Notice: Undefined index:  capability in C:\uw2.php on line 10

Notice: Undefined index:  of in C:\uw2.php on line 10

Notice: Undefined index:  code in C:\uw2.php on line 10
Array
(
    [this] => 3
    [is] => 3
    [a] => 3
    [test] => 4
    [and] => 2
    [only] => 1
    [it] => 1
    [to] => 1
    [ability] => 1
    [the] => 1
    [capability] => 1
    [of] => 1
    [code] => 1
)


Whilst PHP has no problem with this, notices mean PHP has had to guess (and in most cases it is fine), but it is a guess.

ASKER CERTIFIED SOLUTION
Avatar of ddrudik
ddrudik
Flag of United States of America image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Oh. Nice one ddrudik. I wondered why I had to write code for the function which I was sure existed.

I spotted the missing 'to' too.
hankknight, thanks for the question and the points.

RQuadling, thanks, I recalled looking that up for a previous question.