We help IT Professionals succeed at work.

Finding occurrences phrases within a string

hankknight
hankknight asked
on
Hello,

I need to parse meta-tag keywords and find phrases separated by commas. I am not looking for words, just phrases.

 - All phrases should have at least 6 characters but no more than 30 characters.
 - All phrases should have whitespace somewhere in the middle of the text.

The following example should return this:

                  baking pies (4)
                  making food (2)
                  baking from scratch (2)
                  cooking with butter (1)
                  cooking without salt (1)
<?php 
 
$text = "baking from scratch, food, baking pies, recipe, recipes, cook, butter, making food, cooking, bake, baking, making food, baking pies, baking from scratch, cooking with butter, milk, cooking without salt, fruit, fruit pies, apples oranges peaches cherries pear candy salt baking food pie apple corn beans bread, baking pies, food, apples, candy, baking pies";
 
$result =array();
foreach (str_word_count($text,1) as $k) {
    preg_match_all('/\b' . preg_quote($k, '/') . '\b/', $text, $m);
    $result[$k] = count($m[0]);
}
 
	foreach ($result as $word => $num) {
	echo $word . ' <span style="color: orange">(' . $num . ")</span><br />\n";
	}
 
?>

Open in new window

Comment
Watch Question

Top Expert 2007
Commented:
Seemed you missed ["fruit pies" in your result ;-)
<?php 
 
$text = "baking from scratch, food, baking pies, recipe, recipes, cook, butter, making food, cooking, bake, baking, making food, baking pies, baking from scratch, cooking with butter, milk, cooking without salt, fruit, fruit pies, apples oranges peaches cherries pear candy salt baking food pie apple corn beans bread, baking pies, food, apples, candy, baking pies";
 
$phrasses = explode(',', $text);
$result =array();
foreach(explode(',', $text) as $phrase) {
    $phrase = trim($phrase);
    if (strlen($phrase) < 6 || strlen($phrase) >30) {
        // to long or short
        continue;
    }
    if (!preg_match('/\s/', $phrase)) {
        //no whitespace inbetween
        continue;
    }
    if (!isset($result[$phrase])) {
        $result[$phrase] = 1;
    } else {
        ++$result[$phrase];
    }
}
arsort($result);
var_dump($result);
?>

Open in new window

CERTIFIED EXPERT
Top Expert 2007

Commented:
Another way to do it.
<?php 
 
$text = "baking from scratch, food, baking pies, recipe, recipes, cook, butter, making food, cooking, bake, baking, making food, baking pies, baking from scratch, cooking with butter, milk, cooking without salt, fruit, fruit pies, apples oranges peaches cherries pear candy salt baking food pie apple corn beans bread, baking pies, food, apples, candy, baking pies";
 
$exp = explode(",",$text);
 
array_walk($exp,"trim_array");
 
$array = array_count_values($exp);
 
foreach($array as $key=>$value) {
 
	$len = strlen($key);
	
	if(str_word_count($key,0) > 1 && ($len >= 6 && $len < 30)) {
	
		print $key." (".$value.")<br>";  
 
	}
}
 
function trim_array(&$value) { 
    
	$value = trim($value); 
}
 
?>

Open in new window

Explore More ContentExplore courses, solutions, and other research materials related to this topic.