extract most common words in row of database titles

I'm looking for an efficient solution to finding the most common words in a column of a database that comprises of text strings.

The solution would most likely be a combination of a mysql query and probably need some PHP to do some looping on results...

E.g. I have a table with 200,000 rows with a column for "titles" - these are names/titles of photos
I need a way to find the most common words in these 200,000 strings. This script need only run daily via a cron so it has room to be a little inefficient.
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Sander StadSysteemontwikkelaar, Database AdministratorCommented:
You can use the str_word_count() function in PHP. At this website there are several examples how to search for repeating words: http://us3.php.net/manual/en/function.str-word-count.php.
What this function does it return an array with the words inside it. When you have the query you can loop the array to see if the word is already in your database.
The table with the words could look like this:
Word                        Count
You loop through the array and than increment the count in the table if it already exists and if it doesn't exist add itto the table.
Hope this helps you out a bit
mixartAuthor Commented:
This helps a little but I'm not quite there with figuring out the PHP code to do this.

Any chance you could help - if I had an array of strings like this:

$strings = ['The dog and cat run fast','A cat at a bird fast','The fast bird and cat jump','a dog run a cat','The cat fast run dog cat','A horse swim fast in a dog tree'];

Open in new window

Sander StadSysteemontwikkelaar, Database AdministratorCommented:
I thought it over and you'd also like to empty the table when the new job starts because you don't want the results from last time to get in the way of the new results.
It would look something like this:

$link = mysql_connect({SERVER}, {USERNAME}, {PASSWORD});
/* Delete all the old entries */
$query = "DELETE FROM wordstatictics";
$results = mysql_query($query, $link);
/*Get all the titles from the database */
$query = "SELECT titles FROM {TABLENAME}";
$results = mysql_query($query, $link);
/* Loop through the results */
while($result = mysql_fetch_array($results)){
  /*Get all the words inside an array */
  $words = array();
  $words = str_word_count($result['title'], 1));
  /* Call the function updateWordStatistics
function updateWordStatistics($wordsArray = array()){
  /*Loop through every word, check if it already exists in the db
    If it doesn exist the count value is incremented by 1. If it 
    doesn't exists it is inserted in the table */
  foreach($wordsArray as $word){
    $query = "SELECT * FROM wordstatictics WHERE word = $word";
    $results = mysql_query($query, $link);
    if(mysql_num_rows($results) == 0){
      $query = "INSERT INTO wordstatistics(word, count) VALUES(\"$word\",1)";
      $results = mysql_query($query, $link);
      $query = "UPDATE wordstatistics SET count = count + 1 WHERE word = $word";
      $results = mysql_query($query, $link);

Open in new window


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.