PHP Search Help

This search code in PHP is not efficient in that too many irrelevant results are returned.  Can anyone help?

if(!empty($_get['s'])) {
    $varsearch = $_GET['s'] ;
} else {
    $varsearch = $_POST['s'] ;
}
 
$trimmed = trim($varsearch);
$trimmedlength = strlen($trimmed);
$countword = count(explode(" ",$trimmed));
$words = explode(" ",$trimmed);
 
 $querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";
 
$j = 0;
 
for($i=0; $i < $countword; $i++) {
 
if($j != 0) {
$querystr .= " OR wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
} else {
$querystr .= "wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
}
$j++;
}
benc007Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Roger BaklundCommented:
You should remove some "noisewords" from the query, like "a", "is", "the" and so on. Then require all words present in the same row using AND:
$crit = array(); 
for($i=0; $i < $countword; $i++) {
  $crit[] = "(wposts.post_title LIKE '%$words[$i]%' OR ".
            "wposts.post_content LIKE '%$words[$i]%')";
}
$querystr .= implode(' AND ',$crit);

Open in new window

0
quincydudeCommented:
How do you define irrelevant ? The query you given search out those records with title or content containing the search keywords. May be you would like to verify what result do you want the search to get?
0
benc007Author Commented:
Hi Cxr,

I am very new to PHP. How would I add your query to the current code?

Does MySQL have a full-text feature?  
0
Acronis True Image 2019 just released!

Create a reliable backup. Make sure you always have dependable copies of your data so you can restore your entire system or individual files.

Roger BaklundCommented:
Your code was incomplete, it did not include the mysql_query() call, which is the function that actually executes the query. The code I posted replaces the end of what you posted, from $j = 0; and down. To make it a valid SQL statement, you would have to add a closing parantheses, but that was also missing in your code. Below code will create a complete sql statement:
$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";  # note starting parantheses here ...
$crit = array(); 
for($i=0; $i < $countword; $i++) {
  $crit[] = "(wposts.post_title LIKE '%$words[$i]%' OR ".
            "wposts.post_content LIKE '%$words[$i]%')";
}
$querystr .= implode(' AND ',$crit);
$querystr .= ')';  # ... closing the parantheses here

Open in new window

0
Roger BaklundCommented:
...and yes, mysql has a fulltext feature.

http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html
0
benc007Author Commented:
In your code, where are you defining the "noisewords" such as  "a", "is", "the" and so on.  

For this search, do you think fulll-text would work better?  Do you have any suggestions on how to code this?
0
Roger BaklundCommented:
You could hardcode noisewords, or you could have them in a table or in a text file. The example below is with hardcoded noisewords, also removing double spaces between words.
$noise = array('a','an','the','in','is','if','it','of');
$trimmed = trim($varsearch);
$trimmedlength = strlen($trimmed);
$words = explode(" ",$trimmed);
foreach($words as $key=>$word) {
  $word = trim($word);
  if(in_array(strtolower($word),$noise)) unset($words[$key]);
  else $words[$key] = $word;
}
$countword = count($words);

Open in new window

0
benc007Author Commented:
I would like to hardcode noise words.  Should I replace:
 $querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";

$j = 0;

for($i=0; $i < $countword; $i++) {

if($j != 0) {
$querystr .= " OR wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
} else {
$querystr .= "wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
}
$j++;
}

$querystr .= ")
    AND wposts.post_status = 'publish'
    AND wposts.post_type = 'post'
    AND wzensor.moderation_status = 'approved'
    ORDER BY wposts.post_date DESC
 ";

 $pageposts = $wpdb->get_results($querystr, OBJECT);


with:

$noise = array('a','an','the','in','is','if','it','of');
$trimmed = trim($varsearch);
$trimmedlength = strlen($trimmed);
$words = explode(" ",$trimmed);
foreach($words as $key=>$word) {
  $word = trim($word);
  if(in_array(strtolower($word),$noise)) unset($words[$key]);
  else $words[$key] = $word;
}
$countword = count($words);

$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";  # note starting parantheses here ...
$crit = array();
for($i=0; $i < $countword; $i++) {
  $crit[] = "(wposts.post_title LIKE '%$words[$i]%' OR ".
            "wposts.post_content LIKE '%$words[$i]%')";
}
$querystr .= implode(' AND ',$crit);
$querystr .= ')';  # ... closing the parantheses here

?
0
Roger BaklundCommented:
Yes, except you should keep the rest of your query, and execute the query in the end:

Drop this:

$querystr .= ')';  # ... closing the parantheses here

... and keep this:

$querystr .= ")
    AND wposts.post_status = 'publish'
    AND wposts.post_type = 'post'
    AND wzensor.moderation_status = 'approved'
    ORDER BY wposts.post_date DESC
 ";
 $pageposts = $wpdb->get_results($querystr, OBJECT);
0
benc007Author Commented:
The code seems to work but the search results are still not relevant enough.  =(
0
Roger BaklundCommented:
What does that mean, "relevant enough"? Can you show us some examples of search queries and results, and explain which results are relevant enough, and which are not?
0
benc007Author Commented:
One of the post titles is "The Sales Presentation The Bottom Line Is Selling" (without the quotes).  When I search for "The Sales Presentation The Bottom Line Is Selling", I should be only getting this post and maybe a couple more.  

However I am getting posts that do not even have the following words in the title or content: "bottom, line, presentation"

If full-text is used, would the results be more relevant?
0
Roger BaklundCommented:
The code we implemented above should NOT find records where not ALL words except noisewords in the search phrase is present. Can you show your current code?

When using fulltext, the query and the methods to use to get relavant answers is rather different from what we are doing now. It's hard to know what results you would get, but it should be possible to get relavant results with both methods.
0
benc007Author Commented:
My current code is:

if(!empty($_get['s'])) {
    $varsearch = $_GET['s'] ;
} else {
    $varsearch = $_POST['s'] ;
}

$trimmed = trim($varsearch);
$trimmedlength = strlen($trimmed);
$countword = count(explode(" ",$trimmed));
$words = explode(" ",$trimmed);

$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";

$j = 0;

for($i=0; $i < $countword; $i++) {

if($j != 0) {
$querystr .= " OR wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
} else {
$querystr .= "wposts.post_title LIKE '%$words[$i]%' OR wposts.post_content LIKE '%$words[$i]%'";
}
$j++;
}

$querystr .= ")
    AND wposts.post_status = 'publish'
    AND wposts.post_type = 'post'
    AND wzensor.moderation_status = 'approved'
    ORDER BY wposts.post_date DESC
 ";

 $pageposts = $wpdb->get_results($querystr, OBJECT);

/* new */
$noise = array('a','an','the','in','is','if','it','of');
$trimmed = trim($varsearch);
$trimmedlength = strlen($trimmed);
$words = explode(" ",$trimmed);
foreach($words as $key=>$word) {
  $word = trim($word);
  if(in_array(strtolower($word),$noise)) unset($words[$key]);
  else $words[$key] = $word;
}
$countword = count($words);

$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";  # note starting parantheses here ...
$crit = array();
for($i=0; $i < $countword; $i++) {
  $crit[] = "(wposts.post_title LIKE '%$words[$i]%' OR ".
            "wposts.post_content LIKE '%$words[$i]%')";
}
$querystr .= implode(' AND ',$crit);
0
Roger BaklundCommented:
You are still sending the old query to the database server, the new code is ignored. Try the code below:
if(isset($_GET['s'])) {
    $varsearch = $_GET['s'] ;
} elseif(isset($_POST['s'])) {
    $varsearch = $_POST['s'] ;
} else {
  $varsearch = '';
  echo 'No search criteria was provided';
}
 
$noise = array('a','an','the','in','is','if','it','of');
$trimmed = trim($varsearch);
$words = explode(" ",$trimmed);
foreach($words as $key=>$word) {
  $word = trim($word);
  if(in_array(strtolower($word),$noise)) unset($words[$key]);
  else $words[$key] = $word;
}
 
$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";
$crit = array();
foreach($words as $word) 
  $crit[] = "(wposts.post_title LIKE '%$word%' OR ".
            "wposts.post_content LIKE '%$word%')";
$querystr .= implode(' AND ',$crit);
$querystr .= ")
    AND wposts.post_status = 'publish'
    AND wposts.post_type = 'post'
    AND wzensor.moderation_status = 'approved'
    ORDER BY wposts.post_date DESC";
 
$pageposts = $wpdb->get_results($querystr, OBJECT);

Open in new window

0
benc007Author Commented:
There is validation to check that at least 3 characters are entered into the search field, but with the updated code II keep getting "The search field requires valid input." even when there are more than 3 characters in the search phrase.

Updated code:
/* new */
if(isset($_GET['s'])) {
    $varsearch = $_GET['s'] ;
} elseif(isset($_POST['s'])) {
    $varsearch = $_POST['s'] ;
} else {
  $varsearch = '';
  echo 'No search criteria was provided';
}
 
$noise = array('a','an','the','in','is','if','it','of');
$trimmed = trim($varsearch);
$words = explode(" ",$trimmed);
foreach($words as $key=>$word) {
  $word = trim($word);
  if(in_array(strtolower($word),$noise)) unset($words[$key]);
  else $words[$key] = $word;
}
 
$querystr = "
    SELECT wposts.*, wzensor.*
    FROM $wpdb->posts wposts, wp_zensor wzensor
    WHERE wposts.ID = wzensor.post_id
    AND ( ";
$crit = array();
foreach($words as $word)
  $crit[] = "(wposts.post_title LIKE '%$word%' OR ".
            "wposts.post_content LIKE '%$word%')";
$querystr .= implode(' AND ',$crit);
$querystr .= ")
    AND wposts.post_status = 'publish'
    AND wposts.post_type = 'post'
    AND wzensor.moderation_status = 'approved'
    ORDER BY wposts.post_date DESC";
 
$pageposts = $wpdb->get_results($querystr, OBJECT);
/* end of new */


/* Finish */
?>


<div id="page-index">
      
      <div id="content">

<?php if($trimmedlength < 3) { ?>
<?php if($trimmedlength == '') { ?>
<h2 class="center">The search field requires valid input.</h2>
<?php } else { ?>
<h2 class="center">The search field requires valid input.</h2>
<?php } ?>
<?php } else { ?>
0
Roger BaklundCommented:
I removed the line

$trimmedlength = strlen($trimmed);

because it seemed not to be needed. just put it back in after this line:

$trimmed = trim($varsearch);

Note that if the user enters only noisewords, this test will fail: the trimmedlenght contains the full search phrase, including the noise words. Also, the current implementation is flawed: It executes the query first, then refuses to show the result when the length of the search phrase is too short. It should not execute the query if the search phrase exclusive noise words is too short.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
benc007Author Commented:
Cxr - the code works great.  Thank you so much for your help!
0
benc007Author Commented:
Excellent coder!  Thank you for your help.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.