Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

MySQL and/or PHP search engine

Posted on 2014-01-14
4
780 Views
Last Modified: 2014-01-19
Hi,

I'm currently trying to develop a little search engine on a php/mysql based website.

This search engine needs to be able to search on many text fields (5) and return the results even if the work is misspelled.

For example, a search on the term "ward" could return the result "search your words".
If possible, this could also get a precision setting that I could adjust.

I've developped a MySQL function that insert every characters of the term and the search fields in two temporary table and make a sql to determine if the result is fine. This is great for a Database with 20 - 50 results. But when I try with a database of 1000 - 5000 rows, this is not possible to make it work because i'm getting a timeout.

I also checked for the "soundex" possibilities, but didn't find a way to make it work with wildcard characters.

Here is the question:
I would like to know if any people has an algorythm in MySQL or PHP that could fit my needs.
Everything can help, even if it's just a link or a theorical algorythm.

I've attached my actual algorythm in a txt file.

Thank you
searchengine.txt
0
Comment
Question by:luminis86
4 Comments
 
LVL 12

Accepted Solution

by:
sivagnanam chandrakanth earned 500 total points
ID: 39781353
I think instead of trying to do with mysql you should try some text search engines.. I would suggest SOLR since it has many built in functionalities for different types of searches, I it reduces the overload on database and improve performance.

http://www.installationpage.com/solr/how-to-use-solr-search-in-php-tutorial/
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 39781417
Google has kind of "done the job" in search.  Maybe they have a site search capability you could add.
https://support.google.com/customsearch/answer/72326?hl=en

If you are willing to reindex the site manually, I've had excellent results with Wrensoft Zoom.
http://www.wrensoft.com/zoom/

I would avoid Sphider (paralyzingly slow).

I used Atomz for several sites, but it has context-aware advertising.  When I went to put it into a church web site, where (as you can imagine) people search for many deeply personal and controversial ideas, the advertisements were unacceptable.
http://www.atomz.com/

If you succeed in developing a PHP search engine that has satisfactory performance, I hope you will write an article and publish it here on EE.  It's quite a challenging project, especially when your data base grows beyond a few thousand rows.
0
 
LVL 6

Expert Comment

by:Mahesh Bhutkar
ID: 39781492
I will recommend to go for Perl script for search engine to achieve performance as compared to Php. You can call your perl script within PHP.

Check out perl search engines..

Plucene
KinoSearch
Dansie Search Engine
Extropia Site Search
F3DSearch
FluffySearch
Fluid Dynamics Search
Global Data SiteSearch
Htgrep
HTTP::Index module
KSearch
Matt's Simple Search
Perlfect Search
RuterSearch
RiSearch
Selena Sol's Keyword Search
Sphinx new javaOpen Source Code Unix-based tool windows-based tool Mac OS X
WebSearch Perl Script
0
 
LVL 5

Expert Comment

by:MichaelT_
ID: 39781587
I would second Solr, it has a little bit of a steeper learning curve but it was built for search and can do what you want with a little configuration, plus it's scalable, can provide highlighting, suggestions etc.  To connect Solr to PHP you can use the Solarium library:

Solr: http://lucene.apache.org/solr/
Solarium: http://www.solarium-project.org/

If you want to play around with it a little, BitNami do a packaged version (on a VM or just install only) which will allow you to be up and running pretty quickly:

http://bitnami.com/stack/solr

Goodluck, if you have more questions feel free to ask.
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…

861 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question