how to create a meta search engine

Posted on 2014-03-24
Last Modified: 2014-11-12
How do I create a meta search engine getting results from (1)google custom search (2) bing api via azure datamarket and (3) yahoo boss using php and curl and json format for results? I already have the api keys from all three search engines. I want to have the results cached for a time period but I don't know a lot about php programming but I can follow directions. I seen a code on this site about it but not sure how it pertains to my project. I hope I can be helped threw this project.
Question by:Domcast8
LVL 52

Accepted Solution

Scott Fell,  EE MVE earned 500 total points
ID: 39952244
I think it will help to break this request out in small parts.  First, by creating a meta search engine, are you trying to simultaneously take a user search term and look up both Google and Bing?  

If so, you only need Google and only one of yahoo or bing since bing powers yahoo search.

Since each has their own api, it might be a good idea to first make a custom search for each of the search api's you have.  For instance, google has a php library for this.   Use the library just for google and figure out how it works and the different options.

Bing also has a php sample ready for you.  I don't guarantee this link will last forever.  And I found a php library for boss

This is not a small project and you should first be able to create 3 separate (or just 2) php pages for each of the different search api's.  This ensures you understand the concepts and can move on to the next step which will be combining the search's to one view.

Lastly, you want to create a cache somehow.  Do you mean store search's on your server?

I would start by getting your google search working.  Use the php library, if you have troubles with that, post a new question just on using google search api and the php library.  Make sure you give it a try first before asking for help.  This is not a beginners request and will take some time on your part.  

Once you have google going, repeat for yahoo and bing.  At that point, if you are unsure how to merge all the results together, you can start another question on that topic.

The hard part of all of this will be mashing the results together and that will take a bit of computer science.  If you are not sure about creating the php libraries for the individual api's, mashing this together may be a bit of a stretch.

Dogpile is already doing this  I don't think they get much play.  One of the  newest search engines is duckduckgo

Personally, I think you should stick with one search engine.  Dogpile has been doing this for a long time and I don't find much use for them.  Google and Bing really have this covered.  I know from my own websites, of all search generated traffic, Google represents 70% to 80% and Bing/Yahoo represent the difference where Bing brings in a little more than yahoo.
LVL 109

Expert Comment

by:Ray Paseur
ID: 39953036
I especially agree about the Dogpile comment.  This could be a fun academic project, but it is a really useless activity in terms of achieving business success.  For an object lesson look how much (or how little) money DogPile is worth today after years of doing exactly what you've described.  And where are Alta Vista, Lycos, etc.?  They all got subsumed by Google.  Bing would not exist at all unless Microsoft was supporting it; it would fail if it had to live or die on its own P&L statement.

If you're new to PHP and want some good learning resources to jump-start your work, this article will help.  I agree with Scott, that you need to break the project down into smaller bites that can be solved individually.  A good guide for asking questions is the SSCCE.

Author Closing Comment

ID: 39953337
Thank you for you response. This does help me one step at a time. This is a long project but I believe I can do it with the help of experts exchange. I will first try to set up the search engines separately and then go from there.
I've seen other meta search engine scripts like inoutscripts that do cache results for a time period on a server. They say its to become more independent in the future. I would like to do it for that reason and more.
Anyway, thanks for your quick reply helping me with me project.
I will start on what you said and I will get back with you with any other questions that may come up.
Thanks Day 1 and your already helping me get somewhere.

Featured Post

Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
The Nano Server Image Builder helps you create a custom Nano Server image and bootable USB media with the aid of a graphical interface. Based on the inputs you provide, it generates images for deployment and creates reusable PowerShell scripts that …
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question