Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

how to create meta search engine ?

Posted on 2010-09-11
7
Medium Priority
?
898 Views
Last Modified: 2013-12-13
Meta search engine pass the quires through many search engines like google and yahoo but i want to know how their programming work ?
does it possible to make in php?
is google and other search engine gives rights to use their search engine for meta search engine ?


0
Comment
Question by:savsoft
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
7 Comments
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 33652395
Yes they do. You can start to read these pages for Google and Yahoo:

http://code.google.com/intl/it-IT/apis/ajax/
http://developer.yahoo.com/everything.html

You have to learn about curl also: http://php.net/curl

What you want to do is not so trivial to can be done with few lines of code. Good luck.
0
 

Author Comment

by:savsoft
ID: 33652421
Thank you,
ok, i will read this..
0
 
LVL 111

Accepted Solution

by:
Ray Paseur earned 2000 total points
ID: 33653149
The general design pattern would be to call each search engine API and save the results, perhaps in an array with one response element for each search engine.  You might want to include Bing, in addition to Google, Yahoo, and the lesser engines.

You would probably take your search terms from a URL argument (in the PHP script this appears in $_GET).  You might want to have a data base to store your results for some period of time, so you are not so dependent on the foreign sites.

These links might be helpful:
http://lmgtfy.com?q=google+search+api
http://developer.yahoo.com/search/boss/
http://msdn.microsoft.com/en-us/library/dd251056.aspx

In case you find that learning CURL is a daunting task (I did), here is a little script that will make a CURL request.  Put the URL you want to retrieve into line 54.

Good luck with your project, ~Ray


<?php // RAY_temp_curl_example.php
error_reporting(E_ALL);

function my_curl($url, $timeout=2, $error_report=FALSE)
{
    $curl = curl_init();

    // HEADERS FROM FIREFOX - APPEARS TO BE A BROWSER REFERRED BY GOOGLE
    $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: "; // BROWSERS USUALLY LEAVE BLANK

    // SET THE CURL OPTIONS - SEE http://php.net/manual/en/function.curl-setopt.php
    curl_setopt($curl, CURLOPT_URL,            $url);
    curl_setopt($curl, CURLOPT_USERAGENT,      'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6');
    curl_setopt($curl, CURLOPT_HTTPHEADER,     $header);
    curl_setopt($curl, CURLOPT_REFERER,        'http://www.google.com');
    curl_setopt($curl, CURLOPT_ENCODING,       'gzip,deflate');
    curl_setopt($curl, CURLOPT_AUTOREFERER,    TRUE);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($curl, CURLOPT_TIMEOUT,        $timeout);

    // RUN THE CURL REQUEST AND GET THE RESULTS
    $htm = curl_exec($curl);

    // ON FAILURE HANDLE ERROR MESSAGE
    if ($htm === FALSE)
    {
        if ($error_report)
        {
            $err = curl_errno($curl);
		    $inf = curl_getinfo($curl);
            echo "CURL FAIL: $url TIMEOUT=$timeout, CURL_ERRNO=$err";
            var_dump($inf);
        }
        curl_close($curl);
        return FALSE;
    }

    // ON SUCCESS RETURN XML / HTML STRING
    curl_close($curl);
    return $htm;
}




// USAGE EXAMPLE - PUT YOUR FAVORITE URL HERE
$url = "http://finance.yahoo.com/d/quotes.csv?s=lulu&f=snl1c1ohgvt1";
$htm = my_curl($url);
if (!$htm) die("NO $url");


// SHOW WHAT WE GOT
echo "<pre>";
echo htmlentities($htm);

Open in new window

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:savsoft
ID: 33653355
Thank you ray and margusG.

I am reading your refered page and found very useful.
Actually i want to start my own search engine to show better search result then existing search engine. But i know that i can't crawl whole web like google nd bing. So i select meta search technology.
Please can you suggest me more if you have any better idea.
I have dedicated server of 500gb space 4gb ram

0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 33653833
Not to discourage you, but what you're working on puts you in direct competition against Google, Yahoo, and Bing -- all of them are spending millions of dollars each month trying to get better search results.  They constantly study each other, and they have unlimited access to the top scientists and engineers.

I think you're in fine shape just using their search results through their APIs.  Just be careful of the terms of service - you may need to pay them if you use their data for commercial purposes.
0
 

Author Comment

by:savsoft
ID: 33654966
is there any service where we can pay for their data usage?
I have also found an amazon web information service
http://aws.amazon.com
There we use all information of any website.
Can i use this information for my search engine. It charges $0.00015 per request. Alexa also powered by it.


0
 

Author Comment

by:savsoft
ID: 33655472
As i understand search technology, according to it all existing search engine have web spider/web crawler program which start visiting website through some initializing url ( known as seeds) and store all website in their database to further index or page rank use. then crawler detect all hyper links and also add it to their seeds list.it need very large space to store these information. it is one type of downloading of whole web. i think it repeat all this process atleast once in 10 days.
if its true then i think its not an advance method. it makes large data transfer.

There is need to develop new technology of search engine....
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article demonstrates how to create a simple responsive confirmation dialog with Ok and Cancel buttons using HTML, CSS, jQuery and Promises
Boost your ability to deliver ambitious and competitive web apps by choosing the right JavaScript framework to best suit your project’s needs.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.
Suggested Courses

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question