We help IT Professionals succeed at work.

Remote searching

homerdoh
homerdoh asked
on
Medium Priority
162 Views
Last Modified: 2013-12-25
Hi,
There is a PHP search form on a website that searches a MySQL database and displays the results. Is it possible to make a CGI script that could get the results from the site and display it on my site?
E.g. someone comes on my site and searches for "Simpsons", can a CGI go to another site and search the PHP/MySQL and display the results on my page? It would be something like a Metasearch engine but would it be able to get results from PHP? If i can do that, how would i go about it?

Homer
Comment
Watch Question

CERTIFIED EXPERT

Commented:
You would need to know the name and password of the remote database - do you have that information?

Author

Commented:
I'm not connecting to the database directly. Wont it be possible to use a LWP script and get the info as a webfeteching?
CERTIFIED EXPERT

Commented:
Ah.  Well, you could do that - you'd need LWP and the HTML:: modules that allow you to parse HTML... of course, there *are* some ethical and potentially legal issues with doing this.

Author

Commented:
Let me worry about the legal stuff :) Anyway, its just like a Metasearch engine but with only one engine.
But is it possible with PHP? How can i get the results to display on my page because if i point my search form towards the PHP search engine the results are displayed on the page specified by their form. Do you think there are scripts like this out there so i can atleast have a look and see what i have to do.
CERTIFIED EXPERT

Commented:
D'oh.  PHP.  I was thinking CGI/Perl.  Sorry.

Author

Commented:
Nooooooo, i meant their search engine is PHP, i want a CGI script that can do that.
CERTIFIED EXPERT
Commented:
Ah!  Ok, yes, it looks like you  could do this with LWP and HTML modules - the O'Reilly & Associates book called 'Perl Cookbook' goes into how to use these modules for various tasks.  I haven't done it myself, but perhaps this will get you started.... all typos are most likely mine :)

Combining a couple of samples, the following unchecked code (or code similar to it) will retreive content from a URL and strip the HTML from it.  Then, you'd parse the information, and format as you'd like.

use LWP::Simple;
use HTML::Parse;
use HTML::FormatText;

unless(defined($html_text = get $URL)) {
  die "could not get $URL\n"
}

$plain_text = HTML::FormatText->new->format(parse_html($html_text));

... the other have is that you have to 'stuff' their form - you can do this using the LWP::Simple; and URI::URL; modules, according to the O'Reilly book.  Are they using POST or GET methods on the form?

GET:

Use LWP::Simple;
use URI::URL;

my $url = url('http://www.thedomain.com/cgi-bin/script';
$url->query_form(module => 'DB_File', readme => 1);
$content = get($URL);


POST:

Use HTTP::Request::Common qw(POST);
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
my $req = POST ' http://www.thedomain.com/cgi-bin/script',
    [ module => 'DB_File', readme => 1];
$content= $ua->request($req)->as_string;

Author

Commented:
This is what i got:

#!/usr/bin/perl

use LWP::Simple;

$doc=get("http://www.somewhere.com/search.php?word=BART")

print "Content-Type: text/html\n\n"
print "$doc";

For example if i put BART into the line, it will work fine and bring me all the results for BART. But i dont know how i can put in new words in its place. I've tried putting
something like this in and pointing my form to it:

$doc=get("http://www.somewhere.com/search.php?word=$word")

$word=form{word}


But i get Error 500s. What would be the right syntax to put in to avoid these errors?



CERTIFIED EXPERT

Commented:
What is in your error_log?

Try doing this test:

#!/usr/bin/perl

use LWP::Simple;

$word = 'BART';
$getphrase = "http://www.somewhere.com/search.php?word=$word";
$doc = get($getphrase);

print "Content-Type: text/html\n\n"
print "$doc";

Does this work?  

If so, try changing it so that

$word = "NOT BART";

(with a space).  If this DOESN'T work, I would suspect that you have to change spaces to %20's.

Author

Commented:
The Scripts didnt work with BART, it said:

syntax error at /home/cgi-bin/test3.cgi line 10, near "print"
Execution of /home/cgi-bin/test3.cgi aborted due to compilation errors.
CERTIFIED EXPERT

Commented:
Ok, weird.  Here's the exact example from the book:

use LWP::Simple;
unless(defined($content = get $URL) ) {
   die "count not get $URL\n";
}

So you *must* be able to use a variable - there has to be a different issue. (to use the above example, you'd have to set $URL to something).

Apparently, the get function from LWP::Simple returns undef on error.

Also from the book - to determine the cause of the error, you need to go beyond LWP::Simple.

There is a series of modules that they use to find the errors, including LWP::UserAgent (which creates a virtual browser), HTTP:Request (which is used to create a request without sending it) and HTTP::Response which is the object type returned when the user agent actually runs the request, which you can check for errors and contents.

It suggests checking the docs for the CPAN module LWP::Simple, and the lwpcook(1) manpage that comes with LWP, the documentation for the modules LWP::UserAgent, HTTP::Request, HTTP::REsponse, and URI::Heuristic.
Comment from expert accepted as answer.

Thank you
Computer101
E-E Moderator

Explore More ContentExplore courses, solutions, and other research materials related to this topic.