Method Needed: Website Price Comparison Results From Multiple Datasources

Sorry for the novel, I'm a newbie, may have bitten off more than he can chew and wanted to capture everything...
I have a ambitious client who has contacrs in the marketplace who operate websites selling variants of a niche product - circa 100 websites each with their own product databases.
Some of these sites have MySQL databases, most are MS SQL but there are also sites which use databases such as Oracle and niche database technologies.
I have been tasked with building a 'price comparison' style website where the visitor can input a search term and recieve a combined set of distinct results from the 100 other sites, a bit like a price comparison site.
The schema and technology behind each of the [source] databases may be different but the price comparison site will only show 3 fields.
These are "Price", "Product Title", "Product ID" - in addition on the results page of the price comparison site there would need to be a fourth field "Site ID" which would identify the site in which that price was sourced from. roduct title is the only field searched.
The price comparison will be done with co-operation and permission from the source sites offering the products, however any development on the other sites to accomadate this would be at our expense and we're on a budget.
I think based on the number of databases (other websites) to query then page scraping would be messy. I also get the feeling that some may be unwilling to provide views.
Looking for an server that tells me the best method to query the third party data and consolidate as quickly as possible into a single set of search results on the price comparison site.
Minimum outlay, simple as possible and something that can be used for different datasource types. Ideally something that once built can be re-used/distributed on other datasources/websites using that technology so the portfolio of websites being queryed can be expanded to include even more sites and variants of the niche product. Datafeeds are not an option and for every search performed the resulting data must be live.
For the host site displaying the results the existing sample is PHP, which is also my own specialism. If cheaper/easier I'd go with other technologies though as it's not set in stone.
I have the feeling I may have to outsource te dev of this sites inner workings, but I'm really looking for a clear sense of direction. Thanks experts.
1bigwinkAsked:
Who is Participating?
 
fridomConnect With a Mentor CEO/ProgrammerCommented:
It depends on how long it'll  take  to "get" all the results  from the diverse sites. Then it would be your task to  modifiy thedata such that you can sort or filter them. But I'd not care too much about it yet, first get the data and extract what you are interested in you can just put the data in some hash table and maybe that is  all you need to be "happy"....

Regards
Friedrich
0
 
fridomCEO/ProgrammerCommented:
Well if you know where the price list are you could grep them e.g with Hypricot or some other tool for fetching web pages. You than could use XPATH queries for getting the desired information.

I doubt there is any pre-fabricated software for that task, so you'd have to write it yourself
Regards
Friedrich
0
 
1bigwinkAuthor Commented:
Thanks Fridom. How fast would this be? For example each page I wanted to grep may contain up to 1000 results. I'd want results from 25 sites so potentially 25000 results. Once returned is the data returned in a format that the user can sort or filter? With this many results they'd need tobe on multiple pages so where would the data be stored while the user was sorting through it?
0
 
1bigwinkAuthor Commented:
Thanks for the tips
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.