?
Solved

Method Needed: Website Price Comparison Results From Multiple Datasources

Posted on 2009-04-01
4
Medium Priority
?
415 Views
Last Modified: 2013-11-23
Sorry for the novel, I'm a newbie, may have bitten off more than he can chew and wanted to capture everything...
I have a ambitious client who has contacrs in the marketplace who operate websites selling variants of a niche product - circa 100 websites each with their own product databases.
Some of these sites have MySQL databases, most are MS SQL but there are also sites which use databases such as Oracle and niche database technologies.
I have been tasked with building a 'price comparison' style website where the visitor can input a search term and recieve a combined set of distinct results from the 100 other sites, a bit like a price comparison site.
The schema and technology behind each of the [source] databases may be different but the price comparison site will only show 3 fields.
These are "Price", "Product Title", "Product ID" - in addition on the results page of the price comparison site there would need to be a fourth field "Site ID" which would identify the site in which that price was sourced from. roduct title is the only field searched.
The price comparison will be done with co-operation and permission from the source sites offering the products, however any development on the other sites to accomadate this would be at our expense and we're on a budget.
I think based on the number of databases (other websites) to query then page scraping would be messy. I also get the feeling that some may be unwilling to provide views.
Looking for an server that tells me the best method to query the third party data and consolidate as quickly as possible into a single set of search results on the price comparison site.
Minimum outlay, simple as possible and something that can be used for different datasource types. Ideally something that once built can be re-used/distributed on other datasources/websites using that technology so the portfolio of websites being queryed can be expanded to include even more sites and variants of the niche product. Datafeeds are not an option and for every search performed the resulting data must be live.
For the host site displaying the results the existing sample is PHP, which is also my own specialism. If cheaper/easier I'd go with other technologies though as it's not set in stone.
I have the feeling I may have to outsource te dev of this sites inner workings, but I'm really looking for a clear sense of direction. Thanks experts.
0
Comment
Question by:1bigwink
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 24

Expert Comment

by:fridom
ID: 24070297
Well if you know where the price list are you could grep them e.g with Hypricot or some other tool for fetching web pages. You than could use XPATH queries for getting the desired information.

I doubt there is any pre-fabricated software for that task, so you'd have to write it yourself
Regards
Friedrich
0
 

Author Comment

by:1bigwink
ID: 24133935
Thanks Fridom. How fast would this be? For example each page I wanted to grep may contain up to 1000 results. I'd want results from 25 sites so potentially 25000 results. Once returned is the data returned in a format that the user can sort or filter? With this many results they'd need tobe on multiple pages so where would the data be stored while the user was sorting through it?
0
 
LVL 24

Accepted Solution

by:
fridom earned 2000 total points
ID: 24154766
It depends on how long it'll  take  to "get" all the results  from the diverse sites. Then it would be your task to  modifiy thedata such that you can sort or filter them. But I'd not care too much about it yet, first get the data and extract what you are interested in you can just put the data in some hash table and maybe that is  all you need to be "happy"....

Regards
Friedrich
0
 

Author Closing Comment

by:1bigwink
ID: 31565570
Thanks for the tips
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
In this article, I’ll look at how you can use a backup to start a secondary instance for MongoDB.
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…
Suggested Courses
Course of the Month9 days, 14 hours left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question