Solved

Method Needed: Website Price Comparison Results From Multiple Datasources

Posted on 2009-04-01
4
406 Views
Last Modified: 2013-11-23
Sorry for the novel, I'm a newbie, may have bitten off more than he can chew and wanted to capture everything...
I have a ambitious client who has contacrs in the marketplace who operate websites selling variants of a niche product - circa 100 websites each with their own product databases.
Some of these sites have MySQL databases, most are MS SQL but there are also sites which use databases such as Oracle and niche database technologies.
I have been tasked with building a 'price comparison' style website where the visitor can input a search term and recieve a combined set of distinct results from the 100 other sites, a bit like a price comparison site.
The schema and technology behind each of the [source] databases may be different but the price comparison site will only show 3 fields.
These are "Price", "Product Title", "Product ID" - in addition on the results page of the price comparison site there would need to be a fourth field "Site ID" which would identify the site in which that price was sourced from. roduct title is the only field searched.
The price comparison will be done with co-operation and permission from the source sites offering the products, however any development on the other sites to accomadate this would be at our expense and we're on a budget.
I think based on the number of databases (other websites) to query then page scraping would be messy. I also get the feeling that some may be unwilling to provide views.
Looking for an server that tells me the best method to query the third party data and consolidate as quickly as possible into a single set of search results on the price comparison site.
Minimum outlay, simple as possible and something that can be used for different datasource types. Ideally something that once built can be re-used/distributed on other datasources/websites using that technology so the portfolio of websites being queryed can be expanded to include even more sites and variants of the niche product. Datafeeds are not an option and for every search performed the resulting data must be live.
For the host site displaying the results the existing sample is PHP, which is also my own specialism. If cheaper/easier I'd go with other technologies though as it's not set in stone.
I have the feeling I may have to outsource te dev of this sites inner workings, but I'm really looking for a clear sense of direction. Thanks experts.
0
Comment
Question by:1bigwink
  • 2
  • 2
4 Comments
 
LVL 24

Expert Comment

by:fridom
ID: 24070297
Well if you know where the price list are you could grep them e.g with Hypricot or some other tool for fetching web pages. You than could use XPATH queries for getting the desired information.

I doubt there is any pre-fabricated software for that task, so you'd have to write it yourself
Regards
Friedrich
0
 

Author Comment

by:1bigwink
ID: 24133935
Thanks Fridom. How fast would this be? For example each page I wanted to grep may contain up to 1000 results. I'd want results from 25 sites so potentially 25000 results. Once returned is the data returned in a format that the user can sort or filter? With this many results they'd need tobe on multiple pages so where would the data be stored while the user was sorting through it?
0
 
LVL 24

Accepted Solution

by:
fridom earned 500 total points
ID: 24154766
It depends on how long it'll  take  to "get" all the results  from the diverse sites. Then it would be your task to  modifiy thedata such that you can sort or filter them. But I'd not care too much about it yet, first get the data and extract what you are interested in you can just put the data in some hash table and maybe that is  all you need to be "happy"....

Regards
Friedrich
0
 

Author Closing Comment

by:1bigwink
ID: 31565570
Thanks for the tips
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

APEX (Application Express) is used to develop a web application from Oracle. SQL Workshop is one of the tools that comes with Oracle APEX to query or modify the database objects or to make any changes to the structure.
Never store passwords in plain text or just their hash: it seems a no-brainier, but there are still plenty of people doing that. I present the why and how on this subject, offering my own real life solution that you can implement right away, bringin…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now