?
Solved

Pulling product data from ANY website

Posted on 2014-04-03
5
Medium Priority
?
352 Views
Last Modified: 2014-04-22
A client has approached me and wants to create a tool similar to amazons wish list.

Basically, you browse to a website page, add the URL to the tool, and the tool scrapes the website to extract the product name, image, price etc.

Scraping isn't really an issue here, however the client wants it to pull product data from almost any website, which I'm struggling to figure out how to do as each website has a different structure to its HTML.

I'm guessing amazons wish list is using some kind of AI that's been trained to accurately determine which data on the page it needs.

I'm thinking this is a huge task for a single developer, but wondered if I was maybe overthinking this and if there was already a solution available?
0
Comment
Question by:SheppardDigital
  • 2
  • 2
5 Comments
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39975223
Amazon's wish list is a huge task even for Amazon's hundreds of developers.

Try this Google search: https://www.google.com/?q=wish+list+maker

One problem you may encounter (and it will become more prevalent over time) is that many sites do not publish product information in HTML any more.  They use a placeholder and use jQuery/ AJAX to load the information directly into the DOM.  The reason for doing this is to prevent 'bots from scraping data out of their HTML.
0
 
LVL 54

Expert Comment

by:Scott Fell, EE MVE
ID: 39975315
Some easy options....

You could use pintrest api's and perhaps capture peoples pins by having your app send to a special user's pinboard called "wish list".  Then you can grab that wish list via the user id and board id through xml.

Limit to whatever is on one api like amazon http://docs.aws.amazon.com/AWSECommerceService/latest/DG/CHAP_FindingItemstoBuy.html  and build your app around that.
0
 

Accepted Solution

by:
SheppardDigital earned 0 total points
ID: 40008364
Hi, it seems my client has found a third party API where you provide it with a URL and it will return the product information for you.

The service found was http://www.diffbot.com/
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 40008566
If your client is happy, we're happy.  But I would test that service very, very carefully (and check the terms of use) before I relied on it for more than hobby applications.
0
 

Author Closing Comment

by:SheppardDigital
ID: 40014355
This was the more suited answer
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
The title says it all. Writing any type of PHP Application or API code that provides high throughput, while under a heavy load, seems to be an arcane art form (Black Magic). This article aims to provide some general guidelines for producing this typ…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses
Course of the Month15 days, 22 hours left to enroll

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question