Pulling product data from ANY website
Posted on 2014-04-03
A client has approached me and wants to create a tool similar to amazons wish list.
Basically, you browse to a website page, add the URL to the tool, and the tool scrapes the website to extract the product name, image, price etc.
Scraping isn't really an issue here, however the client wants it to pull product data from almost any website, which I'm struggling to figure out how to do as each website has a different structure to its HTML.
I'm guessing amazons wish list is using some kind of AI that's been trained to accurately determine which data on the page it needs.
I'm thinking this is a huge task for a single developer, but wondered if I was maybe overthinking this and if there was already a solution available?