Recommendations for web scraping based on certain criteria?
Posted on 2011-10-29
Well, experts, I have a bit of a challenge ...
I'm unfamiliar with what is available in the data collection/web scraping arena (either free or chargeable). Bottom line, this is the type of information I need to get:
Photography-related websites or blogs (NOT photographers) which meet certain SEO criteria (a high traffic him of visitors would be a good example). I know there are numerous ways to gauge traffic (Alexa rank, back links, etc.). The ideal information, although I have no idea how it could be obtained, would be the number of visitors (either monthly, annually, etc.). The other critical piece of information is an e-mail address by which I could contact each website or blog (typically found on most websites under one or more the following categories: support, information, contact, etc.).
The ultimate goal is to assemble a list of at least several hundred (I would hope something more like several thousand would be more likely) websites that meet the criteria. I guess the minimum criteria would be: URL, brief website description, some indication of traffic rank, and e-mail address. The other criteria are harder to define for purposes of this post, but since I'm just trying to get a handle on this whole web scraping-data collection area, I don't want to muddy the waters with difficult to understand selection criteria.
I've done numerous searches, all of which have not resulted in anything close to what I need. My hope is that someone at EE is aware of an online or standalone software package which could supply most or all of what I need. As another option, I suppose purchasing an e-mail list is an option; however, I have never done that either so I don't know where to start.
I do not program so any solution involving that, would not work in my case. I also don't have the time or money to have custom programs developed to accomplish this, (unless my idea of what it would require is much more than what it actually would take).
I can't help but believe that somewhere, someone has developed this type of software tool, but I have no idea where to even start looking. Any suggestions or guidance would be greatly appreciated. Thank you.
If anyone has a suggestion as to a better zone to identify, please let me know because I don't understand what half of the zones mean anyway.