Solved

what tool or utility can be used to speed collecting text from websites that return jokes, sayings, pictures from queries

Posted on 2014-01-26
5
161 Views
Last Modified: 2014-03-03
I need to collect a great deal of jokes, sayings and quotes, clipart etc. related to specific subjects. Is there any software, utility, robot or such that will aid in the collection or harvesting of above text and picture files and allow them to stored and categorized in ms excel or similar application
0
Comment
Question by:Dov_B
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
5 Comments
 
LVL 53

Assisted Solution

by:Scott Fell, EE MVE
Scott Fell,  EE MVE earned 350 total points
ID: 39811247
You would need to start with a manual search for a site you like. From there you can download using http://www.httrack.com/ but please be aware of how NOT  to use it http://www.httrack.com/html/abuse.html including:

Are the pages copyrighted?
Can you copy them only for private purpose?
Do not make online mirrors unless you are authorized to do so
Do not steal private information
Do not grab emails
Do not grab private information
0
 
LVL 27

Accepted Solution

by:
MacroShadow earned 150 total points
ID: 39811465
I don't know of any such utility and it would seem that neither do any of EE's experts.

Using VBA you can get the html of a website and TRY to properly parse it to separate the jokes etc. but it probably is more work than manually collecting them.
0
 
LVL 53

Assisted Solution

by:Scott Fell, EE MVE
Scott Fell,  EE MVE earned 350 total points
ID: 39811589
The tool I suggested would be the easiest way I can think of.  You don't need any special coding skills or data repository.  Manually is going to be the easiest and help you weed through what is copyright or not.

The only other option would be an automatic search.  Search api's from google or bing are not meant for screen scraping and therefor your option is to create your own search logs.  There are services like 80 legs http://80legs.com/ that will do the crawl work for you.  You will still need to program how to find jokes and get only the jokes content.  This is not a trivial thing to do for both money or the amount of time to spend.

Manual searching for what you want will lead you to the sources you need.  For instance, my first google result for wc fields quotes is http://www.brainyquote.com/quotes/authors/w/w_c_fields.html.  However, reading their TOS  http://www.brainyquote.com/inquire/terms.html
In other words, by accepting this Agreement, you can use our stuff for legitimate academic, research, and reporting projects, but you can't use it to just copy and paste a bunch of our stuff on your own website. That hurts our search engine rankings, not to mention our feelings. We'd also point out that we don't pay for anything you submit to us via our submission form or suggestion email inbox simply because you provide it of your own volition. By submitting material to us, you acknowledge that you have the right to do so, and that you completely transfer to us any rights you might have had in the submission.
Read more at http://www.brainyquote.com/inquire/terms.html#RgrKzSWv6WTXVI73.99


Good luck on your project.
0
 

Author Comment

by:Dov_B
ID: 39811598
Super cool Hashgocha Protis! interestingly after googling forever I suddenly got an email asking me to make a spreadsheet to help automate a bikur cholim effort. As I began working on the bikur cholim project, lo and behold a link showing how to use ms excel to get data from a webpage showed up! It worked like a dream! acces web data from excel
0
 

Author Comment

by:Dov_B
ID: 39811611
I appreciate very much your emphasis on respecting the hard work and rights of other people. I do not put any jokes on my own website. I am a teacher and public speaker and spend a great deal of time looking for interesting things to keep my listeners awake while I lecture. The riddles quote etc. are kep for easy acces in my own excel spreadsheet on my personal hard drive.
0

Featured Post

Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will di…
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question