Solved

Web tool for scraping a site

Posted on 2014-11-21
4
132 Views
Last Modified: 2015-01-02
The browser print functions do not provide a condensed compact document.
I am seeking a Web tool where I would enter a URL and be returned all the text and pictures from All pages  on that site,
This text & pictures would be used for printing & for easy reading when away from devices.
Also to put the text into a word doc for further analysis & processing.
It will Strip out (filter) all html & extraneous spacing.
For instance I am investigating a company & instead  of plowing through all of their pages of fancy Graphics, it would be nice to have just the facts from the websites pages.
0
Comment
Question by:AndyPandy
4 Comments
 
LVL 17

Accepted Solution

by:
selvol earned 250 total points
ID: 40458911
I have used Offline explorer for a decade+.
Exellent program and the best customer support I have encountered.

Follow this example after installing Offline explorer (it has a 30 unrestricted trial).

Example
http://www.experts-exchange.com/Other/New_Net_Users/Q_24820024.html?sfQueryTermInfo=1+10+30+explor+offlin+selvol

On the first picture. There is a menu to the left side I have marked in red.
Un check the files you do not want to save.

In your case leave images checked and  Under "User defined"  on the same menu
add the extension for the files you want to save.

Experiment a bit I'll try to get to any questions you have.,

Download OE
http://www.metaproducts.com/OEPR.html



SElvol
0
 
LVL 23

Assisted Solution

by:Eirman
Eirman earned 250 total points
ID: 40459392
HTtrack is a well established open-source website scraping tool .... ideal for offline browsing.
You choose the depth (levels) that you want to scrape/store, whether you want to store graphics, external links etc.

http://www.httrack.com/
0

Featured Post

Space-Age Communications Transitions to DevOps

ViaSat, a global provider of satellite and wireless communications, securely connects businesses, governments, and organizations to the Internet. Learn how ViaSat’s Network Solutions Engineer, drove the transition from a traditional network support to a DevOps-centric model.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
move widget title down 4 26
PHP Installer 5 38
Forms that calculate points for wordpress 10 53
QQ problem 22 44
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
Use Wufoo, an online form creation tool, to make powerful forms. Learn how to choose which pages of your form are visible to your users based on their inputs. The page rules feature provides you with an opportunity to create if:then statements for y…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question