Solved

Web tool for scraping a site

Posted on 2014-11-21
4
127 Views
Last Modified: 2015-01-02
The browser print functions do not provide a condensed compact document.
I am seeking a Web tool where I would enter a URL and be returned all the text and pictures from All pages  on that site,
This text & pictures would be used for printing & for easy reading when away from devices.
Also to put the text into a word doc for further analysis & processing.
It will Strip out (filter) all html & extraneous spacing.
For instance I am investigating a company & instead  of plowing through all of their pages of fancy Graphics, it would be nice to have just the facts from the websites pages.
0
Comment
Question by:AndyPandy
4 Comments
 
LVL 17

Accepted Solution

by:
selvol earned 250 total points
ID: 40458911
I have used Offline explorer for a decade+.
Exellent program and the best customer support I have encountered.

Follow this example after installing Offline explorer (it has a 30 unrestricted trial).

Example
http://www.experts-exchange.com/Other/New_Net_Users/Q_24820024.html?sfQueryTermInfo=1+10+30+explor+offlin+selvol

On the first picture. There is a menu to the left side I have marked in red.
Un check the files you do not want to save.

In your case leave images checked and  Under "User defined"  on the same menu
add the extension for the files you want to save.

Experiment a bit I'll try to get to any questions you have.,

Download OE
http://www.metaproducts.com/OEPR.html



SElvol
0
 
LVL 23

Assisted Solution

by:Eirman
Eirman earned 250 total points
ID: 40459392
HTtrack is a well established open-source website scraping tool .... ideal for offline browsing.
You choose the depth (levels) that you want to scrape/store, whether you want to store graphics, external links etc.

http://www.httrack.com/
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Problem to file 3 50
Redirect HTTP to HTTPS in Azure 4 39
Can't find where tracking pixels are coming from 2 41
WEB Farm 6 26
I've been asked to discuss some of the UX activities that I'm using with my team. Here I will share some details about how we approach UX projects.
Boost your ability to deliver ambitious and competitive web apps by choosing the right JavaScript framework to best suit your project’s needs.
This video teaches users how to migrate an existing Wordpress website to a new domain.
The viewer will learn how to dynamically set the form action using jQuery.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now