?
Solved

wget saving web page help

Posted on 2011-03-23
4
Medium Priority
?
125 Views
Last Modified: 2016-05-10
Hello,

I need a bit of help with wget.

When a user submits a URL I want to use wget to create an archive/backup of that specific page.  I want to include all the contents of the page i.e. css, images, js etc...

I have the following code and it's working about 90% of what I need.

exec("wget -e robots=off --limit-rate=250k -F -P /home/USERNAME/public_html/results/". $rnd1 ."/". $rnd2 ."/"." -p -k -E ". $site_url ."");

Open in new window


The problem with this code is if a user submits a URL like this:

http://techcrunch.com/2011/03/22/digital-textbook-startup-inkling-nabs-multi-million-dollar-investment-from-mcgraw-hill-and-pearson/

The backup will be structured this way:

[ techcrunch.com - Folder ] / [ 2011 - Folder ] / [ 03 - Folder ] / [ 22 - Folder ] / [ digital-textbook-startup-inkling-nabs-multi-million-dollar-investment-from-mcgraw-hill-and-pearson - Folder ]

Techcrunch File
The html will load all the images from main site (techcrunch.com)

However if the user submits a URL like this:

http://blog.joerogan.net/archives/2889

The backup will contain all the images, css, etc...

Joerogan File


I hope this makes sense.  If not I will try to clarify.
0
Comment
Question by:jambla
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 9

Accepted Solution

by:
absx earned 2000 total points
ID: 35198540
Hi,

There's just too many features in wget to ever get the command correct manually. I would suggest playing around with a tool like wgetGUI (http://www.jensroesner.de/wgetgui/), until you have a set of options that does exactly what you need, and then picking these arguments for the script.
0
 

Author Comment

by:jambla
ID: 35202147
Hello absx,

Thanks for the link, I will have a look to see if it can help me out.


Any one else have any suggestions?
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why Shell Scripting? Shell scripting is a powerful method of accessing UNIX systems and it is very flexible. Shell scripts are required when we want to execute a sequence of commands in Unix flavored operating systems. “Shell” is the command line i…
Many old projects have bad code, but the budget doesn't exist to rewrite the codebase. You can update this code to be safer by introducing contemporary input validation, sanitation, and safer database queries.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
Suggested Courses
Course of the Month11 days, 17 hours left to enroll

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question