Solved

wget overwriting files

Posted on 2011-02-26
10
651 Views
Last Modified: 2012-05-11
I am issuing a wget statement:

wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files

The problem is that it's wasting time downloading files that it already has.  If the files match, I don't want to take the time to download it.
0
Comment
Question by:hrolsons
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 3
10 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34989146
Hi,

that's a case for timestamping.

Use "-S" for the first download ("preserve timestamps") and "-N" ("newer") for subsequent downloads.

With "-N" ...

... Wget will ask the server for the last-modified date. If the local file has the same timestamp as the server, or a newer one, the remote file will not be re-fetched. However, if the remote file is more recent, Wget will proceed to fetch it.

The above quote is from here:
http://www.gnu.org/software/wget/manual/wget.html#Time_002dStamping-Usage

Good luck!

wmp
0
 

Author Comment

by:hrolsons
ID: 34989201
So, I would change it to :

wget -c -np -r -S -l inf -w 5 --limit-rate=200K http://www.mysite.com/files

for the first, and then:

wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files

Bummer that I have to start this huge download back over again.
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34997534
I don't think so.  From the wget man page:

-N
--timestamping
           Turn on time-stamping.
-S
--server-response
           Print the headers sent by HTTP servers and responses sent by FTP
           servers.

The "-N" already does the timestamping.

What format is the disk filesystem that you are downloading to?  Can it preserve the permissions and timestamps of the original system?  (also, would rsync be an option?)
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 

Author Comment

by:hrolsons
ID: 34998674
I am downloading to  Windows XP, running GnuWin.

Yes, rsync would be an option.  What would I gain?
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34998991
Is it NTFS or FAT that you are downloading to?
0
 

Author Comment

by:hrolsons
ID: 34999080
NTFS
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34999291
Here is a page talking about the differences between Windows and Linux with respect to timestamps:
http://samba.anu.edu.au/rsync/daylight-savings.html

That might be the problem.  Or that the webserver is not giving out the times (see http://permalink.gmane.org/gmane.comp.web.wget.general/9977)

With rsync you usually use a shell rather than the webserver to download the files so the timestamps might work better.  

If that fails, then with rsync, there is the option "-c" to only compare file checksums rather than time and sizes when doing a transfer.  
0
 

Author Comment

by:hrolsons
ID: 34999375
It seems to be working right now, which makes no sense.  Also, I had to add a trailing backslash to my original command to get it to work right.
0
 

Accepted Solution

by:
hrolsons earned 0 total points
ID: 35186498
It was all about that trailing backslash:

This is incorrect:
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files

This works:
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files/
0
 

Author Closing Comment

by:hrolsons
ID: 35221365
It worked
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
The purpose of this article is to show how we can create Linux Mint virtual machine using Oracle Virtual Box. To install Linux Mint we have to download the ISO file from its website i.e. http://www.linuxmint.com. Once you open the link you will see …
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question