hrolsons
asked on
wget overwriting files
I am issuing a wget statement:
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
The problem is that it's wasting time downloading files that it already has. If the files match, I don't want to take the time to download it.
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
The problem is that it's wasting time downloading files that it already has. If the files match, I don't want to take the time to download it.
ASKER
So, I would change it to :
wget -c -np -r -S -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
for the first, and then:
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
Bummer that I have to start this huge download back over again.
wget -c -np -r -S -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
for the first, and then:
wget -c -np -r -N -l inf -w 5 --limit-rate=200K http://www.mysite.com/files
Bummer that I have to start this huge download back over again.
I don't think so. From the wget man page:
-N
--timestamping
Turn on time-stamping.
-S
--server-response
Print the headers sent by HTTP servers and responses sent by FTP
servers.
The "-N" already does the timestamping.
What format is the disk filesystem that you are downloading to? Can it preserve the permissions and timestamps of the original system? (also, would rsync be an option?)
-N
--timestamping
Turn on time-stamping.
-S
--server-response
Print the headers sent by HTTP servers and responses sent by FTP
servers.
The "-N" already does the timestamping.
What format is the disk filesystem that you are downloading to? Can it preserve the permissions and timestamps of the original system? (also, would rsync be an option?)
ASKER
I am downloading to Windows XP, running GnuWin.
Yes, rsync would be an option. What would I gain?
Yes, rsync would be an option. What would I gain?
Is it NTFS or FAT that you are downloading to?
ASKER
NTFS
Here is a page talking about the differences between Windows and Linux with respect to timestamps:
http://samba.anu.edu.au/rsync/daylight-savings.html
That might be the problem. Or that the webserver is not giving out the times (see http://permalink.gmane.org/gmane.comp.web.wget.general/9977)
With rsync you usually use a shell rather than the webserver to download the files so the timestamps might work better.
If that fails, then with rsync, there is the option "-c" to only compare file checksums rather than time and sizes when doing a transfer.
http://samba.anu.edu.au/rsync/daylight-savings.html
That might be the problem. Or that the webserver is not giving out the times (see http://permalink.gmane.org/gmane.comp.web.wget.general/9977)
With rsync you usually use a shell rather than the webserver to download the files so the timestamps might work better.
If that fails, then with rsync, there is the option "-c" to only compare file checksums rather than time and sizes when doing a transfer.
ASKER
It seems to be working right now, which makes no sense. Also, I had to add a trailing backslash to my original command to get it to work right.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It worked
that's a case for timestamping.
Use "-S" for the first download ("preserve timestamps") and "-N" ("newer") for subsequent downloads.
With "-N" ...
... Wget will ask the server for the last-modified date. If the local file has the same timestamp as the server, or a newer one, the remote file will not be re-fetched. However, if the remote file is more recent, Wget will proceed to fetch it.
The above quote is from here:
http://www.gnu.org/software/wget/manual/wget.html#Time_002dStamping-Usage
Good luck!
wmp