Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Need a reliable method to copy large amounts of files

Posted on 2013-11-12
8
643 Views
Last Modified: 2013-11-12
Hello Experts!

I have a problem in that I have to transfer hundreds of gigabytes of files from a linux server to an external usb drive and the fact that the transfer will literally take weeks to accomplish.

I have the fear that once the transfer begins, something will happen that will cause the copying to stop, and I will be stuck with part of the files transferred and the rest not.

Is there a method I can use whereby I can start the transfer, and if anything happens in mid-stream to stop it, I can restart where I left off?

Thanks!
0
Comment
Question by:OmniUnlimited
  • 4
  • 4
8 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39643249
I think "rsync" is the tool you should use.

"rsync" will transfer only those files which either do not yet exist on the target or those which have changed in size or in last-modified time.

You use it just like "rcp", specifying source and target, one of which may be remote.

There is a lot of options, best see  the "rsync" man page for details:

http://rsync.samba.org/ftp/rsync/rsync.html

A quite common way of invoking rsync is

- for local copy:

rsync -avz /source/data /target/directory

- for remote copy:

rsync -avz /local/data remotehost:/target/directory

-a implies recursion, copy symlinks as links, preserve permissions, time stamps, group and owner

-v means "verbose", and "-z" forces compression while transferring

Please note that rsync will create an additional directory level on the target if invoked the way I posted above - the copied files and directories will go to /target/directory/data.
To change this behaviour so that  creating an additional directory level at the destination is avoided add a trailing slash to the source specification, e.g.

rsync -avz /local/data/ remotehost:/target/directory

The copied files and directories will now go directly to target/directory.
0
 
LVL 17

Author Comment

by:OmniUnlimited
ID: 39643275
Thanks woolmilkporc (love the moniker by the way) for your response.  Do I have to run it verbose (I really don't need to see output, I was going to do some checking from time to time by just seeing how many files had been transferred and if the process is still going or not.  I plan on running this job in the background) and do I need compression?  (I'm transferring to a disk that is exclusively for these files.)

So are  you saying that if, for whatever reason, the copying (or in this case resyncing) stops, I can restart using the same commands, and it will restart from the point it left off?  Or does have to go through and check everything again?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39643294
You don't have to run it in verbose mode, nor do you have to use compression (just omit "v" and "z").

You restart the transfer by specifying exactly the same commands. "rsync" will not really resume where it has left off, but the check it performs is really fast, so you won't loose too much time.

If you're going to transfer really big files consider using the "--partial" option. This way partially transferred files wil be kept on the target, and rsync will transfer just the remainder of the partial file at its next invocation.
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 17

Author Comment

by:OmniUnlimited
ID: 39643304
Ok, another question: the reason I asked this question is because I already had a failure on one other disk I have.  On that disk I had used the cp command (with obviously no way to start from where it ended.)

Could I use rsync on it and catch up to where it is quickly and finish up its transfer?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39643307
Yes, of course.

"rsync" performs its checks regardless of whether the previous copy has been done by "rsync" itself or by any other tool, that's to say regardless of how the files found their way to their current location.

Oviparous Woolmilkporc

(by full name)
0
 
LVL 17

Author Comment

by:OmniUnlimited
ID: 39643317
LOL!  I think we have a new king of the beasts in our midst.

Oviparous Woolmilkporc, you have been very helpful.  One last question:  will the --partial option slow down things a bit?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39643341
Not really.

During the initial transfer this option just makes rsync keep a partiallly transferred file on the target instead of deleting the stub.

On subsequent transfers, when rsync finds a file on the target to be a fragment it adds the missing data to it. A little effort is needed to find the right point to resume, but this should be minute.
0
 
LVL 17

Author Closing Comment

by:OmniUnlimited
ID: 39643343
Oviparous Woolmilkporc, you certainly rule in this kingdom.  Thank you so much for your expert help.  It is much appreciated!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
ftp to port 21 4 53
Linux Desktop suggestion for Dell Inspiron 3043 13 53
Slow computer- outside access? 14 52
number in printf 13 32
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question