Need a reliable method to copy large amounts of files

Hello Experts!

I have a problem in that I have to transfer hundreds of gigabytes of files from a linux server to an external usb drive and the fact that the transfer will literally take weeks to accomplish.

I have the fear that once the transfer begins, something will happen that will cause the copying to stop, and I will be stuck with part of the files transferred and the rest not.

Is there a method I can use whereby I can start the transfer, and if anything happens in mid-stream to stop it, I can restart where I left off?

Thanks!
LVL 17
OmniUnlimitedAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

woolmilkporcCommented:
I think "rsync" is the tool you should use.

"rsync" will transfer only those files which either do not yet exist on the target or those which have changed in size or in last-modified time.

You use it just like "rcp", specifying source and target, one of which may be remote.

There is a lot of options, best see  the "rsync" man page for details:

http://rsync.samba.org/ftp/rsync/rsync.html

A quite common way of invoking rsync is

- for local copy:

rsync -avz /source/data /target/directory

- for remote copy:

rsync -avz /local/data remotehost:/target/directory

-a implies recursion, copy symlinks as links, preserve permissions, time stamps, group and owner

-v means "verbose", and "-z" forces compression while transferring

Please note that rsync will create an additional directory level on the target if invoked the way I posted above - the copied files and directories will go to /target/directory/data.
To change this behaviour so that  creating an additional directory level at the destination is avoided add a trailing slash to the source specification, e.g.

rsync -avz /local/data/ remotehost:/target/directory

The copied files and directories will now go directly to target/directory.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
OmniUnlimitedAuthor Commented:
Thanks woolmilkporc (love the moniker by the way) for your response.  Do I have to run it verbose (I really don't need to see output, I was going to do some checking from time to time by just seeing how many files had been transferred and if the process is still going or not.  I plan on running this job in the background) and do I need compression?  (I'm transferring to a disk that is exclusively for these files.)

So are  you saying that if, for whatever reason, the copying (or in this case resyncing) stops, I can restart using the same commands, and it will restart from the point it left off?  Or does have to go through and check everything again?
0
woolmilkporcCommented:
You don't have to run it in verbose mode, nor do you have to use compression (just omit "v" and "z").

You restart the transfer by specifying exactly the same commands. "rsync" will not really resume where it has left off, but the check it performs is really fast, so you won't loose too much time.

If you're going to transfer really big files consider using the "--partial" option. This way partially transferred files wil be kept on the target, and rsync will transfer just the remainder of the partial file at its next invocation.
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

OmniUnlimitedAuthor Commented:
Ok, another question: the reason I asked this question is because I already had a failure on one other disk I have.  On that disk I had used the cp command (with obviously no way to start from where it ended.)

Could I use rsync on it and catch up to where it is quickly and finish up its transfer?
0
woolmilkporcCommented:
Yes, of course.

"rsync" performs its checks regardless of whether the previous copy has been done by "rsync" itself or by any other tool, that's to say regardless of how the files found their way to their current location.

Oviparous Woolmilkporc

(by full name)
0
OmniUnlimitedAuthor Commented:
LOL!  I think we have a new king of the beasts in our midst.

Oviparous Woolmilkporc, you have been very helpful.  One last question:  will the --partial option slow down things a bit?
0
woolmilkporcCommented:
Not really.

During the initial transfer this option just makes rsync keep a partiallly transferred file on the target instead of deleting the stub.

On subsequent transfers, when rsync finds a file on the target to be a fragment it adds the missing data to it. A little effort is needed to find the right point to resume, but this should be minute.
0
OmniUnlimitedAuthor Commented:
Oviparous Woolmilkporc, you certainly rule in this kingdom.  Thank you so much for your expert help.  It is much appreciated!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.