Solved

Using rysync to backup linux machine to windows network file share

Posted on 2014-11-06
6
392 Views
Last Modified: 2014-12-02
I need to come up with a method of backing up an linux (Centos 5.11) machine to a windows network file share. I'm very much not a linux admin, so I wanted to see what some of the experts out there would suggest doing.

For one of the linux machines that I need to backup the data is located at /data and the network share is already mounted to /mnt/filesharename. I want a cron job to run daily that will automatically create a folder on /mnt/fileshare using the date as the folder name (creating a log file in the same location), and copy everything for the first run, and then only files that changed (incremental) going forward. The files for back up don't necessarily need to be compressed. The folder created using the date name would be kept for 5 years.

The closet article about doing this I've found is here, http://www.marksanborn.net/howto/use-rsync-for-daily-weekly-and-full-monthly-backups/, but there are some difference in what is done in the link and what I want to do.

Any suggestions of ways to use rsync to accomplish this?
0
Comment
Question by:futureman0
  • 3
  • 2
6 Comments
 
LVL 62

Expert Comment

by:gheist
ID: 40427226
RSYNC transfers file differences, you need to run something like cwrsync on windows site to make it save network bandwidth. Copying over to share saves nothing over simple file copy.
0
 
LVL 29

Expert Comment

by:serialband
ID: 40427487
You can follow the instructions in the article and name your folders with the date command in your cron calls.

If you're making full backs, rsync isn't going to help you.  It will actually slow you down when you make each first copy.

If you're doing a lot of data, and saving it long term, I suggest using rsnapshot to save some space.  It would also be a bit easier to backup to a linux file server than a windows one, because you can then write a script to easily create hard links to each folder with the correct dates that you want for the backup folder instances.  So you would create a snapshot each day and you'd just use the ln command to create a link, with the date to each snap shot folder.

rsnapshot makes use of rsnyc and basically deduplicates the data each day.  It makes use of the previous day's saved snapshot and copies only the differences.  It should be more space efficient than rsync as described by the post.  As gheist mentions, you'll also need to run cwrsync server on Windows to benefit from the rsync.  http://www.rsnapshot.org/faq.html

http://www.rsnapshot.org/
0
 
LVL 62

Expert Comment

by:gheist
ID: 40427805
If you use rsync to sync files to share it will actually read back all the destination files and write only the changes, i.e transfer more data over share than blind copy.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 

Author Comment

by:futureman0
ID: 40429290
Thanks all! I appreciate all of the suggestions. I've been playing around with using rsnapshot as serialbond suggested.  I found this article (http://how-to.linuxcareer.com/guide-to-rsnapshot-and-incremental-backups-on-linux) and have tried out a couple of different scenarios.

Using that article as a base, what type of configuration would you suggest would accomplish the following:

* Using rsnapshot to backup from linux host /data to windows file share at /mnt/fileshare. Then the windows file share is backed up by Symantec Backup Exec to disk.
* Each snapshot would be kept for up to 5 years (taken once per day), the purpose being that any data generated on this computer from a scientific instrument needs to saved per retention rules for 5 years even if it's deleted off the machine.
* Minimizing the size of each snapshot, only copying what has changed from the previous day

I've tried setting the daily interval to day to 1825 and hourly, weekly, monthly to 1. Then run manually rsnapshot daily and daily.0, daily.1, daily.2 .... and so on are generated. What I've found is that in each snapshot folder has all the files, not just the ones that have changed. I'm a bit unclear what hard links are, so that may have something to do with it.  

As rsnapshot renames the different snapshot folders when the next one is made, what could I put in rsnapshot.conf that would put the snapshot in a folder (within daily.1 .2 .3 ...) using the date when it was made?
0
 
LVL 29

Accepted Solution

by:
serialband earned 250 total points
ID: 40429717
The program creates links, like Time machine does.  The files that haven't changed get a link to it and really isn't copied.  It's actually the same file, but just referenced from various folder locations, at least on any Unix type file system it is.  I'm not sure what it does on Windows.  The deltas are copied to the folder and get kept in the new folder.  It's what SANS and NAS vendors call deduplication.  This ability has always existed on Unix filesystems.

I don't know exactly how NTFS handles those as I mostly copied them going the other way.  That's partly why I mentioned creating them on a Linux fileserver instead.  I know Windows uses those clunky shortcut files that work in explorer, but not command line and they don't have a command line tool to create an equivalent to a hard link like Unix does.

EDIT:  Addendum
fsutil.exe on Windows creates the hard link.  I've never used that before, so it looks like NTFS has hard links too.
https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/fsutil_hardlink.mspx
0
 
LVL 62

Assisted Solution

by:gheist
gheist earned 250 total points
ID: 40429991
To gain efficency for rbackup or rsync you need to have (cw) rsync share, not smb mount. Sure rbackup and rsync will work over mounted windows share but then it will read all backup data back to linux, compare with active data on the disk and write out differences.
With rsync protocol rsync server will checksum data it has on the share rsync client wil checksum data it has, quickly compare checksums and transfer only changed data pieces.
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

VM backup deduplication is a method of reducing the amount of storage space needed to save VM backups. In most organizations, VMs contain many duplicate copies of data, such as VMs deployed from the same template, VMs with the same OS, or VMs that h…
Microservice architecture adoption brings many advantages, but can add intricacy. Selecting the right orchestration tool is most important for business specific needs.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question