Enterprise backup in a heterogeneous network is a subject full of complications and restrictions. Issues such as filename & path structure, attributes and extended metadata always tend to complicate the subject to the extent where either the solution costs the earth, or simply doesn’t function the way you want; something that normally becomes obvious around the time of a restore.
My situation isn’t all that unusual; I work for a medium sized employer in the UK, a business which relies in the majority on a windows server infrastructure and is geographically spread across multiple sites. Recently I was asked to provide a Apple Server infrastructure to allow a more efficient co-existence between the company’s windows and Apple desktop fleet, which took the form of a OSX 10.6 server implementation and some interesting integration works between the MS and Apple environments.
My employers network (or at least the part relevant to this article) is based on a windows server 2003 file server & a windows 2003 backup server, running Symantec’s backup exec 10, writing data to a LTO3 tape robot. Symantec have just announced their support for Mac OSX10.6 clients in the ‘almost released’ Backup exec 2010 R2 – but if your employer is like mine and either doesn’t have or is unwilling to pay for the most recent version, or hasn’t maintained an active support agreement (allowing upgrades) then this isn’t much use.
My employers requirement is to allow users to store user network home folders and departmental shares on the OSX10.6 server and for my team to provide a method of allowing nightly, weekly and monthly backups an offsite location. I should follow this up and state that I have complete flexibility in the method used, while this isn’t true in every employer its always worth checking with any policy or regulatory restrictions on backups before you begin an implementation.
In developing this solution the backup ‘rule of three
’ should be kept in mind, that is :-
• Keep your files in three different locations
• Backup your files following three different schedules
• Utilize three different types of media
Ensuring my solution was able to meet the above criteria was fairly trivial. Firstly my employer was geographically diverse and offered three specific locations, in addition to all sites using an offsite location for data storage; in theory we had 6 discreet locations in which our data could be stored. Scheduling could be taken care of by using a grandfather, father & son backup routing (daily, weekly & monthly backups). Lastly the media requirement would be inherently met as backups would happen across different disk and tape types.
Let's move onto specifics, in order to keep my employer details safe I’ll use some level of generalization when describing the network layout and server names. For the purposes of this article we will use the following nomenclature:-
Local Site -> The site in which you're implementing the backup system, more simply the location of the data you need to backup from.
Remote Site(s) -> The site(s) to which you will be backing up your data.
File Server -> A windows 2003 file server to which the data will be copied as an intermediary step prior to being saved to tape.
( additionally, most institutions will have a geographic naming convention, in order to maintain my employers anonymity I will be using ‘local’ and ‘remote’ as the site names, obviously you should substitute your local site codes, specifically if you intend to apply the guide to more than one site).
The method we will be using to backup the servers is rsync over an SSH tunnel scheduled with CRONTAB (using a daily, weekly and monthly schedule). The data will be archived from a OSX server on the local site to a mirror on one or more remote sites.
The first step is optional, but highly recommended. The version of rsync included in OSX can be a little slow and inefficient. Luckily there is the option to upgrade, the choice is yours but it's highly recommended.
Assuming you haven't made any system modifications or installed a more recent version of rsync you should be greeted by the standard osx version
rsync version 2.6.9 protocol version 29
Copyright (C) 1996-2006 by Andrew Tridgell, Wayne Davison, and others.
The most recent version of rsync (3.0.7 at the time of writing) is available from samba.org and should be obtained together with the latest patch archive. From a folder (I used the desktop, you can use anywhere) run the following commands.
curl -O http://rsync.samba.org/ftp/rsync/src/rsync-3.0.7.tar.gz
curl -O http://rsync.samba.org/ftp/rsync/src/rsync-patches-3.0.7.tar.gz
tar -xzvf rsync-3.0.7.tar.gz
tar -xzvf rsync-patches-3.0.7.tar.gz
Compiling rsync at this stage could cause problems, as those familiar with OSX will testify, 'a file isnt just a file'. In OSX there is a fair amount of extended attributes and meta data attached to a file and we need to be careful to apply the relevant patches to rsync which allow it to understand those elements.
patch -p1 <patches/fileflags.diff
patch -p1 <patches/crtimes.diff
Once the above process has completed the source needs to be compiled. At this stage its worth mentioning that a number of development tools are used during the compilation. These tools aren't installed within a standard OSX Server installation. To provide the extra functionality install the Xcode.mpkg located in the 'other Installs' section of the OSX Server CD. Once installed you can go ahead and begin the rsync compile process :-
sudo make install
Once competed you can test the new version by running
rsync version 3.0.7 protocol version 30
Copyright (C) 1996-2009 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
append, ACLs, xattrs, no iconv, symtimes, file-flags
At this stage its important to check that the ACL's and xattrs capabilities are present, without those you will run into trouble should a restore ever be needed.
Lets summarize, at this stage we have identified that we will use rsync for the backup process, and will backup our data to one or more remote servers. We have compiled a version of rsync from source and added two patches to allow additional capabilities needed for OSX file information. The next stage of the process is to allow password-less authentication between the local and remote servers involved in the backup 'agreement'.
(I should make a point of mentioning that I have specifically chosen to perform my system backups as the root user. This may not be 100% best practice in terms of security, but we have a specific need that this is so. You should consider what user to use, and alter the instructions to fit).
sudo ssh-keygen -t dsa -f ./.ssh/id_dsa
This will generate a id_dsa and a id_dsa.pub pair of key's, the first is the private key, the second is a public partner and is the section you will copy to the remote server to allow password-less and transparent authentication and login.
Next the public key part should be copied to the authorized_keys file on the remote system, granting you access to authenticate using a key, rather than traditional username & password.
sudo cat ./.ssh/id_dsa.pub | ssh email@example.com 'cat - >> ~/.ssh/authorized_keys'
As a test from the local server do
You should be prompted to accept the server fingerprint once for the root user and what follows should be the remote server login prompt. This is confirmation that you have a successful SSH connection between the local and remote machines. The instructions should be followed in reverse for the remote -> local backup and altered for any other site<->site relationships you have within your business.
Backup Folder Structure
To keep the article simple i'm going to provide an overview of the structure of our user network home folders, the same principles will apply to departmental or project based file shares but focussing on one keeps everything simple. On each local server we have a share point for network homes. For illustrative purposes we can call this UsersLOCAL (local being the site code).
On the remote server a similar nomenclature will be used will be used, one for each backup schedule
Again, replace the 'LOCAL' for the site code of the LOCAL server you have been working on during this article - or the naming convention of choice.
At this point it might be a good idea to test the process, create a folder named something obvious i always prefer 'if this works this folder should have copied' make sure its created in the UsersLOCAL folder on the LOCAL server and then run the following set of commands.
/usr/local/bin/rsync -aNHAXx --protect-args --fileflags --force-change --rsync-path="/usr/local/bin/rsync" /Volumes/DATA1/UsersLOCAL firstname.lastname@example.org:/Volumes/DATA1/UsersLOCAL.backup.nightly --delete
If everything is working you shouldn't get any prompts and the folder you created above should be copied into the .backup.nightly folder on the remote system, have a check that all ACL's and attributes have been copied.
At this point you should be ready to automate the process, and for this we will be using CRON. make sure you're logged in as root and execute the following.
By default the VI editor will be started, those unfamiliar should probably do some reading before messing, i find this
a good article. In addition if your not familiar with the concepts of CRON you should probably look here
Again, to keep the article as brief as possible i'll concentrate on the nightly and weekly backup entries, but adding a monthly one uses the same process so you should be able to figure it out. From the crontab editor opened above add the following lines.
0 23 * * * /usr/local/bin/rsync -aNHAXx --protect-args --fileflags --force-change --rsync-path="/usr/local/bin/rsync" /Volumes/DATA1/UsersLOCAL email@example.com:/Volumes/DATA1/UsersLOCAL.backup.nightly --delete
0 01 * * 6 /usr/local/bin/rsync -aNHAXx --protect-args --fileflags --force-change --rsync-path="/usr/local/bin/rsync" /Volumes/DATA1/UsersLOCAL firstname.lastname@example.org:/Volumes/DATA1/UsersLOCAL.backup.weekly --delete
This will run an automated copy of our rsync command at 2300 hours each night, copying the data to the nightly folder on the remote server. In addition at 0100 hours on friday morning each week the command will run, making a copy into the weekly folder (which is useful if you run weekly backups on a friday evening).
The final stage of the process is to backup the resultant files to tape, this will depend entirely on your situation and IT policy and is therefor outside of the scope of this article. For my business i'll be following the guide here
to create a DMG image, using RSYNC to copy the files into the DMG image and finally transfer the completed image to a share on our windows 2003 file server - which can be transferred to tape at a later stage.
In summary we have created a folder structure that will be used to backup a copy of the Users Network home folder share to a remote server on a daily/weekly and monthly basis. We have configured SSH to allow key based authentication to avoid the need for passwords and allow automation. We have re-complied rsync to improve performance and include extended attribute support and we have automated the process via CRON. While not necessarily tidy, the system does work and avoids the need for additional expensive investment while being reliable and easy to maintain.
Please keep an eye on my profile for other 'Apple in the enterprise' articles coming soon.