Articles February 19, 2005 at 7:20 pm

rsync Backups on OS X

Clones across the network

With a little bit of work you can have rsync cloning a primary server to a backup server. If the primary server ever fails, all you have to do is reboot the backup from the cloned system and away you go.

Ed. Note: This is a very nice step-by-step for getting rsync working across the network. This is not necessarily hard to setup. However, I would strongly urge you to do this on a test system before deploying. Rsync, in whatever flavor you use, can be rather cryptic in configuration and use.

rsync (http://rsync.samba.org) is a favorite tool of many UNIX sysadmins for doing snapshot-style backups of hosts across the network. It’s more useful than other utilities like tar or scp because it’s smart about only copying the differences between files across a connection. rsync also has the ability to run as a server, making it possible to sync many slaves up to a master or vice-versa. We recently migrated the majority of our services (mail, web, directory, etc) from Linux to Mac OS X and I was surprised at just how tricky it was to get rsync working properly on OS X.

Matthew Phillips has a pretty good tutorial on using rsync for backups on OS X at http://www.labf.org/~egon/mac_backup/, but he describes the process with rsyncX (http://archive.macosxlabs.org/rsyncx/rsyncx.html), which suffers from a number of bugs that make it unreliable.

I’ve put together a little tutorial here on how I got snapshot backups working.

Getting Started

For this tutorial, we’ll be creating a snapshot fail-over server on an older G4 which we want to mirror a Xserve. In the event of a failure on our Xserve, we’d like to be able to change the startup disk on our G4 to the partition mirroring our Xserve system disk, reboot and be back up on our feet with last night’s configuration. This is an ideal solution for a small business with a tight budget but high-availability requirements.

On the xserve, we have the following partitions which need to be mirrored:

/dev/disk0s3 on / (local, journaled)
/dev/disk1s10 on /Volumes/Users (NFS exported, local, journaled)
/dev/disk1s12 on /Volumes/Space (NFS exported, local, journaled)

We’ve put a 120GB and a 40GB hard drive in the G4 and created the following partitions:

/dev/disk1s10 on / (local, journaled)
/dev/disk1s12 on /Volumes/System (local, journaled)
/dev/disk1s14 on /Volumes/Users (local, journaled)
/dev/disk0s5 on /Volumes/Space (local)

Having separate partitions is necessary for the quick failover to work and it keeps things neat on the reboot with Share Points and whatnot. On the Xserve, our / partition is labeled System and on the G4 it’s labeled Backup. This should keep confusion to a minimum.

Getting rsync to work on OS X

Once your machines are configured and the OS is installed the next trick is getting rsync to work properly with HFS+ filesystems. There are some issues which make this a little complicated. The first, of course, is the resource forks. rsync_hfs or rsyncX address this issue by bundling up the resource fork with the data fork and tossing both across the wire to a (hopefully) resource fork-aware rsync on the other side. This is good (and it comes with a neat GUI), but there are two problems with rsyncX. The first is that rsync (designed on Linux) requires a Linux-friendly lchown system call for changing the permissions on a symbolic link and gives us a “File not found” error when running under Darwin. This issue is documented by Ulrich Hoffmann at http://www.honkbude.org/article.php?story=20040801233916583&mode=print (with patch to boot!). The second issue is that the latest release of rsyncX is 2.1, which suffers from a nasty double-free bug in the bundled z-lib. Not only does this create a security issue; it also causes backups to fail when rsync bombs out.

Ed. Note: I use rsyncX a lot on a local basis, without going across the network, with no issues. Of course local-only is much less complex than a network synch, and it’s certainly true true that rsyncX is rather crusty in it’s base version of rysnc. I’ve found that for admins new to rsync, using the rsyncX GUI’s script generator is a decent way to get a feel of what’s going on. From there you can move on to writing the scripts yourself with whatever flavor of rsync you choose.

To fix the HFS+ problem, D Andrew Reynhout has proposed a neat solution at http://www.quesera.com/reynhout/misc/rsync+hfsmode/. He separates the data and resource fork into AppleDouble format before transmission. Using his code has two benefits. First off, his source code is easy to get to and he keeps his diffs up to date with the latest release of rsync proper, which makes it easy for us to get around the z-lib issue, which was patched in 2.6.3. Second off, since we’re splitting the files into AppleDouble format, we can receive the files on non-forked filesystems, like UFS, ext3, and Xsan.

Reynhout releases patched rsync binaries on his site, but as far as I know, they don’t include the lchown patch. So, we’ll build from scratch. First, grab all the source we need:

Unpack the rsync sources and apply the patches as best as you can. Hoffman’s patch is against an earlier version of rsync than 2.6.3 and I had to manually edit rsync.c. It’s not too tricky though, he just bails out of the chown process if the file is a symbolic link. Reynhout’s patch applies cleanly to the current sources. When compiling rsync on OS X, you have to specify LDFLAGS="-framework CoreServices" – autoconf doesn’t pick that up. rsync‘s Makefile supports the DESTDIR environment variable, so making a package with PackageMaker is a snap. Just set DESTDIR to some friendly location like /tmp/rsync/Files and use that as your “Files” directory in the package. Make a /tmp/rsync/Resources directory, throw in a copy of the README and the GPL for good measure, and you’re done. Once you’ve made the package, you can use Remote Desktop to push it out to your machines or install it the old fashioned way.

The package I made is available here:

  • http://people.aiscomputers.com/~msolberg/rsync.dmg
  • Setting up SSH

    Once we’ve gotten past the rsync hurdle, we need to be able to to SSH between the machines as root without a password (unless you intend to stand at the console and type in your root password every morning at 3:00AM). There are some security concerns any time you set up a login process like this, but chances are good that if someone has root on one of your boxes, they can get root on the others pretty easily. There are two ways to allow SSH between machines without a password, using host-based authentication and using authorized keys. Host-based is a real pain in the neck so we’ll use authorized keys, which works out of the box on OS X.

  • First, we’ll enable the root account on the machine with OS X client installed by setting a password for root
    $ sudo passwd root
    
  • Next, we’ll generate a DSA key on the Xserve for root‘s account. Don’t set a password for the key; that would defeat the purpose.
    # ssh-keygen -t dsa -f ~/.ssh/id_dsa
    
  • Then, we’ll copy that DSA key over to the client (192.168.1.5 in this case) and append it to the .ssh/authorized_keys file.
    # scp ~/.ssh/id_dsa.pub [email protected]:
    # ssh [email protected]
    # mkdir -p ~/.ssh
    # cat id_dsa.pub >> ~/.ssh/authorized_keys
    # rm id_dsa.pub
    # chmod 700 .ssh
    
  • Now we should be able to log out of the client machine and log back in without a password. If you’re a security nut, you can modify authorized_keys to allow root on the Xserve to run only the rsync command, but that’s outside the scope of this howto.

    Writing a sync script

    The basic form of an rsync command looks something like this:

    # rsync -e ssh --archive --update --delete --verbose ${MYDIR} [email protected]:${MYDIR}
    

    The command line options are all well documented in the rsync man page. rsync does that funny thing with the trailing slashes that ditto does though, so note that you’ll have to remember to add a trailing slash to the source and leave it off of the destination if you want it to act like you’d expect it to. We’ll be syncing System (our / on the Xserve), Users, and a sub-directory of Space over to the G4. To simplify the code, we’ll place it in a for loop:

    for dir in System Users Space/admin
    do
      rsync -e ssh --archive --update --delete --verbose /Volumes/${dir}/ [email protected]:/Volumes/${dir}
    done
    

    Ed. Note: The author’s boot volume is “System” not just the System folder. Also he has redirected his users’ folders to a parition called Users and has a third paritition, Space, where his admin home folder resides. This is probably not how you have your server setup, so you’ll want to change the Script to reflect just the volumes that you need. Again, this is where you want to get this working in the lab before putting it on your production systems.

    If you run this chunk of code as it stands, which I thought would work pretty well, your rsync will spin off towards the bit bucket in an infinite loop from Hell as it ponders syncing some of the entries in the /dev directory and whomever’s machines it finds in /Network. To get around this, we’ll need a Mac OS X friendly exclude file, which should look something like this:

    tmp/*
    proc
    Network/*
    Volumes/*
    cores/*
    */.Trash
    dev/*
    afs/*
    automount/*
    private/tmp/*
    private/var/vm/*
    private/var/run/*
    private/var/spool/postfix/private/*
    private/var/spool/postfix/public/*
    private/var/imap/socket/*
    

    If you’re astute, you’ll also notice that rsync on our Xserve is trying to use the rsync in /usr/bin on the G4. No big problem, but to get arround it, we’ll set the --rsync-path variable to /usr/local/bin/rsync. The command line is starting to get hairy, so we’ll pull out the options and put them into environment variables:

    RSYNC="/usr/local/bin/rsync"
    RSYNCOPTS="-e ssh --archive --update --delete --verbose --hfs-mode=appledouble"
    EXCLUDEFILE="/Library/Scripts/Sync/sync_exclude"
    PUSH="System Users Space/admin"
    LOCALPREFIX="/Volumes"
    REMOTEPREFIX="[email protected]:/Volumes"
    
    for dir in $PUSH
    do
      $RSYNC $RSYNCOPTS --exclude-from=$EXCLUDEFILE --rsync-path=$RSYNC $LOCALPREFIX/$dir/ $REMOTEPREFIX/$dir
    done
    

    That’s looking pretty good. We’ll want to check exit status, output to logfiles, and email if there are errors in the final version. Then we’ll want to drop the script in the appropriate directory. (I picked /Library/Scripts/Sync) For quick and easy deployment to my customers’ sites, I went ahead and made a package of this as well and included my rsync binary. You can take a peek at that here:

  • http://people.aiscomputers.com/~msolberg/sync.dmg
  • Then we’ll run the script by hand on the Xserve, make sure that everything looks good on the G4, change the Startup Disk on the G4, remember to unplug the ethernet cable from it so we don’t get an IP conflict, and reboot. It should come up looking just like the Xserve. Woot!

    Ed. Note: For this to work correctly, since the remote G4 is a live system, you’ll want to have a System volume with a different name than the volumes that you are synching over. In case of disaster on the primary volume, you would set the boot volume on the secondary server to the cloned copy of the primary server and reboot it. At this point the secondary would be a spitting image of the primary from the last time the rsync was run.

    Running the script from cron

    Once we have a good sync script, we’ll want to run it every night from cron. The easiest way to do this is to create a daily.local script and drop it in our /etc directory. It should run after all the other daily tasks are finished. If you already have an /etc/daily.local, just add the running of the script to it. Be sure to roll any logs that you create in your daily script so that you can keep track of what happened while you weren’t looking. If you want some fancy execution times, go ahead and edit /etc/crontab or root‘s crontab. Just remember that the script will have to run with root permissions to keep the permissions on the target host correct.

    Ed. Note: Do keep in mind that while this article has you synchronizing the entire system from one machine to another, you don’t have to do it all or nothing. Instead you can just synch data folders or other pieces to another system if that’s all you need. However, a remote, up to date, bootable clone of your production server is quite a sexy thing to have around in case things go bad.

    No Comments

    • if you don’t need as much control, and have the luxury of an unused spare
      box next to your main server, putting the spare box in target disk mode,
      attaching to the main server via firewire, and using psync works, too. if the
      main server goes down, just restart the box that was in target disk mode –
      but this solution is NOT as flexible as the rsync method above.


      Professional Services/Training
      MacOutfitters of Cranberry, PA
      “What guy would call his company ‘micro’ and ‘soft’?”

    • rdiff bootable?


      Changing the world, one server at a time.

      Joel Rennich
      http://www.afp548.com

    • I hope so, how about a whole suite of free backup tools? I just had a look at
      BRU for the OS X, ai-ya, 187 page manual for the LE version. I’m sure it’s
      great but I don’t have the patience read thru that. Using Rsync (or maybe
      rdiff-backup, never heard of it till now) is much simpler even with the work-
      arounds.

      By the way thanks to the author for providing a patched installer. I had no
      clue how to run patch.

      If anyone can tell me how to backup ~/Library/Application Support in the
      script that comes with the installer, please clue me in. I’ve tried qouting and
      escaping it but the script still throws an error that the directory Support
      doesn’t exist.

      I can get it to work if I type in the rsync command with the source being:
      ~/Library/Application\ Support
      –and the dest being–
      "[email protected]:~/Library/Application\ Support"

      SSH wants the dest to be quoted and escaped which is fine but I can’t figure
      out how to do that to the variables in the script.

      –johne.mac

      • That’s a good one.

        The easiest answer would be just to transfer all of ~/Library and
        exclude what you don’t want in the exclude file. Otherwise, what I did
        to make it work was something like this:

        LOCALDIR="Test Dir"
        REMOTEDIR="user@server:Test\ Dir"
        
        rsync --archive "$LOCALDIR"/ "$REMOTEDIR"
        

        Quoting the variables in the actual rsync line and escaping the space in
        the REMOTEDIR variable seems to make the difference. The first set of
        Quotes protects the strings as they’re passed to the rsync line, the
        quotes in the rsync line protect the strings passed to rsync itself, and the
        escape gets passed to the remote server. Messy.

    • --eahfs is for the rsyncX supplied version of rsync,
      which I think is the same as the rsync_hfs project listed on
      OpenDarwin.org. The one that ships with OS X doesn’t contain that
      option:

      $ nroff -man /usr/share/man/man1/rsync.1 | grep hfs
      $
      
    • Hi!

      So, someone asked if I could provide a patched patch. This one
      includes both patches mentioned above and applies cleanly to rsync
      2.6.3.

      http://people.aiscomputers.com/~msolberg/
      rsync-2.6.3-macosx-patches.diff.gz

      To compile, do something like this:

      $ tar zfxv rsync-2.6.3.tar.gz
      $ cd rsync-2.6.3
      $ zcat ../rsync-2.6.3-macosx-patches.diff.gz | patch -p1
      $ LDFLAGS='-framework CoreServices' ./configure
      $ make
      $ sudo make install
      

      • I have not been able to get this to compile on 10.4 at all. I will try installing the dev tools on a 10.3 machine around the office to get my own compile.
        Here is the output for the compiling of hfs.c.

        
        hfs.c: In function 'do_open_ephemeral':
        hfs.c:84: warning: return makes integer from pointer without a cast
        hfs.c: In function 'map_file_ephemeral':
        hfs.c:146: warning: implicit declaration of function 'applefile_hdr'
        hfs.c:146: warning: assignment makes pointer from integer without a cast
        hfs.c: In function 'hfs_rfmd':
        hfs.c:200: warning: implicit declaration of function 'get_hfs_metadata'
        hfs.c:200: warning: assignment makes pointer from integer without a cast
        hfs.c: At top level:
        hfs.c:230: error: conflicting types for 'get_hfs_metadata'
        hfs.c:200: error: previous implicit declaration of 'get_hfs_metadata' was here
        hfs.c: In function 'get_hfs_metadata':
        hfs.c:304: warning: passing argument 2 of 'FSGetCatalogInfo' makes integer from pointer without a cast
        hfs.c: At top level:
        hfs.c:387: error: conflicting types for 'applefile_hdr'
        hfs.c:146: error: previous implicit declaration of 'applefile_hdr' was here
        hfs.c: In function 'applefile_hdr':
        hfs.c:446: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:447: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:448: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:449: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:452: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:453: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:454: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:458: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:459: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:463: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:464: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:465: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        hfs.c:466: warning: pointer targets in passing argument 2 of 'swrite' differ in signedness
        

        I use rsync to copy mass amounts of data (cloning sides of a xraid across two servers.) and I am having trouble getting the provided dmg to properly copy resource forks. I am not sure what I am doing wrong. I usually set up rsync in xinetd on the host server and set up a ro module for my xraid and then cron a command to pull down the files and it does not copy resource forks on 10.3 or 10.4. I even tried the basic push down the files through ssh on the host machine but while it says it is copying the ._* files but I am not getting custom folder icons, ect.

        RsyncX is the only thing I have gotten to copy rsrc over the network but it constantly bombs out and I have even had it crash servers on several occasions.


        David H. Aronsohn
        Assistant Desktop Server Administrator
        :wq

    • Thanks for the fix.
      Pointing to the entire library is easier, but getting spaces to work in the script
      is also good to know.
      Spaces, another UNIX/Mac gap that Tiger may bridge?

    • Could you post an example of putting the AppleDouble files back together?

    • This method moves the forks, but splits them first.

      You need to rejoin them on the other side.


      Breaking my server to save yours.

      Josh Wisenbaker
      http://www.afp548.com

    • I’m using 10.4.4

      The article by Ulrich Hoffmann mentions that he couldn’t get chown -h to
      work as a means to change ownership of symbolic links, but it works for me.
      Was this a problem that existed before and was fixed, or could Hoffmann
      have made a mistake?

      I still have problems with the native rsync that ships with Tiger, so I assume
      that Apple hasn’t incorporated Hoffmann’s patch.

    • Can someone explain why NFS Exported partitions are used? (I have a basic
      understanding of the benefits this might give but I probably do not fully
      understand). Ta.

    Leave a reply

    You must be logged in to post a comment.