Articles March 29, 2005 at 5:55 pm

IP Failover in OS X 10.3 Server

Stupid-simple high availablity server solutions using the built-in failover scripts in 10.3.

OS X Server has had a variant of this system since 10.2, however few people have used it, or even realize it exists. Here’s a walkthrough of how to provide an active/passive solution for web and MySQL services.

Udated: 3/30/05: Fixed information about LUN masking on Xserve RAID.Ed. Note: OS X Server has built-in active/passive failover capacity. What this means is that the secondary server in the HA (high availability) pair is doing nothing until the primary one fails. This is different from an active/active HA solution where both servers are load balancing connections, with one assuming all of the other’s traffic in case of a failover.

This is another one of those functions on a server that’s fairly easy to walk through the first time, but the devil’s in the details. Espcially when your organization’s website or client data is on the line, it’s best to test your configuration thoroughly in the lab before deploying.

Before setting this up there’s several things to consider when first setting up a failover server – things like what services are running on your primary machine, and where your critical data resides. All of these services and data must be available to the secondary system for this to work.

Failover services are provided by the heartbeatd daemon which runs on the primary server and broadcasts every second on port 1694 on both specified interfaces and the failoverd daemon which runs on the secondary server and listens on port 1694 for broadcasts from a specific address on both interfaces. When the secondary machine stops receiving messages on both of the interfaces it will take over the public IP address of the primary server.

For this example I’m going to layout the process of setting up a failover for a primary server that’s acting as a webserver, running MySQL, and PHP (version 4.3.x from Entropy install).

The most important thing I can stress before embarking on this sort of project, is to plan out exactly what you want to do, and then test your solution on a couple spare machines. The actual planning of where your data resides, how you’re going to keep your config files in sync and writing your failover scripts is going to take you longer than actually setting up a machine to acquire the IP address of your primary server.

So, lets start with our plan. For our primary server, server1.test.com, we know that we’re going to have unique data that will have to be accessed by the failover, or secondary, machine. While we could accomplish this with rsync, which we use to keep the config files up to date, the best solution is for both servers to have access to the data in a common location without having to copy copious amounts of data all the time. For us, this problem is solved by using an attached Xserve RAID. To keep this simple were going to symlink the data that is usually stored in /Library/WebServer/Documents and in /var/mysql to the Xserve RAID.

Ed. Note: While the method of using a single piece of storage that is “shared” between two servers will work, you should also explore true clustered file systems like Xsan for this. Xsan won’t always be the best solution, but it will greatly simplify the shared storage requirement.

After we’ve found a home for the data, we next need to find a way to keep the configuration files for our services up to date. We’ll be dealing with the Apache configuration file (/etc/httpd/httpd.conf) the MySQL configuration file – if you have one (/etc/my.cnf), and your PHP config file (/usr/local/php/lib/php.ini). We’re not going to store these on the RAID, since they should only be changed on rare occasions, we’re going to rsync them to the failover machine every 2 hours – these files are so small it only takes seconds to copy them over, and also ensures that if any config change has been made it will be up to date within at most 2 hours.

Now we have a good idea of what we’re planning to do here, let’s sketch it out, and set up our two test servers.

  • 1. Move /Library/WebServer/Documents contents to /Volumes/RAID/WebSite_Data
  • 2. link /Library/WebServer/Documents to /Volumes/RAID/WebSite_Data
    ln -s /Volumes/RAID/WebSite_Data /Library/WebServer/Documents
  • 3. Move /var/mysql to /Volumes/RAID/WebSite_mysql
  • 4. link /var/mysql to /Volumes/RAID/WebSite_mysql
    ln -s /Volumes/RAID/WebSite_mysql /var/mysql
  • 5. Set up public and private SSH keys between the root users of the primary and secondary server – see Josh’s article on The Keys to the Door of the SSH Tunnel.
  • 6. Write & test an rsync script to be invoked by cron every 2 hours – see Josh’s article on Crontab Basics.
  • 7. Write & test your Pre & Post scripts for starting and stopping your services on failover and failback
  • 8. Set up and thoroughly test your failover and failback working seamlessly on your test servers.
  • 9. Plan on twice as much time as you think you’re going to need to implement failover on the real server and make a full backup of your primary server before you start any work.
  • With this plan in mind, start setting up your two testing servers – the two machines do not have to be identical in hardware, but they should be in software installation.

    Now, you may have a question here about how can we have two machines both attached to the same storage and not worry about problems with your data. The simple answer lies within the your /etc/fstab file on your secondary server (server2.test.com). Edit the file to contain the following:

    LABEL="whatever_your_raid_is_called" 	none hfs xx

    This will make server2.test.com ignore your raid on boot, and we will write into the Pre and Post scripts below to mount and unmount the storage when we need it.

    Ed. Note: You can also use the UUID of the disk instead of the label. The manpage on fstab has more information on this. Also note that with the Xserve RAID admin tools you can use LUN masking to accomplish the same goal, but there is no easy way of changing LUN masking on the RAID programatically during failover. So instead we use /etc/fstab which fills all our needs.

    Although it’s entirely possible for you to configure Apache, MySQL and PHP to look directly to the RAID for their config files, the reason we’re using symbolic links in steps 1 through 4 is to preserve the default paths for anyone other than you who has to administer your server. It also means that you don’t have to change anything in httpd.conf or in my.cnf.

    One small point, which may be obvious, is that while you’re moving all this data, have apache and mysql stopped – only restart the services after you’ve created the links and checked the ownership on all the files you’ve moved.

    Step 5 is fully explained in Josh’s article, I set up the SSH key on my server2.test.com and copied the public key over to the root home directory on server1.test.com, so let’s move on to setting up the rsync.

    I set up the script on server2.test.com in /usr/local/share and called it conf_rsync. Remember to make this an executable file:

    #!/bin/bash
    
    # This is a file that rsyncs the critical conf files from
    # server1.test.com to server2.test.com, so in the
    # event of a failover all conf files will be up to date.
    # No passwords are required due to SSH keys being used by
    # the root accounts.
    
    # This script is called by /etc/crontab
    
    rsync server1:"/etc/httpd/httpd.conf" /etc/httpd
    rsync server1:"/etc/my.cnf" /etc
    rsync server1:"/usr/local/php/lib/php.ini" /usr/local/php/lib
    

    My addition to crontab to run this script every two hours looks like this:

    # Run rsync scripts every two hours.
    0   */2 *   *   *   root    /usr/local/share/conf_rsync
    

    We’re on to writing our Pre and Post scripts now. to give you a brief idea of what these scripts do:

    PreAcq: runs before acquiring IP address from primary server
    PostAcq: runs after acquiring IP address from primary server
    PreRel: runs before relinquishing IP address back to primary server
    PostRel: runs after relinquishing IP address back to primary server

    You can have multiples of these scripts, for instance you could have files called PostAcq_apache, PostAcq_mysql and PostAcq_mountraid, these would be executed in the order they list in the terminal with an ls -l. If you have some quite complicated scripts you want to achieve this may be an option for you, however, for the sake of keeping things simple in this example I’m going to stick to one script per stage, and in fact, for the tasks we want to accomplish we’re only going to look at the PostAcq and PreRel scripts. We could have a PostRel that checks that server1.test.com is indeed running, and running all of it’s services, but lets just stick to the basics for now.

    In the PostAcq script we need to inform server administrators that there’s been a problem, mount the storage, start apache, start MySQL and make a note in the server log. This is what my PostAcq looks like:

    #!/bin/bash
    
    # Post Acquire failover script
    
    subject="server1.test.com has failed"
    to="[email protected], [email protected], [email protected]"
    body="server1.test.com has failed. server2.test.com
    has taken over all responsibilities and will continue
    to function as server1.test.com until the primary server
    comes back on-line.  Your emergency contacts are: 
    Elmer Fudd - (123) 555-1234  
    Daffy Duck - (123) 555-5678  
    and Marvin the Martian - (123) 555-9999"
    
    
    # Mount RAID
    diskutil mount `diskutil list | grep BigRAID | awk '{print $6}'`
    
    # Start Apache
    apachectl start
    
    # Start MySQL
    /usr/bin/mysqld_safe --user=root &
    
    # Send e-mail advising that a failover event has occured
    echo "${body}" | mail -s "${subject}" "${to}"
    
    logger "Sent alert email "${subject}" to "${to}"."
    

    This should all be fairly self-explanatory except the mounting of the storage. This becomes more difficult since OS X does not guarantee that a drive will always be mounted as the same device. nor can you mount a drive with just the name alone. So we’ve used a quick grep/awk construct to find the drive that has the name “BigRAID” in a diskutil list and get the device number from that. Obviously you’ll need to swap in the name of your own disk for “BigRAID”.

    It would be wise to test both the diskutil mount part of the script and the entire failover functionality before deploying. Just a thought.

    Once server2.test.com starts hearing the heartbeat from server1.test.com on both interfaces again it can run the PreRel script:

    #!/bin/bash
    
    # Pre-Release script - run on failover server before returning priority to main server
    
    subject="server2.test.com is returning priority to server1.test.com"
    to="[email protected], [email protected], [email protected]"
    body="server1.test.com has become available again. server2.test.com
    has stopped all it's failover services, unmounted the RAID
    and is resuming it's position as the failover server.
    Your emergency contacts are: Elmer Fudd - (123) 555-1234
    Daffy Duck - (123) 555-5678  and Marvin the Martian - (123) 555-9999"
    
    # Stop Apache
    apachectl stop
    
    # Stop MySQL
    mysqladmin shutdown --user=root
    
    # Send e-mail advising that a failback event has occurred
    echo "${body}" | mail -s "${subject}" "${to}"
    
    logger "Sent failback alert email "${subject}" to "${to}"."
    
    # Eject RAID volume so it can be mounted by the other server
    diskutil eject /Volumes/BigRAID
    

    Make both of these scripts executable and try them out on your test server. They can live anywhere for testing purposes.

    Now we’re actually on to modifying /etc/hostconfig on both machines, and setting up our private network. I’m assuming that your machines are both already set up with their own public IP addresses, and both are working fine. I like to set up my private network over firewire – not only because being mac users we can do this, but it helps me identify more quickly which is which. You can use either firewire 400 or firewire 800 for this, set up your new interface in System Preferences, and in the Network Port Configurations panel ensure the interface used for your public network is at the top of the list.

    I’m going to stop here and show you exactly what my IP details for each server are:

    server1.test.com
    
    Public Network (en0)
    IP: 10.1.0.1
    Subnet: 255.255.255.0
    Router: 10.1.0.1
    DNS: 10.1.0.1
    Search Domain: test.com
    
    Private Network (fw0)
    
    IP: 192.168.1.1
    Subnet: 255.255.0.0
    
    
    server2.test.com
    
    Public Network (en0)
    IP: 10.1.0.2
    Subnet: 255.255.255.0
    Router: 10.1.0.1
    DNS: 10.1.0.1
    Search Domain: test.com
    
    Private Network
    
    IP: 192.168.1.2 (fw0)
    Subnet: 255.255.0.0
    

    Do note that the private firewire network only has an IP address and subnet, and that the subnet is different to that of the public ethernet interface.

    Now you should have your machines connected by the public IP addresses, and by a firewire cable over the private IP addresses. You can test out and make sure the machines can see each-other with a simple ping to both the public and private IP addresses.

    On server1.test.com use your favorite text editor to edit the /etc/hostconfig file. Add the following line:

    FAILOVER_BCAST_IPS="192.168.1.255 10.1.0.255"
    

    This is delegating the ranges that the heartbeatd should broadcast on from your primary server – it’s the private broadcast first, and the public broadcast second.

    Also in /etc/hosconfig on server1.test.com set the value of IPFORWARDING:

    IPFORWARDING=-YES-
    

    Restart server1.test.com. The IPFailover startup item will launch on reboot and heartbeatd will launch, move to the background and the continue to send out the heartbeat messages.

    On server2.test.com use your favorite text editor again to modify /etc/hostconfig. Add the following lines:

    FAILOVER_PEER_IP="192.168.1.1"
    FAILOVER_PEER_IP_PAIRS="en0:10.1.0.1"
    

    The first line indicates what IP address to listen for broadcasts from, and the second line indicates which IP address to take over as a public IP on the event of the heartbeat being lost from server1.test.com.

    Also in /etc/hostconfig set the value of IPFORWARDING as was done on server1.test.com:

    IPFORWARDING=-YES-
    

    Disconnect the ethernet cable, and the firewire cable (public and private interfaces) from server2.test.com and reboot the machine. Once the server has fully rebooted reconnect the private interface – firewire in our case and wait 15 seconds. After 15 seconds reconnect the public interface (ethernet in our case) to server2.test.com.

    Your two testing machines are now set to fail over to each-other. To test this out, connect a client machine to your test network – ping both public interfaces. Now disconnect the private interface from server1.test.com. At this point server2.test.com is aware that it’s lost one of the heartbeats from server1.test.com, but not both, so it hasn’t taken over yet, once you disconnect the ethernet cable from server1.test.com – go ahead, try it – you’ll notice that you can still ping both 10.1.0.1 and 10.1.0.2. Remarkable isn’t it! Follow the same order for reconnecting server1.test.com as shown above with the private interface first, waiting 15 seconds, and then the public interface.

    Remember those Pre and Post scripts we wrote earlier? now it’s time to put them into action.

    On server2.test.com create a directory in /Library called IPFailover:

    /Library/IPFailover
    

    In the directory IPFailover create another directory named for the IP interface of the public IP that server2.test.com is going to take over:

    /Library/IPFailover/10.1.0.1

    With your client machine point your web browser to server1.test.com – you should see the website that’s running from server1.test.com, with all the data hosted from your attached raid volume. Disconnect your private and public interfaces from server1.test.com. Refresh your web browser – if you still have content, give yourself a pat on the back, if not, time to double-check all of your settings and scripts and be glad that you’re doing this on a testing set of servers.

    Some things to note while you’re running though this whole set up, check your logs frequently – this is why we wrote items to get written to the system.log, it makes following what’s going on easier to see. Also have a look at /Library/Logs/failoverd.log – all failover activity is logged here.

    Now we’re on step 8 of our plan we’re testing everything we possibly can on our test server setup. This is where you can play with things, and try and make ti break – the more you break and fix it here, the better you’ll understand things if anything goes wrong on the real servers. Once you’re confident in your test setup and shown it off to all your friends, it’s time to schedule your real server failover project. Allow yourself lots of time, back up all your data fist, and double-check everything. After this is completed think how much easier you’ll be able to sleep at night knowing that all your clients will be able to access your data even if the power supply dies at 3AM on your primary server.

    About

    Andrina Kelly is responsible for anything and everything touched by, or connected to, a Mac at Bell Media, Canada's premiere multimedia company. You may recognize her name from the end credits of Canada's evening news broadcast. She has previously spoken at MacSysAdmin, JAMF National Users Conference, Apple's WWDC, Macworld IT conferences, Mac Networkers Retreat, and Canada MacExpo.

    No Comments

    • Well, Xsan is half the battle since that clusters your storage. Ideally you’d put a
      hardware load balancer in front of the two servers, and then you wouldn’t have to
      worry about any IPFailover config on the servers themselves as the hardware
      would do all the work for you.


      Changing the world, one server at a time.

      Joel Rennich
      http://www.afp548.com

    • Doh! Forgot about the update. Corrected in the article.


      Changing the world, one server at a time.

      Joel Rennich
      http://www.afp548.com

    • The SharePoints you’re looking for are held in netinfo… if you do

      nicl . list /config/SharePoints

      you’ll see the listing of your
      shares, and for details you can

      nicl . read /config/SharePoints/your_sharename_here

      You should be able to do a nidump and niload of this information between
      your two servers –

      nidump -r /config/SharePoints

      should start
      you in the right direction…

      • Also note the little used "sharing" command which will allow you to script any
        sharepoint changes. Not quite like synchronizing them, but might be of us.


        Changing the world, one server at a time.

        Joel Rennich
        http://www.afp548.com

        • "man sharing" on server


          Changing the world, one server at a time.

          Joel Rennich
          http://www.afp548.com

        • From

          man sharing

          sharing, – create share points for afp, ftp and smb services.
          You can add, edit, delete and list all existing sharepoints with this command.

          For example,

          sharing -l

          will give you a listing of all your sharepoints, from this listing you should be
          able to pull out that data to feed back into a script to create sharepoints on
          the other server… let us see what you come up with!

          • Can you sync user info between servers? I’m guessing if you nidump passwd, the passwords won’t actually load into the backup server, so do you just manually keep users up to date in both servers?

            Thanks…

            • Local users, or LDAP users? For ldap you can set up your failover
              machine as a replica. For local netinfo, I believe the passwd flag of
              nidump is what you’re looking for – check out man nidump for more
              indo.

    • Has anyone had trouble getting the secondary server to relinquish the IP? I’ve
      got Failover working but when I bring up the primary server again, it
      complains that it’s ip address is in use.
      I’m not using firewire for the private network but instead en1 on both
      xserves. The private IPs are
      primary
      10.0.1.1
      255.255.0.0

      secondary
      10.0.1.2
      255.255.0.0

      Nothing in any other fields. Any ideas why it’s not working?
      Thanks

      • Got it. To relinquish, you have to unplug the public ethernet cable from the
        secondary server before reconnecting the primary server’s public interface. Is
        that correct? I guess this is what keeps the primary server from complaining
        that someone else has that manual IP address already?

        • In reality, when bringing my primary server back on-line I tend to down/
          disconnect the secondary server, rather than letting the failback take place –
          if you’re there putting the primary back in action anyway, why complicate the
          issue 😉

      • Isn’t IPFORWARDING used by NAT? I’ve got IP Failover working with
        IPFORWARDING=-NO-

    • I see the events in the syslog, but I don’t have /Library/Logs/failoverd.log on
      either machine.
      The man page for failoverd says; "The server logs events that may be of
      interest to the administrator. The log file is named failoverd.log and is stored
      in /Library/Logs."
      So I guess none of my events have been interesting to the administrator?
      😉
      Do I need to touch this file or something?

    • This is a great article but I had some troubles with this set up (apparent by all
      my previous comments). I’d like to share what I found for future readers.

      1. The default IPFORWARDING=-NO- in /etc/hostconfig is fine. This isn’t
      used for ip failover, if you switch it to -YES- it won’t mess ip forwarding up,
      but it’s not necessary.

      2. The Private Network (fw0) Subnet Mask should be 255.255.255.0 if the line
      FAILOVER_BCAST_IPS=”192.168.1.255 10.1.0.255″ is in your primary server’s
      /etc/hostconfig, otherwise nothing is broadcast over the private network and
      the second server won’t relinquish the shared ip when the primary comes
      back up. (tcpdump helped me figure this one out)
      OR
      change the line to
      FAILOVER_BCAST_IPS=”192.168.255.255 10.1.0.255″
      if you use the Subnet Mask 255.255.0.0 for your private network ip. Basically
      these need to match.

    • No need to indicate the path – this is the default that the failover daemon is
      looking for.

    • We were planning to use IPFailover in conjunction with StornextFS, but
      we’ve just had our budget cut for Stornext. We would still like to
      implement failover, but have had problems in the past with (accidental)
      concurrent access to a Raid.

      Has anyone got any ideas on guaranteeing that ServerA is completely
      dead/not accessing the Raid before getting Server B to mount it RW as
      part of it’s PreAcq script?

      The consequences of having two hosts with RW access to a filesystem at
      the same time are two painful to think about…

    • Hi,

      Is there a way to allow IP failover between 2 interfaces found on one host? For example, interface en0 configured with 192.168.168.1, and en1 with 192.168.168.2. Can anyone suggest a way, using heartbeatd and failoverd, so, when one interface is down, the other one will acquire its ip?

      Thanks.

    • See http://docs.info.apple.com/article.html?artnum=305066

      “IP failover does not work with Intel-based Macs
      With Mac OS X Server 10.4.7 (Universal) or later through 10.4.8, IP failover does not work with Intel-based Macs.”

    Leave a reply

    You must be logged in to post a comment.