A redundant load-balancing firewall system, using FreeBSD.


Goal:   Configure two or more redundant IPF-based firewalls, which
	will also act as load-balancers (henceforth referred to as
	"FWLBs") for an internet services cluster.  If one firewall
	fails, the second will take over as the firewall/load
	balancer.  Barring any catastrophes, this should result in
	a highly-available, almost zero downtime environment. This
	hopefully saves you loads of money on expensive dedicated 
	hardware devices.
       
        We'll end up with a configuration like the following:


                         ~~~
                       ( net )
                         ~~~
                          |
                          |
                          |
                      ----------
                     |  switch  |
                      ---------- 
                      /        \
                     /   VIP    \
             ---------         --------
            |  fwlb1  |       | fwlb2  |
             ---------         --------
                    \    VIP   /
                     \        /
                      ---------
                     | switch |
                      ---------
                      /      \ 
                     /        \
              ---------        ---------
             | server1 |      | server2 |
              ---------        ---------
Where the first VIP (Virtual IP) is the publicly known IP of your site, and the second is an RFC 1918 private address that your servers will use as a default route and NAT through. Prerequisites: - Two dual-homed FreeBSD-STABLE boxes - Two switches/hubs - Two or more real servers for a cluster - /usr/ports/net/pen - /usr/ports/net/freevrrpd - /usr/ports/sysutils/daemontools or /usr/ports/sysutils/runit - kernel compiled with IPFILTER and IPFILTER_LOG Pen is a simple but flexible load balancer. FreeVRRPd is what will be handling our failover for us. Daemontools(or Gerrit Pape's excellent, more licensing-friendly and featureful workalike, runit) is not strictly necessary, but it's a tool I find quite useful for process control and supervision, so I've used it in the examples here. Procedure: PART 1: The Firewall Create an IPF(or IPFW, or PF, for that matter) ruleset, allowing traffic in on the port of the service you'll be balancing. Optionally, allow the cluster to be able to NAT through this box, if they'll be needing to initiate outbound connections. IPF/IPNAT configuration is beyond the scope of this document, but there's been plenty written on the subject. The two points important to keep in mind are to make sure you allow free multicast communication between both firewalls(VRRP requires it), and to make sure other hosts can't(or you'll run into some unpleasant security possibilities). For more IPF info, please refer to: http://www.nwo.net/ipf/ipf-howto.html http://coombs.anu.edu.au/~avalon/ip-filter.html Short story - build a kernel with the options above, add the following to /etc/rc.conf: ipfilter_enable="YES" ipfilter_rules="/etc/ipf.conf" ipnat_rules="/etc/ipnat.conf" ipnat_flags="-CF" ipmon_enable="YES" And put your rulesets in the proper places. PART 2: The balancer 1) Create the necessary users and directories. mkdir -p /etc/supervise/pen/log mkdir -p /var/chroot/pen mkdir -p /var/log/pen pw useradd pen -s /bin/false -d /var/chroot/pen pw useradd penlog -s /bin/false -d /var/chroot/pen chown penlog:pen /var/log/pen 2) Create the runfiles for pen. cd /etc/supervise/pen cat << _EOF_ > run #!/bin/sh exec 2>&1 exec pen -d -u pen -j /var/chroot/pen -C localhost:8888 -f -r 80 hostname1 hostname2 _EOF_ chmod 755 run cd log cat << _EOF_ > run #!/bin/sh exec /usr/local/bin/setuidgid penlog /usr/local/bin/multilog s999999 n20 /var/log/pen _EOF_ chmod 755 run This will configure pen to run chrooted in /var/chroot/pen, with a control port of 8888. It will be balancing port 80 incoming to port 80 on hostname1 and hostname2. This is configured for round-robin balancing - if you require sticky sessions, remove the "-r" flag. This example has pen logging somewhat verbosely, to aid in debugging. You may wish to remove the "-d" in a production environment. 3) Start up the load-balancing services. cd /service ln -s /etc/supervise/pen echo "csh -cf '/usr/local/bin/svscanboot &'" >> /etc/rc.local csh -cf '/usr/local/bin/svscanboot &' sleep 5 && svstat pen You should now be able to point your browser, dns resolver, etc to either of the IPs of these machines, and see it balancing out to your real servers. You can confirm this by tailing /var/log/pen/current on both machines. PART 3: Redundancy 1) First, configure syslog to log VRRP info to its own file. touch /var/log/freevrrpd.log cat << _EOF_ >> /etc/syslog.conf !freevrrpd *.* /var/log/freevrrpd.log _EOF_ 2) Configure FreeVRRPd Until this point, both machines have been equal. Now, you need to choose which FWLB is going to be your primary. On this machine, Copy /usr/local/etc/freevrrpd.conf.sample to /usr/local/etc/freevrrpd.conf. Edit the file, and configure it along the following lines: # public-facing VRID [VRID] serverid = 1 interface = fxp0 priority = 255 addr = 198.123.111.1/32 password = vrid1 vridsdep = 2 # backend VRID [VRID] serverid = 2 interface = fxp1 priority = 255 addr = 10.0.0.1/32 password = vrid2 vridsdep = 1 This results in 2 VRIDs being created - one for the front-facing network, and one for the rear-facing one that the cluster will be using. In this example, both VRIDs are configured to consider this host the master server during VRRP elections. Note that both VRIDs depend on the other, specified by the "vridsdep" field. This is important - it means that if one of the interfaces in a FWLB fails, the other one will automatically go into backup mode, failing both interfaces to the slave FWLB. This avoids the backend servers trying to route through a machine with a dead front-end connection. You should now copy this file over to the slave FWLB, and change both priority fields to be 100. Change the password field on both to something more original, but certainly don't rely on VRRP passwords as a security measure. If another box outside of this cluster is in a position to communicate with it over VRRP, you've got a problem. 3) Start FreeVRRPd You can now start up freevrrpd on both boxes: cp /usr/local/etc/rc.d/freevrrpd.sh{.sample,} /usr/local/etc/rc.d/freevrrpd.sh start PART 4: Failover testing Now you just need to verify that this whole setup works in the case of a failure. First, configure both FWLB boxes to start an SSH daemon, so we have something to connect to to verify that the interfaces fail properly. Try the following scenarios: - From one of the machines in your cluster, ssh to 10.0.0.1, and log in. Verify the hostname of the machine is the hostname of the master FWLB. - While watching /var/log/pen/current on FWLB1, connect to 198.123.111.1, port 80 from a machine on the front-end network. Verify that you see the connection occur. - Pull the front-end interface of FWLB1. - Watch the logs on FWLB2. Connect to 198.123.111.1, port 80, and verify that the connection occurred. - SSH again to 10.0.0.1. You should see the hostname of FWLB2 in the SSH banner. - Reconnect the front-end interface of FWLB1. Verify that both interfaces on FWLB1 recover back to master state. Now conduct the same tests when unplugging the backend interface. For fun, you may want to just hit the reset button on FWLB1 while actively hitting the web servers. Notes: Removing servers from a pool: Pen cannot permanently remove servers from a pool, but if you need to have it ignore a server while doing upgrades or such, you can just do: penctl localhost:8888 server $servername blacklist 99999 This will blacklist the server for 99999 seconds, giving adequate time to seed the server. When the server is back in service, just do: penctl localhost:8888 server $servername blacklist 1 This will reset the blacklist timeout to 1 second, bringing it back into the pool. Permanently adding or removing servers to/from a pool: In the case that more servers need to be added or removed, the /service/pen/run file will need to be edited. Simply add the hostname to the end of the pen command, and do: svc -t /service/pen This will TERM and restart pen. While this may not cause interruption to the user, it's probably wise to do this while in maintenance mode or off hours. References: http://www.faqs.org/rfcs/rfc2338.html http://www.bsdshell.net/hut_fvrrpd.html http://siag.nu/pen/ http://cr.yp.to/daemontools.html http://smarden.org/runit © 2004 David Thiel --- lx [@ at @] redundancy.redundancy.org Updated Sun Mar 21 18:23:28 PST 2004