Friday, May 7, 2010

gwping2 - A Python Script for Dual WAN Redundancy and Load Balancing in Linux

Sometime ago I was faced with the task of finding the way to make a Linux box, use two different WAN connections to two different Internet Service Providers (ISP), in this case Telus and Shaw, in some special way. The challenge was to do a sort of load balancing while both link were up and running, by having some services as http, https, smtp on one of the links and the VoIP system, running through the second WAN link. Then, any time one of this links went down, the services should be automatically moved from the broken link to the link still running, for keeping the company services online. In this case there was no server involved behind the Linux box acting as a firewall, but only clients connecting to the Internet and their Phones, connecting to the VoIP server located at the company headquarters, through a VPN Tunnel.


This was the network layout I was working with for this task. I believe this will help you understand every step from now on.

So I started by browsing on the Internet, for some one that had previously done something similar to this and after some days of investigation, I found the right idea that would  allow me to develop the final python script. Click here to see the post with the original idea.

The main idea behind the post, was very simple. It used the Linux ping application, to monitor the availability of each gateway, given that the ping application for Linux , allows you to send the ping with any source IP address you want. By using this feature of the ping application for Linux and the ip rules suite for Linux, it was possible to develop something as simple and at the same time useful as this script named gwping.

To me, this script was not enough, given the specifics of the project, but I felt the idea behind was correct, so  I started working on my own version of the script that I decided to name gwping2. The gwping script was developed using bash, but for my version I decided to go with python, which is one of my favorite and, in my opinion, the most powerful programming language today and the one I know the most, and also because Python is installed by default in every Linux installations, just as Bash is.

So I started by recreating the network infrastructure in the Lab and the fun begun. The requirements were the followings:
  • Dual WAN Connection to the Internet through two different providers on the Linux Box.
  • Whiles both links were up, the Internet services will be running on the TELUS (ISP1) link and the VoIP services on the SHAW (ISP2) link.
  • If the TELUS link went down, all the services will be automatically moved to the SHAW link to keep the company services online.
  • If the connection to TELUS was reestablished, all the internet services will be automatically moved back to the TELUS link, keeping the VoIP services running over the SHAW Link.
  • If the SHAW link went down, the VoIP services will be automatically moved to the TELUS link to keep the company services online.
  • If the connection to SHAW was reestablished , the internet services will be moved to the SHAW link and the VoIP services will be kept running over the TELUS link. Then after 8 p.m. everything would be reset  back to its normal operation state. This difference in the approach when the SHAW link goes down and then UP is because , every time the VoIP service is moved it caused all the phones to reconnect to the VoIP server causing 1-3 minutes of phones downtime.
With all the requirements set, I started working on the script. First I decided it would also need a config file for every setting I (or my boss) can possibly want. This config file, will have all the settings needed for the script to work properly and also should be easy to understand for anyone , so I divided it into section to make even easier to anyone. Click here to download de gwping.conf file.

As you can see there are a few section in this file that, at first sight, don't relate directly to the nature of the script; for example: the iptables, dhcp and openvpn sections. The reason why the iptables section was included, was because the Linux box acting as a firewall, is running iptables to protect the internal network, and for changing the services from one link to another, several changes has to be made in the iptables rule set to allow the services to run smoothly on what ever link they were moved to. You will see this later when we get to the gwping2 script. A similar reason provoked the addition of a dhcp section. The main problem with de DHCP was due to the fact that, the conections to the ISPs used, were residential class connections to save some money. This means that, every time one of the link goes down , you will probably end up with a different IP address assigned to the interface, once it comes back, because the IP address for residential services, are assigned by using the ISP dhcp service. The openvpn section was included because the VoIP system is running over a VPN connection to the headquarters and every time one of the links goes down, the openvpn connection has to be reestablished using the available link, to allow the phones to reconnect to to the VoIP server.

Now it is time for the main dish: the gwping2.py python script that will tight all this together. Keep in mind that for python the indenting is critical, if you decide to copy and paste this script. I would like to suggest you use notepad++ or any other Advanced Text Editor to make any modification to the gwping2 script.Click here to download the gwping2.py script.

You would want to place the gwping.conf file under /etc on your Linux box. This is the only file you will need to change to make any modifications to the setup explained in here. As you can see , you only need a few general details to make this work, for example:

testip=192.5.5.241 --> This will be ip you will use to test the links. It needs to be a host that is always responsive and in this case , we are using a DNS Root Server IP Address.

teleworker=X.X.X.X --> This is the ip of the VoIP server to which the phones of the clients connect.

You will need to setup the variables listed under the [network] section to match you particular environment. The same applies to the [email] and [iptables] section. Everything else should be OK.

After modifying gwping.conf to you particular environment, you would like to place gwping2.py under /usr/sbin/ , so you can execute this script anytime once it is in the system's path.

This is pretty much it, with regard to the installation of the software. Now it will take care of the routing on your Linux box, as well as the modification of the iptables rules anytime a link goes down or up, according to the requirements above. It will also take care of the opnvpn tunnels restarting processes, every time it notices a link state change on any of the links. This little script will also alert you any time a link state change takes place, by sending an email to the specified email address, by using the smtp relay server also specified.

I hope some of you find this helpful and, at the same time, save some time by not having to program all this on your own. Please let me know if you run into some problems when trying to use my script, by posting your thoughts below. Feel free to modify anything you need on any of the files provided.

No comments:

Post a Comment