Wednesday, March 13, 2013

Autoremove unreachable node from balanced DNS

When you are offering a high availability service, you often balance users among nodes using DNS. The problem with DNS is the propagation time so, in case of a node failure, a quick response is very important. This is the reason why I developed a tiny script that checks the status of the balanced nodes for the bind9 DNS server.

The script will need a little modification on your bind9 zone files. This is the syntax:
[..] 3600 IN A 258.421.3.125

balancer-www 120 IN A 258.421.3.182
balancer-www 120 IN A 258.421.3.183

balancer-http 3600 IN CNAME

Here we have the subdomain balancer-www balanced between two hosts with IP's 258.421.3.182, 258.421.3.183. At the top of these A records we have the "code" that we have to add for the script to know how to proceed. The sintax is simple: ;; BALANCED_AUTOCHECK::<service_port>. The BALANCED_AUTOCHECK part is only a matching pattern and the <service_port> is the port of the service to check. In the above example, we are checking a balancer for an http service, so we are using the port 80.

NOTE: For your interest, the rule matches the regular expression: ;;\s*BALANCED_AUTOCHECK::\d+$

Please, have in mind that no protocol check is made (i.e.: HTTP 400 errors, etc) but only a plain socket connection. If the socket connection fails, the IP is marked as down by commenting the A record it and if a recover is detected, the A record is uncommented.

Here is the help output of the script:
Usage: bind9_check_balancer [options] [dns_files]

  -h, --help          show this help message and exit
  -c COMMAND, --command=COMMAND
                        Command to be executed if changes are made.
                        (example:'service bind9 restart')
  -t TIMEOUT, --timeout=TIMEOUT
                        Socket timeout for unreachable hosts in seconds.
                        Default: 5
I think it explains quite well how it works but, just in case, here are some examples:

# Check and If a change is made
# (down/recover detected), exec the command: /etc/init.d/bind9 reload
bind9_check_balancer -c '/etc/init.d/bind9 reload' /etc/named/ /etc/bind/

# Check all files in /etc/named/zones directory and set a timeout
# of 1 second for connection checks. Also exec: service bind9 reload
bind9_check_balancer -c 'service bind9 reload' /etc/named/zones/*

This script is intempted to be executed as a cron job each minute or each 5 minutes (or each time you want). You can get the script form dgtool github repository at:

Tuesday, February 5, 2013

NGINX as sticky balancer for HA using cookies

I recently needed to find a solution to implement several http load balancers that balance among several backend servers so each balancer can proxyfy any of the backend servers. The abstract architecture of the deploy is represented in this image.

Yes, the backends have an NGINX, I'll explain that later...

Of course one of the requirements was that the session had to be kept but there was a problem: the backend server did not synchronized the session among themselves. This situation directly required the balancers to send each user to the same backend server each time the user makes a request. This way,  as the one user is using only one backend, there won't be any problem with the session persistence.

I know that the Upstream Module from NGINX provide the above feature through the ip_hash parameter as, if you have in all the balancer servers the same list of backends in the same order, the IP for a user will always match the same backend server. You can check this taking a look at the module source code at
However, using this module has several cons:
  • No support for IPv6. In fact, all request from an IPv6 will be redirected to the first backend server (as I understood from the source code). NOTE: IPv6 addresses are supported starting from versions 1.3.2 and 1.2.2. The version included in last ubuntu (12.10) is 1.1.19 and it is the version I was testing so it lacks of ipv6 support.
  • Colissions as it only uses the 3 first numbers of the IP for the hash. That means that all the ips of the same C-class network range will go to the same backend server.
  • All users behind a NAT will access to the same backend server.
  • If you add new backends, all the hashes will change and sessions will be lost.
Because of these problems there are some situations where the balacers will not be fair at all, overloading some nodes while others are idle. For this reason I wondered if it could be possible to balance among servers using other solution than ip_hash.

I found the project nginx-sticky-module which uses cookies for balance among backneds, but it is not integrated into nginx so I had to recompile it which I don't really like for production environments as I prefer to use official packages from the GNU/Linux distribution. So I wondered... would be NGINX so versatile that would let me implement something like this module only using configuration parameters? Guess the answer! There exist a way!

So, now that we are in context, let's start with a little test. Our objective in this example is to use nginx to balance between two backend nodes using cookies. The solution we will implement will work the way it is represented in the following diagram.

In the diagram you can see that the user USER0 when accessing to our application service is directed to balancer1 probably through a DNS round robin resolution. In step 1 the user accesses for the first time to the service so the balancer1 will choose one backend by round robin and, in this example, backend1 was chosen (for the luck of our draftsman). This backend will set the cookie backend=backend1 so the client have "registered" in the backend.

At step 2 the user will access the platform for the second time. This time, the DNS round robin will send our user to the balancer balancer3 and, thanks to the cookie named backend, the balancer will be able to know which backned should attend the request: backend1.

Now that we know how it works, how do we make it work with NGINX ?

Well, lets talk first about how the cookie is set in the backend server. If you can change some code on the web application you can force the application to set the cookie for you. If not, you can proxify the application with NGINX in the same server and you could even use that NGINX to implement the cache of the platform. I will choose to proxyfy the backend application with NGINX as it is a generic solution to our problem.

So a very simple configuration of the NGINX in the backend server backend1 would look like this:

server {
   listen 80;

   location / {
      add_header Set-Cookie backend=backend1;
      proxy_pass http://localhost:8000/;
      proxy_set_header X-Real-IP $remote_addr;

Each backend server will have to have an specific configuration with its backend identifier. I recommend you for testing to set up two backends like this one with ids backend1 and backend2. Of course in production environments we could use md5 hashes or uuids to avoid predictable identifiers.

We can now configure a balancer to balance between these two backends. The balancer configuration would look like this:

upstream rrbackend  {

map $cookie_backend $sticky_backend {
   default bad_gateway;

server {
    listen 80;

    location / {
       error_page 502 @rrfallback;
       proxy_pass http://$sticky_backend$uri;
   location @rrfallback {
      proxy_pass  http://rrbackend;

The good part is that the configuration will be the same for all balancers if you have more than one. But lets explain how this configuration works.

In first place we declare the backends for the round robin access.

Then we map the cookie backend variable to the $sticky_backend NGINX variable. If $cookie_backend is backend1, then $sticky_backend value will be and so on. If no match is found, the default value bad_gateway will be used. This part will be used to choos the sticky backend.

The next thing is to declare the server part, with its port (server_name and whatever) and its locations. Note that we have a @rrfallback location that resolves to a backend using round robin. When a user access the location /, two things can happen:
  1. The user accesses for the first time so no backend cookie is provided and $sticky_backend NGINX variable will have the value bad_gateway. In this case, the proxy_pass will try to access a server called bad_gateway which will return an error 502 (Bad Gateway) and, as declared in the line above the proxy_pass, this error will fallback to @rrfallback so a backend will be choosed by round robin.
  2. The users passes the backend cookie with a valid value so the $sticky_backend NGINX variable will have a valid IP from the map and redirected to the backend indicated in the cookie through the proxy_pass.
  3. The same than 2. but the backend is down. In this case, the proxy_pass will return a 502 error and the process will start from 1: new backend by round-robin.

And this is all! You can test it both with a browser to check that you'll stay on the same server or make requests with curl/wget without sending the cookie to check the initial round-robin.

As always, comments are more than welcome!