Thursday, November 9, 2017

PyWench is now dioivo

I decided to move the pywench tool to its own github repository. I saw this tool improve a lot as I used it many times and each time I end up fixing something, adding features or improving them. For this reason I want to manage it properly in its own repository.

Now it is also installable via pip.
pip install dioivo
As I said many things changed so some options may not be present and some others were added. In any case it is active development so more feautres and improvements will come.

Monday, October 17, 2016

Ansible Inventory

Recently I've started using ansible after a couple of years using salt stack. In general ansible feels very easy to use and very versatile. However, one of the things I miss is an easier way to handle your hosts inventory: manage groups, manage variables, etc.

For that reason I started a small program that helps me do all the inventory management using a console interface. It also integrates directly with ansible as a dynamic inventory. Here are some of the features:

  • Add, edit and delete hosts and groups
  • Add, edit and delete variables for hosts and groups
  • List hosts and groups
  • Show the hierarchy tree of group
  • Unique color per group, host and variable for visual identification
  • Use of regular expressions for bulk targeting
  • Importing an already existing inventory in the ansible JSON format
  • Direct use as a dynamic inventory with ansible (--list and --host)
  • Different backends with concurrency: file (for local use) and redis (for network use).
Let me show you how it looks.



You can get more information and the tool itself on github: https://github.com/diego-treitos/ansible-inventory.

As always, all comments and sugestions are welcome so please, let me know what you think.

Thursday, September 29, 2016

Image compression tool for websites

Why to compress images?

One of the most important tricks to make a web page load faster is to make its components smaller. Nowadays images are one of the most common assets in web pages and also the ones that then to accumulate a big part of the web page size. This means that if you reduce the size of the images, there will be a big impact in the size of the webpage and therefore it will load noticiable faster.

I've been compressing images in websites for a while and doing benchmarks before and after the image compression. In some websites I could double the performance only with image compression.

How to compress images?

There are several very good tools to compress images but usually they are only for a single type of them (jpeg, png, etc) and usually they are used against a single file. This makes hard to apply compression to a whole website, where you would need to find all the image files and compress them depending on its type.

For this reason I created an script that finds all images, makes a backup of them and compress them with the right tool depending on type. It also has some additional features that come very handy when you are dealing with image compression every day.

The image compression script

The compression script uses mainly 4 tools and all the compression merits goes to these tools:  jpegoptim (by Timo Kokkonen), pngquant (by Kornel LesiƄski), pngcrush (by Andrew Chilton) and gifsicle (by Eddie Kohler). You will need to install these tools in order for the script to work and the script itself will warn you if they are not installed.  In a debian/ubuntu system you can install them with this command:
sudo apt-get install jpegoptim gifsicle pngquant pngcrush
Once you have these compression tools you can start using the compression script. Lets see the options we have.
 Use: ./dIb_compress_images <options> <path>

   Options:
      -r              Revert. Replace all processed images with their
                      backups.
      -j              Compress JPEG files.
      -p              Compress PNG files.
      -g              Compress GIF files.
      -L              Use lossless compression.
      -q              Quiet mode.
      -c              Continue an already started compression. This will
                      convert only not already backuped files.
      -t <percentage> Only use the converted file if the compression
                      saves more than  size. The files not
                      below  will appear as 0% in output.

 NOTE: If none of -j, -p or -g are specified all of them (-j -p -g)
       will be used.


So yo can choose which types of images you want to compress (by default all of them) in a given path. The script will find those image types recursively inside that path and compress them.

You can compress them using lossy compression (default and recommended) or lossess (with -L flag). Usually using lossy compression makes no visual differences for the naked eye between the non-compressed and compressed version of a image. Lossy compression gives a much compressed version of the image so it is recommended.

For each image, the script will make a backup and this backup will not be ever overwritten so the backup will always be the original image. In case you want to revert the changes because you found some visual differences or whatever, you can use the -r flag over the same path, which will cause all the image backups to be restored.

By default, each time you execute the script on a given path, it will recompress all the images despite if they were already compressed or not. If you want to skip the already compressed images, you can use the -c flag. If you use it together with the quiet mode flag (-q) you can add the script to a cron task to periodically compress the images of your site.

You can also specify a percentage threshold so you can only keep the compressed version of an image if it saves at least that percentage in its used space with the -t parameter.

Here is an example output of the most basic execution. It is compressing images from a real website, altough I changed the image names in the script output for security reasons.





In this case, the images used 35MB but after compression they only use 12MB, that is a 33% of their original size!

Again, these compression rates are due to the fantastic tools the script uses. What I did was to gather information from many tests to find the best options for these tools to get the best balance between size and quality (and also compression time).

As always, let me know your thoughts and sugestions to improve it are more than welcome!

The script is available here:  https://github.com/diego-treitos/dgtool/blob/master/compress_images/dIb_compress_images




Friday, July 1, 2016

PyWench: Huge improvements

I've recently been paying some attention to this tool. I've discovered some bugs and some great room for improvements.
Basically I found that the rps metric was not very precise and also that the performance of the application was not very good.
Regarding the requests per second issue, I changed the way they are calculated and now I can grant that the results are quite accurate (compared to what I see in the log files of the benchmarked server).
I've also found that the performance was quite bad, mainly because of the python threading system and the python GIL.  So I decided to migrate the tool to python multiprocessing and the performance improvements have been huge: almost 3 times faster.
As I was already changing things I also changed the plotting library from gnuplot to matplotlib as gnuplot wasn't very stable. This also implies that the plot cannot be viewed live anymore and the "-l" option will now allow you to play with the plot once the benchmark is done. Also, now the tool should work properly in systems with out xserver.

Please, be aware that you need a big server (several cores, big bandwidth, etc) in order to benchmark a website properly instead of end up benchmarking the tool itself or your bandwidth connection. I will still work in this tool in the future and one of the things I would like to do is to make it distributed so it can run a benchmark from different servers and then gather all the results in a central node. This would help to eliminate bottlenecks in the client side of the benchmark.

Please, if you use the tool report any bug you see and of course let me know if you see how to improve the tool. All comments are welcome :).

You can already find this new version at: https://github.com/diego-XA/dgtool/tree/master/pywench


Tuesday, March 11, 2014

PyWench: log driven web benchmarking tool

Well, it's been a while since the last publication but here I am again with a new tool I wrote. I sometimes have to take a look on how web sites are performing so I need a tool to see if the configuration changes I make are taking effect and measure the performance differences. There are some tools like apache benchmark that can be used to benchmark web performance but they only accept a single URL to test, which is far away from a normal usage of a site.

So I wrote a tool that uses the access log file from apache or nginx and rips the URL paths to test so we can measure better how a "normal" usage of the site will behave with this or that configuration. It also reports some stats and plots a graph with the timings. To have an idea of what this tool can do, lets see its options.


root@dgtool:~# pywench --help
Usage: pywench [options] -s SERVER -u URLS_FILE -c CONCURRENCY -n NUMBER_OF_REQUESTS

Options:
  -h, --help            show this help message and exit
  -s SERVER, --server=SERVER
                        Server to benchmark. It must include the protocol and
                        lack of trailing slash. For example:
                        https://example.com
  -u URLS_FILE, --urls=URLS_FILE
                        File with url's to test. This file must be directly an
                        access.log file from nginx or apache.'
  -c CONCURRENCY, --concurrency=CONCURRENCY
                        Number of concurrent requests
  -n TOTAL_REQUESTS, --number-of-requests=TOTAL_REQUESTS
                        Number of requests to send to the host.
  -m MODE, --mode=MODE  Mode can be 'random' or 'sequence'. It defines how the
                        urls will be chosen from the url's file.
  -R REPLACE_PARAMETER, --replace-parameter=REPLACE_PARAMETER
                        Replace parameter on the URLs that have such
                        parameter: p.e.: 'user=hackme' will set the parameter
                        'user' to 'hackme' on all url that have the 'user'
                        parameter. Can be called several times to make
                        multiple replacements.
  -A AUTH_RULE, --auth=AUTH_RULE
                        Adds rule for form authentication with cookies.
                        Syntax:
                        'METHOD::URL[::param1=value1[::param2=value2]...]'.
                        For example: POST::http://example.com/login.py::user=r
                        oot::pass=hackme
  -H HTTP_VERSION, --http-version=HTTP_VERSION
                        Defines which protocol version to use. Use '11' for
                        HTTP 1.1 and '10' for HTTP 1.0
  -l, --live            If you enable this flag, you'll be able to see the
                        realtime plot of the benchmark
Most of the parameters like the SERVER, CONCURRENCY, TOTAL_REQUESTS are probably well known or self-explanatory for you, but you can also see other parameters that are not so common, so let me explain them:
  • URLS_FILE: This is the access log file from nginx or apache server. So if you want to test your server, you will only have to download the access log and use it as the input file. Please note that it takes the 7th column of the log file as the URL path.
  • MODE: The URLs can be extracted from URLS_FILE randomly or in the order they appear. You can choose it with the MODE parameter.
  • REPLACE_PARAMETER: Sometimes you need to pass an specific parameter so the server will answer as if you were a normal user (maybe a session string, a user name, etc). If you use this option, any URL from URLS_FILE will be checked for the parameter you want to replace and if the parameter exists, its value will be replaced with the value you specify.
  • AUTH_RULE: Sometimes you need to authenticate to a server before starting the benchmark to ensure that you are treated as a normal user. With this option you can choose how to authenticate, pass the authentication options and, before the benchmark starts, PyWench will login into the website and save the cookie and this auth cookie will be used for all the requests.
  • HTTP_VERSION: You can use HTTP 1.0 or HTTP 1.1 for the benchmarks.
  • --live: This will show you a live plot of how well the server is doing serving the PyWench requests during the benchmark.

Well, lets see how it works with an example! In this example we will test our favourite server: example.com. It will be a basic usage case.
To start the benchmark we will need two things: the server URL (protocol and domain name) and the access.log file. In this example we will use https://www.example.com as server URL (to point out that it also works with HTTPS). We will start with 500 requests and a concurrency of 50


root@dgtool:~# /pywench -s "https://www.example.com" -u access_log -c 50 -n 500

 Requests: 500        Concurrency: 50

[==================================================] 

Stopping...  DONE

Compiling stats... DONE

Stats (seconds)

            min            avg            max
ttfb        0.15178        7.46277        103.63504
ttlb        0.15299        7.58585        103.63520


Requests per second: 6.591


URL min ttfb: (0.15178) /gallery/now/thumbs/thumbs_img_0401-1024x690.jpg
URL max ttfb: (89.01898) /gallery/then/02_03_2011_0007-2.jpg
URL min ttlb: (0.15336) /gallery/now/thumbs/thumbs_img_0401-1024x690.jpg
URL max ttlb: (90.55633) /gallery/then/02_03_2011_0007-2.jpg
NOTE: These stats are based on the average time (ttfb or ttlb) for each url.


Protocol stats:
    HTTP 200:    499 requests (99.80%)
    HTTP 404:      1 requests ( 0.20%)

Press ENTER to save data and exit.
Well, as you can see, we already have some stats in the output: there are some statistics regarding the timings for ttfb (time to first byte), ttlb (time to last byte) , the fastest and slowest URL and some error code stats.
There also were created 3 files in the working directory:
  • www.example.com_r500_c50_http11.log: Log file with all the gathered data. It is like a CSV with the start time, url, ttfb and ttlb of each request.
  • www.example.com_r500_c50_http11.stat: It contains a python dictionary with the stats reported in the command line.
  • www.example.com_r500_c50_http11.png: A plot of the benchmark. This plot is also what you see if you use the --live option. See the following image.

 REQUIREMENTS

  • python-urllib3: I had to use urllib3 due to threading problems in urllib2 (threading safeness and python...)
  • python-gnuplot
  • gnuplot-x11: It is important to have the x11 package installed so live plotting works.
So, if you are in a Debian/Ubuntu system: apt-get install python-urllib3 python-gnuplot gnuplot-x11

UPDATE:  Check the last post with the latest modifications: http://dgtool.treitos.com/2016/07/pywench-huge-improvements.html

You can access this tool clicking this URL: https://github.com/diego-XA/dgtool/tree/master/pywench
As always, comments are welcome so if you have any question, found a bug or have a suggestiong, please let me know.

Wednesday, March 13, 2013

Autoremove unreachable node from balanced DNS

When you are offering a high availability service, you often balance users among nodes using DNS. The problem with DNS is the propagation time so, in case of a node failure, a quick response is very important. This is the reason why I developed a tiny script that checks the status of the balanced nodes for the bind9 DNS server.

The script will need a little modification on your bind9 zone files. This is the syntax:
[..]

ftp.example.com 3600 IN A 258.421.3.125

;; BALANCED_AUTOCHECK::80
balancer-www 120 IN A 258.421.3.182
balancer-www 120 IN A 258.421.3.183

balancer-http 3600 IN CNAME balancer-www.example.com.

[..]
Here we have the subdomain balancer-www balanced between two hosts with IP's 258.421.3.182, 258.421.3.183. At the top of these A records we have the "code" that we have to add for the script to know how to proceed. The sintax is simple: ;; BALANCED_AUTOCHECK::<service_port>. The BALANCED_AUTOCHECK part is only a matching pattern and the <service_port> is the port of the service to check. In the above example, we are checking a balancer for an http service, so we are using the port 80.

NOTE: For your interest, the rule matches the regular expression: ;;\s*BALANCED_AUTOCHECK::\d+$

Please, have in mind that no protocol check is made (i.e.: HTTP 400 errors, etc) but only a plain socket connection. If the socket connection fails, the IP is marked as down by commenting the A record it and if a recover is detected, the A record is uncommented.

Here is the help output of the script:
Usage: bind9_check_balancer [options] [dns_files]

Options:
  -h, --help          show this help message and exit
  -c COMMAND, --command=COMMAND
                        Command to be executed if changes are made.
                        (example:'service bind9 restart')
  -t TIMEOUT, --timeout=TIMEOUT
                        Socket timeout for unreachable hosts in seconds.
                        Default: 5
I think it explains quite well how it works but, just in case, here are some examples:

# Check test.net.hosts and test.com.hosts. If a change is made
# (down/recover detected), exec the command: /etc/init.d/bind9 reload
bind9_check_balancer -c '/etc/init.d/bind9 reload' /etc/named/test.net.hosts /etc/bind/test.com.hosts

# Check all files in /etc/named/zones directory and set a timeout
# of 1 second for connection checks. Also exec: service bind9 reload
bind9_check_balancer -c 'service bind9 reload' /etc/named/zones/*



This script is intempted to be executed as a cron job each minute or each 5 minutes (or each time you want). You can get the script form dgtool github repository at: https://github.com/diego-XA/dgtool/blob/master/ha/bind9_check_balancer

Tuesday, February 5, 2013

NGINX as sticky balancer for HA using cookies

I recently needed to find a solution to implement several http load balancers that balance among several backend servers so each balancer can proxyfy any of the backend servers. The abstract architecture of the deploy is represented in this image.


Yes, the backends have an NGINX, I'll explain that later...

Of course one of the requirements was that the session had to be kept but there was a problem: the backend server did not synchronized the session among themselves. This situation directly required the balancers to send each user to the same backend server each time the user makes a request. This way,  as the one user is using only one backend, there won't be any problem with the session persistence.

I know that the Upstream Module from NGINX provide the above feature through the ip_hash parameter as, if you have in all the balancer servers the same list of backends in the same order, the IP for a user will always match the same backend server. You can check this taking a look at the module source code at
nginx-X.X.XX/src/http/modules/ngx_http_upstream_ip_hash_module.c.
However, using this module has several cons:
  • No support for IPv6. In fact, all request from an IPv6 will be redirected to the first backend server (as I understood from the source code). NOTE: IPv6 addresses are supported starting from versions 1.3.2 and 1.2.2. The version included in last ubuntu (12.10) is 1.1.19 and it is the version I was testing so it lacks of ipv6 support.
  • Colissions as it only uses the 3 first numbers of the IP for the hash. That means that all the ips of the same C-class network range will go to the same backend server.
  • All users behind a NAT will access to the same backend server.
  • If you add new backends, all the hashes will change and sessions will be lost.
Because of these problems there are some situations where the balacers will not be fair at all, overloading some nodes while others are idle. For this reason I wondered if it could be possible to balance among servers using other solution than ip_hash.

I found the project nginx-sticky-module which uses cookies for balance among backneds, but it is not integrated into nginx so I had to recompile it which I don't really like for production environments as I prefer to use official packages from the GNU/Linux distribution. So I wondered... would be NGINX so versatile that would let me implement something like this module only using configuration parameters? Guess the answer! There exist a way!

So, now that we are in context, let's start with a little test. Our objective in this example is to use nginx to balance between two backend nodes using cookies. The solution we will implement will work the way it is represented in the following diagram.






In the diagram you can see that the user USER0 when accessing to our application service is directed to balancer1 probably through a DNS round robin resolution. In step 1 the user accesses for the first time to the service so the balancer1 will choose one backend by round robin and, in this example, backend1 was chosen (for the luck of our draftsman). This backend will set the cookie backend=backend1 so the client have "registered" in the backend.

At step 2 the user will access the platform for the second time. This time, the DNS round robin will send our user to the balancer balancer3 and, thanks to the cookie named backend, the balancer will be able to know which backned should attend the request: backend1.

Now that we know how it works, how do we make it work with NGINX ?


Well, lets talk first about how the cookie is set in the backend server. If you can change some code on the web application you can force the application to set the cookie for you. If not, you can proxify the application with NGINX in the same server and you could even use that NGINX to implement the cache of the platform. I will choose to proxyfy the backend application with NGINX as it is a generic solution to our problem.

So a very simple configuration of the NGINX in the backend server backend1 would look like this:


server {
   listen 80;

   location / {
      add_header Set-Cookie backend=backend1;
      proxy_pass http://localhost:8000/;
      proxy_set_header X-Real-IP $remote_addr;
   }
}

Each backend server will have to have an specific configuration with its backend identifier. I recommend you for testing to set up two backends like this one with ids backend1 and backend2. Of course in production environments we could use md5 hashes or uuids to avoid predictable identifiers.

We can now configure a balancer to balance between these two backends. The balancer configuration would look like this:

upstream rrbackend  {
   server 192.168.122.201;
   server 192.168.122.202;
}

map $cookie_backend $sticky_backend {
   default bad_gateway;
   backend1 192.168.122.201;
   backend2 192.168.122.202;
}

server {
    listen 80;


    location / {
       error_page 502 @rrfallback;
       proxy_pass http://$sticky_backend$uri;
    }
  
   location @rrfallback {
      proxy_pass  http://rrbackend;
   }
}

The good part is that the configuration will be the same for all balancers if you have more than one. But lets explain how this configuration works.

In first place we declare the backends for the round robin access.

Then we map the cookie backend variable to the $sticky_backend NGINX variable. If $cookie_backend is backend1, then $sticky_backend value will be 192.168.122.201 and so on. If no match is found, the default value bad_gateway will be used. This part will be used to choos the sticky backend.

The next thing is to declare the server part, with its port (server_name and whatever) and its locations. Note that we have a @rrfallback location that resolves to a backend using round robin. When a user access the location /, two things can happen:
  1. The user accesses for the first time so no backend cookie is provided and $sticky_backend NGINX variable will have the value bad_gateway. In this case, the proxy_pass will try to access a server called bad_gateway which will return an error 502 (Bad Gateway) and, as declared in the line above the proxy_pass, this error will fallback to @rrfallback so a backend will be choosed by round robin.
  2. The users passes the backend cookie with a valid value so the $sticky_backend NGINX variable will have a valid IP from the map and redirected to the backend indicated in the cookie through the proxy_pass.
  3. The same than 2. but the backend is down. In this case, the proxy_pass will return a 502 error and the process will start from 1: new backend by round-robin.

And this is all! You can test it both with a browser to check that you'll stay on the same server or make requests with curl/wget without sending the cookie to check the initial round-robin.

As always, comments are more than welcome!