Pages

Friday, December 6, 2013

GlusterFS performance on different frameworks

A couple months ago, I did a comparison of different distributed filesystems. It came out that GlusterFS was the easiest and most feature full, but it was slow. Since I would really like to use it, I decided to give another chance. Instead of doing raw benchmarks using sysbench, I decided to stress test a basic installation of the three PHP frameworks/CMS I use the most using siege.

My test environment:

  • MacBook Pro (Late 2003, Retina, i7 2.66 Ghz)
  • PCIe-based Flash Storage
  • 2-4 virtuals machines using VMware Fusion 4, each with 2 GB of RAM.
  • Ubuntu 13.10 server edition with PHP 5.5 and OPCache enabled
  • GlusterFS running on all VMs with a volume in replica mode
  • The volume was mounted using nodiratime,noatime using GlusterFS native driver (NFS was slower)
The test:
  1. siege -c 20 -r 5 http://localhost/foo  # Cache warming
  2. siege -c 20 -r 100 http://localhost/foo  # Actual test
I then compared the local filesystem (inside the VM) vs the Gluster volume using these setups:
  • 2 nodes, 4 cores per node
  • 2 nodes, 2 cores per node
  • 4 nodes, 2 cores per node
The compared value is the total time to serve 20 x 100 requests in parallel.
All tests were ran 2-3 times while my computer was doing nothing and the results were very consistent.

Symfony Wordpress Drupal Average
2 nodes
4 cores
Local 2.91 s 9.92 s 5.39 s 6.07 s
Gluster 10.84 s 23.94 s 7.81 s 14.20 s
2 nodes
2 cores
Local 5.41 s 19.14 s 9.67 s 11.41 s
Gluster 25.05 s 31.91 s 15.17 s 24.04 s
4 nodes
2 cores
Local 5.57 s 19.6 s 9.79 s 11.65 s
Gluster 30.56 s 35.92 s 18.36 s 28.28 s
Local vs
Gluster
2 nodes, 4 cores 273 % 141 % 45 % 153 %
2 nodes, 2 cores 363 % 67 % 57 % 162 %
4 nodes, 2 cores 449 % 83 % 88 % 206 %
Average 361 % 97 % 63 % 174 %
2 nodes vs
4 nodes
Local 3 % 2 % 1 % 2 %
Gluster 22 % 13 % 21 % 19 %
4 cores vs
2 cores
Local 86 % 93 % 79 % 86 %
Gluster 131 % 33 % 94 % 86 %


Observations:
  1. Red — Wordpress and Drupal have an acceptable loss in performance under Gluster, but Symfony is catastrophic.
  2. Blue — The local tests are slightly slower when using 4 nodes vs 2 nodes. This is normal, my computer had 4 VMs running.
  3. Green — The gluster tests are 20% slower on a 4 node setup because there is more communication between the nodes to keep them all in sync. 20% overhead for double the nodes isn’t that bad.
  4. Purple — The local tests are 85% quicker using 4 cores vs 2 cores. A bit under 100% is normal, there is always some overhead to parallel processing.
  5. Yellow — For the Gluster tests, Symfony and Drupal scale very well with the number of nodes, but Wordpress is stalling, I am not sure why.

I am still not sure why Symfony is so much slower on GlusterFS, but really, I can’t use it in production for the moment because I/O is already the weak point of my infrastructure. I am in the process of looking for a different hosting solution, maybe it will be better then.

Organizing Javascript and CSS assets for optimal loading

On Reddit, I recently stumbled upon DynoSRC which allows to serve only a differential Javascript file to your users. The concept is pretty amazing, but I find it a bit overkill. I believe this would only be useful for very high traffic websites with a lot of Javascript with small parts changing often. For example: Facebook, Asana, WolframAlpha, Google Maps, etc.

For common websites however, you will probably be deploying a bunch of files at the same time. If you modified 10% of 40% of your files, the overhead of computing all those diffs (server side and client side) and having this system to manage is probably not worth it. If you are already compressing and grouping your assets and you have a CDN with proper HTTP caching, you are already pretty good. That can be hard though, especially the proper part.

Separate assets in 3 groups

When I have complete control over my assets, I usually like to split all the assets in 3 groups:
  1. Very common libraries (Bootstrap, jQuery, Modernizr). I serve them using public CDN like cdnjs. This is because it is very likely that the user will already have it in cache.
  2. External assets specific to my project (Custom Bootstrap build, jQuery plugins, lightbox plugin, etc.). I bundle them all in a big libs.js/css.
  3. Global custom assets written for this project, bundled in a single global.js/css. It does not need to be all the custom assets, but stuff you will need on 80% of your requests. Specific code for specific pages can be included individually.

Cache busting

I mostly use Amazon CloudFront as a CDN which handles query parameters so I set the expire date to one year and append a query parameter with the last git commit. (git rev-parse --short HEAD). That way, a fresh file is used each time there is any change in the project’s code. See Automatic cache busting using Git commit in Symfony2 for an example.

About combining

People often talk about combining how it saves HTTP requests but consider that it also compresses a lot better if the files are all grouped together. You may be adding an overhead on the first page load but the rest of the website will be blazing fast.

However, be careful not to overload the browser. Keep it mind that the Javascript will be executed on each page load. For example, to not try to initialize every modal window just in case one might pop up. See Optimizing page loads by reducing the impact of the Javascript initialization for more details.


Optimizing page loads by reducing the impact of Javascript initialization

So you combined all your Javascript files in the hope it will speed up page loads ? Well for sure the download will be faster, but the browser still needs to execute all this Javascript ! There are simple tricks to help reduce the impact on page loads.

Reduce DOM queries and manipulations

With libraries like jQuery, it is really easy to bind all sorts of events on complicated selectors. The thing is, the browser has to query to DOM like a mad man to find out all elements. Things get even worse when you add elements or query information like height or position, which triggers reflows and repaints. Try to be minimal.

Make initialization conditional

If you have a big block of code that needs to be executed only in specific cases.
  • Add a class to the body element and verify it.
    Ex: jQuery(document.body).hasClass('user-logged-in');
  • Check existence of important sections.
    Ex: document.getElementById('comments');

Delay initialization of non-essential parts

  • Delay heavy libraries like Google Maps or Facebook Like.
    See this post about loading social libraries.
  • Use requestAnimationFrame for animations.
  • Use setTimeout(function(){}, 1); to push the execution to the async queue, delaying the execution.
  • Use Web workers to run the function to run in background, without hanging the rest of the script. This also leverages multithread processing.

Use delegated event listeners

jQuery offers delegated event listeners where the listener is on an ancestor element. Your favorite library probably has it as well.

A good example is reply buttons in a comments thread: 
jQuery('#comments').on('click', '.reply', function(){});

How it works is that the click bubbles up to the comments element and there it verifies if the originally clicked element matches the selector.

This is extremely beneficial because you have much less DOM query at load and less event listeners to attach.

Initialize only on first use

Let’s say you have a complicated modal dialog that needs initialization and this process may take about 50ms. This is not very noticeable, but if you have other things to do after, you may well get over the 100ms rule, so you wouldn’t want to do it every time a modal is popped. For the same idea, you wouldn’t want to initialize 2-3 of those things at page load. This is why you need setupOnce

Inspired by the once function from Underscore, this utility will group two callbacks: one that is ran the first time it is called and only that is called every time.



Mobile

Mobile is even more critical because 200-700ms is spent doing the initial HTTP connection. For an in-depth look, see this presentation by @igrigorik from Google.