Pages

Thursday, December 4, 2014

Preventing cache stampede when using Doctrine Cache

Wikipedia has short and clear article on the matter, cache stampede can be quite deadly, especially when you are rebooting your server, clearing your cache or having a midnight maintenance (cronjob).

Cache stampede, put simply, is when your process is trying to fetch an expensive cache entry, it is not there, and tries to build it, but a simultaneous process comes and does the same thing. This results in resource waste because of processes all trying to do the same thing, uselessly.

The trick is to put a lock on the cache entry and make the other processes wait. I use the trick of having a TTL for the cache calculation. This way, even if the server crashes or there is an exception, the lock will expire and another process will take over. This estimation is also used to calculate how much time to sleep while waiting.

This example uses Doctrine Cache as a global variable, but this should definitely be refactored (or change to another system).


Tuesday, December 2, 2014

Load Doctrine Fixtures when using Migrations in Symfony

Using app/console doctrine:migrations:migrate is really easy and trouble free for everybody, but those tables are often empty and they need some data to be of any use. A common example is a table of countries or user roles.

However, once a project is started, it is often hard to use fixtures because it messes with your data. The trick here is to load only the fixtures you need, only when needed by the migration. This way, a new developer starting with the project could simply run the migrations and have a database ready for testing.

Here is a simple way of loading those fixtures on postUp. You need to pass an array of initialized fixture classes, Doctrine will figure the way to order them by itself.

This is only a proof of concept, it would need some refactoring and testing to be production ready, but you can get the idea.

Monday, December 1, 2014

MySQL 5.7 and Wordpress problem

If you upgrade to MySQL 5.7, you may encounter bugs with legacy software. Wordpress, which I also consider some kind of legacy software, does not handle this very well with its default settings.

You may encounter the "Submit for review" bug where you cannot add new posts. It may be related to permissions, auto_increment and other stuff, but here is another case: bad date formats and invalid data altogether.

In MySQL <= 5.6, by default, invalid values are coalesced into valid ones when needed. For example, attempting to set a field NULL on a non-null string will result in empty string. Starting with MySQL 5.7, this is not permitted.

Hence, if you want to upgrade to 5.7 and use all the goodies, you should consider putting it in a more compatible mode, adding this to your /etc/my.cnf:

[mysqld]
# Default:
# sql_mode = ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION 
sql_mode = ALLOW_INVALID_DATES,NO_ENGINE_SUBSTITUTION

See official documentation for complete information

Wednesday, November 5, 2014

Starting a text editor from inside a Virtual Machine

Using Vagrant, Docker or other virtual development environment is becoming quite popular. However, a drawback of this is that you cannot start a visual text editor on your main machine because heh, the files are not there. Purist may tell you that you should be using Vim or Emacs, but I like Sublime, I want to keep using it.

A simple trick I came up with it to call my text editor by SSH using path translation. Of course, you will need to translate the paths and otherwise adapt it to your needs, but this should get you started:

In this example, /media/data/projects on the Guest exists as ~/projects on the Host and ~ on the Host exists as /media/home on the Guest.

Make sure you have ~/.ssh/authorized_keys and ForwardAgent properly setup to avoid password prompts each time.

You could also use this trick to open a webpage by using the “open” program on Mac. Equivalent exists on most Linux distros.

Wednesday, October 8, 2014

Creating a static private network on VMWare Fusion with Ubuntu

This tutorial is using VMWare Fusion 7, Ubuntu 14.04.1 server and OSX 10.9

The goal here is to create a private network shared with selected VMs and the host, while offering NATing to connect to the Internet. VMWare offers some documentation, which works great with DHCP, but I needed to specify everything static for custom needs.

Creating a private network

  1. Go to the Network tab of general settings (⌘,)
  2. Unlock the screen by clicking on the lock.
  3. Add a custom network by clicking on the +.
  4. Make sure all options are checked (see screenshot).
  5. Specify a subnet IP, I will be using 192.168.200.0.
    Activating the DHCP here is needed for the host to connect to it, even though our VMs will be using static IPs.

Configure the VM’s network adapter

  1. Make sure your VM is powered off.
  2. Go to the VM’s settings (⌘E)
  3. Click on Network Adapter.
  4. Select your newly created network (for me it was vmnet2).

Configure the OS’s network adapter

  1. Edit the network interfaces :
    $ sudo nano /etc/network/interfaces

    # The loopback
    network interface

    auto lo

    iface lo inet loopback


    # The primary network interface

    auto eth0

    iface eth0 inet static

        address 192.168.200.100

        netmask 255.255.255.0

        gateway 192.168.200.2

        dns-nameservers 192.168.200.2
  2. Reboot
  3. Check to see if Internet works :
    ping google.com
  4. Check to see if you can see your host :
    ping 192.168.200.1
  5. Try SSHing to your VM from your host :ssh 192.168.200.100

You can add more VMs to this private network, just remember to change the IP from 192.168.200.100 to something else from 192.168.200.3 to 192.168.200.253.




Friday, June 20, 2014

Adding newlines in a SQL mysqldump to split extended inserts

The official mysqldump supports more or less two output styles: separate INSERTs (one insert statement per row) or extended INSERTs (one insert per table). Extended INSERTs are much faster, but MySQL write them all in one line, the result being a SQL very hard to read. Can we get the best of both worlds ?

Separate INSERTs

INSERT INTO mytable (id) VALUES (1);
INSERT INTO mytable (id) VALUES (2);

Extended INSERTs

INSERT INTO mytable (id) VALUES (1),(2);

New-And-Improved INSERTs

INSERT INTO mytable (id) VALUES
(1),
(2);

Current solutions

Using sed

mysqldump --extended-insert | sed 's/),(/),\n(/g'

Only problem is, lines will be split, even in the middle of strings, altering your data.

Using net_buffer_length

mysqldump --extended-insert --net_buffer_length=5000

mysqldump will make sure lines are not longer than 5000 (or whatever), starting a new INSERT when needed. The problem is that the behaviour is kinda random, diffs are hard to analyze and it may break your data if you are storing columns longer than this.

Writing a parser

This question has been often asked without a proper reply, so I decided to write a simple parser. Precisely, we need to check for quotes, parenthesis, and escape characters.

I first wrote it in PHP:



But then I realized it was too slow, so I rewrote it in C, using strcspn to find string occurence:


The only flaw that I can think of is that the parser will fail if the 10001st character of a line is an escaped quote, it will see it as an unescaped quote.

Happy dumping !

Monday, June 16, 2014

Using PHP’s mb_detect_encoding to cleanup your data

Ever heard of iso-8859-1 ? Yeah… that nightmare… With it, my name ends up more often than not… Sébastien. The computers gurus came up one day with UTF-8 and all our problems should have been solved; one encoding to rule them all.

Sweet, let’s all switch to UTF-8 ! Oh wait… legacy projects… PHP internal encoding is still not UTF-8 and functions like strlen() are still not able to properly process multi-bytes strings. It is being said that UTF-8 should land in PHP 6, but in the mean time, we still have to do something.

I am currently working on a big project with a lot of spaghetti-legacy code with a lot of entry points to the database. Almost all the data is stored in one big table, but not encoded uniformly. When I started working on it, tables had fields with a combinaison of  ascii_bin, utf8_general_ci, latin_general_ci and latin_swedish_ci … and we are in Canada ! In all those fields, data was stored with absolutely no guaranty of its encoding. Data was retrieved and passed through a series of UTF-8 encode/decode and stuff like this:

if (strpos($string, 'é') !== false) {
   $string = utf8_decode($string);
}

I eventually managed to change every field, change all database connections and remove and traces of encode/decode. However, I still had the problem of having data not encoded properly in some rows/columns. You may or may not be familiar with mb_detect_encoding, here is a very simple trick:

Once you have detected the encoding, use iconv to convert it. This crunched through my 1GB database in no time and I was then sure that everything was in UTF-8.

Of course, this is an example with a database, it can work with any data. This script could also be faster if all updates where done at the same time for each row.

Please, save yourself some trouble, make sure all user content is in UTF-8.

Friday, May 23, 2014

Simple custom CSS checkboxes with Font Awesome

Browsers make it notoriously hard to modify the default form elements like dropdowns and checkboxes. For ages, the only way to add a custom style was to use images and CSS Checkbox gives a very good example of this. However, using images is more of a hack than anything else, developers usually prefer to avoid them for good reason: hard to modify, adds to the loading time, flicker when hovering unless you preload or you use sprites, and resolution issues when scaling.

Then came the Javascript solutions. I personally like Chosen for custom dropdowns. These solutions are usually well supported cross-browser and are highly customizable and reliable, but they have the downside of being… Javascript. This means additional libraries, loading time, etc.

The holy grail is to use native CSS: you can customize it as you like, it can be built on top of your site’s stylesheet, inheriting the colours, the fonts, etc. There are a lot of examples out there that will show you working solutions, but they are usually complex. Here is a quick tutorial:

The basics

The concept is to define a custom element that will have a different style based on the state of the input element. No javascript and barely any extra markup. We will hide the original input, but it will continue to work as intended.

Base HTML

<label>
  <input type="checkbox">
  <span class="icon"></span>
  My label
</label>

Base CSS

input {
  display: none;
}
.icon {
  visibility: hidden;
}
input:checked + .icon {
  visibility: visible;
}

This way, the input is always hidden and the .icon is hidden only when the input is not checked.

** The :checked selector is not available for IE8 and lower.

Working demo

For a basic demonstration: http://codepen.io/lavoiesl/pen/rjcIx. The CSS is divided in section to clearly see what is essential. This demo uses Font Awesome to have a nice and clean checkmark.

For more complex examples, see http://cssdeck.com/labs/css-checkbox-styles.

Wednesday, April 9, 2014

Secure your SSL/TLS server

Heartbleed

Recently the Heartbleed bug came to light. It is a bug in the OpenSSL library that causes information to leak from the server. It is an undetectable backdoor that allows to gain the private key of your server. Let’s just say it is VERY important to fix it. Most distros have been very quick to propagate the OpenSSL update, so running your favorite update manager should fix it in no time.

To verify if you have protected, run this command and check for built on to be greater or equal to April 7th, 2014:
$ openssl version -a

OpenSSL 1.0.1e 11 Feb 2013
built on: Mon Apr  7 20:33:19 UTC 2014
platform: debian-amd64

Disable weak ciphers

The way SSL/TLS works is that the client and the server must agree on a cipher to use for encryption. If you were to attack a server, you would obviously use the least secure cipher. To protect against this, simply disable ciphers to be known as weak or those which flaws have been discovered.

I am using this configuration for Apache:

SSLCipherSuite ALL:!ADH:!AECDH:RC4+RSA:+HIGH:+MEDIUM:!LOW:!SSLv2:!EXPORT

For Nginx, see their configuration reference. Since 1.0.5, they are using a sensible default. Otherwise, you can use the same as above.

Do not use a too weak or too strong private key

The private key must never be discovered. Otherwise, anyone could decrypt the content and could perpetrate a MITM attack. If the private key is too weak, it could eventually be guessed given enough data. However, SSL/TLS handshakes are very CPU intensive for both the server and the client. Using a key too long will considerably slow down your website. In most cases, 2048 is perfect.

Test your own server

SSL Labs provides a free test suite that will test your ciphers and for known attacks including BEAST and Heartbleed. This is a must: https://www.ssllabs.com/ssltest/

Further reading

I am not a security expert, I simply happen to have done hosting for quite a time. I suggest you do not take my word blindly and go check this very pertinent paper from SSL Labs.


Thursday, January 30, 2014

Is Bitcoin mining profitable ?

I thought I would go for a change and talk about some Bitcoin mining. There have been some big hype around it. Some large companies are making top dollar with it. One of them, CoinTerra, has gained a very serious reputation with a team of highly skilled people.

Their latest technology, the TerraMiner IV, offers an astonishing 2 TH/s for about 6k USD. If you go at the Bitcoin wisdom calculator and select it using default settings, it shows potential profit of about 10,000 USD in about 8 months, Awesome !

The catch ?

Why are they selling money-making machines ? Why don’t they run them themselves ? Well, the bulk of the profit is made in the first few weeks. Due to the increasing difficulty in mining, hardware quickly becomes less effective so they trick into pre-order 2-3 months before you will have your machine. That is pretty insane…

If you go at the the calculator and input real, plausible shipping dates, you will see you cannot make money.

The options

I did every calculations with 25% difficulty increase and 0.05 USD/kWh. This is optimistic, it is usually considered that the increase is more around 30-35 these days. This also excludes any taxes, shipping fees, customs, etc.

I selected those 4 because it covers a large variety of hardware people often point as the most cost-effective or appealing. I encourage you to try your own.

Profit is calculated until 2015-01-01 or end of lease.

Company Gear GHs Price Start Profit
CoinTerra TerraMiner IV 2,000 6,000 $ 2014-05-15 -2,500 $
ButterflyLabs Monarch 600 2,200 $ 2014-03-01 -1,300 $
Black Arrow Rent-Some-Minions 4 months 1000 2,600 $ 2014-05-01 -800 $
Black Arrow Rent-Some-Minions 6 months 100,000 260,000 $ 2014-05-01 -70,000 $

For the infamous Monarch, the date is highly speculative due to their poor reputation. For it to be profitable, it would need to ship, at my door, within 12 days from today. When is the last time you received a ~40 lbs package by mail in less than 2 weeks ? Always excluding all taxes and shipping fees.

Need I say more ?

A huge hardware seller, TerraHash, has closed, upsetting a lot of customers and ButterflyLabs keeps getting bad press, I do not think they value their customers much.

The increase in difficulty

Not only is the difficulty increasing too quickly for casual miners to follow, the increase is itself increasing ! You have to factor the reduction of days between each change.

I compiled the data from bitcoinwisdom as % difficulty increase per day . Moore’s law dictates a linear increase in computing technology of 100% per 2 years or 0.14% per day, but we have here an increase of 2-6% per day and it seems like it is increasing. It is most probably due to the increase in miners. The computing power for mining has not followed Moore’s law so far, but this because new technologies were used, not discovered


You can’t win, the amount of power required to follow is way more than the normal increase in technology.

Less expensive options, for fun

So you just want to have some fun ?

One of the most suggested hardware for beginners is the ASICMiner Block Erupter USB 300MH/s Sapphire. It is advertised you can get one for as low as 35 USD. Pretty cheap ? Well even at that price tag, you will never reclaim more than 2-3 dollars. More expensive options are similar.

Conclusion

If you are able to build awesome hardware or buy a lot and use it quickly, you can make 2-3 times your investment, but in probably 6 months, it will be worthless. It has been profitable in the past, but now it is just a crazy technological race. Maybe it will calm down and we will be able to mine a bit some time. In the mean time, grab yourself a starter kit and join the party !