Badge 2

Tips Category

Easy ICMP health checking for front-end load balanced web servers

Thursday, May 13th, 2010

Over the past several years, I’ve encountered a lot of growing pains while managing a SaaS infrastructure for my company. One of our big successes in transitioning to the Lighttpd web server was almost reverted because our hardware load balancer wasn’t able to health check our front-end web servers.

Under normal operation, the load balancer will health check its child servers using a basic HTTP HEAD request. Lighttpd has some stricter requirements than Apache, and the load balancer’s HEAD request contained an invalid Content-Length header, which caused it to be rejected. Long story short, a load balancer that can’t health check its children is about as useful as a cell phone on the moon.

Being weary of continuous HEAD requests causing our log files to fill up and reserving web processes to answer them, I thought a much simpler solution would be to rely on the ICMP protocol, or more specifically, ping. If you ping a server and it comes back, that server should be online and ready to accept requests — if not, it’s dead. It’s easy to test on any terminal, DOS prompt on an aunt’s computer over Thanksgiving dinner, bash scripts, the load balancer, etc..

Next problem. Using FastCGI with Xcache in production, I’ve found a handful of ways a server can simply go zombie in my configuration which leads to the server answering pings and behaving normally but not answering (or corrupting) HTTP requests:

1) Too many connections at once overloads FastCGI available threads.
2) Xcache segfault
3) FastCGI craps out (keep PHP_FCGI_MAX_REQUESTS to 500!)
4) opcode overload (especially with stat)
5) Runaway CLI asynchronous daemon process(es) spawning off cron
6) Out of swap space (my favorite)
7) Too many open file descriptors
8) TCP network saturation
… to name a few I’ve experienced.

We rely on an external health checking service; but it can take time to get the phone call, get out of bed, restart the web server and fix the problem — the whole time your client is seeing a 500 error message or getting nothing at all. The best solution is to stop answering pings when any of the above situations happens or when the database goes down or when your app starts spewing errors. The quicker the server stops answering pings, the quicker the load balancer will redirect traffic to the next server, then the next, then eventually a nice, clean error page until we get the problem fixed.

I’ve written three scripts in bash which accomplish this on later versions of Ubuntu (I believe > version 7 or whenever the Ubuntu Firewall was introduced). The first two: block-pings and allow-pings do just what they say. The third, www-check, attempts to connect to its own web server and scans for a string of your choice. If it finds it, it executes allow-pings. If it can’t connect, or it doesn’t find the string, it calls block-pings. Here are the first two scripts, tested in Ubuntu 10.04:

block-pings:

#!/usr/bin/env bash

if [ ! $('whoami') = 'root' ]; then
    echo "This script must be run by root."
    exit 1
fi

UFW_BEFORE_RULES="/etc/ufw/before.rules"

/bin/grep "icmp-type echo-request -j ACCEPT" $UFW_BEFORE_RULES > /dev/null 2>&1
if [ $? -ne 0 ]; then
    exit 0
fi

/bin/sed -r -i "s/(icmp-type echo-request -j) ACCEPT\\s*$/\1 DROP/" $UFW_BEFORE_RULES 2>/dev/null
if [ $? -ne 0 ]; then
    echo "Failed to update $UFW_BEFORE_RULES."
    exit 1
fi

/etc/init.d/ufw restart > /dev/null 2>&1
if [ $? -ne 0 ]; then
    echo "Failed to restart ufw."
    exit 1
fi

echo "ICMP ping requests are now being blocked."

exit 0

allow-pings:

#!/usr/bin/env bash

if [ ! $('whoami') = 'root' ]; then
    echo "This script must be run by root."
    exit 1
fi

UFW_BEFORE_RULES="/etc/ufw/before.rules"

/bin/grep "icmp-type echo-request -j DROP" $UFW_BEFORE_RULES > /dev/null 2>&1
if [ $? -ne 0 ]; then
    exit 0
fi

/bin/sed -r -i "s/(icmp-type echo-request -j) DROP\\s*$/\1 ACCEPT/" $UFW_BEFORE_RULES 2>/dev/null
if [ $? -ne 0 ]; then
    echo "Failed to update $UFW_BEFORE_RULES."
    exit 1
fi

/etc/init.d/ufw restart > /dev/null 2>&1
if [ $? -ne 0 ]; then
    echo "Failed to restart ufw."
    exit 1
fi

echo "ICMP ping requests are now allowed."

exit 0

www-check:

#/usr/bin/env bash

TMP_FILE="/tmp/www-check.tmp"

if [ $# -ne 3 ]; then
    echo "Syntax: www-check [url] [string to check] (ping|yell|exit)"
    exit 1
fi

/bin/touch $TMP_FILE
/usr/bin/wget -T 4 -O - "$1" > $TMP_FILE 2>/dev/null
/bin/grep "$2" $TMP_FILE >/dev/null 2>&1

if [ $? -ne 0 ]; then
    case "$3" in
        ping)
            /usr/local/bin/block-pings;
            exit $?;
        ;;
        yell)
            /bin/echo "$1 is DOWN.";
            exit 1;
        ;;
        *)
            exit 1;
        ;;
    esac
else
    case "$3" in
        ping)
            /usr/local/bin/allow-pings;
            exit $?;
        ;;
        yell)
            /bin/echo "$1 is UP.';
            exit 0;
        ;;
        *)
            exit 0;
        ;;
    esac
fi

Now, either run www-check in a looping script or install it in the crontab like so:

* * * * * /usr/local/bin/www-check “http://www.mywebsite.com” “Welcome to my Site” ping

Every minute, the server will health check itself and block pings if the test string isn’t found, an HTTP connection can’t be made, or if the request times out. When the server comes back online, it answers the pings again!

  • Facebook
  • Twitter

Posted in Linux, Tips | Comments Off

My Top Ten Internationalization Headaches and How I Fixed Them

Tuesday, April 27th, 2010

Working on the CATS project has taken me from developing primarily English software into a whole new realm of excitement – internationalization and localization (i8n / L10n).  Suddenly I’ve got people from 120 countries (and not a handful, hundreds of paying customers!) wanting to see full support for their native tongues.

I could probably talk for hours on the enormous effort that it took to take CATS to the level of i8n support it has today; but, instead, I’m going to talk about the top 10 headaches I ran into.

Before I get started, if you’re looking at adopting i8n / L10n either pre-development or on an existing project, UTF-8 is the way to go. There are alternatives; but unless you have a very heavy non-Latin based user base, stop looking. UTF-8 is backwards compatible with ASCII, it supports just about everything (and is supported by just about everything) and it’s the best thing since sliced bread.

Now let’s get started!

1) My umlaut looks like a question mark in a fancy triangle!

This is the first step: change the encoding on all of your rendered HTML pages. Hopefully, you use a CMS or have a single header file where you can add this to the top of your pages in between the <head> tags:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

The handful of bytes you save by loading ISO-8859-1 isn’t worth it. Get on the bandwagon and start implementing UTF-8, even if you don’t need it yet — there’s a reason the IETF (read: Internet Police) requires all Internet Protocols  support it.

Once you cover the HTML, don’t forget about other content types. Make sure that your Ajax, RSS feeds and XML responses all include the UTF-8 identifiers or there will be some jumbling going on.

2) My XML or HTML doesn’t validate, it says invalid entity but it’s using <?xml version=”1.0″ encoding=”UTF-8″?> or includes the above <meta> tag and the entity is valid UTF-8!

Unless your DTD includes specifications for the UTF-8 entities, you’re going to get yelled at during validation. The whole point of the encoding=”UTF-8″ is so you don’t need entities. Luckily, this is an easy fix in PHP. Use the built-in html_entity_decode function to turn those entities into their actual characters:

$value = html_entity_decode(‘fancy &Uuml;’, ENT_COMPAT, ‘UTF-8′);
// returns ‘fancy Ã’

Just run your string data through it prior to exporting it to your XML writer. On a side note, if you haven’t noticed, my examples of Unicode data almost always include one of my favorite words: umlaut.

3) When I send data from JavaScript to my server using Ajax it gets scrambled!

Stop using the escape() function. It is not friendly to UTF-8. If you need to escape your strings for insert into your Ajax request URI, use encodeURIComponent() instead. It takes the same one parameter as the text and it returns the escaped text. It works in all browsers and it’s UTF-8 friendly. Start up that sed script on all your javascript files!

4) When I export my data from MySQL using SELECT INTO OUTFILE, it corrupts my UTF-8 in the CSV it creates!

The output file MySQL creates is going to be in BINARY. It’s NOT going to be in your character set for the table. For this reason, do not edit the CSV files created by SELECT INTO OUTFILE using a text editor. UTF-8 is variable length, so a text editor like vim may show 2 or 3 byte combinations that represent a single character.  Mess with any of those characters and you’ll corrupt the encoding!

If you need to use LOAD DATA INFILE / SELECT INTO OUTFILE to transfer UTF-8 data, just make sure that:

1) The source and destination tables are using the same UTF-8 encoding
2) You don’t mess with the CSV file in a text editor.
3) Include “CHARACTER SET UTF-8″ in the LOAD DATA INFILE right after the “INTO TABLE <name>” part.

5) In SQL, my “table.column_a = table.column_b” is throwing an incompatible character set error. Why does it hate me?

MySQL stores character set and collations for each database, table and row. If no row settings exist, it falls back to the table and then database. If you’re using a bad combination of character sets between two tables, string comparison may not be possible without alteration. Either alter your row/table/db to include compatible character sets/collations or alter your query like so: “CONVERT(table.column_a USING UTF-8) = CONVERT(table.column_b USING UTF-8)”. I don’t suggest this for a long term solution as that conversion is slow and will disable the query cache.

Also, if you’re wondering what collation is and how it differs from the character set: the simplest answer is think of collation as case sensitive/insensitive. It goes beyond that, but the simplest use of using a different collation between UTF-8 would be if you don’t care about the case of a column (the “username” column of a login table is a great example of insensitive, where the password column would be sensitive). Collations that end in _ci stand for “case insensitive” and those that end in “_cs” are their counter-parts. There is also “_bin” which stands for raw byte data.

6) I have a column that allows 32 characters, but my i8n data is getting trimmed to much less, 24, etc.!

UTF-8 is a variable length format. In a basic English Latin character set, the word “four” uses 4 bytes. For fancy characters like umlauts, a single character can take several bytes. When you set a column size in MySQL to 32 (i.e.: name varchar(32)), you’re setting the bytes, NOT the characters. Therefore, when setting columns sizes you should generally multiply your max size by 4 (UTF-8 takes 1-4 bytes) then use PHP to truncate the character (not byte) limit before sending the query.

7) I have the UTF-8 flags set everywhere I should, but my queries still contain scrambled characters!

I’ll assume when you say everywhere, you mean everywhere (in the MySQL client library, the HTML page, any Ajax URIs, etc.). This is usually because PHP 5 isn’t completely UTF-8 compliant yet (see PHP 6) so several of the string functions still work on bytes and not characters (this is similar to #6).

Be careful not to use things like substr() or left() to truncate data. Any string operation that works on bytes and not characters has the potential to chop up a 4 byte UTF-8 character midstream and corrupt the character. There are functions that start with mb_ (stands for multi-byte) which you should use instead. I will note, trim() is safe!

8) I don’t speak other languages and my only form of testing is copy and pasting umlauts. How can my users publish translations?

First, I recommend Gettext. There are numerous Gettext applications out there which allow Windows, Mac and Linux computers to write binary translation files. WordPress has a great implementation, copy it. There’s some nice documentation here: http://codex.wordpress.org/Translating_WordPress.

Another suggestion is that whenever you test your forms for UTF-8 compatibility, use Chinese text. If it works with Mandarin, it works with everything. Here’s where you can get some sample text: http://www.lorem-ipsum.info/generator3.

9) Is there an easy solution for images with text in them? What about JavaScript?

Images: no. Use CSS to position text over the images and stop putting text in your images themselves or accept the headache of using multiple images for each language.

For JavaScript, I recommend using a PHP script that can call your gettext functions and then outputs the .js file after setting the Content-Type. You can configure Apache/Lighttpd so that the file maintains the .js extension but parses it as PHP. While doing this, it’s a great time to add a combiner to reduce the number of JS files but maintain component-based scripts and also a minification script like JSMin.

10) What is the biggest headache you wish you had avoided when implementing i8n in CATS?

I allowed clients to translate individual strings themselves and change any piece of text on any page that they wanted. This means maintaining thousands of copies of translation files, which are difficult to cache, and reading from them when rendering nearly every single page. If you give a client a cookie, they’ll want a glass of  milk.  Draw a line, and support a handful of language translations only. You’ll thank me later.

  • Facebook
  • Twitter

Posted in CSS, HTML, JavaScript, PHP, SQL, Tips | 1 Comment »

Journaling Web Applications – Better Reliability and Responsiveness

Monday, February 15th, 2010

If you’re working on LAMP, you’re most likely already taking advantage of journaling on your file system and with your database. With journaling, any operation performed is first logged to the disk and then read circularly to be performed.

On a modern file system such as reiserfs or ext3, journaling means that a sudden loss of power between the removal of a directory entry and the marking of its inode as free in the free space map won’t lead to an orphaned inode (a storage leak).

In a database, journaling is often referred to with ACID (atomicity, consistency, isolation, durability) or the bundling of a series of operations in a transaction. When banking for example, you wouldn’t want a process to fail after deducting the money from your account but before depositing it as a payment into another account.

So far, we’re pretty comfortable implementing journaling everywhere that we’re storing data; but what about in the application layer. As the project triangle asks, we toss out simplicity (bye bye Ruby on Rails) for better reliability and performance. I propose numerous situations where journaling can be beneficial:

  1. Eliminate latency between the front-end web server and the database server (such as when front-ends are deployed internationally to provide better local service).
  2. 2) Eliminate downtime when databases need to be cycled offline for updates, backups, or during unexpected outages. When running a database on a cloud, maintain service during upgrades or when Amazon has its “hiccups”.
  3. Eliminate the waiting period after a user hits “submit” — even if it requires a response.

The Problem

There is a disconnect between the front-end web server serving content/running the application and the database server processing queries that store and retrieve data. That disconnect can temporarily fail or become slow.

The Solution

Front-end caching manages its own “temporary” database locally. All writes are written to the local database. All reads come from the master or local replicated database and ALSO the local database. All data in the local database is moved to the master database as fast as it can be.

Implementation

The key requirements of a front-end application journaling implementation are that it has to be fast, lightweight and use on-disk storage.

First, storage: The best tools for the job are Apache’s CouchDB, local MySQL install, SQLite or a custom implementation of flat files (perhaps XML) in a structured directory tree. Whatever you use, it needs to be setup in a manner that the oldest entries can be read with FIFO (first in, first out). Each “node” stored represents a CRUD (minus the R) action, or: create, update and delete.

Second, lightweight. The goal is to keep it fast, we’re not aiming at permanent storage or to turn the front-end into a database server. The resource usage requirements should be very low. Things like indexes aren’t required as we’re reading and writing circularly.

Third, on-disk storage. We can’t rely on things like memcached or in-memory tables.

The next hurdle is key generation. When adding new items, we have to generate and retrieve new primary keys in the master database (which often requires a response, as after adding an item the client is often redirected to view it). Obviously we don’t have this key as we’re writing it to the local database. One solution is that each front-end has an algorithm which generates number patterns unique to it. The second solution is you use a separate numbering scheme (such as one prefixed with a ‘z’) for local items. You then maintain a table with relationships between the local ID and the master ID when it becomes available. That item then becomes accessible either by the local ID or the master ID from then on.

The format of the “nodes” is pretty simple. It should be a compilation of the field/value pairs required to insert the record, serialized, in whatever form you’re using to store it.

A cron job or background process should be setup which routinely reads out the oldest nodes, unserializes the data and attempts to connect to the master database and process the action. Only upon success is the local node deleted or archived.

If you want complete break-away-ability from the master server, a local replicated database of the master should also be available to the front-end for all reads. You’ll also need to combine all read operations against the master or replicated database with the new local layer (look out for LIMIT clauses for pagination).

Implementing this process is complicated and adds complexity to your application; but, the results can be quite amazing. Imagine taking your database offline for an hour to install a hardware upgrade without ever taking your application down. Perhaps your database cloud provider has a big “oops” and loses all your data. If you maintain a few days of nodes, just restore a recent database backup to a new server, and re-read the old nodes. Your application never goes offline and your back in business with a new master database server in a few hours.

  • Facebook
  • Twitter

Posted in Linux, PHP, SQL, Tips | Comments Off