Tuesday 23 September 2014

Blog Archived

Whilst Blogger has been a good platform so far I've had great difficulty in dealing with code block and syntax highlighting.  I have therefore decided to move to a combination of GitHub Pages and Pelican for the new site.

The new site can be found at http://linuxjedi.co.uk/

Monday 22 September 2014

libAttachSQL now in Beta!

After three successful alpha releases I have today pushed up the last major feature prior to the 1.0 release of libAttachSQL.  Therefore v0.4.0 has been released today and is the first beta release.  I have been working on a few other things for HP's Advanced Technology Group hence why this is a little more delayed than I would like.  But I have made sure there is a lot of good things in this release.

The big new feature in this release is Server-side Prepared Statement support.  You now have the ability to prepare, execute and fetch the results for a prepared statement and the API for this is non-blocking as before.

Other things in this release are:

  • Compiler fixes for GCC 4.9, RedHat/CentOS 6.x and Python 3.x (Python is used for building the documentation)
  • Bug fixes for SSL and buffer issues
  • Much faster building on multi-core systems
In the next couple of weeks I'll write some blog posts on how to use this API as well as the rest of libAttachSQL.

Monday 15 September 2014

Speaking about libAttachSQL at Percona Live London

As many of you know I'm actively developing libAttachSQL and am rapidly heading towards the first beta release.  For those who don't, libAttachSQL is a lightweight C connector for MySQL servers with a non-blocking API.  I am developing it as part of my day job for HP's Advanced Technology Group.  It was in-part born out of my frustration when dealing with MySQL and eventlet in Python back when I was working on various Openstack projects.  But there are many reasons why this is a good thing for C/C++ applications as well.

What you may not know is I will be giving a talk about libAttachSQL, the technology behind it and the decisions we made to get here at Percona Live London.  The event is on the 3rd and 4th of November at the Millennium Gloucester Conference Centre.  I highly recommend attending if you wish to find out more about libAttachSQL or any of the new things going on in the MySQL world.

As for the project itself, I'm currently working on the prepared statement code which I hope to have ready in the next few days.  0.4.0 will certainly be a big release in terms of changes.  There has been feedback from some big companies which is awesome to hear and I have fixed a few problems they have found for 0.4.0.  Hopefully you will be hearing more about that in the future.

For anyone there I'll be in London from the 2nd to the 5th of November and am happy to meet with anyone and chat about the work we are doing.

Monday 1 September 2014

New version of libAttachSQL, C connector for MySQL released!

It has been just over 2 weeks since the last libAttachSQL version was released.  I had a great vacation in the middle which for once meant that I didn't do any work for the week I was away :)

For those who don't know about it, libAttachSQL is a lightweight, non-blocking C connector for MySQL servers.  It is Apache 2.0 licensed so plays well with both Open Source and Commercially licensed applications.  I have been developing it for 2 months now as part of my work for HP's Advanced Technology Group.  It is hosted on GitHub and uses many freely available tools (such as Travis CI) to host and test various parts of the project.

Once again I thank everyone for the feedback I have received.  You all make it even more awesome to be working on this :)

So, on to the new version 0.3.0 alpha release.  This time round we have been focusing on zlib compression and SSL support.  Both of these features have been added and neither impacts the non-blocking aspect of the library.  The SSL part in particular was quite new to me, I've coded SSL into applications many times in the past, but I've never done it in a non-blocking way before.  It posed some interesting challenges but it was fun and appears to be working great now.

The biggest changes in this release are:

  • Fixes to the test cases and improvements to the CI used
  • Documentation improvements
  • Many minor bug fixes
  • Protocol compression (zlib) support
  • SSL encryption (OpenSSL) support
  • 32bit compiling works
For more information see the Version History section of the docs.

On to the next release which should complete the biggest pre-release features.  From there we can head towards our first GA release.

If you have any questions, feedback, etc... please feel free to leave comments, email me or open a GitHub issue.

Friday 15 August 2014

libAttachSQL 0.2.0 alpha released!

Hot on the heals of last week's release we have released version 0.2.0 alpha of libAttachSQL.  For those who have missed my previous blog posts, libAttachSQL is a lightweight C connector for MySQL servers I'm creating with HP's Advanced Technology Group.  It has an Apache 2 license so is good for linking with most Open Source licenses as well as commercial software projects.

Changes in this release:

  • Added support for query result buffering
  • Passive connect on first query is now asynchronous
  • Improved memory handling
  • Many documentation changes, including API examples
  • Many other smaller fixes
For more information see the libAttachSQL documentation and the release itself can be found on the libAttachSQL website.

We have had some great feedback so far.  Thanks to everyone who has contacted me since the first release.  As always if you have any questions feel free to contact me or file an issue on GitHub.

Sunday 10 August 2014

libAttachSQL 0.1.0 alpha released!

As I briefly mentioned in my previous post, I have been working on a new project for HP's Advanced Technology Group called libAttachSQL.

libAttachSQL is a lightweight C connector for MySQL servers.  It is Apache 2 licensed (and therefore compatible with many open source licenses as well as commercial use) and has a new asynchronous API.  With the new API you send a command which returns immediately and you poll until the library tells you there are results ready, this is very useful for applications that have many things going on that you do not want held up by waiting for the MySQL server to process a query.  In later posts I will give usage examples of this.

I am a great believer in release early/often so on Friday, 5 weeks after I started writing code (and docs), I have released the first alpha version of this connector.  The source of this release can be downloaded here.  For now this is a source-only release just to give a taste of the project so far.  At some point before GA binary packages will be released too.  Documentation for the library can be found on Read The Docs.

What it can currently do:

  • Compile in CLang and GCC on Linux and Mac
  • Cross-compile for Windows using MinGW64 (in Fedora only)
  • Connect to MySQL servers using TCP or Unix socket file
  • Send basic MySQL queries and retrieve results
  • Using an API similar to prepared statements it can automatically escape and convert data for your queries
  • Not a lot else
As the project progresses we will be adding many more features such as prepared statement support.


This project is completely open, using many available free services as described in my previous blog post.  We welcome people to come and kick the tyres and contribute in as small or large way as possible.  This can be simply filing a bug or feature request, contributing docs or code, etc...  One thing we could really use right now is someone with Debian/Ubuntu expertise to help us create the Debian package scripts (I'm not an expert at these and am struggling to make it work).  There is a GitHub issue open for this.

If you have any questions about the library feel free to contact me, comment on this blog post, open a GitHub issue or come chat on the #libAttachSQL channel on Freenode.

Friday 8 August 2014

How cloud hosted services are helping open source

One big project I'm working on for HP's Advanced Technology Group right now is an Apache 2.0 licensed C connector for MySQL servers called libAttachSQL.  The whole process, not just the code itself, is helping us learn about new and current techniques in Open Source development.  Whilst I will be writing many posts about libAttachSQL in the future, today's post is about the free hosted services we are using around it.

GitHub

Almost all previous Open Source projects I have worked on in the past have been hosted on Canonical's Launchpad platform.  Over the last couple of years there has been a shift to using GitHub and almost everything I have worked on at HP has been hosted there.  Now there are many services that hook into GitHub so this seemed like the perfect opportunity to try some of them out.

The libAttachSQL project has its own organisation in GitHub and a couple of trees under this.  The service is fantastic and has grown a lot over the years in features and reliability.  The only thing I don't quite agree with is that they prefer a custom type of Markdown documentation over other formats.  Some reStructuredText support is there but it isn't as good as I would hope yet.  This is a really minor issue though and not something they should be knocked down on.

GitHub Pages

GitHub pages is a relatively new service created by GitHub.  Simply create a tree with a specific name, push some static content, and you are done!  There is also an easy method to get domains pointed to it so we have a GitHub page as the site for libattachsql.org.

Read The Docs

Every Open Source project I have worked on from Drizzle onwards has had its documentation in reStructuredText format which compiles into HTML, PDF and many other formats using a Python based tool called Sphinx (not to be confused with the search server).  In my opinion it is more flexible than Markdown format, especially when documenting APIs.

libAttachSQL's documentation was again written in reStructuredText format and is automatically compiled into HTML and PDF documentation using the free service Read The Docs.  This is hooked into GitHub so on a new push/merge Read The Docs will automatically generate a new version of the documentation.  We have pointed a subdomain to the Read The Docs output so that it can be easily accessed, docs.libattachsql.org.

I am extremely pleased with this service, not only is it free for Open Source projects but it makes documentation even more aesthetically pleasing than the basic Sphinx templates do.

Travis CI

Every source code project needs Continuous Integration.  There are many solutions to this, one of the most popular being Jenkins.  As with the RST documentation format every project I have worked on from Drizzle onwards uses Jenkins to test every branch before and after merging.  I could have used Jenkins for this project but my goal is not to own the hosting of anything.  So, for libAttachSQL I setup Travis CI.  This is a hosted service that is free for Open Source projects and has a paid-for variant for private projects.

Our Travis setup will test compiling in CLang and GCC in Linux (Travis uses Ubuntu 12.04), running a test suite in each.  Every virtual test host comes with a MySQL server already running for you to use in your tests and it was very simple to set this up.  We also get the Travis tests to build the documentation with nitpick mode and warnings as errors so any minor documentation problems are picked-up early.  All this is done with a very simple YAML script (although ours has got a little more complex with adding support for Coverity).

At a later date I want our builds to also run Valgrind checks and on the provided OS X platform, but I will work on getting those running at a later date.

Travis is a fantastic service and a breeze to setup and use.  The interface shouldn't be too unfamiliar if you have used Jenkins before.  My wish is that it supported more platforms.  I would really love a Fedora based builder, a more up-to-date Ubuntu and possibly Windows builders.  Although they do have OS X builders which is fantastic.

Coverity Scan

Coverity Scan is a static code analyser which is free to Open Source projects hosted on GitHub, it also hooks in nicely with Travis CI with Travis providing the analysis data from the code and builds to break down on the site.  This was the most complex of all of these services to setup but has given some fantastic results so far.  It has found 13 potential bugs in my code that CLang's lint and Valgrind didn't find.  This is really impressive, for starters there are incredibly strict flags set for building the project from git, also there was only one false positive.  Unfortunately there is a quota limit for Open Source projects so we only run this occasionally rather than every merge.

Conclusion

We have managed to have all of the services that we would need to setup and manage setup for us completely for free and no hosting for us to manage.  And these are all awesome services and most were very quick to setup.  I thank all of the companies providing these services, it has easily shaved a week off my time setting up machines to host our project and many more hours managing the services.

Over the next couple of weeks I will be talking a lot more about the libAttachSQL project, so look out for those posts.

Thursday 31 July 2014

Tripping up using MinGW

One of the projects I am currently working on for HP's Advanced Technology Group is written in C/C++ and is using MinGW to cross-compile from Linux to Windows.  I have done this in several other projects quite successfully but this week I was tripped up by something when using MinGW in Fedora 20.

The build system I am using is DDM4 which is a collection of m4 files that can be used as a template for any project.  Back when working on libdrizzle 5.x we added support for MinGW cross-compiling to this.  It is designed to enable as many good compiler warnings as possible (and makes them trigger as errors) and if the compiler support PIE (Position Independent Executable) this will be enabled too.

MinGW's compiler is based on GCC and as such supports PIE to some extent.  Unfortunately I found that the '-pie' compiler option was causing the entry point to the executable to point away from the main() function.  It was hitting MinGW's dummy main() instead and the executable was returning immediately.

I have made a patch to disable PIE for MinGW and upstreamed it to DDM4.  I leave this as a warning for anyone that it hits in the future (including me).

As an extra side note, MinGW 64bit in Ubuntu seems to be completely broken.  The binary packages only contain a few text files and directories, no actual binaries.

Thursday 26 June 2014

CoreOS Review

I have spent a few days now playing with CoreOS and helping other members of HP's Advanced Technology Group get it running on their setups.

Today I thought I would write about the good and the bad about CoreOS so far.  Many components are in an alpha or beta state so things may change over the coming months.  Also as a disclaimer, views in this post are my own and not necessarily those of HP.

Installation


As stated in my blog post yesterday, I have been using CoreOS on my Macbook Pro using Vagrant and VirtualBox.  This made it extremely easy to setup the CoreOS cluster on my Mac.  I made a minor mistake to start with, and that is to not configure the unique URL required for Etcd correctly.  A mistake a colleague of mine also made on his first try so it likely to be a common one to make.

I initially had VirtualBox configured to use a Mac formatted USB drive I have hooked-up.  Vagrant tried to create my CoreOS cluster there and during the setup the Ruby in Vagrant kept spinning on some disk reading routine and not completing the setup.  Debug output didn't help find the cause so I switched to using the internal SSD instead.

CoreOS


CoreOS itself appears to be derived from Chrome OS which itself is a fork of Gentoo.  It is incredibly minimal, there isn't even a package manager that comes with it.  But that is the whole point.  It is designed so that Docker Containers are run on top of it providing the application support.  Almost completely isolating the underlying OS from the applications running on top of it.  This also provides excellent isolation between say MySQL and Apache for a LAMP stack.

It is a clean, fast OS using many modern concepts such as systemd and journald.  Some of these are only in the bleeding-edge distributions at the moment so many people may not be familiar with using them.  Luckily one of my machines is running Fedora 20 so I've had a play with these technologies before.

Etcd


CoreOS provides a clustered key/value store system called 'etcd'.  The name of this confused many people I spoke to before we tried it.  We all assumed it was a clustered file store for the /etc/ path on CoreOS.  We were wrong, although that is maybe the direction it will eventually take.  It actually uses a REST based interface to communicate with it.

Etcd has been pretty much created as a new project from the ground-up by the CoreOS team.  The project is written in Go and can be found on Github.  Creating a reliable clustered key/value store is hard, really hard.  There are so many edge cases that can cause horrific problems.  I cannot understand why the CoreOS team decided to roll their own instead of using one of the many that have been well-tested.

Under the hood the nodes communicate to each other using what appears to be JSON (REST) for internal admin commands and Google Protobufs over HTTP for the Raft Consensus Algorithm library used.  Whilst I commend them for using Protobufs in a few places, HTTP and JSON are both bad ideas for what they are trying to achieve.  JSON will cause massive headaches for protocol upgrades/downgrades and HTTP really wasn't designed for this purpose.

At the moment this appears to be designed more for very small scale installations instead of hundreds to thousands of instances.  Hopefully at some point it will gain its own protocol based on Protobufs or similar and have code to work with the many edge cases of weird and wonderful machine and network configurations.

Fleet


Fleet is another service written in Go and created by the CoreOS team.  It is still a very new project aimed at being a scheduler for a CoreOS cluster.

To use fleet you basically create a systemd configuration file with an optional extra section to tell Fleet what CoreOS instance types it can run on and what it would conflict with.  Fleet communicates with Etcd and via. some handshaking figures out a CoreOS instance to run the service on.  A daemon on that instance handles the rest.  The general idea is you use this to have a systemd file to manage docker instances, there is also a kind-of hack so that it will notify/re-schedule when something has failed using a separate systemd file per service.

Whilst it is quite simple in design it has many flaws and for me was the most disappointing part of CoreOS so far.  Fleet breaks, a lot.  I must have found half a dozen bugs in it in the last few days, mainly around it getting completely confused as to which service is running in which instance.

Also the way that configurations are expressed to Fleet are totally wrong in my opinion.  Say, for example, you want ten MySQL docker containers across your CoreOS cluster.  To express this in Fleet you need to create ten separate systemd files and send them up.  Even though those files are likely identical.

This is how it should work in my opinion:  You create a YAML file which specifies what a MySQL docker container is and what an Apache/PHP container is.  In this YAML you group these and call them a LAMP stack.  Then in the YAML file you specify that your CoreOS cluster needs five LAMP stacks, and maybe two load balancers.

Not only would my method scale a lot better but you can then start to have a front-end interface which would be able to accept customers.

Conclusion


CoreOS is very ambitions project that in some ways becomes the "hypervisor"/scheduler for a private Docker cloud.  It can easily sit on top of a cloud installation or on top of bare metal.  It requires a totally different way of thinking and I really think the ideas behind it are the way forward.  Unfortunately it is a little too early to be using it in anything more than a few machines in production, and even then it is more work to manage than it should be.

Wednesday 25 June 2014

Test drive of CoreOS from a Mac

Yes, I am still the LinuxJedi, and whilst I have Linux machines in my office for day-to-day work the main desktop I use is Mac OS.

I have recently been giving CoreOS a spin for HP's Advanced Technology Group and so for this blog post I'm going to run through how to setup a basic CoreOS cluster on a Mac.  These instructions should actually be very similar to how you would do it in Linux too.

Prerequisites


There are a few things you need to install before you can get things going:
  1. VirtualBox for the CoreOS virtual machines to sit on
  2. Vagrant to spin-up and configure the virtual machines
  3. Git to grab the vagrant files for CoreOS
  4. Homebrew to install 'fleetctl' (also generally really useful on a developer machine)
You then need to install 'fleetctl' which will be used to spin up services on the cluster:

$ brew update
$ brew install fleetctl

Configuring


Once everything is installed the git repository for the vagrant bootstrap is needed:

$ git clone git://github.com/coreos/coreos-vagrant/
$ cd coreos-vagrant/
$ cp config.rb.sample config.rb
$ cp user-data.sample user-data

The last two commands here copy the sample configuration files to configuration files we will use for Vagrant.

Edit the config.rb file and look for the line:

#$num_instances=1

Uncomment this and set the number to '3'.  When I was testing I also enabled the 'beta' channel in this file instead of the default 'alpha' channel.  This is entirely up to personal preference and both will work with these instructions at time of writing.

With your web browser go to https://discovery.etcd.io/new.  This will give you a token URL.  This URL will be used to help synchronise Etcd which is a clustered key/value store in CoreOS.  In the user-data YAML file uncomment the 'discovery' line and set the URL to the one the link above gave you.  It is important that you generate a new URL for every cluster and every time you burn down / spin up  a cluster on your machine.  Otherwise Etcd will not behave as expected.

Spinning Up


At this point everything is configured to spin-up the CoreOS cluster.  From the 'coreos-vagrant' directory run the following:

$ vagrant up

This may take a few minutes as it downloads the CoreOS base image and creates the virtual machines required.  It will also configure Etcd and setup SSH keys so that you can communicate with the virtual machines.

You can SSH into any of these virtual machines using the following and substituting '01' with the CoreOS instance you wish to connect to:

$ vagrant ssh core-01 -- -A

Using Fleet


Once you have everything up and running you can use 'fleet' to send systemd configurations to the cluster.  Fleet will schedule which CoreOS machine the configuration file will run on.  Fleet internally uses Etcd to discover all the nodes of your CoreOS cluster.

First of all we need to tell 'fleetctl' on the Mac how to talk to the CoreOS cluster.  This requires two things, the IP/port and the SSH key:

$ export FLEETCTL_TUNNEL=127.0.0.1:2222
$ ssh-add ~/.vagrant.d/insecure_private_key

For this example I'm going to spin up three memcached docker instances inside CoreOS, I'll let Fleet figure out where to put them only telling it that two memcached instances cannot run on the same CoreOS machine.  To do this create three identical files, 'memcached.1.service', 'memcached.2.service' and 'memcached.3.service' as follows:

[Unit]
Description=Memcached server
After=docker.service

[Service]
ExecStart=/usr/bin/docker run -p 11211:11211 fedora/memcached

[X-Fleet]
X-Conflicts=memcached*

This is a very simple systemd based script which will tell Docker in CoreOS to get a container of a Fedora based memcached and run it.  The 'X-Conflicts' option tells Fleet that two memcached systemd configurations cannot run on the same CoreOS instance.

To run these we need to do:

$ fleetctl start memcached.1.service
$ fleetctl start memcached.2.service
$ fleetctl start memcached.3.service

This will take a while and show as started whilst it downloads the docker container and executes it.  Progress can be seen with the following commands:

$ fleetctl list-units
$ fleetctl journal memcached.1.service

Once complete your three CoreOS machines should all be listening on port 11211 to a Docker installed memcached.

Thursday 5 June 2014

Live Kernel Patching Video Demo

Earlier today I posted my summary of testing the live kernel patching solutions so far as part of my work for HP's Advanced Technology Group.  I have created a video demo of one of these technologies, kpatch.


This demo goes over the basics of how easy kpatch is to use.  The toolset for now will only work with Fedora 20  for now but judging by the documentation Ubuntu support is coming very soon.

Live Kernel Patching Testing

At the end of March I gave a summary of some of the work I was doing for HP's Advanced Technology Group on evaluating kernel patching solutions.  Since then I have been trying out each one whilst also working on other projects and this blog post is to show my progress so far.

After my first blog post there were a few comments as to why such a thing would be required.  I'll be the first to admit that when Ksplice was still around I felt it better to have a redundant setup which you could perform rolling reboots on.  That is still true in many cases.  But with OpenStack based public clouds a reboot means that customers sitting on top of that hardware will have their virtual machines shut down or suspended.  This is not ideal when with big iron machine you can have 20 or more customers to one server.  A good live kernel patching solution means that in such a scenario the worst case would be a kernel pause for a few seconds.

In all cases my initial testing was to create a patch which would modify the output of /proc/meminfo. This is very simple and doesn't go as far as testing what would happen patching a busy function under load but at least gets the basics out of the way.

Ksplice


As previously discussed, Ksplice was really the only toolkit to do live kernel patching in Linux for a long time.  Unfortunately since the Oracle acquisition there have been no Open Source updates to the Ksplice codebase.  In my research I found a Git repository for kSplice which had recent updates to it. I tried using the with Fedora 20 but my guess is either Fedora does something unique with the kernels or it is just too outdated for modern 3.x Linux kernels.  My attempts to compile the patches hit several problems inside the toolset which could not easily be resolved.

kGraft


After Ksplice I moved on to kGraft.  This takes on a similar approach to Ksplice but uses much newer techniques.  Of all the solutions I believe the technology in kGraft is the most advanced.  It is the only solution that is capable of patching a kernel without pausing it first.  To use this solution there is a requirement for now to use SUSE's kGraft kernel tree.  I tested this on an OpenSUSE installation so I could be close to the systems it was designed on.  Unfortunately whilst the internals are quite well documented, the toolset is not.  After several attempts I could not find a way to get this solution to work end-to-end.  I do think this solution has massive potential, especially if it gets added to the mainline kernel.  I hope in the future the documentation is improved and I can revisit this.

kpatch


Finally I tried Red Hat's kpatch.  This solution was released around the same time as kGraft and likewise is in a development stage with warnings that it probably shouldn't be used in production yet. The technology again is similar to Ksplice with more modern approaches but what really impressed me is the ease-of-use.  The kpatch source does not require compiling as part of the kernel, it can be compiled separately and is just a case of 'make && make install'.  The toolset will take a standard unified diff file and using this will compile the required parts of the kernel to create the patch module.  Whilst this sounds complex it is a single command to do this.  Applying the patch to the kernel is equally as easy.  Running a single command will do all the required steps and apply the patch.

Conclusion


If public clouds with shared hardware and going to continue to grow this should be a technology that is invested in.  I was very quickly drawn to Red Hat's solution because they have put a high priority on how the user would interact with the toolset.  I'm sure many Devops engineers will appreciate that.  What I would love to see is kGraft's technology combined with kpatch's toolset.  This will certainly be an interesting field to watch as both projects mature.

Monday 28 April 2014

Should voicemail be trusted?

As a member of HP's Advanced Technology Group I take security very seriously. It is a primary thought in everything that engineer does at HP.

I will start this discussion with a disclaimer: Don't hack voicemail!  Not only is it a really nasty thing to do, it is illegal!

In the UK we have had a phone hacking scandal in our media for a long time. The short story is reporters for tabloid newspapers were accessing the voicemail of celebrities to find out gossip to sell papers.  Phone providers made this easy by having an easy to guess default PIN number or no PIN number at all to access voicemail remotely.

Whilst things have improved slightly in the wake of this, The Register recently proved you can still access the voicemail of others without a PIN number very simply.  By the time you read this I suspect both providers affected by this will have closed the loop hole but there is bound to be other loop holes just waiting to be exploited.  This still raises several questions in my mind about the security of voicemail.

Judging by the data from a recent Data Genetics article you could reasonably guess a PIN in three tries with around 18% chance of getting it correct.  I've not tried to lock myself out of a voicemail system before but I would hope it locks out after three attempts (if not then we should really worry).  If you have some information such as memorable years/dates about the person owning the number you could probably even increase your success rate.  So in theory a hacker wouldn't have to try too many phone numbers until he/she got in.

If your PIN number is not stored in an encrypted form it is likely that it would be vulnerable to some form of social engineering attack at the provider's end.  I also suspect that many would use their credit card PIN number as their voicemail PIN number to make it easy to remember which adds an level of insecurity with the system.

I think it is very unlikely that the voicemail is stored as in an encrypted form, much more likely that it is a bunch of MP3s on a disk array with a database table pointing to your messages (or just blob data in the DB).  This brings the security of it down inline with email (worse because a PIN number is easier to guess than passwords).  Even if the voicemail data is encrypted the provider holds the locks and the keys, rendering you powerless.

The general saying is that email should be considered public and you shouldn't send messages you wouldn't want the world reading without at least some form of encryption (such as PGP). I would say that exactly the same is true of voicemail, don't use it for messages that you don't want the world to hear.  Voicemail doesn't have anything like PGP encryption built-in.

Several of my friends and I have our voicemail greeting messages set to say that we don't listen to our voicemail ever and that messages should be left in a different form.  In this decade, where security is really under the magnifying glass, I think someone needs to start taking a serious look at a better way of doing voicemail.

Monday 31 March 2014

Live Kernel Patching Solutions

A large public cloud is quite a complex thing to manage, even with toolsets such as Openstack to deploy and manage it.  You often have many customers with compute instances on a single server so security is a high priority.  This also introduces an interesting problem, when you need to deploy a Linux Kernel security patch to the host (sometimes thousands of hosts) how do you do it with the least disruption?

One of the currently most accepted methods is to suspend the instances on a box, deploy the patch, reboot the host and resume the instances.  Whilst this works in many cases you have customers who have had their instances down for X minutes (even if it is planned and the customer is notified it can be inconvenient) and instance kernels that suddenly think "my clock just jumped, WTF just happened?".  This in-turn can cause problems for long running clients talking to servers as well as any time-sensitive applications.  Long running clients will often think they are still connected to the server when they really are not (there are ways around this with TCP Keepalive).

There are a couple of old solutions to this and a couple of new ones and as part of my work for HP's Advanced Technology Group I will be taking a deep dive into them in the coming weeks.  For now here is a quick summary of what is around:

kexec


This is probably the oldest technology on the list.  It doesn't quite fall under the "Live Kernel Patching" umbrella but is close enough that it warrants a mention.  It works by basically ejecting the current kernel and userspace and starting a new kernel.  It effectively reboots the machine without a POST and BIOS/EFI initialisation.  Today this only really shaves a few seconds off the boot time and can leave hardware in an inconsistent state.  Booting a machine using the BIOS/EFI sets the hardware up in an initialised state, with kexec the hardware could be in the middle of reading/writing data at the point the new kernel is loaded causing all sorts of issues.

Whilst this solution is very interesting, I personally would not recommend using it as during a mass deployment you are likely to see failures.  More information on kexec can be found on the Wikipedia entry.

Ksplice


Ksplice was really the first toolset to implement Live Kernel Patching.  It was created by several MIT students who subsequently spun-off a company from it which supplied patches on a subscription model.  In 2011 this company was acquired by Oracle and since then there have been no more official Open Source releases of the technology.  Github trees which are updated to work with current kernels still exist.

The toolset works by taking a kernel patch and converting it into a module which will apply the changes to functions to the kernel without requiring a reboot.  It also supports changes to the data structures with some additional developer code.  It does temporarily pause the kernel whilst the patch is being applied (a very quick process), but this is far better than rebooting and should mean that the instances do not need suspending.

kpatch


Both Red Hat and SUSE realised that a more modern Open Source solution is needed for the problem, and whilst SUSE announce their solution (kGraft) first, Red Hat's kpatch solution was the first to show actual code to the solution.

Red Hat's kpatch solution gives you a toolset which creates a binary diff of a kernel object file before and after a patch has been applied.  It then turns this into a kernel module which can be applied to any machine with the same kernel (along with kpatch's loaded module).  Like Ksplice, it does need to pause the kernel whilst patching the functions.  It also as-yet does not support changes to data structures.

It is still very early days for this solution but development is has been progressing rapidly.  I believe the intention is to create a toolset that will take a unified diff file and turn that into a kpatch module automatically.

For more information on this solution take a look at their blog post and Github repository.

kGraft


SUSE Labs announced kGraft earlier this year but only very recently produced code to show their solution.

From the documentation I've seen so far their solution appears to work in a similar way to Red Hat's but they have the unique feature which gives the ability for the patch to be applied to the kernel without pausing it.  Both the old and the replacement functions can exist at the same time, old executions will finish using the old function and new executions will use the new function.

This solution seems to have gone down the route of bundling their code on top of a Linux kernel git tree which means it took an entire night for me to download the git history.  I'm looking forward to digging through the code of this solution to see how it works.

The git tree for this can be found in the kgraft kernel repository (make sure to checkout the origin/kgraft branch after you have cloned it) and SUSE's site on the technology can be found here.

Summary


All three of the above solutions are very interesting.  Combined with a deployment technology such as Salt or Ansible it could mean the end to maintenance downtime for cloud compute instances.  As soon as I have done more research on the technologies I will be writing more details and hopefully even contributing where possible.

Saturday 22 February 2014

Cloud Users: Don't put your eggs in one basket

In my previous blog post I briefly mentioned that a premature cloud account removal meant that the Drizzle project lost a lot of data, including backups.  I'll go into a bit more detail here so cloud users can learn from our mistakes.  I will try to avoid using cloud puns here as much as possible :)

As with everything in life clouds fail.  No cloud I have ever seen so far can claim they have had 100% uptime since the beginning of hitting GA.  However much projects like Openstack simplify things for the operators of clouds they are complex architectures and things can fail.

But there is one element of human failure I've seen several times from multiple cloud vendors which is the problem that crippled Drizzle.  That is account deletion.

Most of Drizzle's websites and testing framework were running from a cloud account using compute resources.  Backups were made and automatically uploaded to the cloud file storage on the same account for archiving.  This was our mistake (and we have definitely learnt from it).  We knew that at some point in the future the cloud accounts used would be migrated to a different cloud and the current cloud account terminated.  Unfortunately the cloud account used was terminated prematurely.  This meant that all compute instances and file storage was instantly flushed down the toilet.  All our sites and backups were instantly destroyed.

This is not the only time I have seen this happen.  There have been two other instances I know of in the last year where an accidental deletion of a cloud account has meant that all data including backups were destroyed.  Luckily in both those cases the damage was relatively minor.  I actually also lost a web server due to this problem around the same time as Drizzle was hit.

The Openstack CI team do something quite clever but relatively simple to mitigate against these problems and continue running.  They use multiple cloud vendors (last I checked it was HP Cloud and Rackspace).  When your commit is being tested in Jenkins it goes to a cloud compute instance in whatever cloud is available at the time.  So if a vendor goes down for any reason the CI can still continue.

I highly recommend a few things to any users of the cloud.  You should:

  1. Make regular offsite backups (and verify them)
  2. If uptime is important, use multiple cloud providers
  3. Use Salt, Ansible or similar technology so that you can quickly spin your cloud instances up again to your requirements at a moments notice

Patrick Galbraith, a Principal Engineer who works with me at HP's Advanced Technology Group is currently working on a way to enhance libcloud to work with HP Cloud better so that it is easy to seamlessly use multiple clouds.  We are also working on several enhancements to salt and ansible.  Both very promising technologies when it comes to cloud automation.

The way I see it, no one should be putting all their cloud eggs in one basket.

Is Drizzle dead?

Yesterday someone opened a Launchpad question asking "is Drizzle dead?".  I have answered that question on Launchpad but wanted to blog about it to give a bit of background context.

As I am sure most of the people who read this know, Drizzle is an Open Source database which was originally forked from the alpha version of MySQL 6.0.  At the time it was an extremely radical approach to Open Source development, many features were stripped out and re-written as plugins to turn it into a micro-kernel style architecture.  Every merge request was automatically throughly tested on several platforms for regressions, memory leaks and even positive/negative performance changes.

In fact Drizzle has influenced many Open Source projects today.  Openstack's Continuous Integration was born from the advanced testing we did on Drizzle.  MariaDB's Java connector was originally based on Drizzle's Java connector.  Even MySQL itself picked up a few things from it.

Development of Drizzle started off as a "What if?" R&D project at Sun Microsystem spearheaded by Brian Aker.  Once Oracle acquired Sun Microsystem a new corporate sponsor was found for Drizzle, Rackspace.

Rackspace hired all the core developers (and that is the point where I joined) and development progressed through to the first GA release of Drizzle.  Unfortunately Rackspace decided to no longer sponsor the development of Drizzle and we had to disband.  I've heard many reasons for this decision, I don't want to reflect on it, I just want to thank Rackspace for that time.

Where are we now?  Of the core team whilst I was at Rackspace:

So, back to the core question: "Is Drizzle dead?".  The core team all work long hours in our respective jobs to make some awesome Open Source products and in what little spare time we have we all work on many Open Source projects.  Unfortunately splitting our time to work on Drizzle is hard, so the pace has dramatically slowed.  But it isn't dead.  We have been part of Google Summer of Code, we still get commits in from all over the place and Drizzle is still part of the SPI.

Having said this, Drizzle no longer has a corporate sponsor.  Whilst Drizzle can live and go on without one, it is unlikely to thrive without one.

Another thing that is frequently asked is: "What happened to the docs and wiki?".  Drizzle being a cloud databases had all of its development and public documentation servers hosted in the cloud.  Unfortunately the kill switch was accidentally hit prematurely on the cloud account used.  This means we not only lost the servers but the storage space being used for backups.  This also affected other Open Source projects such as Gearman.  The old wiki is dead, we cannot recover that content.  The docs were auto-generated from the reStructuredText documentation in the source.  It was just automatically compiled and rendered for easy reading.

What I would personally like to see is the docs going to Read The Docs automatically (there is an attempt to do this, but it is currently failing to build) and the main site moved to DokuWiki similar to the new Gearman site.

As for Drizzle itself...  It was in my opinion pretty much exactly what an Open Source project should be and indeed was developing into what I think an Open Source database should be.  It just needs a little sponsorship and a core team that are paid to develop it and mentor others who wish to contribute.  Given that it was designed from the ground-up to be a multi-tenant in-cloud database (perfect for a DBaaS) I suspect that could still happen, especially now projects like Docker are emerging for it to sit on.

Thursday 13 February 2014

Caveats with Eventlet

The Stackforge Libra project as with most Openstack based projects is written in Python.  As anyone who has used Python before probably knows, Python has something called a GIL (Global Interpreter Lock).  The GIL basically causes Python to only execute one thread at a time, context switching between the threads.  This means you can't really use threads for performance reasons in Python.

One solution to get a little more performance is to use Eventlet.  Eventlet is a library which uses what is called "Green Threads" and hacks on top of the networking libraries to give a mutli-threaded like feel to an application.  As part of this blogging series for HP's Advanced Technology Group I'll write about some of the things I found out the hard way about Eventlet so hopefully you don't hit them.

What are Green Threads?


Green Threads are basically a way of doing multi-tasking on a single real thread.  They use what is called "Cooperative Yielding" to allow each other to run rather than being explicitly scheduled.  This has the advantage of removing the need for locks in many cases and making asynchronous IO easier.  But they come with caveats which can hurt if you don't know about them.

Threading library patched


One of the first things you typically do with eventlet is "Monkey Patch" standard Python library functions so that the are compatible with cooperative yielding.  For example you want the sleep() function to yield rather than hanging all the green threads up until finished.

The threading library is one of the libraries that is monkey patched and the behaviour suddenly becomes slightly different.  When you try to spawn a thread control will not return back to the main thread until the child thread has finished execution.  So your loop that tries to spawn X threads will suddenly only spawn 1 thread and not spawn the next until that thread has finished.  It is recommended you use Eventlet's green thread calls instead (which will actually work as expected).

Application hangs


Cooperative yielding relies on the library functions being able to yield.  Which means that if you use functions that do not understand this the yielding will not happen and all your green threads (including the main thread) will hang waiting.  Any unpatched system call (such as executing some C/C++ functions) falls into this category.

A common place you can see this is with the MySQLdb library which is a wrapper for the MySQL C connector (libmysqlclient).  If you execute some complex query that will take some time, all green threads will wait.  If your MySQL connection hangs for any reason... well, you are stuck.  I recommend using one of the native Python MySQL connectors instead.

Another place I have seen this is with any library that relies on epoll.  Python-gearman is an example of this.  It seems that Eventlet only patches the select() calls, so anything that uses epoll.poll() is actually blocking with Eventlet.

In summary there are cases where Eventlet can be useful.  But be careful where you are using it or things can grind to a halt really quickly.

Tuesday 11 February 2014

Why use double-fork to daemonize?

Something that I have been asked many times is why do Linux / Unix daemons double-fork to start a daemon?  Now that I am blogging for HP's Advanced Technology Group I can try to explain this as simply as I can.

Daemons double-fork

For those who were not aware (and hopefully are now by the title) almost every Linux / Unix service that daemonizes does so by using a double-fork technique.  This means that the application performs the following:

  1. The application (parent) forks a child process
  2. The parent process terminates
  3. The child forks a grandchild process
  4. The child process terminates
  5. The grandchild process is now the daemon

The technical reason


First I'll give the technical documented reason for this and then I'll break this down into something a bit more consumable.  POSIX.1-2008 Section 11.1.3, "The Controlling Terminal" states the following:


The controlling terminal for a session is allocated by the session leader in an implementation-defined manner. If a session leader has no controlling terminal, and opens a terminal device file that is not already associated with a session without using the O_NOCTTY option (see open()), it is implementation-defined whether the terminal becomes the controlling terminal of the session leader. If a process which is not a session leader opens a terminal file, or the O_NOCTTY option is used on open(), then that terminal shall not become the controlling terminal of the calling process.

The breakdown

When you fork and kill the parent of the fork the new child will become a child of "init" (the main process of the system, given a PID of 1).  You may find others state on the interweb that the double-fork is needed for this to happen, but that isn't true, a single fork will do this.

What we are actually doing is a kind of safety thing to make sure that the daemon is completely detached from the terminal.  The real steps behind the double-fork are as follows:
  1. The parent forks the child
  2. The parent exits
  3. The child calls setsid() to start a new session with no controlling terminals
  4. The child forks a grandchild
  5. The child exits
  6. The grandchild is now the daemon
The reason we do step 4 & 5 is it is possible for the child to regain control of the terminal, but once it has lost control the forked grandchild cannot do this.

Put simply it is a roundabout way of completely detaching itself from the terminal that started the daemon.  It isn't a strict requirement to do this at all, many modern init systems can daemonize a process that will stay in the foreground quite easily.  But it is useful for systems that can't do this and for anything that at some point in time is expected to be run without an init script.

Why VLAIS is bad

At the beginning of the month I was at FOSDEM interfacing and watching talks on behalf of HP's Advanced Technology Group.  Since my core passion is working on and debugging C code I went to several talks on Clang, Valgrind and other similar technologies.  Unfortunately there I several talks I couldn't get into due to the sheer popularity of them but I hope to catch up with video in the future.

One talk I went to was on getting the Linux kernel to compile in Clang.  It appears that there are many changes which are down to Clang being a minimum of C99 compliant and GCC supporting some non-standard language extensions.

The one language extension which stood out for me was called VLAIS which stands for Variable Length Arrays In Structs.  Now, VLAs (Variable Length Arrays) are nothing new in C they have been around a long time.  What we are talking about here are variable length arrays at any point in the struct.  For example:

void foo(int n) {
    struct {
        int x;
        char y[n];
        int z;
    } bar;
}

The char in this struct is what we are talking about here.  This kind of code is used in several places in the kernel, for the most part it is used in encryption algorithms.

How did it get added to GCC?


It came about around 2004 in what appears to be a conversion of a standard from ADA to C.  There is a mailing list post on it here.  It has since been used in the kernel and I can understand the argument that the kernel was never intended to be compiled with anything other than GCC.  But the side of me that likes openness and portability is not so keen.  I suspect the problems that currently plague the kernel for Clang also affect compilers such as Intel's ICC and other native CPU manufacturer's compilers.

So why is it bad?


Well, to start with there is the portability issue.  Any code that uses this will not compile in other compilers.  If you turn on the pedantic C99 flags in GCC it won't compile either (if you aren't doing this then you really should, it shakes out lots of bugs).  Once Linus Torvalds found out about the usage of it in the kernel he called it an abomination and asked for it to be dropped.

Next there is debugging.  I'm not even sure if debuggers understand this and if they do I can well imagine it being difficult to work with, especially if you need to track Z in the example above.

There is the possibility of alignment issues.  Some architectures work much better when the structs are byte aligned by a certain width.  This will be difficult to do with a VLAIS in the middle of the struct.

In general I just don't think it is clean code and if it were me I would be using pointers and allocated memory instead.

Of course this is all my opinion and I'm sure people have other views that I haven't thought of.  Please use the comments box to let me know what you think about VLAIS and its use in the kernel.  You can find out more in the Linux Plumbers Conference 2013 slides.

Monday 10 February 2014

HAProxy logs byte counts incorrectly

Continuing my LBaaS look back series of blog posts for HP's Advanced Technology Group I am today looking into an issue that tripped us up with HAProxy.

Whilst we were working with HAProxy we naturally had many automated tests going through a Jenkins server.  One such test was checking that the byte count in the logs tallied with bytes received, this would be used for billing purposes.

Unfortunately we always found our byte counts a little off.  At first we found it was due to dropped log messages.  Even after this problem was solved we were still not getting an exact tally.
After some research and reading of code I found out that despite what the manual says the outgoing byte count is measured from the backend server to HAProxy, not the bytes leaving HAProxy.  This means that injected headers are not in the byte count and if HAProxy is doing HTTP compression for you the count will be way off.

My findings were backed by this post from the HAProxy developer.

On average every log entry for us was off by around 30 bytes due to injected headers and cookies.
Given the link above this appears to be something the developer is looking into but I doubt it will be a trivial fix.

Sunday 9 February 2014

Working with Syslog

As part of my transition to HP's Advanced Technology Group I am winding down my contributions to HP Cloud's Load Balancer as a Service project (where I was technical lead).  I thought I would write a few blog posts on things we experienced on this project.

Whilst we were working on the Stackforge Libra project we added a feature that uploads a load balancer's log file to Swift. To do this we store the HAProxy log file using Syslog into a separate file.  The syslog generated file is what Libra uploaded.

Our installation of Libra is on an Ubuntu LTS setup, but the instructions should be very similar for other Linux distributions.

Logging to a New File


Syslog has the facility to capture certain log types and split them into separate files.  You can actually see it doing this in various /var/log/ files.  Having syslog handle this makes it far easier to manage, rotate, etc... than having daemons write to files in their own unique way.

To do this for HAProxy we create the file /etc/rsyslog.d/10-haproxy.conf with the following contents:

$template Haproxy,"%TIMESTAMP% %msg%\n"
local0.* -/mnt/log/haproxy.log;Haproxy
# don't log anywhere else
local0.* ~

In this example we are using /mnt/log/haproxy.log as the log file, our servers have an extra partition there to hold the log files.  The log will be written in a format similar to this:

Dec 10 21:40:20 74.205.152.216:54256 [10/Dec/2013:21:40:19.743] tcp-in tcp-servers/server20 0/0/1075 4563 CD 17/17/17/2/0 0/0

From here you can add a logrotate script called /etc/logrotate.d/haproxy as follows:

/mnt/log/haproxy.log {
    weekly
    missingok
    rotate 7
    compress
    delaycompress
    notifempty
    create 640 syslog adm
    sharedscripts
    postrotate
        /etc/init.d/haproxy reload > /dev/null
    endscript
}

This will rotate weekly, compressing old logs and retaining up to 7 log files.

Syslog Flooding


We soon found a problem, the generated log file was not recording all log entries on a load balancer that was getting hammered.  When looking through the main Syslog file for clues we discovered that the flood protection had kicked in and we were seeing log entries as follows:

Jul 3 08:50:16 localhost rsyslogd-2177: imuxsock lost 838 messages from pid 4713 due to rate-limiting

Thankfully this flood protection can be tuned relatively simply by editing the file /etc/rsyslog.conf and add the following to the end of the file:

$SystemLogRateLimitInterval 2
$SystemLogRateLimitBurst 25000

Then syslog needs to be restarted (reload doesn't seem to apply the change):

sudo service rsyslog restart

After this we found all log entries were being recorded.

Friday 7 February 2014

How I would do the "Year of Code"

As a member of HP's Advanced Technology Group I'm very interested in government initiatives to get more people into various forms of computer programming.  The current one being talked about in the media is the "Year of Code", the director of this was attacked on Newsnight recently and it felt a little like she was hung out to dry by someone.

I think it is really great that there is an initiative for this, but I worry that we will end up going with the wrong approach that would collapse.  My head was spinning with ideas if I was writing a curriculum for this so I thought I would note them down in this blog post.

Computer Fundamentals

At first I would teach computer fundamentals to children.  You can't even begin to program a computer properly without understanding these.  The three main components of a computer haven't really changed in many years, very much like a car engine, they have just evolved.  By this I mean the CPU, RAM and storage.  The children should be taught these and the rolls they play in a computer.  Even buy an old PC (AMD K6 or early Pentium for example) and get the children to put it together as an exercise.

CPUs run instructions, at a basic level they can only see two numbers at a time and operate on them (yes it is way more complex than that, but we are teaching basic fundamentals here).  Lots of fun animations can be used to illustrate this if it is done properly.  Computer programming languages are compiled or interpreted into this machine language, children need to know this, CPUs don't understand English :)

They should be taught something relative for computer storage (both GB and GiB).  Children will often tell me "my X has Y GB of memory" but they often have no idea what this means they can store or even if that is RAM or flash storage.  What is "Windows" or even an "Operating System"?  This is something that should be taught as well as the alternatives such as "Mac OS" and "Linux" (maybe not teach them that how to use them, but at least that they exist).  Did you know that your Android phone/tablet is also a computer running a form of Linux?  When I gave a talk in a junior school not too long ago many didn't know this, I even showed them Fedora Linux running on my laptop.  The fascination was incredible.

What is dual/quad-core, what is a MHz?  These are all questions that could be answered by the fundamentals section of the curriculum.

At this stage you could teach them IT.  Word processing and spreadsheets haven't changed much since I was in high school in the 90s and I doubt they will change much in the next couple of decades.  We were taught Logo in school as an introduction to programming (I wouldn't call it a programming language but it helps give instructions in a logical order).  Today's equivalent is "Scratch" which would be an acceptable replacement.

The Internet

The Internet can be this big scary thing that will play a large part of their lives for some time to come.  Children should be taught how the Internet is connected together (they don't need to be taught underlying protocols).  They need to be taught not only that it can be unsafe but why/how it can be unsafe.  I believe there are already initiatives to do this in school but with the fundamentals base knowledge they should be able to understand easier what goes on when they request a web page.

Computer Programming

This should not be taught to everyone (beyond Scratch as indicated above).  Only those who show interest in the subject should be taught more advanced subjects.  I wouldn't teach how to play a clarinet to someone who isn't interested in music.  But by all means given children a taster to see if it is something they want to learn.

When I was around 7 years old I taught myself BASIC on a BBC Micro, it was also one of the first machines I learnt assembly language on (I'm not recommending teaching children that, but encourage them if they want to learn).  Although BASIC in that form is hard to find and probably not as relevant there are languages nowadays that could be taught instead.

Last month I was in our Seattle office and during my down time I spent some time with some friends there.  A young lady called Sakura lives there who is only a year older than my eldest son and was telling me all about how she is learning Python.  In my opinion this would be a great starter language.

Combined with a development board such as a Raspberry Pi, a hardware kit and a good Python library set the programming could be extended to basic robotics.  Seeing their work do tangible things in the real world would be a great way of keeping children interested.

HTML is not a programming language.  XML is not a programming language, with any hope XML will be replaced with YAML in the future but that is another blog post.  These two should not be taught in schools but any inquisitive child should be shown things like the w3c schools site.

Summary

We need more computer programmers in the UK.  When I worked as a Technical Lead for HP Cloud's Platform-as-a-Service department I was the only one in the entire department who was UK based, I would have loved to have hired more UK staff.  But not everyone will be / can be a programmer and we should accept this.  We definitely need more IT professionals in the UK, from people who can use word processors and the Internet to people who can program.  The UK seems focused on games programming, negating the fact that there are many other fun things you can do in the computer industry.  When I was 18 and working in a computer shop building and repairing PCs, that was a lot of fun and very hands-on.  Now I find debugging and writing Open Source C tools/libraries a lot of fun and incorporate that in my day job as much as I can.

I do hope the government gets this right and teaches appropriate things whilst keeping it fun and interesting for the children.

These are just my opinions and not necessarily those of HP, please comment with what you think the "Year of Code" initiative should look like.

Monday 20 January 2014

The importance of backup verification

I have recently moved to HP's Advanced Technology Group which is a new group in HP and as part of that I will be blogging a lot more about the Open Source things I and others in HP work on day to day.  I thought I would kick this off by talking about work that a colleague of mine, Patrick Crews, worked on several months ago.

For those who don't know Patrick, he is a great Devops Engineer and QA.  He will find new automated ways of breaking things that will torture applications (and the Engineers who write them). I don't know if I am proud or ashamed to say he has found many bugs in code that I have written by doing the software equivalent of beating it with a sledgehammer.

Every Devops Engineer worth his salt knows that backups are important, but one thing that is regularly forgotten about is to check whether the backups are good.  A colleague of mine from several years back, Ian Anderson, once told me about the hunt for a good tape archive vendor.  He tested them by getting them to pick a randomly selected tape from the archives and reading it, timing how long it takes to do so.  You would be surprised the vendors who couldn't perform this task, I'd hate to see what would happen in a real emergency.

There was also the case of CouchSurfing which was crippled back in 2006 when after a massive failure they found their backups to be bad.  They eventually rebuilt and is a great site today, but this kind of damage can cost even a small company many thousands of dollars.

The main thing I am trying to stress here is that it is important not just to make backups but also to make sure the backups are recoverable.  This should be done with verification and even fire drills.  There may come a time where you may really need that backup in an emergency and if it isn't there, well you just burnt your house down.

Before I was a member of the Advanced Technology Group I was the Technical Lead for the Load Balancer as a Service project for HP Cloud.  We had a very small team and needed a backup and verification solution that was mostly automated and reliable.  This thing would be hooked up to our paging system and none of my team like being woken at 2AM :)

This is a very crude diagram of the solution Patrick developed:
The solution works as follows:
  1. Every X minutes Jenkins will tell the database servers to make a backup
  2. The MySQL database servers encrypt that backup and push it up to the cloud file storage (Openstack Swift)
  3. Jenkins would then trigger a compute instance (Openstack Nova) build and install MySQL on it
  4. The new virtual machine would grab the backup, decrypt it, restore it and run a bunch of tests on the data to see if it is valid
  5. If any of the above steps failed send out a page to people
Most of the above uses salt to communicate across the machines.  Have we ever been paged by this system?  Yes, but so far only because step 3 failed, either due to a Nova build fail or once due to a salt version incompatibility.  We have since added some resilience into the system to stop this happening again.

As well as the above there are monthly fire drills to manually test that we can restore from these backups.  We also regularly review the testing procedures to see if we can improve them.

This is going to sound a little strange but sometimes the best Devops Engineers are lazy.  By that I don't mean they don't do any work (they are some of the hardest working people I know), but they will automate everything so they don't have to do a lot of boring manual labour and they hate being woken by pagers at silly hours of the morning.  Some of the best Devops Engineers I know think in these terms :)