Autoscaling on AWS Without Bundling AMIs

At Zooniverse HQ we typically run between 15 and 20 AWS instances to support our current set of projects. It's a mix of fairly vanilla Apache/Rails webservers, AWS RDS MySQL instances, a couple of LAMP machines and a bit of MongoDB goodness for kicks. Over the past year we've migrated pretty much all of our infrastructure to make full use of the AWS toolset including elastic load balancing and auto-scaling our web and API tiers, SQS for asynchronous processing of requests and RDS for our database layer. Overall I'm pretty happy with the setup but one pain point in the deployment process has always been the bundling of AMIs for auto-scaling. I've described before the basic configuration required when setting up an auto-scaling group - the step that always takes the most time is saving out a machine image with the latest version of the production code so that when the group scales the machine is ready to go. The problem is that for a typical deploy, the changes to the code are minimal and really don't require a new machine image to be built, rather we just need to be sure that when the machine boots it's serving the latest version of the application code. Over the past few months I've considered a number of different options for streamlining this process, the best of which being an automated checkout of the latest code from version control when the machine boots. This is all very well but we host our code at GitHub. Now don't get me wrong, I love their service but I really don't want to build in a dependency of GitHub for our auto-scaling.

» An alternative?

A couple of weeks ago we made a change to the way that we deploy our applications and I can honestly say it's been a revolution. The basic flow is this:

  1. Work on new feature, finish up and commit (don't forget to run your tests)
  2. Push code up to GitHub
  3. Git 'archive' the repo locally, tar and zip it up
  4. Push the git archive up to S3
  5. Reboot each of the production nodes in turn
  6. Done!

» Eh?

So that not might look like a big deal but the secret sauce is what happens when the machine reboots. There's a simple script that executes on boot to pull down the latest bundle of the production code from S3 put it in the correct place and voila, you're running the latest version of the code. We use Capistrano for on-the-fly deployment and so it's important that this script doesn't get in the way of that - upon downloading a new bundle of the code the script timestamps the new 'release' and symlinks the config files and 'current' directory. That way, if we need to we can still cap deploy a hotfix to a running server.

» Show me the code already!

The capistrano task used here is super-simple and can, I'm sure be further improved. Below is the example for our latest project Old Weather.

» A small but significant change

There's nothing like showing someone new your current deployment process to realise where the pain points are. Credit here has to go to to Zooniverse developer Robert Simpson here for patiently watching me run through the old process and kicking me to make it easier.