|Docker Solved a Key Problem|
Since the inception of our service, we have been using a single container that contained a set of language environments and binary packages – Ruby, Python, PHP, Java, .NET, and the other languages we support as well as code libraries such as ImageMagick, SoX, and others.
This container (and the strategy behind it) was showing signs of aging with things like Ruby 1.9.1, Node 0.8, Mono 2, and other older language versions in the default stack. As time went on, this problem obviously got worse as people started using newer things and then were forced to change their worker code to work with older versions of their languages.
Limited to a Single LXC Container...IronWorker uses LXC containers to isolate resources and provide secure run-time task environments. LXC works great as a run-time component but was falling short when it came to creating and loading environments within the runners we use to process tasks. We were at an impasse when it came to creating the runtime environments. On the one hand, we couldn’t just update versions in the existing container or else we'd risk breaking a fair number of the million-plus tasks that run each day. (We tried that once way back at the onset of the service and it wasn't pretty.)
We also couldn't keep different LXC containers around with various language versions as they contained full copies of the operating system and libraries (which means they would clock in at ~2 GB per image). This might actually work fine in a typical PaaS environment like Heroku where the processes run indefinitely and you could just get the right container before starting up the process. In this type of situation, you could even have large custom images for each customer if you wanted without much worry, but IronWorker is a much different animal.
IronWorker is a large shared multi-tenant task processing system where users queue up jobs and those jobs run across thousands of processors. It can be used to offload tasks from the main response loop to run in the background, continually process transactions and event streams, run scheduled jobs, or perform concurrent processing across a large number of cores. The benefit is users get on-demand processing and very large concurrency without lifting a finger.
Under the hood, the service works by taking a task from a set of queues, installing a run-time environment within a particular VM, downloading the task code, and then running the process. The nature of the service means that all machines are used by all customers, all the time. We don’t devote a machine to particular applications or accounts for an extended period of time. The jobs are typically short lived, some running for just a few seconds or minutes and with a maximum timeout of 60 minutes.
LXC did the job to a point but we kept asking ourselves, how can we update or add to our existing container while keeping things backwards compatible and not using an insane amount of disk space. Our options seemed pretty limited and so we kept putting off a decision.
... And Then Came DockerWe had heard about Docker over a year ago. We help organize the GoSF meetup group and Solomon Hykes, the creator of Docker, came to a meetup in March 2013 and gave a demo of his new project Docker, which happens to be written in Go. In fact, he released it to the public that day so it was the first time anyone had really seen it.
The demo was great and the 100+ developers in the audience were impressed with what he and his team had built. (And in the same stroke, as evidenced by one of his comment threads, Solomon started a new development methodology called Shame Driven Development.)
Alas, it was too early back then – we’re talking Day 1 early – so the project wasn’t exactly production ready but it did get the juices flowing.
|Solomon Hykes and Travis Reeder hacking|
at the OpenStack Summit in 2013.
I started playing with Docker and Solomon helped me wrap my head around what it can do and how it worked. You could tell right off it was not only a cool project but it also was addressing a difficult problem in a well-designed way. It didn't hurt, from my point of view at least, that it was new, written in Go, and didn't have a huge amount of technical debt.