A Closer Look at Docker

Part of my job at Lincoln Loop is to explore new technologies and make sure that, as a company, we stay current. I’ve been playing around with Docker for a while, but recently decided to redeploy some of our production services with Docker to really wrap my head around it. Here’s my take away from that experience:

The Bad

Docker shines in some places, but first let’s take a look at where it doesn’t.

Logging

I configured my containers to log to stdout, deferring log management to the Docker daemon (similar to what I would do with a process manager like upstart). You can access the logs via docker logs <container>. The first problem I hit was, as the logs get bigger, tailing them becomes very slow as it scrolls through all the log data to get to the last few lines.

I setup Heka’s DockerLogInput to pick up the logs and ship them to ElasticSearch/Kibana, but that doesn’t make local debugging any easier. The other option is to write the logs to a file on the host or in a data container. Something then needs to rotate those logs to prevent filling the disk which means more things tied to the host or a different container.

This exercise was indicative of my experience as a whole. The initial setup was straightforward, but when you get into the details, you either end up with a tangled web of interconnected containers or leaking your container info/processes into the host system.

Debugging

As of Docker 1.3 it’s easy to get shell access on a running container with docker exec. It’s a big step forward for debugging, but I still found it painful. If you follow the common practice of making your containers as small as possible, you don’t have much to work with. I found myself needing to install common tools like curl, vim, or telnet before I could even get started debugging. Once you rebuild the container, you have to jump through all those hoops again (or sacrifice your small builds).

Stopping, Starting, and Reloading

Managing processes running in containers is not like managing services. Your container typically has a large set of arguments passed into it at runtime (volumes to mount, ports to expose, environment variables, etc.). If you want to stop a container briefly, you need to do this magic incantation again to start it up. Alternatively, you can use an external process manager like upstart to manage your containers. Again, this is spilling information about your containers onto the host system and reducing the benefits Docker intends to provide.

I used our configuration management tool (Salt) handle spinning containers up and gluing them all together. During the initial setup and troubleshooting, this was slow, tedious, and error-prone.

Hot Code Updates

Pre-containers, a typical deploy involved updating code/configuration on our servers and then reloading the process serving it (uWSGI, Nginx, etc.). It is trivial to do this on-the-fly without dropping incoming connections. With Docker, you build a container then push/pull the container to your servers. The easy way to deploy a new container is to stop the old container and start the new one, but you’ll drop all traffic for a few seconds in the process. Instead, you want need additional machinery to route traffic to the new container on a dynamic port. Building this out is certainly doable, but non-trivial at the moment.

Building Images

Building images is slow. I used the Docker Hub and after committing a change, there would be a few minutes of waiting in the queue, building the image, then uploading it to the repo before I could deploy it. There are probably faster options, but it’s either another service I need to setup and manage in-house or an external commercial service.

Deployment

The concept that you can build an image, test it, then deploy it in that exact state is great. It greatly reduces the chance a failed build lands on your servers. Right now, that’s more concept than reality. You either need to build that pipeline yourself, pay a third-party for it, or depend on an immature open source project to fill the gap. I’m not excited about having any of those as an extra layer in our stack.

Supporting Services

Docker is a new project and people are still figuring out how to plug everything together. Lots of competing services are being built to make things easier (consul, etcd, kubernates, confd, etc.), but there isn’t a clear winner and they are still in their infancy. Using these as the foundation of our production infrastructure is risky at best.

My general feelings after a day or two were:

I wasn’t reducing complexity, but instead pushing it around. Worse, I was pushing it from mature stable systems to ones that were on the bleeding edge.
I wasn’t making our stack simpler. I would need to add more services/tooling to have similar functionality.
When things weren’t working, my job got harder. The additional layers only made it more challenging to find and diagnose problems.

But it wasn’t all bad…

The Good

Docker really is revolutionary in many ways. Here are a few places I think it shines:

Supporting Services for Local Development

It’s not uncommon for a website to require half a dozen backing services (or more) in production. Getting a new developer setup with a database, a cache, a search engine, a queue, etc. can be a slow and painful process. Using docker-compose (née fig) makes that trivial. It’s a huge win for getting your developers on a homogenous environment quickly.

Isolated Test Environments

I’m setting up all our testing with docker-compose on Jenkins. We test multiple projects, some using different versions of the same backing services in production. Instead of trying to install all these services side-by-side, I can quickly spin them up in Docker with the same OS and versions in production and tear them down when I’m done. Tests don’t need to clean up after themselves and can assume they start with a clean slate every time.

Generating Builds

I’ve been very interested in Python wheels lately. Unfortunately the huge variety of libraries across different Linux distributions makes it hard to distribute pre-compiled binaries via PyPI. With Docker, I can build wheels that match my target architecture easily. Pre-generated wheels (hosted on your own PyPI) make it possible install complex Python projects without the need for build tools on the server.

Conclusion

You can probably guess that I’m not jumping to deploy more production services in Docker. In fact, I’ve already rolled back a few that I setup to a more “traditional” setup. I’m sure there are certain production environments where the benefits of Docker shine, but for our purposes, it’s not ready yet. In the meantime, I’ll stick to the boring technology.