Sunday, May 21, 2017

Good Reasons to Dockerize Builds

The Problem

On my current team, we use a common tool stack of Jenkins, Maven, Robot Framework, and Java. Our build system is centralized with a couple of dozen projects. We have suffered build and developer friction, namely
  • Our version of Jenkins is older than the Internet and cannot be easily upgraded
  • Upgrading a plugin/tool risks incompatibility with other plugins/tools
  • We have accumulated snippets of shell scripts embedded int POM files
  • Developers sometimes use our continuous integration (CI) to do compiles and test code
  • We sometimes have trouble with local builds passing and CI builds failing
These and more are the Usual Suspects on almost any build system that has grown over time. Having been on this team for around two years now, I have identified a direction we should take to modernize our build system.

A Solution

Docker and containerized builds have been around since at least 2013. There are some compelling reasons to consider using containerized builds. This is my attempt to clearly enumerate the advantages and to convince my chain-of-command to make the jump. Hopefully this will help any readers to do the same.

Moving to containers means we can deploy onto a cloud with minimal work. This can address scaling issues effectively. Note that some builds will still depend on lab access to local devices, and these dependencies will not scale.

Containerizing the build pipeline means easier upgrading. For example, running a component in its own container isolates it so other containers that depend on it are forced through a standard, explicit interface.

Containerizing the build means better partitioning. So instead of making environments that contain all possible tools and libraries, a container can only use those needed for its specific purpose. This has the side effect of reducing defects due to interacting 3rd party components.

Containerizing the build means a clean reset state. Instead of writing reset scripts, the container is discarded and resurrected from a fixed image. This is a phoenix (burn-it-down) immutable view of containers, and forces all build and configuration to be explicit (not accumulate in snow flake instances).

Containerizing means 100% consistency between local development and the build system, which should eliminate the but-it-works-on-my-machine problem.

Containerizing the build means effective postmortems for build failures, potentially leaving failed runs in the exact state of failure, rather than relying solely on extracted log files.

Containerizing the build means building and installing an intermediate, shared artifact onces, instead of 52 times, and potentially speeds up the build cycle.

Containerizing the build means that some tests can make temporary checkpoints via a temporary image/container and roll back to that state rather than tearing  down and rebuilding, affording a quicker build.

Judicious use of containers might help with diagnosing hard-to-reproduce issues in the field. I have seen instance of technical support sending/receiving VM images to/from customers. Containers would be both simpler and could be a lot smaller.

Containerizing the build is considered a modern best-practice and affords access to all kinds of build work flow and management tools.

That's it! Good luck convincing your management when the time comes to modernize your build systems.