Docker and Kubernetes are two hot technologies in the world of software. Most software architectures are using them, or considering them. The question is often asked – Docker vs Kubernetes – which is better? Which one should we be using? As it turns out, this question misrepresents the two. These two technologies don’t actually do the same thing! They do complement each other nicely, however. In this post, we will explore the “Docker vs Kubernetes” question. We will dig into the backgrounds and details of both. We will also show how they differ. With this information, you can better decide how Docker and Kubernetes fit in your architecture. First, some background…
How Did We Get Here?
Before diving into the topic, let’s walk through a brief history of how we got here.
In the Beginning…
In the REALLY early days of computing (like, the 1960s), there was time sharing on mainframes. On the surface, this looked nothing like its modern day counterparts. A room full of big iron, and perhaps a primitive text-based terminal. Lots of little lights. Very limited functionality. Yet, the concept is the same – one machine serving many users at once. Each isolated from each other. While not practical for today’s needs, this technology planted the seed for the future. Around the 1980s and 1990s, computer workstations began to grow in prominence. Computers no longer required a room full of mainframe hardware. Instead, a server could fit on your desk. One in every home! In the software industry, these workstations become the main workhorses of web serving. This didn’t scale well to a large number of users and services, due to the expensive hardware. For most users, a beefy workstation offered far more capacity than a one person required.
Virtual Machines (VM) offered a solution to this problem. Full Virtualization allowed one physical server to host several “VM instances”. Each instance featured its own copy of the Operating System. This allowed “machines” to be rapidly created and deployed. Instead of deploying a physical server each time you needed a computer, a VM could take its place. These VMs were usually not as powerful as a full workstation, but they didn’t need to be. This advance made it much easier to add new machines to a computing environment. It was inefficient and costly though. Each VM instance required a full operating system. Lots of duplicate code and processes would run on a single VM server. Lots of OS licenses needed purchasing. The industry kept working on better alternatives.
Containers (also known as Operating-System-Level Virtualization) provide a solution to this waste. A single container environment provides the “core” Operating System processes. Each container running in this environment is an isolated “user-space” instances. In other words, the instances share the common functionality (file system, networking, etc). This eliminates the duplicate OS-level processes. As a result, a single physical server can support a much large volume of containers. Additionally, cloud computing landscape lends itself very well to container architecture. Customers generally don’t want (or need) to worry about individual machines. It’s all “in the cloud.” Developers can code, test, and deploy containers to the cloud. Never worrying about the hardware they are running on. Containers have exploded in popularity with the growth of cloud computing.
Docker (both the company and product) is a big name in containerization. Docker begin as an internal project at a dotCloud, a Platform as a Service company. It soon outgrew its creator and debuted to the public in 2013. It is an open source project, and has rapidly become a leader in the Container space. “Google” is synonymous with “Search”. You might say, “google it”. The same has almost become true for Docker – “use docker” means “use containers”. Docker is available on all major cloud platforms, with rapid growth since its release.Here are some key concepts from world of Docker:
- Image – the Docker Image is the file that holds everything necessary to run a Container. This includes:
- the actual application code
- a run-time environment, with all the OS services the application needs.
- any libraries needed for your application
- environment variables and config files, such as connection strings and other settings.
- Container – a Container is a “copy” of an Image, either running or ready to run in Docker. There can be more than one Containers copied from the same Image.
- Networking – Docker allows different Containers to speak to each other (and the outside world). The code running in the Container isn’t “aware” that it’s running within Docker. It simply makes network requests (REST, etc), and Docker routes the calls.
- Volumes – Docker offers Volumes to allow for shared storage between Containers.
The Docker “ecosystem” consists of a few main software components:
Docker’s main platform is the Docker Engine. It is the software that hosts and runs the Containers. It runs on the physical host machine, and is the “sandbox” all the containers will live within. The Docker Engine consists of the following components:
- The Server, or Daemon – the Daemon is the “brains” of the whole operation. This is the main process that manages all the other Docker pieces. Those pieces include Images, Containers, Networks, and Volumes.
- REST API – The REST API allows programs to communicate with the Daemon for all their needs. This includes adding/removing Images, stopping/starting Containers, adjusting configuration, etc.
- Command Line Interface (CLI) – allows command line interaction with the Docker Daemon. This is how end users interact with Docker. It uses the Docker REST API under the covers.
Docker Hub is an enormous online library containing vast quantities of “pre-made” images. Like Github, except instead of hosting Git repositories, it hosts Docker images. For almost any software need, there is an image on Docker Hub that provides it.For example, you might need:
- a Rails environment for web services
- connected with a MySQL database
- with Redis available for caching.
Dockerhub contains “Official” images for these types of things. “Pull” the required images to your local environment, and use them to build Containers. Complex, production-ready environments can be ready within minutes. Companies can also pay for private repositories to host their internal Docker images. Dockerhub offers a centralized location to track and share images. History tracking, branching, etc. Like Github, except for Docker.
Docker Swarm is Docker’s open source Container Orchestration platform. Container Orchestration becomes important in large scale deployments. Large environments, with tens, hundreds or thousands of Containers. With this type of volume, manually tracking and deploying Containers becomes cost prohibitive. An Orchestration platform provides a “command center.” It monitors and deploy all the various Containers in an environment. Docker Swarm provides some of the same functionality as Kubernetes. It is simpler and less powerful, but easier to get started with. It uses the same CLI, making its usage familiar to a typical Docker user. We’ll get more into Container Orchestration below.
While Docker is the industry leader, there are alternatives. These include:
- Core OS’ rkt (Rocket) – the “pod-native” container engine. Developed by a Kubernetes-based software team, this is a competitor to Docker.
- Cloud Foundry – adds a layer of abstraction on top of Containers. Allows you to provide the application, and not worry about the layering beneath. With this service, you’re not really focused on the Container layer.
- Digital Ocean – a cloud-based provider that calls its containers “droplets”. This appears to be like Cloud Foundry, in that they abstract away some complexity. There are still cloud/Kubernetes options in their control panel.
- “Serverless” services – major cloud providers like AWS and Azure offer “serverless” services. These allow companies to create simple webservices on the fly. No hardware, or hardware virtualization. No worries about the underlying platform. Not technically Containers, but offer support for many of the same use cases.
Kubernetes is the industry leader in Container Orchestration. First, here’s an overview of what that is…
Containers are a very powerful tool, but in large environments, they can get out of hand. Different deployment schedules into different environments types. Tracking uptime, and knowing when things fall down. Networking spaghetti. Capacity planning. Tracking all that complexity requires more tools. As this technology has matured, Container Orchestration platforms have grown in importance. These orchestration engines offer some of the following benefits:
- “Dashboard” for all the Containers. One place to watch and manage them all.
- Automatic provisioning and deployment. Rather than individually spinning up Containers, the orchestration engine manages for you. Push a button, adjust a value, more Containers spring to life.
- Redundancy – if a Container fails in the wild, an orchestration engine will notice it fail. And put a new one in its place.
- Scaling – as your workload grows, you may outgrow what you have. An orchestration engine detects capacity shortages. It adds new Containers to spread the load.
- Resource Allocation – under all those Containers, you’re still dealing with real-life computers. Orchestration engines can manage and optimize those physical resources.
While there are several options available, Kubernetes has become the market leader.
Rise of Kubernetes
Kubernetes (Greek for “governor”) began at Google in 2014. It was heavily influenced by Google’s internal “Borg” system. Borg was an internal tool Google used to manage all their environments. Google released and open-sourced Kubernetes in 2015. It has since grown to become one of the largest open source projects on the planet. All the major cloud providers offer Kubernetes solutions. Kubernetes is now the de facto Container Orchestration platform. This post goes into great detail about the growth of Kubernetes over the past couple of years.
At a very high level, Kubernetes helps manages large numbers of Containers. Simple enough, right? At a more granular level, Kubernetes consists of a Cluster managing lots of Nodes. It has one Master Node, and one-to-many Worker Nodes. These Nodes use Pods to deploy Containers to environments. As requirements scale, Kubernetes can deploy more Containers, Pods, and Nodes. Kubernetes tracks all the above, and adds/removes when needed. Here’s a closer look at all the concepts described above:
- Cluster – A Cluster is an instance of a Kubernetes environment. It has a Master node and several Worker nodes.
- Node – A Kubernetes Node is a process that runs on a server (physical or virtual). A node is either a Master node, or a Worker node. Together, Master and Workers manage all the distributed resources, both physical and virtual.
- Master – the Master node is the control center for Kubernetes. It hosts an API server exposing a REST interface used to communicate with Worker nodes. The Master runs the Scheduler, which creates Containers on the various Worker Nodes. It contains the Controller Manager, which manages the current state of the cluster. If the cluster doesn’t match the desired state, the Controller Manager will correct. For example, if Containers fail, it creates new Containers to take their place.
- Worker – the Worker Nodes carry out the wishes of the Master Node. This includes starting Containers, and reporting back their status. As an environment needs to scale to more machines, Kubernetes adds more Worker Nodes.
- Pods – A Pod is the smallest deployable unit in the Kubernetes object model. It consists of one or more Containers, storage resources, networking glue, and configuration. Kubernetes deploys Pods to Nodes. Docker is the main Container technology Kubernetes uses, but others are available.
While Kubernetes is the front runner, there are alternative options for Container Orchestration. These include:
- Docker Swarm – already mentioned above, this is Docker’s Container Orchestration offering. This has the advantage of coming from the same team that maintains Docker. It is also considered easier to use by some, and faster to get started. Additionally, Swarm uses the same CLI as Docker. This makes it easy to use for those already familiar with Docker.
- Nomad – this is a lightweight orchestration platform. This doesn’t feature all the bells and whistles of more advanced systems. It is more simple, though, which may appeal to some.
With all this solid background in place, we are now better poised to make a decision. How to containerize everything? For starters, Docker is a must. While alternatives exist, Docker is the clear front runner. It has become the industry standard, and features extensive tooling and documentation. It is open source, and free to get started. You can’t go wrong using Docker as your container technology. Once things get big enough to orchestrate, you must make a decision. The best two choices seem to be:
- Docker Swarm – an easy stepping stone from simple Docker, Swarm is worth exploring first. Using the same CLI, you can grow your Docker environment to multiple Containers on several Machines. If you are able to manage everything this way, you might just stop there.
- Kubernetes – if Swarm doesn’t seem up to the task, it’s probably worth the leap to Kubernetes. It’s the leader in the orchestration space, which offers the same documentation and support advantages. It will grow as big as you need it, and supports the complications that arise with large-scale systems.
If your organization is looking to use Containers in the Cloud, Iron.io can help you get there. Iron.io supports Docker, Kubernetes and other alternatives. Iron.io’s expert staff will help you intelligently scale your business on any of the major cloud platforms. Iron.io is trusted by brands such as Zenefits, Google, and Untappd. Allow them to help your business containerized in the cloud!