Charting the Waters: Exploring Kubernetes Basics Through a Security Researcher’s Lens
Introduction
Driven by the motivation detailed in my previous post, I’ve set out to deconstruct Kubernetes into its core components. In this article, I’ll outline Kubernetes’ fundamental concepts and share my initial impressions, laying solid foundation before we move onto more hands-on approach.
What is Kubernetes?
Kubernetes is a powerful, open-source platform designed to manage and scale containerised applications. To put it simply, it helps you run and control your software without worrying about the specific hardware it’s running on.
Here’s how it works: Kubernetes abstracts away the details of the physical machines (like servers) and offers a consistent environment for your applications. This means you can deploy and scale your apps across various types of infrastructure — be it physical servers, virtual machines, or cloud services — without changing how you manage them.
Think of Kubernetes as a “reverse virtual machine.” Instead of splitting a single server into multiple virtual systems, it combines multiple servers into a unified system for running your applications. After setting it up, you don’t need to manage the hardware directly. If you need more resources, you can simply add more servers to your cluster. Kubernetes handles the underlying complexity and lets you focus on directing what services you need and how they should run, without worrying about the physical infrastructure.
Core Concepts of Kubernetes
So we already know that Kubernetes hides the complexity of physical infrastructure to make our lives easier. The way I see it is — Kubernetes is an unified operating system for distributed computer. Let’s investigate what components does it expose to us, to make good use of its ecosystem:
- Namespaces: In Kubernetes, a Namespace is a way to organise and isolate resources within a cluster. It allows you to divide a single cluster into multiple virtual environments, helping with resource management and access control. For example, you might use different namespaces for development, testing, and production to keep their resources separate and manage them more effectively.
- Pods: The smallest and simplest Kubernetes objects. A pod represents a single instance of a running process in your cluster. Pods contain one or more containers, with the obvious example being Docker ones — but in truth it supports any containers that implement Kubernetes CRI (Container Runtime Interface). Such container could be for example a REST API, database server or any other application one might want to run inside the cluster.
- Nodes: These are the physical or virtual machines that make up your cluster. Each node is managed by the master and contains the services necessary to run pods, including the Kubernetes runtime and the kubelet. As mentioned above, nodes help with abstracting the hardware away, so we don’t need to worry about them too much on day-2-day basis. But they are regardless a very important part of the system, and at some point we’ll follow this rabbit hole — after all we want to REALLY understand this system.
- Services: An abstract way to expose an application running on a set of Pods as a network service. With Kubernetes, you don’t need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them. That means that once we are navigating multi-pod environments, we don’t need to worry about what is what, Kubernetes groups them logically for us, load balancing the incoming traffic between them, regardless of on what underlying machine is the pod running at given moment. In summary, a Service in Kubernetes ensures that your application is accessible and scalable by providing a consistent way to reach your pods, regardless of changes or updates to those pods.
- Deployments: Manage the deployment and scaling of a set of Pods, and provide declarative updates to applications. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. Suppose you have a web application you want to run with 3 instances (pods). You create a Deployment specifying that 3 replicas should be running. If one pod crashes, the Deployment will automatically create a new pod to replace it. If you need to update the application to a new version, you can update the Deployment, and it will gradually roll out the new version without downtime. In summary, a Deployment in Kubernetes manages the lifecycle of your pods, ensuring they are deployed, scaled, and updated according to your specifications.
Summing up the above — we can start to see a pattern — Kubernetes is taking all the “hard” configuration, such as managing servers and networks, abstracting them away, leaving us only with one job — to properly describe expected state and provide it with proper containers. Once we do that — we can leave it to work its magic — managing all of the physical infrastructure for us. All of this sounds like a fairytale… Let’s investigate the underlying architecture that makes it all possible.
Kubernetes Architecture
Call me old, but I stopped believing fairytales long time ago. Sad as it is, it’s time to learn what complexity behind makes the magic happen:
1. The Control Plane
The heart of Kubernetes’ operations is the Control Plane. The Control Plane’s primary function is to manage the cluster and make global decisions about the cluster (e.g., scheduling), as well as detecting and responding to cluster events (e.g., starting up a new pod when a deployment’s replicas field is unsatisfied). Let’s learn about few core services backing it up:
- kube-apiserver: The API server is a component of the Kubernetes control plane that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane. The API server exposes an HTTP API that lets end users, different parts of your cluster, and external components communicate with one another. The Kubernetes API lets you query and manipulate the state of API objects in Kubernetes (for example: Pods, Namespaces, ConfigMaps, and Events). So among all this wizardry, this is the eldest and wisest mage we are talking with. This is where we, humble users, communicate what and how we want to achieve, before going for a coffee, waiting for cluster to work its magic — achieving described state.
- etcd: Reliable, distributed key-value storage of the cluster’s data, used as Kubernetes’ backing store for all cluster data. This is one of the crucial services. After reading the docs it’s pretty easy to assume that if this is not backed up — disaster happening is just a question of “when” not “if”.
- kube-scheduler: Watches for newly created Pods with no assigned node, and selects a node for them to run on. This scheduler makes it possible for us not to worry about the what and where of services and physical servers.
- kube-controller-manager: Runs controller processes, which are background threads that handle routine tasks in the cluster.
2. The Node Components
Nodes are the workers that contain the necessary tools to manage networking between containers, communicate with the Control Plane, and assign resources.
- kubelet: An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod.
- kube-proxy: Maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
- Container Runtime (e.g. containerd): The software that is responsible for running the containers.
With those three it all finally comes together — they work together to execute the instructions from the Control Plane while managing local operations on each node, sending all relevant information about local events back to Control Plane — so that it can continue making informed decisions about the cluster.
Security Tooling in Kubernetes
Finally, the parts that excite me the most, which we’ll definitely investigate more in depth in the near future — the mechanisms that allow us to better manage the security of our cluster:
- Network Policies: Define rules which dictate allowed network communications within your Kubernetes cluster, helping to isolate and secure pods from unauthorised access. My initial thoughts are: most importantly we need to ensure that communication with the Control Plane is managed properly — so that in case of malicious activity, misbehaving pods can’t escalate the damage into the entire cluster.
- Pod Security Policies: Specify security settings for a pod, enforcing controls such as preventing privileged execution, limiting resource usage, and controlling kernel capabilities. This seem like another useful tool — allowing us to enforce first layer of defence against unexpected actions happening inside our pods.
- Secrets Management: Kubernetes supports secrets management natively, allowing you to store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Secret stores are pure gold when it comes to security research. We all know they are there. We all know they should be secure. But first of all — are they? And if so… Why do so many secrets end up stored in git repo in secrets.txt. I might be overdramatising, but you wouldn’t believe how many times I’ve seen secrets being part of CI/CD actions definition, files in repo or simply hardcoded inside the application. We will definitely go in depth on this one.
- Security Audits: There is a plethora of automatic scanners and other security tools aiming to ensure the cluster is properly configured. From simple misconfigurations to supply chain issues, coverage seems high, but I hope we’ll get a chance to see how complete and in-depth those solutions are. Maybe once we understand more, we’ll embark on another journey: trying to build one…
Conclusion
Kubernetes is more than just a tool for automating deployment and scaling; it’s a robust ecosystem that manages a wide array of containerised applications and services. By understanding its architecture and core components, we’re better equipped to tackle security challenges and explore advanced features.
This overview is designed to provide a strong foundation for our journey into Kubernetes. As we continue to delve deeper into its capabilities, these insights will serve as essential building blocks. If you’re excited about what lies ahead and eager to expand your Kubernetes knowledge, make sure to follow along for future updates and discoveries!