CIS Hardened Kubernetes

Hardening a Kubernetes cluster against the CIS Benchmark and the U.S. DOD STIG

Hardening a Kubernetes cluster against the CIS Benchmark and the U.S. DOD STIG.

It took a long time to do it, but I have installed and hardened a Kubernetes cluster that meets the CIS Benchmark and the US. Department of Defense STIG. I did this using Ansible due to the complexity and need of repeatability across many clusters.

You can check out the repository with the roles and playbook here.

GitHub - EdwardCooke/ansible-kubernetes: Creates a hardened Kubernetes cluster that meets the CIS benchmark 1.9
Creates a hardened Kubernetes cluster that meets the CIS benchmark 1.9 - EdwardCooke/ansible-kubernetes

It has a lot of features and is flexible and extendable. I added the concept of hooks which allows you to inject tasks at the different steps of the playbook. For example, after installing Kubernetes components, after first initializing the control plane, or after installing all the Kubernetes control plane nodes.

It requires the use of OIDC to authenticate end users to the cluster. This was part of the CIS Benchmark where they required that end users connect without a long-lived token or service account.

The trickiest part of the hardening was getting the ETCD components to run as a separate ETCD user. That required some custom work and patches on that manifest.

To get it fully setup and working it requires a minimum of 2 servers.

  • A proxy server that sits in front of the control plane
  • A control plane server

You can use a hook to remove the specific control plane taints on that single control plane node to allow workloads to run on it. Obviously not recommended, but great for development environments.

The playbook is written for Ubuntu. Tested on Ubuntu 22.04 and 24.04. It works on x64 (Ubuntu 22.04 and 24.04) and ARM processors (tested with Ubuntu 24.04 on a Raspberry PI). It has successfully installed a CIS/STIG compliant (where it can), working Kubernetes cluster at version 1.32.

It is idempotent, meaning you can run it multiple times and it will only do the work necessary. It can add control planes or worker nodes on subsequent runs.

There is also a reset playbook, that when passed the same hosts it will undo the default install playbook. It removes files, users, sysctls, everything. The servers should be back in a state prior to when you ran the playbook. If you put in customizations/hooks to run during the install, you'll need to undo those yourself.

For detailed information on how to use the playbook, checkout the readme file in the repository linked above.