Certified Kubernetes Security Specialist (CKS) Preparation based on CNCF CKS curriculum.
Approach to this preparation guide
This guide is being authored before actual certification has went live. The best approach to be most prepared for this certification is to know all of the required domains and competences really well. I’ve based this preparation guide on the available curriculum. I do not aim to author 100% of the content for the CKS preparation, instead, I will aim to rely on existing resources where possible and only fill-in remaining gaps myself.
Domains & Competencies
Cluster Setup – 10%
Use Network security policies to restrict cluster level access
- Network Policies - Official documentation - kubernetes.io
- Get started with Kubernetes network policy - calico is the leading implementation of Network Policy in kubernetes and comes as default option on managed Kubernetes like GKE. There are 3 tutorials to practice your Network Policy skills. - projectcalico.org
- kubernetes-network-policy-recipes Excellent collection of Network Policy examples by ahmetb - github.com
- Kubernetes Network Policies Best Practices - article on Network Policies best practices - alcide.io
- Exploring Network Policies in Kubernetes - thorough exploration of Netowork Polcies - banzaicloud.com
Use CIS benchmark to review the security configuration of Kubernetes components (etcd, kubelet, kubedns, kubeapi)
- CIS Benchmarks can be found on official CIS website and unfortunately are not available in bite-size form - cisecurity.org
- It is unreasonable to expect examinee to know all of the available (and always growing) CIS Benchmarks. It is likely that exam will expect you to perform CIS benchmark analysis using Open Source tools. One that comes to mind is kube-bench. It is even mentioned on CNCF Cloud Native Landscape - cncf.io
- Alcide Advisor is another tool that can be run against the cluster to evaluate CIS Benchmarks - github.com
Properly set up Ingress objects with security control
Protect node metadata and endpoints
Minimize use of, and access to, GUI elements
Verify platform binaries before deploying
Cluster Hardening – 15%
- Restrict access to Kubernetes API
- Use Role Based Access Controls to minimize exposure
- Using RBAC Authorization - Official documentation - kubernetes.io
- Two types of scopes - ClusterRole and Role
- ClusterRole is a powerful, cluster-wide resource, it has to be matched with ClusterRoleBinding. Always prefer to use a regular Role, unless you really need a uniform access across all namespaces.
- Role is namespaced resource, meaning that it can only impact permissions within a namespace where it exists. It has to be matched with RoleBinding.
- It is likely that you will be tasked to configure the least privilege RBAC. Things to be mindful of:
- Exercise caution in using service accounts e.g. disable defaults, minimize permissions on newly created ones
You can disable default token auto-mounting on namespace or a pod level. While default namespace service account have the same permissions as unauthorised user, this technique prevents over-granting of permissions when additional roles are bind to the default service account. This will prevent bearer token being placed in
/var/run/secrets/kubernetes.io/serviceaccount. Namespace level:
apiVersion: v1 kind: ServiceAccount metadata: name: build-robot automountServiceAccountToken: false
apiVersion: v1 kind: Pod metadata: name: my-pod spec: serviceAccountName: build-robot automountServiceAccountToken: false
- Update Kubernetes frequently
System Hardening – 15%
- Minimize host OS footprint (reduce attack surface)
- Minimize IAM roles
- Minimize external access to the network
- Appropriately use kernel hardening tools such as AppArmor, seccomp
Minimize Microservice Vulnerabilities – 20%
- Setup appropriate OS level security domains e.g. using PSP, OPA, security contexts
- Manage Kubernetes secrets
- Practical lab around Secrets - katacode.com
- Use container runtime sandboxes in multi-tenant environments (e.g. gvisor, kata containers) *
- Implement pod to pod encryption by use of mTLS
Supply Chain Security – 20%
- Minimize base image footprint
- We want to Minimize attack surface area
- Do not include build utilities in the final image by using multi-stage builds
FROM golang:1.15.0-alpine3.12 # Set the working directory WORKDIR /go/src/github.com/hugomd/go-example # Copy our main file COPY main.go . # Build the Golang app RUN CGO_ENABLED=0 GOOS=linux go build -o app . # Create the second stage of our build FROM alpine WORKDIR /app # Copy from the first stage (--from=0) COPY --from=0 /go/src/github.com/hugomd/go-example/app . CMD ["./app"]
- Aim for smaller distributions with least amount of utility bloat (think of using Alpine vs full-sized distributions)
scratchimages are most secure as they don’t add anything but application itself. Modified example of a Go app which now runs distroless. Learn more about distroless.
FROM golang:1.15.0-alpine3.12 # Set the working directory WORKDIR /go/src/github.com/hugomd/go-example # Copy our main file COPY main.go . # Build the Golang app RUN CGO_ENABLED=0 GOOS=linux go build -o app . # Create the second stage of our build FROM scratch WORKDIR /app # Copy from the first stage (--from=0) COPY --from=0 /go/src/github.com/hugomd/go-example/app . CMD ["./app"]
.dockerignoreto avoid pulling in unnecessary files, secrets
- Don’t run image as
rootuser. Add least privilege user, group.
ADDon URLs can result in arbitrary MITM attacks. Add is also susceptible to Zip Slip Vulnerability
- 3 simple tricks for smaller Docker images Well written explanation on how to minimise size and security risks of your Docker images. - learnk8s.io
- Docker Image Security Best Practices - concise pdf with best practices to follow. snyk.io
- We want to Minimize attack surface area
- Secure your supply chain: whitelist allowed registries, sign and validate images
- Sign and validate
- We can use combination of Kritis Signer to sign images in CI/CD and Kritis to validate that images are signed on admission. Kritis is using Validating Admission Webhook as a final arbiter.
- In GKE, you can use Authorized Binaries feature which is a lot simpler to setup and run when compared to Kritis. Shameless plug - video from Cloud Next 2019 showcasing Binary Authorization.
- OPA Gatekeeper can be used to build policies for Allowing / Denying images into the cluster. Conftest can be used to run Gatekeeper policies in CI/CD and validate registries before it gets to a running cluster.
- ImagePolicyWebhook can be used to implement validation mechanisms. kube-image-bouncer is an example of such implementation. Read more about Admission Controllers here.
- Sign and validate
- Use static analysis of user workloads (e.g.Kubernetes resources, Docker files)
- Scan images for known vulnerabilities
- The Anchore Engine is an open-source project that provides a centralized service for inspection, analysis, and certification of container images. The Anchore Engine is provided as a Docker container image that can be run standalone or within an orchestration platform such as Kubernetes, Docker Swarm, Rancher, Amazon ECS, and other container orchestration platforms.
- Clair is an open source project for the static analysis of vulnerabilities in application containers (currently including appc and docker).
- Vuls Vulnerability scanner for Linux/FreeBSD, agentless, written in golang. Can scan containers as well as hosts.
- OpenSCAP Open Source Security Compliance Solution - The oscap program is a command line tool that allows users to load, scan, validate, edit, and export SCAP documents.
Monitoring, Logging and Runtime Security – 20%
- Perform behavioral analytics of syscall process and file activities at the host and container level to detect malicious activities
- Detect threats within physical infrastructure, apps, networks, data, users and workloads
- Detect all phases of attack regardless where it occurs and how it spreads
- Perform deep analytical investigation and identification of bad actors within environment
- Ensure immutability of containers at runtime
- Principles of Container-based Application Design Official guiding principles for containerised applications - kubernetes.io
- What does this mean and why does it matter? Immutable means that a container won’t be modified during its life: no updates, no patches, no configuration changes. If you must update the application code or apply a patch, you build a new image and redeploy it. Immutability makes deployments safer and more repeatable. If you need to roll back, you simply redeploy the old image. This approach allows you to deploy the same container image in every one of your environments, making them as identical as possible. Source
- Immutability of Volumes (Secrets, ConfigMaps, VolumeMounts) can be achieved with
readOnly: truefield on the mount.
volumeMounts: - name: cloudsql-instance-credentials mountPath: /secrets/cloudsql readOnly: true
- Ensure that your containers are stateless and immutable Good explanation of why immutability is important - cloud.google.com
- Webinar: Immutable Infrastructure in the Age of Kubernetes CNCF webinar - youtube.com
- Best Practices for Designing and Building Containers for Kubernetes Explains operational risks of mutability - weave.works
- Use Audit Logs to monitor access
- Auditing Official Kubernetes documentation - kubernetes.io
- Kubernetes Audit: Making Log Auditing a Viable Practice Again. explanation of Kubernetes audit logging - alcide.io
- How to monitor Kubernetes audit logs Datadog’s view of Kubernetes Audit Logging. Whie part of the narrative is there to showcase the product, it still covers interesting and valuable signals that CKS should be aware of - datadoghq.com