We had been postponing this for a long time. The deprecation messages kept reminding us but we kept using our “optical filters” to ignore them. However, Kubernetes 1.25 comes closer and closer and finally it was time to bite the bullet.
Table of Contents
What’s an admission controller anyway?
Simplified, an admission controller is something like a gatekeeper controlling what can control what gets let into your cluster and what does not. And that’s pretty much what PSA does too.
PSA vs PSS?
Yeah, confusing. PSA, Pod Security Admission tracks and optionally enforces PSS rules. PSS levels and rules determines the strictness of the rules being applied (restricted, baseline, privileged).
PSP and PSA are not mutually exclusive
This is good news if you think about it. You can play around with PSS rules in audit more while keeping your old PSP rules. This also means that you can enforce PSS rules while keeping your old PSP rules too.
With one exception in my case: I had to allow pods to set the RuntimeDefault seccomp profile in Ranchers PSP profiles by annotate the with seccomp.security.alpha.kubernetes.io/allowedProfileNames=”*”.
Being able to run them side by side is great because it makes the transition so much easier!
Don’t use the native PSA controller
It’s there, you might be tempted to use it, but trust me, you don’t. Pick another project, any really, is better than the native controller and you have a bunch of options out there for example:
- Kyverno
- OPA Gatekeeper
- Neuvector
Why you don’t want to use the native PSA:
- It’s not even a little bit flexible, you will very likely run into cases which is not supported
- It can’t log events other than to the audit log, a log which might not be accessible by users
- It does not have any interface to show the current Policy Violations of a cluster
Just… don’t… use… it.
Kyverno
We went with Kyverno because it was recommended to my by two separate friends. If you have time to evaluate more than one, go ahead, but this post will cover how we solved it using Kyverno. The post will not cover how to install Kyverno as there’s already good documentation of how to do that on their web page.
We had a few needs:
- We needed to be able to exclude Istio sidecars from the Seccomp check (at least I have not found a way to change the Seccomp profile on Istio sidecars)
- We needed to exclude certain checks from certain namespaces
- Those namespaces still needed to use ie. PSA Baseline
Creating a restricted Policy
Since Kyverno Cluster Policies is evaluated one by one we can start with the restricted policy. Please note that this is not the same as the kyverno-policies package that generates generic validation rules. Those are great too, but did not work for sidecar exclusion, which we needed for Istio.
Let’s break apart a Cluster Policy and look at each section.
Disabling autogen rules
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: psa-restricted
annotations:
kyverno.io/autogen-controllers: none
The annotation kyverno.io/autogen-controllers: none tells Kyverno not to automatically generate policy results for ReplicaSets or Deployment. Reason for choosing this is that PSS just controls Pods and all Deployments and ReplicaSets really do is to spawn Pods.
If you choose to omit this annotation you might be blasted with thousands of errors depending on the size of your cluster as Deployments can have a number of revisions and each revision keeps a ReplicaSet around. So if you have 100 Deployments with 10 revisions each you’ll see 1000 errors, even if the ReplicaSets has 0 replicas configured. Also, you’ll get PolicyViolations for all the old ReplicaSets even though you’ve patched the latest Deployment to be compliant with PSA.
To enforce or not to the enforce
spec:
background: true
validationFailureAction: Audit
Here you can select what Kyverno should do if it detects a PolicyViolation. Since you’re reading this article you probably want to specify Audit. If you want Kyverno to block pods violating the policy you can pick Enforce.
What to look for (and what to not look for)
- name: restricted
match:
any:
- resources:
kinds:
- Pod
exclude:
any:
- resources:
namespaces:
- kube-system
- istio-system
Then we tell Kyverno to look for Pods in all namespaces except the ones we can’t use the restricted policy in. In the example above we look for pods in all namespaces except for kube-system and istio-system.
The validation rules
validate:
podSecurity:
level: restricted
version: latest
exclude:
- controlName: Seccomp
images:
- repo.org.com/istio/proxyv2:*
Here you specify which level of PSS to evaluate and which version. You can pick latest as version, or a valid Kubernetes minor version, ie v.1.23.17. This is also the place where you can make a general exception for Istio containers. Just make sure to make the pattern as exact as possible to avoid people abusing it to name images to something matching the pattern and circumvent the policy.
Putting it all together
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: psa-restricted
annotations:
kyverno.io/autogen-controllers: none
spec:
background: true
validationFailureAction: Audit
rules:
- name: restricted
match:
any:
- resources:
kinds:
- Pod
exclude:
any:
- resources:
namespaces:
- kube-system
- istio-system
validate:
podSecurity:
level: restricted
version: latest
exclude:
- controlName: Seccomp
images:
- repo.org.com/istio/proxyv2:*
Creating a baseline policy
Just because restricted does not fit some namespaces it’s not a bad idea to create a baseline policy for the ones that can run it. I won’t go into detail about each section below since they’re pretty much the same, except for these things:
- Instead of matching all except a few namespaces I match against a specific namespace, in this case istio-system
- Instead of the PSS level restricted I picked baseline
Note that you can choose to match against multiple namespaces here, or create one policy per namespace being excluded by the restricted policy, but do not that any exceptions you make will apply to all matching namespaces in the policy.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: psa-baseline-istio-system
annotations:
kyverno.io/autogen-controllers: none
spec:
background: true
validationFailureAction: Audit
rules:
- name: baseline
match:
any:
- resources:
kinds:
- Pod
namespaces:
- istio-system
validate:
podSecurity:
level: baseline
version: latest
So what did we get?
- Kyverno logs any PolicyViolations as Kubernetes Events instead of Audit events in the audit log.
- If you ship events to ie. ElasticSearch or Loki you can search, make graphs and alerts based on them. This creates transparency for the users during the migration to PSA.
- Kyverno allows for a much more flexible approach and lets you also pick what to not evaluate.
- Unlike native PSA Kyverno does not depend on namespace labels to know what to enforce. One less thing for you and your users to think about.
- If you choose to do so you can also install Kyvernos Web UI which gives user a realtime look into the Policy compliance state of their deployments. This also makes it easier for you as an admin to follow up and track progress.
- Kyverno has ready-made Prometheus scraping endpoints and Grafana boards if you prefer to use that for keeping track of compliance violations.
- When we have reached a point where we don’t have any policy violations we can set the validationFailureAction property to Enforce with high confidence.
Kudos to the Kyverno people on their Slack channel which gave me a lot of help and suggestions on how to overcome some of the problems I faced. ❤️
Got any comments or suggestions, please leave a comment below!