This article explores implementing Seccomp profiles in Kubernetes to enhance container security by filtering system calls.
In this article, we explore how to implement Seccomp profiles in Kubernetes to bolster container security by filtering system calls (syscalls). We start by reviewing how Docker and Kubernetes apply Seccomp by default and then demonstrate how to enable and customize Seccomp profiles within Kubernetes pods.
By restricting syscalls, Seccomp helps reduce the attack surface in containerized environments. This guide outlines both default behaviors and custom configurations for running secure Kubernetes workloads.
Docker uses a built-in Seccomp profile to block approximately 60 syscalls by default. To inspect Docker’s default Seccomp configuration, we can use the open-source container introspection tool called amicontained.Run the following command to launch amicontained as a Docker container:
Copy
Ask AI
docker run r.j3ss.co/amicontained amicontained
The command output indicates that 64 syscalls are blocked due to Docker’s default Seccomp profile. Notice that the Seccomp mode is set to filtering (mode 2):
Deploying the same image as a Kubernetes pod yields a different outcome. In Kubernetes (version 1.20 during this recording), Seccomp is not enabled by default.Create a pod with the following command:
Copy
Ask AI
kubectl run amicontained --image=r.j3ss.co/amicontained amicontained -- amicontained
You should see:
Copy
Ask AI
pod/test created
Next, inspect the pod logs:
Copy
Ask AI
kubectl logs amicontained
The log output will display Seccomp as disabled along with only 21 syscalls being blocked:
To enable Seccomp filtering in a pod, specify a Seccomp profile in the pod (or container) manifest. The example below demonstrates using the default Docker profile via the seccompProfile field under the pod-level security context:
Setting allowPrivilegeEscalation: false restricts the container process from gaining additional privileges beyond what is needed.Apply the pod definition:
Copy
Ask AI
kubectl apply -f pod-definition.yaml
Then verify the pod logs:
Copy
Ask AI
kubectl logs amicontained
The logs should now show that Seccomp filtering is active with additional syscalls being blocked:
If you prefer to run a pod without any Seccomp restrictions (which is the default behavior), explicitly set the Seccomp profile to Unconfined in your pod manifest:
Custom Seccomp profiles offer granular security control based on your application’s syscall needs. The following sections detail how to create a pod with a custom Seccomp profile, enforce strict policies, and tailor profiles for your specific requirements.
In this example, we create a pod using the Ubuntu image. The container prints a message and then sleeps for 100 seconds. The pod manifest below applies a custom Seccomp profile from a file on the node.
Copy
Ask AI
apiVersion: v1kind: Podmetadata: name: test-auditspec: securityContext: seccompProfile: type: Localhost localhostProfile: <path to the custom JSON file> containers: - command: ["bash", "-c", "echo 'I just made some syscalls' && sleep 100"] image: ubuntu name: ubuntu securityContext: allowPrivilegeEscalation: false
The localhostProfile path is relative to the default Seccomp profile directory (typically /var/lib/kubelet/seccomp). For example, if you place your custom profile in /var/lib/kubelet/seccomp/profiles/, the path might be profiles/audit.json.
Inside your custom profile (e.g., audit.json), you could set the default action to log syscalls:
Copy
Ask AI
{ "defaultAction": "SCMP_ACT_LOG"}
Once the pod is running, all container syscalls are logged to the node’s syslog (commonly /var/log/syslog). You can check the logs with:
To enforce a stricter security posture, you can create a profile that denies any syscall by default. Create a JSON file (e.g., violation.json) with the following content:
Copy
Ask AI
{ "defaultAction": "SCMP_ACT_ERRNO"}
Apply this profile in your pod’s security context. When created, the pod status will be “ContainerCannotRun” because even essential syscalls are blocked. For example:Apply the pod definition:
Copy
Ask AI
kubectl apply -f test-violation.yaml
Then check the pod status:
Copy
Ask AI
kubectl get pods
Expected output:
Copy
Ask AI
NAME READY STATUS RESTARTS AGEtest-violation 0/1 ContainerCannotRun 0 2m2s
While a strict Seccomp profile can improve security, it may render the pod non-functional if critical syscalls are blocked.
After analyzing your application’s syscall requirements—by inspecting audit logs or using tools like Tracee—you can craft a custom Seccomp profile that allows only the required syscalls.Create a custom profile JSON file (for example, custom.json) with specific rules and place it in the node’s Seccomp profile directory (typically in a profiles folder under /var/lib/kubelet/seccomp).Then reference your custom profile in a pod definition:
Copy
Ask AI
apiVersion: v1kind: Podmetadata: name: test-customspec: securityContext: seccompProfile: type: Localhost localhostProfile: profiles/custom.json containers: - command: ["bash", "-c", "echo 'I just made some syscalls' && sleep 100"] image: ubuntu name: ubuntu restartPolicy: Never
Verify that the pod starts successfully:
Copy
Ask AI
kubectl get pods
Expected output:
Copy
Ask AI
NAME READY STATUS RESTARTS AGEtest-custom 1/1 Running 0 2m2s
Once your custom profile is applied, only the required syscalls will be allowed, enhancing container isolation and security.
Implementing and customizing Seccomp profiles in Kubernetes enhances container security by limiting unnecessary syscalls. Although creating a custom profile can be time-consuming, mastering existing profiles and tailoring them to your application’s needs is essential for a secure container environment.For further details, refer to the official Kubernetes Seccomp Documentation.Now is the time to experiment with Seccomp profiles in your environment. By leveraging these security measures, you can achieve a more robust and secure Kubernetes deployment.