Default Kubernetes configurations prioritize ease of use over security. While this helps with initial adoption, production clusters require significant hardening. After securing clusters handling sensitive workloads and passing multiple security audits, I’ve learned which hardening measures provide the most value and how to implement them without breaking functionality.
The Security Challenge
A default Kubernetes installation has numerous security gaps:
- Overly permissive RBAC defaults
- No network segmentation
- Containers running as root
- No admission control
- Secrets stored unencrypted in etcd
- Wide-open API server access
- Minimal audit logging
Let’s address each systematically.
RBAC: Principle of Least Privilege
Kubernetes includes RBAC (Role-Based Access Control), but default roles are often too permissive.
Disable Anonymous Access:
# API server flags
--anonymous-auth=false
Audit Default Service Accounts:
# Check what default service account can do
kubectl auth can-i --list --as=system:serviceaccount:default:default
# Should show minimal permissions
Create Purpose-Specific Service Accounts:
apiVersion: v1
kind: ServiceAccount
metadata:
name: app-reader
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: read-configmaps
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: app-reader-binding
namespace: production
subjects:
- kind: ServiceAccount
name: app-reader
namespace: production
roleRef:
kind: Role
name: read-configmaps
apiGroup: rbac.authorization.k8s.io
Audit RBAC Permissions:
// Tool to audit RBAC permissions
package main
import (
"context"
"fmt"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes"
)
func auditRBAC(clientset *kubernetes.Clientset) error {
// List all ClusterRoleBindings
bindings, err := clientset.RbacV1().ClusterRoleBindings().List(
context.TODO(),
metav1.ListOptions{},
)
if err != nil {
return err
}
for _, binding := range bindings.Items {
// Check for cluster-admin bindings
if binding.RoleRef.Name == "cluster-admin" {
fmt.Printf("WARNING: cluster-admin binding found: %s\n", binding.Name)
for _, subject := range binding.Subjects {
fmt.Printf(" Subject: %s/%s\n", subject.Kind, subject.Name)
}
}
// Check for wildcard permissions
if binding.RoleRef.Name == "system:masters" {
fmt.Printf("CRITICAL: system:masters binding found: %s\n", binding.Name)
}
}
return nil
}
Network Policy: Zero Trust Networking
Default Kubernetes allows all pod-to-pod communication. Implement network segmentation:
Default Deny All:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Allow Specific Communication:
# Frontend can talk to API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-to-api
namespace: production
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Egress
egress:
# Allow to API service
- to:
- podSelector:
matchLabels:
app: api
ports:
- protocol: TCP
port: 8080
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
# API can talk to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-to-database
namespace: production
spec:
podSelector:
matchLabels:
app: api
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector:
matchLabels:
name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
Test Network Policies:
# Deploy test pod
kubectl run test --rm -it --image=alpine -- sh
# Try to access different services
wget -O- http://frontend-service:80 # Should fail if not allowed
wget -O- http://api-service:8080 # Test based on policy
Pod Security Policies
Enforce pod security standards:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default'
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: restricted-psp
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['restricted']
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: restricted-psp-all-serviceaccounts
roleRef:
kind: ClusterRole
name: restricted-psp
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
name: system:serviceaccounts
apiGroup: rbac.authorization.k8s.io
Admission Controllers
Enable and configure admission controllers:
# API server flags
--enable-admission-plugins=\
NodeRestriction,\
PodSecurityPolicy,\
LimitRanger,\
ResourceQuota,\
ServiceAccount,\
DefaultStorageClass,\
MutatingAdmissionWebhook,\
ValidatingAdmissionWebhook
Custom Validation Webhook:
package main
import (
"encoding/json"
"fmt"
"net/http"
admissionv1 "k8s.io/api/admission/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
type AdmissionReview struct {
Request *admissionv1.AdmissionRequest
Response *admissionv1.AdmissionResponse
}
func validatePod(ar *admissionv1.AdmissionRequest) *admissionv1.AdmissionResponse {
var pod corev1.Pod
if err := json.Unmarshal(ar.Object.Raw, &pod); err != nil {
return &admissionv1.AdmissionResponse{
UID: ar.UID,
Allowed: false,
Result: &metav1.Status{
Message: err.Error(),
},
}
}
// Validation rules
var violations []string
// Check for latest tag
for _, container := range pod.Spec.Containers {
if container.Image == "" ||
container.Image[len(container.Image)-7:] == ":latest" {
violations = append(violations,
fmt.Sprintf("Container %s uses :latest tag", container.Name))
}
// Check for resource limits
if container.Resources.Limits.Cpu().IsZero() {
violations = append(violations,
fmt.Sprintf("Container %s missing CPU limit", container.Name))
}
// Check running as root
if container.SecurityContext == nil ||
container.SecurityContext.RunAsNonRoot == nil ||
!*container.SecurityContext.RunAsNonRoot {
violations = append(violations,
fmt.Sprintf("Container %s must run as non-root", container.Name))
}
}
if len(violations) > 0 {
return &admissionv1.AdmissionResponse{
UID: ar.UID,
Allowed: false,
Result: &metav1.Status{
Message: fmt.Sprintf("Pod validation failed: %v", violations),
},
}
}
return &admissionv1.AdmissionResponse{
UID: ar.UID,
Allowed: true,
}
}
func handleAdmission(w http.ResponseWriter, r *http.Request) {
var admissionReview admissionv1.AdmissionReview
if err := json.NewDecoder(r.Body).Decode(&admissionReview); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
admissionReview.Response = validatePod(admissionReview.Request)
admissionReview.Response.UID = admissionReview.Request.UID
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(admissionReview)
}
Secrets Encryption at Rest
Encrypt secrets in etcd:
# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <BASE64_ENCODED_32_BYTE_KEY>
- identity: {}
# Generate encryption key
head -c 32 /dev/urandom | base64
# Configure API server
--encryption-provider-config=/etc/kubernetes/encryption-config.yaml
# Re-encrypt existing secrets
kubectl get secrets --all-namespaces -o json | kubectl replace -f -
API Server Security
Harden API server access:
# API server flags
--anonymous-auth=false
--insecure-port=0
--authorization-mode=RBAC,Node
--enable-admission-plugins=...
--audit-log-path=/var/log/kubernetes/audit.log
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
--client-ca-file=/etc/kubernetes/pki/ca.crt
Audit Policy:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log admin actions
- level: RequestResponse
users: ["kubernetes-admin"]
# Log secret access
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Log authentication failures
- level: Metadata
omitStages:
- RequestReceived
verbs: ["create", "update", "patch", "delete"]
# Don't log read-only requests
- level: None
verbs: ["get", "list", "watch"]
Kubelet Security
Harden kubelet configuration:
# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
readOnlyPort: 0
tlsCertFile: /var/lib/kubelet/pki/kubelet.crt
tlsPrivateKeyFile: /var/lib/kubelet/pki/kubelet.key
rotateCertificates: true
serverTLSBootstrap: true
protectKernelDefaults: true
Runtime Security with Falco
Deploy Falco for runtime threat detection:
apiVersion: v1
kind: ConfigMap
metadata:
name: falco-rules
namespace: falco
data:
custom_rules.yaml: |
- rule: Unauthorized Container Access
desc: Detect exec into containers
condition: >
spawned_process and
container and
proc.name in (bash, sh) and
not user.name in (allowed_users)
output: >
Unauthorized access detected
(user=%user.name command=%proc.cmdline container=%container.name)
priority: WARNING
- rule: Sensitive File Access
desc: Detect access to sensitive files
condition: >
open_read and
container and
fd.name in (/etc/shadow, /etc/passwd, /root/.ssh/id_rsa)
output: >
Sensitive file accessed
(file=%fd.name container=%container.name)
priority: CRITICAL
Certificate Management
Automate certificate rotation:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-tls
namespace: production
spec:
secretName: api-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.example.com
Security Scanning
Automated security scanning:
# GitHub Actions
name: Security Scan
on: [push, pull_request]
jobs:
scan-manifests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run Kubesec
run: |
docker run --rm -v $(pwd):/work kubesec/kubesec:v2 \
scan /work/k8s/*.yaml
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
scan-type: 'config'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
Hardening Checklist
- RBAC enabled with least privilege
- Default deny network policies
- Pod Security Policies enforced
- Admission controllers enabled
- Secrets encrypted at rest
- API server authentication required
- Audit logging enabled
- Kubelet authentication required
- Image scanning in CI/CD
- Runtime security monitoring
- Certificate rotation automated
- Security context configured for all pods
- Resource limits set
- Read-only root filesystem where possible
- Regular security audits
Conclusion
Kubernetes security requires defense in depth. No single measure is sufficient. Layer multiple controls:
- Access Control: RBAC, authentication, authorization
- Network Security: Network policies, service mesh
- Pod Security: PSP, security contexts, admission control
- Data Security: Encryption at rest and in transit
- Runtime Security: Monitoring and threat detection
- Audit: Comprehensive logging and alerting
Start with the basics (RBAC, network policies) and progressively add more sophisticated controls. Test each measure in non-production before enforcing in production. Security is a journey, not a destination—continue hardening as new threats emerge and best practices evolve.