Default Kubernetes configurations prioritize ease of use over security. While this helps with initial adoption, production clusters require significant hardening. After securing clusters handling sensitive workloads and passing multiple security audits, I’ve learned which hardening measures provide the most value and how to implement them without breaking functionality.

The Security Challenge

A default Kubernetes installation has numerous security gaps:

  • Overly permissive RBAC defaults
  • No network segmentation
  • Containers running as root
  • No admission control
  • Secrets stored unencrypted in etcd
  • Wide-open API server access
  • Minimal audit logging

Let’s address each systematically.

RBAC: Principle of Least Privilege

Kubernetes includes RBAC (Role-Based Access Control), but default roles are often too permissive.

Disable Anonymous Access:

# API server flags
--anonymous-auth=false

Audit Default Service Accounts:

# Check what default service account can do
kubectl auth can-i --list --as=system:serviceaccount:default:default

# Should show minimal permissions

Create Purpose-Specific Service Accounts:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-reader
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: read-configmaps
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-reader-binding
  namespace: production
subjects:
  - kind: ServiceAccount
    name: app-reader
    namespace: production
roleRef:
  kind: Role
  name: read-configmaps
  apiGroup: rbac.authorization.k8s.io

Audit RBAC Permissions:

// Tool to audit RBAC permissions
package main

import (
    "context"
    "fmt"
    rbacv1 "k8s.io/api/rbac/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/kubernetes"
)

func auditRBAC(clientset *kubernetes.Clientset) error {
    // List all ClusterRoleBindings
    bindings, err := clientset.RbacV1().ClusterRoleBindings().List(
        context.TODO(),
        metav1.ListOptions{},
    )
    if err != nil {
        return err
    }

    for _, binding := range bindings.Items {
        // Check for cluster-admin bindings
        if binding.RoleRef.Name == "cluster-admin" {
            fmt.Printf("WARNING: cluster-admin binding found: %s\n", binding.Name)
            for _, subject := range binding.Subjects {
                fmt.Printf("  Subject: %s/%s\n", subject.Kind, subject.Name)
            }
        }

        // Check for wildcard permissions
        if binding.RoleRef.Name == "system:masters" {
            fmt.Printf("CRITICAL: system:masters binding found: %s\n", binding.Name)
        }
    }

    return nil
}

Network Policy: Zero Trust Networking

Default Kubernetes allows all pod-to-pod communication. Implement network segmentation:

Default Deny All:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Allow Specific Communication:

# Frontend can talk to API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-to-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
    - Egress
  egress:
    # Allow to API service
    - to:
        - podSelector:
            matchLabels:
              app: api
      ports:
        - protocol: TCP
          port: 8080
    # Allow DNS
    - to:
        - namespaceSelector:
            matchLabels:
              name: kube-system
        - podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
---
# API can talk to database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-to-database
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 8080
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: database
      ports:
        - protocol: TCP
          port: 5432
    - to:
        - namespaceSelector:
            matchLabels:
              name: kube-system
        - podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53

Test Network Policies:

# Deploy test pod
kubectl run test --rm -it --image=alpine -- sh

# Try to access different services
wget -O- http://frontend-service:80  # Should fail if not allowed
wget -O- http://api-service:8080     # Test based on policy

Pod Security Policies

Enforce pod security standards:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default'
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: restricted-psp
rules:
  - apiGroups: ['policy']
    resources: ['podsecuritypolicies']
    verbs: ['use']
    resourceNames: ['restricted']
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: restricted-psp-all-serviceaccounts
roleRef:
  kind: ClusterRole
  name: restricted-psp
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: Group
    name: system:serviceaccounts
    apiGroup: rbac.authorization.k8s.io

Admission Controllers

Enable and configure admission controllers:

# API server flags
--enable-admission-plugins=\
NodeRestriction,\
PodSecurityPolicy,\
LimitRanger,\
ResourceQuota,\
ServiceAccount,\
DefaultStorageClass,\
MutatingAdmissionWebhook,\
ValidatingAdmissionWebhook

Custom Validation Webhook:

package main

import (
    "encoding/json"
    "fmt"
    "net/http"

    admissionv1 "k8s.io/api/admission/v1"
    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

type AdmissionReview struct {
    Request  *admissionv1.AdmissionRequest
    Response *admissionv1.AdmissionResponse
}

func validatePod(ar *admissionv1.AdmissionRequest) *admissionv1.AdmissionResponse {
    var pod corev1.Pod
    if err := json.Unmarshal(ar.Object.Raw, &pod); err != nil {
        return &admissionv1.AdmissionResponse{
            UID:     ar.UID,
            Allowed: false,
            Result: &metav1.Status{
                Message: err.Error(),
            },
        }
    }

    // Validation rules
    var violations []string

    // Check for latest tag
    for _, container := range pod.Spec.Containers {
        if container.Image == "" ||
           container.Image[len(container.Image)-7:] == ":latest" {
            violations = append(violations,
                fmt.Sprintf("Container %s uses :latest tag", container.Name))
        }

        // Check for resource limits
        if container.Resources.Limits.Cpu().IsZero() {
            violations = append(violations,
                fmt.Sprintf("Container %s missing CPU limit", container.Name))
        }

        // Check running as root
        if container.SecurityContext == nil ||
           container.SecurityContext.RunAsNonRoot == nil ||
           !*container.SecurityContext.RunAsNonRoot {
            violations = append(violations,
                fmt.Sprintf("Container %s must run as non-root", container.Name))
        }
    }

    if len(violations) > 0 {
        return &admissionv1.AdmissionResponse{
            UID:     ar.UID,
            Allowed: false,
            Result: &metav1.Status{
                Message: fmt.Sprintf("Pod validation failed: %v", violations),
            },
        }
    }

    return &admissionv1.AdmissionResponse{
        UID:     ar.UID,
        Allowed: true,
    }
}

func handleAdmission(w http.ResponseWriter, r *http.Request) {
    var admissionReview admissionv1.AdmissionReview

    if err := json.NewDecoder(r.Body).Decode(&admissionReview); err != nil {
        http.Error(w, err.Error(), http.StatusBadRequest)
        return
    }

    admissionReview.Response = validatePod(admissionReview.Request)
    admissionReview.Response.UID = admissionReview.Request.UID

    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(admissionReview)
}

Secrets Encryption at Rest

Encrypt secrets in etcd:

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <BASE64_ENCODED_32_BYTE_KEY>
      - identity: {}
# Generate encryption key
head -c 32 /dev/urandom | base64

# Configure API server
--encryption-provider-config=/etc/kubernetes/encryption-config.yaml

# Re-encrypt existing secrets
kubectl get secrets --all-namespaces -o json | kubectl replace -f -

API Server Security

Harden API server access:

# API server flags
--anonymous-auth=false
--insecure-port=0
--authorization-mode=RBAC,Node
--enable-admission-plugins=...
--audit-log-path=/var/log/kubernetes/audit.log
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
--client-ca-file=/etc/kubernetes/pki/ca.crt

Audit Policy:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Log admin actions
  - level: RequestResponse
    users: ["kubernetes-admin"]

  # Log secret access
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # Log authentication failures
  - level: Metadata
    omitStages:
      - RequestReceived
    verbs: ["create", "update", "patch", "delete"]

  # Don't log read-only requests
  - level: None
    verbs: ["get", "list", "watch"]

Kubelet Security

Harden kubelet configuration:

# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false
  webhook:
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
readOnlyPort: 0
tlsCertFile: /var/lib/kubelet/pki/kubelet.crt
tlsPrivateKeyFile: /var/lib/kubelet/pki/kubelet.key
rotateCertificates: true
serverTLSBootstrap: true
protectKernelDefaults: true

Runtime Security with Falco

Deploy Falco for runtime threat detection:

apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-rules
  namespace: falco
data:
  custom_rules.yaml: |
    - rule: Unauthorized Container Access
      desc: Detect exec into containers
      condition: >
        spawned_process and
        container and
        proc.name in (bash, sh) and
        not user.name in (allowed_users)
      output: >
        Unauthorized access detected
        (user=%user.name command=%proc.cmdline container=%container.name)
      priority: WARNING

    - rule: Sensitive File Access
      desc: Detect access to sensitive files
      condition: >
        open_read and
        container and
        fd.name in (/etc/shadow, /etc/passwd, /root/.ssh/id_rsa)
      output: >
        Sensitive file accessed
        (file=%fd.name container=%container.name)
      priority: CRITICAL

Certificate Management

Automate certificate rotation:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - http01:
          ingress:
            class: nginx
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-tls
  namespace: production
spec:
  secretName: api-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - api.example.com

Security Scanning

Automated security scanning:

# GitHub Actions
name: Security Scan
on: [push, pull_request]

jobs:
  scan-manifests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Run Kubesec
        run: |
          docker run --rm -v $(pwd):/work kubesec/kubesec:v2 \
            scan /work/k8s/*.yaml

      - name: Run Trivy
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'config'
          scan-ref: '.'
          format: 'sarif'
          output: 'trivy-results.sarif'

Hardening Checklist

  • RBAC enabled with least privilege
  • Default deny network policies
  • Pod Security Policies enforced
  • Admission controllers enabled
  • Secrets encrypted at rest
  • API server authentication required
  • Audit logging enabled
  • Kubelet authentication required
  • Image scanning in CI/CD
  • Runtime security monitoring
  • Certificate rotation automated
  • Security context configured for all pods
  • Resource limits set
  • Read-only root filesystem where possible
  • Regular security audits

Conclusion

Kubernetes security requires defense in depth. No single measure is sufficient. Layer multiple controls:

  1. Access Control: RBAC, authentication, authorization
  2. Network Security: Network policies, service mesh
  3. Pod Security: PSP, security contexts, admission control
  4. Data Security: Encryption at rest and in transit
  5. Runtime Security: Monitoring and threat detection
  6. Audit: Comprehensive logging and alerting

Start with the basics (RBAC, network policies) and progressively add more sophisticated controls. Test each measure in non-production before enforcing in production. Security is a journey, not a destination—continue hardening as new threats emerge and best practices evolve.