Running containers in production is very different from running them on your laptop. I’ve spent the last year operating containerized workloads at scale, and the learning curve has been steep. Container orchestration platforms like Kubernetes provide powerful primitives, but using them effectively requires understanding both the platform and distributed systems principles.
Today, I want to share the hard-won lessons I’ve learned about running production container workloads—the practices that make the difference between a stable system and one that keeps you up at night.
Resource Management
The most common mistake I see: not setting resource requests and limits properly.
Resource Requests and Limits
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 3
template:
spec:
containers:
- name: order-service
image: order-service:v1.2.0
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Requests: What the container is guaranteed. Used for scheduling decisions.
Limits: Maximum the container can use. Exceeding memory limit kills the container. Exceeding CPU limit throttles it.
How to set them:
- Run load tests and monitor resource usage
- Set requests to typical usage
- Set limits to peak usage with some headroom
- Monitor and adjust based on actual usage
// Example: Monitor and recommend resource settings
type ResourceMonitor struct {
metricsClient MetricsClient
}
func (m *ResourceMonitor) AnalyzeUsage(namespace, deployment string, days int) (*ResourceRecommendation, error) {
// Query actual usage over time
cpuUsage, err := m.metricsClient.QueryRange(fmt.Sprintf(
`avg(rate(container_cpu_usage_seconds_total{namespace="%s",pod=~"%s-.*"}[5m]))`,
namespace, deployment,
), time.Now().Add(-time.Duration(days)*24*time.Hour), time.Now())
if err != nil {
return nil, err
}
memUsage, err := m.metricsClient.QueryRange(fmt.Sprintf(
`avg(container_memory_working_set_bytes{namespace="%s",pod=~"%s-.*"})`,
namespace, deployment,
), time.Now().Add(-time.Duration(days)*24*time.Hour), time.Now())
if err != nil {
return nil, err
}
// Calculate percentiles
cpuP50 := percentile(cpuUsage, 0.50)
cpuP95 := percentile(cpuUsage, 0.95)
memP50 := percentile(memUsage, 0.50)
memP95 := percentile(memUsage, 0.95)
return &ResourceRecommendation{
CPURequest: fmt.Sprintf("%.0fm", cpuP50*1000),
CPULimit: fmt.Sprintf("%.0fm", cpuP95*1.2*1000), // 20% headroom
MemoryRequest: fmt.Sprintf("%.0fMi", memP50/(1024*1024)),
MemoryLimit: fmt.Sprintf("%.0fMi", memP95*1.2/(1024*1024)),
}, nil
}
Quality of Service Classes
Kubernetes assigns QoS classes based on resource configuration:
Guaranteed: Requests == Limits for all containers. Highest priority.
Burstable: Requests < Limits or only requests set. Medium priority.
BestEffort: No requests or limits. Lowest priority, killed first under pressure.
For production workloads, use Guaranteed or Burstable. Never BestEffort.
Health Checks
Kubernetes needs to know if your application is healthy.
Liveness Probes
Restart containers that are deadlocked or hung:
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
Implement a health endpoint:
type HealthChecker struct {
db *sql.DB
cache *redis.Client
}
func (h *HealthChecker) Check(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 3*time.Second)
defer cancel()
status := &HealthStatus{
Status: "healthy",
Checks: make(map[string]string),
}
// Check database connection
if err := h.db.PingContext(ctx); err != nil {
status.Status = "unhealthy"
status.Checks["database"] = fmt.Sprintf("error: %v", err)
} else {
status.Checks["database"] = "ok"
}
// Check cache connection
if err := h.cache.Ping(ctx).Err(); err != nil {
status.Status = "unhealthy"
status.Checks["cache"] = fmt.Sprintf("error: %v", err)
} else {
status.Checks["cache"] = "ok"
}
if status.Status == "unhealthy" {
w.WriteHeader(http.StatusServiceUnavailable)
} else {
w.WriteHeader(http.StatusOK)
}
json.NewEncoder(w).Encode(status)
}
Important: Liveness probes should check internal health, not dependency health. If your database is down, you don’t want all your pods restarting—that makes things worse.
Readiness Probes
Remove pods from service when they can’t handle traffic:
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 2
Unlike liveness, readiness should check dependencies:
func (h *HealthChecker) Ready(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
defer cancel()
// Check if we can reach critical dependencies
if err := h.db.PingContext(ctx); err != nil {
w.WriteHeader(http.StatusServiceUnavailable)
json.NewEncoder(w).Encode(map[string]string{
"status": "not ready",
"reason": "database unavailable",
})
return
}
// Check if local caches are warmed up
if !h.isCacheWarmed() {
w.WriteHeader(http.StatusServiceUnavailable)
json.NewEncoder(w).Encode(map[string]string{
"status": "not ready",
"reason": "cache warming",
})
return
}
w.WriteHeader(http.StatusOK)
json.NewEncoder(w).Encode(map[string]string{"status": "ready"})
}
Graceful Shutdown
Handle SIGTERM properly to avoid dropping requests during deployment:
func main() {
server := &http.Server{
Addr: ":8080",
Handler: router,
}
// Start server in goroutine
go func() {
if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("Server failed: %v", err)
}
}()
// Wait for interrupt signal
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
<-quit
log.Println("Shutting down server...")
// Give outstanding requests time to complete
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
log.Printf("Server forced to shutdown: %v", err)
}
log.Println("Server exited")
}
Configure Kubernetes to wait during termination:
spec:
containers:
- name: order-service
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
terminationGracePeriodSeconds: 60
The sleep gives time for:
- Load balancer to remove pod from rotation
- In-flight requests to complete
- Service to clean up resources
Configuration Management
ConfigMaps for Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: order-service-config
data:
app.properties: |
database.host=postgres.default.svc.cluster.local
database.port=5432
cache.ttl=300
feature.newCheckout=true
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
template:
spec:
containers:
- name: order-service
volumeMounts:
- name: config
mountPath: /etc/config
volumes:
- name: config
configMap:
name: order-service-config
Secrets for Sensitive Data
apiVersion: v1
kind: Secret
metadata:
name: order-service-secrets
type: Opaque
stringData:
database-password: "super-secret-password"
api-key: "abc123def456"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
template:
spec:
containers:
- name: order-service
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: order-service-secrets
key: database-password
- name: API_KEY
valueFrom:
secretKeyRef:
name: order-service-secrets
key: api-key
Never commit secrets to Git. Use external secret management:
# External Secrets example
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: order-service-secrets
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: order-service-secrets
data:
- secretKey: database-password
remoteRef:
key: secret/order-service/database
property: password
Deployment Strategies
Rolling Updates
Default strategy—replace pods gradually:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # How many extra pods during update
maxUnavailable: 0 # How many pods can be down
This ensures zero downtime during deployments.
Canary Deployments
Deploy to a small subset first:
# Stable version - 90% of traffic
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-stable
labels:
version: stable
spec:
replicas: 9
template:
metadata:
labels:
app: order-service
version: stable
---
# Canary version - 10% of traffic
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-canary
labels:
version: canary
spec:
replicas: 1
template:
metadata:
labels:
app: order-service
version: canary
spec:
containers:
- name: order-service
image: order-service:v1.3.0-canary
Monitor canary metrics. If good, gradually shift traffic.
Autoscaling
Horizontal Pod Autoscaler
Scale based on metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
Key configurations:
- stabilizationWindowSeconds: Wait before scaling to avoid flapping
- scaleUp policies: How aggressively to scale up (fast is good for traffic spikes)
- scaleDown policies: How aggressively to scale down (slow prevents premature scale-down)
Custom Metrics
Scale based on application metrics:
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
Implement metrics endpoint for custom metrics:
func (m *MetricsHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Expose metrics in Prometheus format
metrics := []string{
fmt.Sprintf("http_requests_per_second %f", m.getRequestRate()),
fmt.Sprintf("queue_depth %d", m.getQueueDepth()),
fmt.Sprintf("active_connections %d", m.getActiveConnections()),
}
w.Header().Set("Content-Type", "text/plain")
for _, metric := range metrics {
fmt.Fprintln(w, metric)
}
}
Pod Disruption Budgets
Ensure availability during maintenance:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: order-service-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: order-service
This prevents Kubernetes from evicting too many pods during node maintenance or upgrades.
Resource Quotas and Limit Ranges
Prevent resource exhaustion:
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "100"
requests.memory: 200Gi
limits.cpu: "200"
limits.memory: 400Gi
pods: "50"
---
apiVersion: v1
kind: LimitRange
metadata:
name: production-limits
namespace: production
spec:
limits:
- max:
cpu: "4"
memory: 8Gi
min:
cpu: "100m"
memory: 128Mi
default:
cpu: "500m"
memory: 512Mi
defaultRequest:
cpu: "250m"
memory: 256Mi
type: Container
Networking
Network Policies
Control traffic between pods:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: order-service-netpol
spec:
podSelector:
matchLabels:
app: order-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
This implements zero-trust networking at the pod level.
Service Mesh Integration
For advanced traffic management, observability, and security, integrate a service mesh:
apiVersion: v1
kind: Service
metadata:
name: order-service
annotations:
service.beta.kubernetes.io/inject-proxy: "true"
spec:
selector:
app: order-service
ports:
- port: 80
targetPort: 8080
The service mesh provides:
- Automatic mTLS between services
- Traffic splitting for canaries
- Advanced routing rules
- Distributed tracing
Logging and Monitoring
Structured Logging
logger := log.New(os.Stdout, "", 0)
func logStructured(level, message string, fields map[string]interface{}) {
entry := map[string]interface{}{
"timestamp": time.Now().UTC().Format(time.RFC3339),
"level": level,
"message": message,
"pod": os.Getenv("HOSTNAME"),
"namespace": os.Getenv("POD_NAMESPACE"),
}
for k, v := range fields {
entry[k] = v
}
data, _ := json.Marshal(entry)
logger.Println(string(data))
}
Centralized Logging
Ship logs to a central system:
# Fluentd DaemonSet for log collection
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
spec:
selector:
matchLabels:
app: fluentd
template:
spec:
containers:
- name: fluentd
image: fluentd:v1.14
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containers
hostPath:
path: /var/lib/docker/containers
Best Practices Summary
Resource management: Always set requests and limits. Use monitoring to tune them.
Health checks: Implement proper liveness and readiness probes. Test them.
Graceful shutdown: Handle SIGTERM and drain connections before exiting.
Configuration: Use ConfigMaps and Secrets. Never hardcode configuration.
Deployments: Use rolling updates. Implement canary deployments for risky changes.
Autoscaling: Configure HPA for variable load. Set conservative scale-down policies.
Resilience: Use Pod Disruption Budgets to maintain availability.
Security: Implement Network Policies. Use service mesh for zero-trust networking.
Observability: Structured logging, metrics, and distributed tracing are essential.
Looking Forward
Container orchestration is maturing rapidly. Kubernetes has won the orchestration wars, and the ecosystem is building powerful abstractions on top of it. Service meshes are becoming standard. GitOps is automating deployments. Serverless containers are emerging.
But the fundamentals remain: understand resource management, implement proper health checks, handle failures gracefully, and maintain comprehensive observability.
The practices I’ve shared come from real production experience—late-night incidents, post-mortems, and gradual improvements. Start with these patterns, adapt them to your needs, and keep learning from your incidents.
Your production workloads will thank you.