Progressive Delivery: Canary Deployments and Feature Flags

Progressive delivery strategies enable safe deployments by gradually rolling out changes. After implementing canary deployments and feature flags in production, I’ve learned how to deploy with confidence.

Canary Deployments

Gradually shift traffic:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: myapp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  progressDeadlineSeconds: 60
  service:
    port: 8080
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m
      - name: request-duration
        thresholdRange:
          max: 500
        interval: 1m

Feature Flags

Control features independently:

type FeatureFlags struct {
    store *redis.Client
}

func (ff *FeatureFlags) IsEnabled(feature, userID string) bool {
    // Check user-specific override
    key := fmt.Sprintf("feature:%s:user:%s", feature, userID)
    if val, _ := ff.store.Get(key).Result(); val == "true" {
        return true
    }
    
    // Check percentage rollout
    key = fmt.Sprintf("feature:%s:percentage", feature)
    pct, _ := ff.store.Get(key).Int()
    
    // Consistent hashing for user
    hash := crc32.ChecksumIEEE([]byte(userID))
    return int(hash%100) < pct
}

Blue-Green Deployments

Switch traffic instantly:

# Blue (current)
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: blue

# Deploy green
# Test green
# Switch selector to green

Monitoring Rollouts

Track deployment health:

# Error rate during canary
sum(rate(http_requests_total{version="canary",code=~"5.."}[5m]))
/
sum(rate(http_requests_total{version="canary"}[5m]))

# Latency comparison
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket{version="canary"}[5m]))
/
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket{version="stable"}[5m]))

Rollback Strategy

Automate rollbacks:

analysis:
  webhooks:
    - name: rollback
      type: pre-rollout
      url: http://flagger-loadtester/
    - name: load-test
      url: http://flagger-loadtester/
      timeout: 5s
      metadata:
        type: cmd
        cmd: "hey -z 1m -q 10 -c 2 http://myapp-canary:8080/"

Traffic Mirroring

Test new versions with production traffic without risk:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: api-service
spec:
  hosts:
    - api-service
  http:
    - match:
        - headers:
            x-test-version:
              exact: "v2"
      route:
        - destination:
            host: api-service
            subset: v2
          weight: 100

    - route:
        - destination:
            host: api-service
            subset: v1
          weight: 100
      mirror:
        host: api-service
        subset: v2
      mirrorPercentage:
        value: 10.0  # Mirror 10% of traffic to v2

Compare results:

type TrafficMirroring struct {
    primaryClient MirrorClient
    shadowClient  Client
    comparator    ResponseComparator
}

func (tm *TrafficMirroring) Call(ctx context.Context, req *Request) (*Response, error) {
    // Call primary version
    primaryResp, err := tm.primaryClient.Call(ctx, req)

    // Mirror to shadow version (async, don't block)
    go func() {
        shadowResp, shadowErr := tm.shadowClient.Call(context.Background(), req)

        // Compare responses
        diffs := tm.comparator.Compare(primaryResp, shadowResp)
        if len(diffs) > 0 {
            tm.logDifferences(req, diffs)
        }
    }()

    return primaryResp, err
}

A/B Testing

Run controlled experiments:

type ABTest struct {
    name           string
    variantA       string
    variantB       string
    trafficSplit   float64  // Percentage to variant B
    metrics        MetricsCollector
}

func (ab *ABTest) SelectVariant(userID string) string {
    // Consistent hashing for user
    hash := crc32.ChecksumIEEE([]byte(userID))
    pct := float64(hash%100) / 100.0

    if pct < ab.trafficSplit {
        ab.metrics.RecordAssignment(ab.name, ab.variantB)
        return ab.variantB
    }

    ab.metrics.RecordAssignment(ab.name, ab.variantA)
    return ab.variantA
}

// Middleware for A/B testing
func ABTestMiddleware(test *ABTest) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            userID := r.Header.Get("X-User-ID")
            variant := test.SelectVariant(userID)

            // Add variant to context
            ctx := context.WithValue(r.Context(), "variant", variant)

            // Add variant header for routing
            r.Header.Set("X-Variant", variant)

            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}

Statistical significance testing:

type ExperimentAnalysis struct {
    variantA *Metrics
    variantB *Metrics
}

type Metrics struct {
    Users       int
    Conversions int
    Revenue     float64
}

func (ea *ExperimentAnalysis) CalculateSignificance() (*SignificanceResult, error) {
    // Calculate conversion rates
    rateA := float64(ea.variantA.Conversions) / float64(ea.variantA.Users)
    rateB := float64(ea.variantB.Conversions) / float64(ea.variantB.Users)

    // Z-test for proportions
    pooledRate := float64(ea.variantA.Conversions+ea.variantB.Conversions) /
        float64(ea.variantA.Users+ea.variantB.Users)

    se := math.Sqrt(pooledRate * (1 - pooledRate) *
        (1/float64(ea.variantA.Users) + 1/float64(ea.variantB.Users)))

    zScore := (rateB - rateA) / se

    // P-value (two-tailed)
    pValue := 2 * (1 - normalCDF(math.Abs(zScore)))

    return &SignificanceResult{
        ZScore:     zScore,
        PValue:     pValue,
        Significant: pValue < 0.05,  // 95% confidence
        RateA:      rateA,
        RateB:      rateB,
        Lift:       (rateB - rateA) / rateA * 100,
    }, nil
}

Ring Deployments

Gradually expand deployment scope:

# Ring 0: Internal testing (1% of infrastructure)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-ring0
  labels:
    ring: "0"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api
      version: v2
      ring: "0"
  template:
    metadata:
      labels:
        app: api
        version: v2
        ring: "0"
    spec:
      containers:
      - name: api
        image: api:v2

---
# Ring 1: Friendly users (10%)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-ring1
  labels:
    ring: "1"
spec:
  replicas: 10

---
# Ring 2: General availability (100%)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-ring2
  labels:
    ring: "2"
spec:
  replicas: 100

Feature Flag Architecture

Implement scalable feature flags:

type FeatureFlagService struct {
    store  FlagStore
    cache  *sync.Map
    evaluator RuleEvaluator
}

type FeatureFlag struct {
    Name        string
    Enabled     bool
    Rules       []Rule
    Rollout     *RolloutStrategy
}

type Rule struct {
    Attribute string      // e.g., "user_id", "email_domain"
    Operator  string      // e.g., "equals", "in", "contains"
    Value     interface{}
}

type RolloutStrategy struct {
    Percentage int
    UserIDs    []string  // Specific users
    Groups     []string  // User groups
}

func (ffs *FeatureFlagService) IsEnabled(flagName string, ctx *EvaluationContext) bool {
    // Check cache first
    if cached, ok := ffs.cache.Load(flagName); ok {
        flag := cached.(*FeatureFlag)
        return ffs.evaluate(flag, ctx)
    }

    // Load from store
    flag, err := ffs.store.Get(flagName)
    if err != nil {
        return false // Fail closed
    }

    ffs.cache.Store(flagName, flag)
    return ffs.evaluate(flag, ctx)
}

func (ffs *FeatureFlagService) evaluate(flag *FeatureFlag, ctx *EvaluationContext) bool {
    if !flag.Enabled {
        return false
    }

    // Check specific user overrides
    for _, userID := range flag.Rollout.UserIDs {
        if userID == ctx.UserID {
            return true
        }
    }

    // Evaluate rules
    for _, rule := range flag.Rules {
        if !ffs.evaluator.Evaluate(rule, ctx) {
            return false
        }
    }

    // Check percentage rollout
    if flag.Rollout.Percentage > 0 {
        hash := crc32.ChecksumIEEE([]byte(ctx.UserID))
        userPct := int(hash % 100)
        return userPct < flag.Rollout.Percentage
    }

    return true
}

// Usage in application
func (s *Service) ProcessOrder(ctx context.Context, order *Order) error {
    evalCtx := &EvaluationContext{
        UserID: order.UserID,
        Attributes: map[string]string{
            "plan":   order.Plan,
            "region": order.Region,
        },
    }

    if s.flags.IsEnabled("new-checkout-flow", evalCtx) {
        return s.newCheckoutFlow(ctx, order)
    }

    return s.legacyCheckoutFlow(ctx, order)
}

Shadow Mode Deployment

Run new version alongside production without switching traffic:

type ShadowDeployment struct {
    primary Client
    shadow  Client
    logger  Logger
}

func (sd *ShadowDeployment) Call(ctx context.Context, req *Request) (*Response, error) {
    // Primary call (production)
    resp, err := sd.primary.Call(ctx, req)

    // Shadow call (async, no impact on response)
    go func() {
        start := time.Now()
        shadowResp, shadowErr := sd.shadow.Call(context.Background(), req)
        duration := time.Since(start)

        // Log comparison
        sd.logger.Log(map[string]interface{}{
            "request_id":      req.ID,
            "primary_success": err == nil,
            "shadow_success":  shadowErr == nil,
            "shadow_duration": duration,
            "responses_match": sd.responsesMatch(resp, shadowResp),
        })
    }()

    return resp, err
}

Deployment Automation

Automate progressive rollouts:

type DeploymentController struct {
    k8s        kubernetes.Interface
    metrics    MetricsClient
    config     *RolloutConfig
}

type RolloutConfig struct {
    InitialWeight int           // Start with 10%
    MaxWeight     int           // Max 50%
    StepWeight    int           // Increase by 10%
    Interval      time.Duration // Every 5 minutes
    SuccessThreshold float64    // 99.5% success rate
}

func (dc *DeploymentController) ProgressiveRollout(deployment string) error {
    currentWeight := dc.config.InitialWeight

    for currentWeight <= dc.config.MaxWeight {
        // Update traffic weight
        if err := dc.updateTrafficWeight(deployment, currentWeight); err != nil {
            return err
        }

        // Wait for interval
        time.Sleep(dc.config.Interval)

        // Check metrics
        metrics, err := dc.metrics.GetDeploymentMetrics(deployment)
        if err != nil {
            return err
        }

        successRate := metrics.SuccessRate()
        if successRate < dc.config.SuccessThreshold {
            // Rollback
            dc.rollback(deployment)
            return fmt.Errorf("rollout failed: success rate %.2f%% < %.2f%%",
                successRate*100, dc.config.SuccessThreshold*100)
        }

        // Progress to next step
        currentWeight += dc.config.StepWeight
    }

    // Promote to 100%
    return dc.promoteToProduction(deployment)
}

Observability for Progressive Delivery

Monitor deployment progress:

# Success rate by version
sum(rate(http_requests_total{code!~"5..", version="v2"}[5m]))
/
sum(rate(http_requests_total{version="v2"}[5m]))

# Latency comparison
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket{version="v2"}[5m]))
/
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket{version="v1"}[5m]))

# Error rate diff
(
  sum(rate(http_requests_total{code=~"5..",version="v2"}[5m]))
  /
  sum(rate(http_requests_total{version="v2"}[5m]))
)
-
(
  sum(rate(http_requests_total{code=~"5..",version="v1"}[5m]))
  /
  sum(rate(http_requests_total{version="v1"}[5m]))
)

Dashboard for deployment tracking:

# Grafana dashboard snippet
panels:
  - title: "Deployment Progress"
    targets:
      - expr: "flagger_canary_weight{name='api'}"
        legendFormat: "Traffic Weight %"

  - title: "Success Rate by Version"
    targets:
      - expr: "sum(rate(http_requests_total{code!~'5..',version='v1'}[5m])) / sum(rate(http_requests_total{version='v1'}[5m]))"
        legendFormat: "v1"
      - expr: "sum(rate(http_requests_total{code!~'5..',version='v2'}[5m])) / sum(rate(http_requests_total{version='v2'}[5m]))"
        legendFormat: "v2"

  - title: "Request Latency p99"
    targets:
      - expr: "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket{version='v1'}[5m]))"
        legendFormat: "v1"
      - expr: "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket{version='v2'}[5m]))"
        legendFormat: "v2"

Progressive Delivery Best Practices

Start small: Begin with 1-5% traffic
Monitor closely: Track error rates, latency, and business metrics
Automate decisions: Use metrics to drive progression or rollback
Test in production: Shadow mode and traffic mirroring reduce risk
Feature flags: Decouple deployment from release
Gradual rollout: Increase traffic in small increments
Quick rollback: Automated rollback on metric degradation
Observability: Comprehensive monitoring and alerting

Conclusion

Progressive delivery enables safe, confident deployments through:

Canary deployments for gradual traffic shifting
Feature flags to decouple deployment from feature release
Blue-green deployments for instant switchover capability
A/B testing for data-driven product decisions
Traffic mirroring to test with production load
Ring deployments for gradual infrastructure rollout
Automated rollbacks based on metric thresholds
Comprehensive monitoring to detect issues quickly

The key is to reduce deployment risk through automation, observability, and gradual rollout strategies. Start with canary deployments for infrastructure changes, use feature flags for business logic, and invest in metrics-driven decision-making.

Progressive delivery transforms deployments from risky, manual processes into automated, data-driven operations. This enables teams to deploy more frequently with confidence, accelerating feature delivery while maintaining system reliability.