I’ve been involved in several cloud migrations over the past few years, ranging from small applications to large legacy systems. The hardest lesson I’ve learned: there’s no one-size-fits-all migration strategy. Every system has different constraints, and choosing the wrong approach can turn a migration into a multi-year death march.

Today, I want to share the migration patterns I’ve found most effective, along with the trade-offs of each. Whether you’re moving a monolith or modernizing existing cloud workloads, understanding these patterns will help you choose the right path.

The Six R’s of Migration

The industry has converged on six basic migration strategies. I think of them as a spectrum from low-effort/low-benefit to high-effort/high-benefit:

1. Rehost (Lift and Shift)

Move your application to the cloud without changes. Take your VMs and recreate them in the cloud.

When to use: Legacy systems where you need quick migration and don’t have resources for re-architecture.

Benefits:

  • Fast migration (weeks to months)
  • Minimal risk
  • Immediate infrastructure cost benefits

Drawbacks:

  • Miss cloud-native benefits
  • Often more expensive than optimized cloud architecture
  • Technical debt follows you to the cloud

I’ve used this for legacy systems where the business value of quick migration outweighs long-term optimization. For example, getting out of an expiring data center lease.

2. Replatform (Lift, Tinker, and Shift)

Make minimal cloud-specific optimizations without changing core architecture.

Example: Move from self-managed MySQL to managed RDS, but keep application code the same.

// Before: Managing database ourselves
db, err := sql.Open("mysql", "user:password@tcp(10.0.1.50:3306)/dbname")

// After: Use managed RDS, but code stays the same
db, err := sql.Open("mysql", "user:password@tcp(mydb.us-east-1.rds.amazonaws.com:3306)/dbname")

When to use: When you can get significant operational benefits with minimal code changes.

Benefits:

  • Reduced operational burden (no database patching, backups handled by platform)
  • Better than pure rehost
  • Still relatively quick

Drawbacks:

  • Doesn’t fully leverage cloud capabilities
  • May need further optimization later

3. Repurchase (Replace with SaaS)

Replace custom applications with SaaS offerings.

When to use: For commodity functionality that doesn’t differentiate your business.

Example: Migrating from a self-hosted email server to a managed service, or replacing custom CRM with Salesforce.

Benefits:

  • Eliminate maintenance burden
  • Get features and updates automatically
  • Often cheaper than running yourself

Drawbacks:

  • Less control and customization
  • Vendor lock-in
  • Integration challenges

4. Refactor (Re-architect)

Re-architect applications to be cloud-native. This is the most interesting and challenging approach.

When to use: When you need to scale significantly, improve agility, or the current architecture is limiting.

This is where most of my effort goes, so let’s dive deeper.

Cloud-Native Refactoring Patterns

Breaking Down the Monolith

The typical first step in refactoring is decomposing monoliths into microservices. Here’s my approach:

Start with clear boundaries: Don’t decompose arbitrarily. Find natural seams in your domain.

// Monolith: Everything in one service
type OrderService struct {
    db *sql.DB
}

func (s *OrderService) CreateOrder(userID string, items []Item) error {
    // Validate user
    user, err := s.db.Query("SELECT * FROM users WHERE id = ?", userID)
    // ...

    // Check inventory
    for _, item := range items {
        inventory, err := s.db.Query("SELECT stock FROM inventory WHERE item_id = ?", item.ID)
        // ...
    }

    // Process payment
    // ...

    // Create order
    // ...

    // Send notification
    // ...
}

// Microservices: Split by domain
type OrderService struct {
    userClient      *UserServiceClient
    inventoryClient *InventoryServiceClient
    paymentClient   *PaymentServiceClient
    notifyClient    *NotificationServiceClient
}

func (s *OrderService) CreateOrder(ctx context.Context, userID string, items []Item) error {
    // Validate user
    user, err := s.userClient.GetUser(ctx, userID)
    if err != nil {
        return err
    }

    // Check inventory
    available, err := s.inventoryClient.CheckAvailability(ctx, items)
    if err != nil {
        return err
    }

    // Process payment
    if err := s.paymentClient.ProcessPayment(ctx, userID, calculateTotal(items)); err != nil {
        return err
    }

    // Create order
    order, err := s.createOrder(userID, items)
    if err != nil {
        // Need to rollback payment
        s.paymentClient.RefundPayment(ctx, userID, calculateTotal(items))
        return err
    }

    // Send notification (async, don't wait)
    go s.notifyClient.SendOrderConfirmation(ctx, userID, order.ID)

    return nil
}

This introduces new challenges: distributed transactions, network failures, and orchestration. But it enables independent scaling and deployment.

Strangler Fig Pattern

Don’t rewrite everything at once. Use the strangler fig pattern to gradually replace functionality:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Legacy Monolith             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚     Order Processing        β”‚   β”‚ ──┐
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚   β”‚  Phase 1: Extract one service
β”‚  β”‚   User Management (legacy)  β”‚   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚   β”‚
β”‚  β”‚    Payment (legacy)         β”‚   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                                          β–Ό
                               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                               β”‚  Order Service     β”‚
                               β”‚  (microservice)    β”‚
                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Phase 2: Route new order requests to microservice, legacy to monolith
Phase 3: Migrate all order data and decommission monolith's order code
Phase 4: Repeat for other domains

Use a routing layer to direct traffic:

type Router struct {
    legacyBackend *LegacyService
    newBackend    *MicroserviceBackend
    migrationPct  float64
}

func (r *Router) HandleRequest(w http.ResponseWriter, req *http.Request) {
    // Route based on feature flags, user cohorts, or percentage
    if r.shouldUseMicroservice(req) {
        r.newBackend.ServeHTTP(w, req)
    } else {
        r.legacyBackend.ServeHTTP(w, req)
    }
}

func (r *Router) shouldUseMicroservice(req *http.Request) bool {
    // Gradual rollout: start with 1%, increase over time
    userID := getUserID(req)
    hash := hashUserID(userID)
    return (float64(hash%100) / 100.0) < r.migrationPct
}

This allows you to migrate incrementally, validate each step, and roll back if needed.

Database Migration Strategies

Database migration is often the hardest part. Here are patterns I’ve used:

Pattern 1: Replicate and Sync

Keep legacy and new databases in sync during migration:

type DualWriteDAO struct {
    legacyDB *sql.DB
    newDB    *sql.DB
    logger   Logger
}

func (d *DualWriteDAO) CreateOrder(order *Order) error {
    // Write to legacy DB (source of truth during migration)
    if err := d.legacyDB.Exec("INSERT INTO orders ...", order); err != nil {
        return err
    }

    // Also write to new DB (asynchronously to avoid blocking)
    go func() {
        if err := d.newDB.Exec("INSERT INTO orders ...", order); err != nil {
            // Log discrepancy for later reconciliation
            d.logger.Error("Dual write to new DB failed", "order_id", order.ID, "error", err)
        }
    }()

    return nil
}

Run a reconciliation job to catch any missed writes:

func (d *DualWriteDAO) ReconcileData() error {
    // Find orders in legacy DB not in new DB
    rows, err := d.legacyDB.Query(`
        SELECT id FROM legacy.orders
        WHERE id NOT IN (SELECT id FROM new.orders)
        AND created_at > NOW() - INTERVAL 24 HOUR
    `)
    if err != nil {
        return err
    }
    defer rows.Close()

    for rows.Next() {
        var orderID string
        rows.Scan(&orderID)

        // Copy missing order
        order, err := d.getLegacyOrder(orderID)
        if err != nil {
            log.Printf("Failed to get order %s: %v", orderID, err)
            continue
        }

        if err := d.writeToNewDB(order); err != nil {
            log.Printf("Failed to copy order %s: %v", orderID, err)
        }
    }

    return nil
}

Once the new DB is in sync and tested, switch reads to it. Finally, stop writing to the legacy DB.

Pattern 2: Event Sourcing for Migration

Capture all changes as events, which can be replayed to build new data models:

type OrderEvent struct {
    EventID   string
    EventType string
    OrderID   string
    Timestamp time.Time
    Data      json.RawMessage
}

// Publish events on all changes
func (s *OrderService) CreateOrder(order *Order) error {
    // Write to legacy DB
    if err := s.db.Insert(order); err != nil {
        return err
    }

    // Publish event
    event := OrderEvent{
        EventID:   generateID(),
        EventType: "order.created",
        OrderID:   order.ID,
        Timestamp: time.Now(),
        Data:      marshalOrder(order),
    }

    s.eventBus.Publish(event)
    return nil
}

// New service consumes events to build its data model
func (s *NewOrderService) ConsumeEvents() {
    s.eventBus.Subscribe("order.*", func(event OrderEvent) {
        switch event.EventType {
        case "order.created":
            s.handleOrderCreated(event)
        case "order.updated":
            s.handleOrderUpdated(event)
        case "order.cancelled":
            s.handleOrderCancelled(event)
        }
    })
}

This decouples the new system from the legacy database schema.

Managing Migration Risk

Feature Flags

Control rollout with feature flags:

type FeatureFlags struct {
    flags map[string]bool
}

func (f *FeatureFlags) IsEnabled(flag string, userID string) bool {
    // Check if feature is enabled for this user
    if enabled, ok := f.flags[flag]; ok {
        return enabled
    }

    // Default to old behavior
    return false
}

func handleCheckout(w http.ResponseWriter, r *http.Request) {
    userID := getUserID(r)

    if featureFlags.IsEnabled("new_checkout_flow", userID) {
        newCheckoutHandler(w, r)
    } else {
        legacyCheckoutHandler(w, r)
    }
}

This lets you:

  • Enable features for internal users first
  • Gradually roll out to customers
  • Quickly disable if problems arise
  • Run A/B tests to compare performance

Dark Launching

Run new code in production without affecting users:

func handleRequest(w http.ResponseWriter, r *http.Request) {
    // Process request with legacy code
    result := legacyHandler(r)

    // Also process with new code, but discard result
    go func() {
        newResult := newHandler(r)

        // Compare results and log discrepancies
        if !resultsMatch(result, newResult) {
            logger.Warn("Result mismatch",
                "legacy", result,
                "new", newResult,
            )
        }

        // Track performance metrics
        metrics.RecordLatency("new_handler", newResult.Latency)
    }()

    // Return legacy result to user
    w.Write(result)
}

This validates new code with production traffic before switching over.

Canary Deployments

Deploy to a small percentage of infrastructure first:

# Kubernetes canary deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service-v2
spec:
  replicas: 2  # Start with just 2 pods
  selector:
    matchLabels:
      app: order-service
      version: v2
  template:
    metadata:
      labels:
        app: order-service
        version: v2
    spec:
      containers:
      - name: order-service
        image: order-service:v2.0.0
---
# Keep most traffic on v1
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service-v1
spec:
  replicas: 18  # 90% of traffic
  selector:
    matchLabels:
      app: order-service
      version: v1

Monitor error rates and latency. If v2 looks good, gradually increase its replica count and decrease v1.

Migration Sequencing

Phase 1: Assessment

  • Inventory all applications and dependencies
  • Identify migration candidates (start with stateless services)
  • Estimate effort and risk

Phase 2: Foundation

  • Set up cloud accounts and networking
  • Establish security baseline (IAM, encryption, logging)
  • Create CI/CD pipelines
  • Deploy monitoring and alerting

Phase 3: Pilot Migration

  • Choose a low-risk application
  • Migrate using chosen strategy
  • Learn and adjust processes
  • Document patterns and pitfalls

Phase 4: Progressive Migration

  • Migrate applications in waves
  • Start with least dependencies
  • Tackle complex integrations
  • Keep legacy and cloud systems in sync

Phase 5: Optimization

  • Refactor for cloud-native patterns
  • Optimize costs
  • Improve observability
  • Decommission legacy infrastructure

Cost Management

Cloud can be cheaper, but only if you optimize:

// Auto-scaling based on load
type AutoScaler struct {
    minInstances int
    maxInstances int
    targetCPU    float64
}

func (a *AutoScaler) Scale(currentCPU float64, currentInstances int) int {
    if currentCPU > a.targetCPU {
        // Scale up
        return min(currentInstances+1, a.maxInstances)
    } else if currentCPU < a.targetCPU*0.5 {
        // Scale down
        return max(currentInstances-1, a.minInstances)
    }
    return currentInstances
}

Use spot instances for batch workloads:

type BatchProcessor struct {
    spotInstances  []*Instance
    onDemandBackup *Instance
}

func (b *BatchProcessor) ProcessJobs(jobs []Job) error {
    // Try spot instances first (much cheaper)
    for _, instance := range b.spotInstances {
        if instance.Available() {
            return instance.Process(jobs)
        }
    }

    // Fall back to on-demand if spots unavailable
    log.Println("Spot instances unavailable, using on-demand")
    return b.onDemandBackup.Process(jobs)
}

Common Pitfalls

Underestimating data transfer costs: Moving terabytes between regions or out of cloud can be expensive. Plan carefully.

Ignoring latency: Cloud resources in different regions add latency. Measure impact on user experience.

Assuming infinite scale: Cloud scales well, but not infinitely. You still need to architect for scale.

Forgetting about egress: Bandwidth out of cloud providers costs money. Consider CDNs and caching.

Security shortcuts: Don’t compromise security for speed. Set up proper IAM, encryption, and network isolation from the start.

Looking Forward

Cloud migration is becoming table stakes. The question isn’t whether to migrate, but how to do it effectively. The patterns I’ve sharedβ€”strangler fig, dual writes, feature flags, canary deploymentsβ€”reduce risk and enable incremental progress.

Start with clear goals: cost reduction, scalability, agility? Choose your migration strategy based on those goals, not what’s trendy. And remember: migration is a journey, not a destination. Even after you’re β€œin the cloud,” continuous optimization is key.

The cloud rewards systems designed for it. Take the time to refactor thoughtfully, and you’ll reap benefits for years to come.