Iβve been involved in several cloud migrations over the past few years, ranging from small applications to large legacy systems. The hardest lesson Iβve learned: thereβs no one-size-fits-all migration strategy. Every system has different constraints, and choosing the wrong approach can turn a migration into a multi-year death march.
Today, I want to share the migration patterns Iβve found most effective, along with the trade-offs of each. Whether youβre moving a monolith or modernizing existing cloud workloads, understanding these patterns will help you choose the right path.
The Six Rβs of Migration
The industry has converged on six basic migration strategies. I think of them as a spectrum from low-effort/low-benefit to high-effort/high-benefit:
1. Rehost (Lift and Shift)
Move your application to the cloud without changes. Take your VMs and recreate them in the cloud.
When to use: Legacy systems where you need quick migration and donβt have resources for re-architecture.
Benefits:
- Fast migration (weeks to months)
- Minimal risk
- Immediate infrastructure cost benefits
Drawbacks:
- Miss cloud-native benefits
- Often more expensive than optimized cloud architecture
- Technical debt follows you to the cloud
Iβve used this for legacy systems where the business value of quick migration outweighs long-term optimization. For example, getting out of an expiring data center lease.
2. Replatform (Lift, Tinker, and Shift)
Make minimal cloud-specific optimizations without changing core architecture.
Example: Move from self-managed MySQL to managed RDS, but keep application code the same.
// Before: Managing database ourselves
db, err := sql.Open("mysql", "user:password@tcp(10.0.1.50:3306)/dbname")
// After: Use managed RDS, but code stays the same
db, err := sql.Open("mysql", "user:password@tcp(mydb.us-east-1.rds.amazonaws.com:3306)/dbname")
When to use: When you can get significant operational benefits with minimal code changes.
Benefits:
- Reduced operational burden (no database patching, backups handled by platform)
- Better than pure rehost
- Still relatively quick
Drawbacks:
- Doesnβt fully leverage cloud capabilities
- May need further optimization later
3. Repurchase (Replace with SaaS)
Replace custom applications with SaaS offerings.
When to use: For commodity functionality that doesnβt differentiate your business.
Example: Migrating from a self-hosted email server to a managed service, or replacing custom CRM with Salesforce.
Benefits:
- Eliminate maintenance burden
- Get features and updates automatically
- Often cheaper than running yourself
Drawbacks:
- Less control and customization
- Vendor lock-in
- Integration challenges
4. Refactor (Re-architect)
Re-architect applications to be cloud-native. This is the most interesting and challenging approach.
When to use: When you need to scale significantly, improve agility, or the current architecture is limiting.
This is where most of my effort goes, so letβs dive deeper.
Cloud-Native Refactoring Patterns
Breaking Down the Monolith
The typical first step in refactoring is decomposing monoliths into microservices. Hereβs my approach:
Start with clear boundaries: Donβt decompose arbitrarily. Find natural seams in your domain.
// Monolith: Everything in one service
type OrderService struct {
db *sql.DB
}
func (s *OrderService) CreateOrder(userID string, items []Item) error {
// Validate user
user, err := s.db.Query("SELECT * FROM users WHERE id = ?", userID)
// ...
// Check inventory
for _, item := range items {
inventory, err := s.db.Query("SELECT stock FROM inventory WHERE item_id = ?", item.ID)
// ...
}
// Process payment
// ...
// Create order
// ...
// Send notification
// ...
}
// Microservices: Split by domain
type OrderService struct {
userClient *UserServiceClient
inventoryClient *InventoryServiceClient
paymentClient *PaymentServiceClient
notifyClient *NotificationServiceClient
}
func (s *OrderService) CreateOrder(ctx context.Context, userID string, items []Item) error {
// Validate user
user, err := s.userClient.GetUser(ctx, userID)
if err != nil {
return err
}
// Check inventory
available, err := s.inventoryClient.CheckAvailability(ctx, items)
if err != nil {
return err
}
// Process payment
if err := s.paymentClient.ProcessPayment(ctx, userID, calculateTotal(items)); err != nil {
return err
}
// Create order
order, err := s.createOrder(userID, items)
if err != nil {
// Need to rollback payment
s.paymentClient.RefundPayment(ctx, userID, calculateTotal(items))
return err
}
// Send notification (async, don't wait)
go s.notifyClient.SendOrderConfirmation(ctx, userID, order.ID)
return nil
}
This introduces new challenges: distributed transactions, network failures, and orchestration. But it enables independent scaling and deployment.
Strangler Fig Pattern
Donβt rewrite everything at once. Use the strangler fig pattern to gradually replace functionality:
βββββββββββββββββββββββββββββββββββββββ
β Legacy Monolith β
β βββββββββββββββββββββββββββββββ β
β β Order Processing β β βββ
β βββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββ β β Phase 1: Extract one service
β β User Management (legacy) β β β
β βββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββ β β
β β Payment (legacy) β β β
β βββββββββββββββββββββββββββββββ β β
βββββββββββββββββββββββββββββββββββββββ β
βΌ
ββββββββββββββββββββββ
β Order Service β
β (microservice) β
ββββββββββββββββββββββ
Phase 2: Route new order requests to microservice, legacy to monolith
Phase 3: Migrate all order data and decommission monolith's order code
Phase 4: Repeat for other domains
Use a routing layer to direct traffic:
type Router struct {
legacyBackend *LegacyService
newBackend *MicroserviceBackend
migrationPct float64
}
func (r *Router) HandleRequest(w http.ResponseWriter, req *http.Request) {
// Route based on feature flags, user cohorts, or percentage
if r.shouldUseMicroservice(req) {
r.newBackend.ServeHTTP(w, req)
} else {
r.legacyBackend.ServeHTTP(w, req)
}
}
func (r *Router) shouldUseMicroservice(req *http.Request) bool {
// Gradual rollout: start with 1%, increase over time
userID := getUserID(req)
hash := hashUserID(userID)
return (float64(hash%100) / 100.0) < r.migrationPct
}
This allows you to migrate incrementally, validate each step, and roll back if needed.
Database Migration Strategies
Database migration is often the hardest part. Here are patterns Iβve used:
Pattern 1: Replicate and Sync
Keep legacy and new databases in sync during migration:
type DualWriteDAO struct {
legacyDB *sql.DB
newDB *sql.DB
logger Logger
}
func (d *DualWriteDAO) CreateOrder(order *Order) error {
// Write to legacy DB (source of truth during migration)
if err := d.legacyDB.Exec("INSERT INTO orders ...", order); err != nil {
return err
}
// Also write to new DB (asynchronously to avoid blocking)
go func() {
if err := d.newDB.Exec("INSERT INTO orders ...", order); err != nil {
// Log discrepancy for later reconciliation
d.logger.Error("Dual write to new DB failed", "order_id", order.ID, "error", err)
}
}()
return nil
}
Run a reconciliation job to catch any missed writes:
func (d *DualWriteDAO) ReconcileData() error {
// Find orders in legacy DB not in new DB
rows, err := d.legacyDB.Query(`
SELECT id FROM legacy.orders
WHERE id NOT IN (SELECT id FROM new.orders)
AND created_at > NOW() - INTERVAL 24 HOUR
`)
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var orderID string
rows.Scan(&orderID)
// Copy missing order
order, err := d.getLegacyOrder(orderID)
if err != nil {
log.Printf("Failed to get order %s: %v", orderID, err)
continue
}
if err := d.writeToNewDB(order); err != nil {
log.Printf("Failed to copy order %s: %v", orderID, err)
}
}
return nil
}
Once the new DB is in sync and tested, switch reads to it. Finally, stop writing to the legacy DB.
Pattern 2: Event Sourcing for Migration
Capture all changes as events, which can be replayed to build new data models:
type OrderEvent struct {
EventID string
EventType string
OrderID string
Timestamp time.Time
Data json.RawMessage
}
// Publish events on all changes
func (s *OrderService) CreateOrder(order *Order) error {
// Write to legacy DB
if err := s.db.Insert(order); err != nil {
return err
}
// Publish event
event := OrderEvent{
EventID: generateID(),
EventType: "order.created",
OrderID: order.ID,
Timestamp: time.Now(),
Data: marshalOrder(order),
}
s.eventBus.Publish(event)
return nil
}
// New service consumes events to build its data model
func (s *NewOrderService) ConsumeEvents() {
s.eventBus.Subscribe("order.*", func(event OrderEvent) {
switch event.EventType {
case "order.created":
s.handleOrderCreated(event)
case "order.updated":
s.handleOrderUpdated(event)
case "order.cancelled":
s.handleOrderCancelled(event)
}
})
}
This decouples the new system from the legacy database schema.
Managing Migration Risk
Feature Flags
Control rollout with feature flags:
type FeatureFlags struct {
flags map[string]bool
}
func (f *FeatureFlags) IsEnabled(flag string, userID string) bool {
// Check if feature is enabled for this user
if enabled, ok := f.flags[flag]; ok {
return enabled
}
// Default to old behavior
return false
}
func handleCheckout(w http.ResponseWriter, r *http.Request) {
userID := getUserID(r)
if featureFlags.IsEnabled("new_checkout_flow", userID) {
newCheckoutHandler(w, r)
} else {
legacyCheckoutHandler(w, r)
}
}
This lets you:
- Enable features for internal users first
- Gradually roll out to customers
- Quickly disable if problems arise
- Run A/B tests to compare performance
Dark Launching
Run new code in production without affecting users:
func handleRequest(w http.ResponseWriter, r *http.Request) {
// Process request with legacy code
result := legacyHandler(r)
// Also process with new code, but discard result
go func() {
newResult := newHandler(r)
// Compare results and log discrepancies
if !resultsMatch(result, newResult) {
logger.Warn("Result mismatch",
"legacy", result,
"new", newResult,
)
}
// Track performance metrics
metrics.RecordLatency("new_handler", newResult.Latency)
}()
// Return legacy result to user
w.Write(result)
}
This validates new code with production traffic before switching over.
Canary Deployments
Deploy to a small percentage of infrastructure first:
# Kubernetes canary deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-v2
spec:
replicas: 2 # Start with just 2 pods
selector:
matchLabels:
app: order-service
version: v2
template:
metadata:
labels:
app: order-service
version: v2
spec:
containers:
- name: order-service
image: order-service:v2.0.0
---
# Keep most traffic on v1
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-v1
spec:
replicas: 18 # 90% of traffic
selector:
matchLabels:
app: order-service
version: v1
Monitor error rates and latency. If v2 looks good, gradually increase its replica count and decrease v1.
Migration Sequencing
Phase 1: Assessment
- Inventory all applications and dependencies
- Identify migration candidates (start with stateless services)
- Estimate effort and risk
Phase 2: Foundation
- Set up cloud accounts and networking
- Establish security baseline (IAM, encryption, logging)
- Create CI/CD pipelines
- Deploy monitoring and alerting
Phase 3: Pilot Migration
- Choose a low-risk application
- Migrate using chosen strategy
- Learn and adjust processes
- Document patterns and pitfalls
Phase 4: Progressive Migration
- Migrate applications in waves
- Start with least dependencies
- Tackle complex integrations
- Keep legacy and cloud systems in sync
Phase 5: Optimization
- Refactor for cloud-native patterns
- Optimize costs
- Improve observability
- Decommission legacy infrastructure
Cost Management
Cloud can be cheaper, but only if you optimize:
// Auto-scaling based on load
type AutoScaler struct {
minInstances int
maxInstances int
targetCPU float64
}
func (a *AutoScaler) Scale(currentCPU float64, currentInstances int) int {
if currentCPU > a.targetCPU {
// Scale up
return min(currentInstances+1, a.maxInstances)
} else if currentCPU < a.targetCPU*0.5 {
// Scale down
return max(currentInstances-1, a.minInstances)
}
return currentInstances
}
Use spot instances for batch workloads:
type BatchProcessor struct {
spotInstances []*Instance
onDemandBackup *Instance
}
func (b *BatchProcessor) ProcessJobs(jobs []Job) error {
// Try spot instances first (much cheaper)
for _, instance := range b.spotInstances {
if instance.Available() {
return instance.Process(jobs)
}
}
// Fall back to on-demand if spots unavailable
log.Println("Spot instances unavailable, using on-demand")
return b.onDemandBackup.Process(jobs)
}
Common Pitfalls
Underestimating data transfer costs: Moving terabytes between regions or out of cloud can be expensive. Plan carefully.
Ignoring latency: Cloud resources in different regions add latency. Measure impact on user experience.
Assuming infinite scale: Cloud scales well, but not infinitely. You still need to architect for scale.
Forgetting about egress: Bandwidth out of cloud providers costs money. Consider CDNs and caching.
Security shortcuts: Donβt compromise security for speed. Set up proper IAM, encryption, and network isolation from the start.
Looking Forward
Cloud migration is becoming table stakes. The question isnβt whether to migrate, but how to do it effectively. The patterns Iβve sharedβstrangler fig, dual writes, feature flags, canary deploymentsβreduce risk and enable incremental progress.
Start with clear goals: cost reduction, scalability, agility? Choose your migration strategy based on those goals, not whatβs trendy. And remember: migration is a journey, not a destination. Even after youβre βin the cloud,β continuous optimization is key.
The cloud rewards systems designed for it. Take the time to refactor thoughtfully, and youβll reap benefits for years to come.