Happy New Year! As enterprises increasingly adopt multi-cloud strategies, we’re seeing growing demand for key management that works consistently across AWS, Azure, and on-premises infrastructure. Customers don’t want to learn different key management APIs for each environment. They want unified key management with consistent policies, audit trails, and operational procedures. Here’s how we’re building this at Thales.
The Multi-Cloud Reality
A few years ago, multi-cloud was mostly theoretical. Organizations picked one cloud provider and stuck with it. Today, multi-cloud is increasingly common:
Strategic diversity: Not putting all eggs in one basket. Using multiple clouds reduces vendor lock-in and provides leverage in negotiations.
Best-of-breed services: AWS excels at some services, Azure at others. Organizations use each cloud where it’s strongest.
Merger and acquisition: When companies merge, they bring different cloud strategies. Suddenly you’re managing both AWS and Azure.
Geographic requirements: Different clouds have different global footprints. Data residency requirements might mandate different clouds for different regions.
Hybrid cloud: Many workloads remain on-premises for cost, compliance, or latency reasons. True “cloud” strategy includes on-premises infrastructure.
The challenge for key management is providing a consistent experience across these heterogeneous environments.
Abstraction Layer Architecture
Our approach is building an abstraction layer over provider-specific key management services:
Unified API: Applications use a single API regardless of where keys are stored. The same decrypt(keyId, ciphertext) call works whether the key is in AWS KMS, Azure Key Vault, or an on-premises HSM.
Provider adapters: Under the hood, we have adapters that translate our unified API to provider-specific APIs. The AWS adapter calls KMS, the Azure adapter calls Key Vault, the HSM adapter uses PKCS#11.
Routing layer: Based on key ID and policy, requests are routed to the appropriate backend. Application doesn’t know or care where the key lives.
This architecture provides flexibility. We can move keys between providers without changing applications. We can implement hybrid approaches where some operations use cloud KMS for convenience while others use on-premises HSMs for security.
Provider Capability Differences
AWS KMS and Azure Key Vault have similar capabilities but different details:
Key hierarchy: AWS KMS has Customer Master Keys (CMKs) and data keys. Azure Key Vault has keys and secrets. The concepts map but terminology differs.
Key import: AWS recently added support for importing your own key material. Azure has supported this longer. Implementation details differ significantly.
Access control: AWS uses IAM policies. Azure uses Azure Active Directory and Key Vault access policies. Both work but are structured differently.
Regions: AWS and Azure have different geographic regions. A key in AWS us-east-1 doesn’t map cleanly to Azure East US - they’re different data centers.
Pricing: AWS charges per key per month plus per-request fees. Azure has different pricing tiers. Cost optimization requires understanding both models.
Our abstraction layer papers over these differences where possible and exposes them where necessary. For example, access control is abstracted through our policy engine, but region selection is exposed because applications need to choose based on latency requirements.
Cross-Cloud Key Replication
Some use cases require the same key to be available in multiple clouds. For example, data encrypted in AWS might need to be decrypted in Azure during a migration.
Replicating keys across clouds is tricky:
Security: Keys must be protected during replication. We can’t just copy them over the Internet in plaintext.
Consistency: Key material must be identical in both clouds. If they differ, data encrypted in one cloud can’t be decrypted in another.
Synchronization: When a key is rotated in AWS, we need to propagate the new version to Azure.
Our approach uses envelope encryption patterns:
- Master keys remain in HSMs (either on-premises or CloudHSM).
- Data keys are generated and encrypted by master keys.
- Encrypted data keys can be stored in both AWS KMS and Azure Key Vault.
- When decryption is needed, the data key is retrieved from local cloud KMS and decrypted by the master key.
This allows “logical” key replication without actually copying key material between clouds insecurely.
Network Connectivity Challenges
Multi-cloud introduces network complexity. Our key management service needs to communicate with:
- AWS KMS endpoints in multiple regions
- Azure Key Vault endpoints in multiple regions
- On-premises HSMs in corporate data centers
- Customer applications in various networks
Each connection has different characteristics:
Latency: Calls to AWS KMS from our AWS-hosted services are fast. Calls from on-premises are slower. Cross-cloud calls (our AWS service calling Azure Key Vault) are slowest.
Reliability: Internet connections are less reliable than intra-cloud connections. We need retry logic and circuit breakers.
Security: Connections to cloud providers use TLS. Connections to on-premises HSMs might use VPNs or dedicated circuits.
We optimize by:
- Caching decrypted data keys to minimize backend calls
- Using regional deployments to minimize cross-region latency
- Implementing smart routing that prefers nearby key stores
Identity and Access Management Across Clouds
Each cloud has its own identity system. AWS has IAM. Azure has Active Directory. On-premises has LDAP, Active Directory, or other systems.
Unified key management requires unified identity:
Federation: We federate identities from customer systems. If a user is authenticated to Azure AD, they can access keys across all clouds.
Service accounts: For application authentication, we issue our own service account credentials that work across all backends.
Token translation: Internally, we translate cloud-specific identity tokens (AWS STS tokens, Azure AD tokens) to our own format that represents identity independent of cloud.
This allows us to enforce consistent access policies regardless of which cloud a request originates from.
Policy Enforcement Across Clouds
Policies must work consistently across clouds. If a policy says “finance users can decrypt payment keys only during business hours,” it must apply the same in AWS, Azure, and on-premises.
We centralize policy evaluation in our platform, not in cloud-specific systems. This ensures:
Consistency: The same policy code evaluates all requests regardless of backend.
Auditability: One audit log shows all policy decisions across all clouds.
Flexibility: We can implement richer policies than any single cloud provider supports.
The tradeoff is that we become a central choke point. If our policy service is down, key operations fail even if the backend cloud services are healthy. We mitigate this through high availability and caching of policy decisions.
Cost Optimization
Multi-cloud key management incurs costs across multiple providers. Optimizing costs requires understanding each provider’s pricing:
AWS KMS: $1/key/month + $0.03/10k requests. Cost is primarily per-key, so consolidating keys helps.
Azure Key Vault: Different pricing for standard vs premium tiers. Premium provides HSM-backing. Per-operation costs vary by operation type.
On-premises HSMs: Upfront capital cost plus maintenance. No per-operation fees but finite capacity.
We help customers optimize by:
- Using on-premises HSMs for high-volume operations to avoid per-request fees
- Using cloud KMS for low-volume keys to avoid HSM capacity costs
- Implementing aggressive caching to minimize billable operations
- Providing cost analytics showing spend by cloud and key
Compliance Across Cloud Boundaries
Many compliance requirements specify data residency - European data must stay in Europe, for instance. Multi-cloud complicates this.
We enforce data residency through key management:
Regional master keys: Master keys for European data are stored in European data centers only.
Policy enforcement: Policies prevent decryption of European keys from non-European regions.
Audit logging: Logs track where data was decrypted, enabling compliance verification.
This is more reliable than trusting applications to implement data residency correctly. Even if an application tries to move European data to US clouds, it can’t decrypt it because the keys aren’t available outside Europe.
Disaster Recovery Across Clouds
Multi-cloud provides disaster recovery benefits. If one cloud has an outage, we can fail over to another.
Our key management platform is deployed across multiple clouds:
Active-active: Services run in both AWS and Azure simultaneously. Requests can be served from either cloud.
Shared backend: Master keys in on-premises HSMs are accessible from both cloud deployments.
DNS failover: If one cloud is unavailable, DNS routes traffic to the other cloud.
This provides resilience against cloud provider outages, though it introduces complexity in keeping state synchronized across clouds.
Migration Strategies
We’re helping customers migrate between clouds, which requires careful key management:
Phased migration: Some workloads move to new cloud while others remain. During transition, keys need to be accessible from both clouds.
Re-encryption: Data encrypted with old cloud keys is decrypted and re-encrypted with new cloud keys. This must happen without exposing plaintext unnecessarily.
Dual operation: For a transition period, data is accessible from both clouds using different keys.
Our platform supports all these patterns, providing tools for safe migration without data exposure or downtime.
Observability Across Clouds
Monitoring multi-cloud key management is challenging. We need visibility into:
- Key operations in each cloud
- Latency by cloud and region
- Error rates by backend
- Cost by cloud provider
- Policy decisions across all clouds
We aggregate all this into a unified dashboard. Operations teams see one view regardless of how many clouds we’re spanning.
The underlying implementation is complex - collecting metrics from AWS CloudWatch, Azure Monitor, on-premises systems, and our own services, then normalizing and aggregating them.
Lessons Learned
After six months of production multi-cloud deployment:
Abstraction has costs: Our abstraction layer adds latency (usually 5-10ms). For most use cases this is acceptable, but high-performance applications notice.
Testing is harder: We need to test behavior across all cloud combinations. AWS KMS + Azure compute, Azure Key Vault + AWS compute, on-premises HSM + cloud compute, etc.
Cloud-specific issues: Each cloud has its own quirks. AWS KMS throttling behaves differently than Azure Key Vault rate limiting. We need provider-specific handling.
Network reliability varies: Internet connectivity between clouds is less reliable than we expected. Aggressive retry logic is essential.
Identity federation is complex: Mapping identities across different systems is one of the hardest parts.
Looking Forward
Multi-cloud is becoming the default, not the exception. We’re investing in:
Better cost analytics: More sophisticated tools for understanding and optimizing multi-cloud costs.
Performance optimization: Reducing abstraction layer latency through protocol improvements and caching.
Additional cloud support: Adding GCP (Google Cloud Platform) support based on customer demand.
Cloud-native features: Leveraging unique features of each cloud (AWS CloudHSM, Azure Dedicated HSM) through our platform.
Key Takeaways
For teams building multi-cloud key management:
- Build abstraction layers for consistent application experience across clouds
- Understand and accommodate provider-specific differences in capabilities and pricing
- Use envelope encryption for logical key replication across clouds
- Centralize policy enforcement rather than relying on cloud-specific systems
- Plan for network latency and reliability challenges
- Implement comprehensive observability across all clouds
- Test extensively across all cloud combinations
- Design for disaster recovery using multi-cloud redundancy
Multi-cloud key management is complex but increasingly necessary. The benefits - avoiding lock-in, leveraging best-of-breed services, improving resilience - justify the investment. As cloud adoption continues, unified key management across clouds will become a critical capability for enterprise security.