As 2014 comes to a close, I’m reflecting on a year of significant evolution for FC-Redirect and my own growth as an engineer. While 2013 was about raw performance and scale, 2014 was about architectural maturity, platform expansion, and cloud integration. Here’s my year in review.
The Numbers
Let me start with the quantitative achievements:
Platform Expansion:
- Platforms supported: 3 → 5 (MDS 9250i, N7000, MDS 9700 preview, OpenStack, AWS integration)
- Customer deployments: 50 → 150 (3x growth)
- Largest deployment: 768 ports, 500K concurrent flows
Performance:
- Throughput: 2.9M → 4.2M packets/sec (+45%)
- Latency: 2.0μs → 1.4μs P99 (-30%)
- Flow capacity: 12K → 500K (+4000%)
Reliability:
- Uptime: 99.999% maintained
- Zero data loss incidents
- Zero security breaches
Technology Adoption:
- Container support: Docker volume plugin
- Cloud integration: OpenStack Cinder driver, AWS hybrid storage
- Big data: Spark analytics on 6TB/month flow data
- Consensus: Raft implementation for distributed coordination
These numbers tell a story of maturation, but the real story is in the architectural evolution.
Major Technical Achievements
1. Multi-Platform Architecture
The biggest architectural change was truly multi-platform support. No longer is FC-Redirect a single-platform solution. We now run on:
MDS 9250i (original platform):
- Optimized for ASIC-based processing
- Best performance/watt ratio
- Mature, stable
Nexus 7000 (converged platform):
- Distributed line card processing
- Multi-tenancy (VDC support)
- FC + Ethernet convergence
MDS 9700 (next-gen platform):
- 768 ports, 96 cores
- Hardware telemetry (INT)
- Massive scale (500K flows)
OpenStack (cloud platform):
- RESTful API
- Dynamic provisioning
- Multi-tenant isolation
AWS (hybrid cloud):
- Tiered storage (FC → EBS → S3 → Glacier)
- Disaster recovery
- Cloud bursting
Each platform required deep architectural work. The N7000’s distributed caching, the 9700’s NUMA awareness, OpenStack’s dynamic zoning, and AWS hybrid storage all pushed us to build more flexible, adaptable architectures.
2. Cloud-Native Features
The cloud revolution forced us to evolve from hardware-centric to API-centric thinking:
APIs First:
- RESTful API for all operations
- Python SDK
- OpenStack integration
- AWS integration
Dynamic Provisioning:
- Volume creation: 5 seconds → 100ms (50x faster)
- Storage pools (pre-provisioned volumes)
- On-demand zoning
Multi-Tenancy:
- Per-tenant quotas
- Resource isolation
- Usage metering
Automation:
- Docker volume plugin
- OpenStack Cinder driver
- Terraform provider
This transformation from “hardware appliance” to “cloud-native service” was profound.
3. Distributed Systems Maturity
We replaced our homegrown consensus protocol with Raft. This sounds simple, but it fundamentally changed how we think about distributed systems:
Before (Homegrown):
- Complex, hard to understand
- Subtle bugs
- Difficult to reason about correctness
After (Raft):
- Well-specified algorithm
- Understandable by entire team
- Provably correct (extensive formal verification)
- Zero consensus bugs in 6 months
This taught me an important lesson: use proven algorithms and protocols. Don’t reinvent the wheel, especially for critical functionality like distributed consensus.
4. Observability and Analytics
We built comprehensive observability:
Metrics:
- Prometheus integration
- Grafana dashboards
- 1000+ metrics tracked
Logging:
- Structured logging (JSON)
- Centralized log aggregation
- Search and analysis
Tracing:
- Distributed tracing
- Request flow visualization
- Performance profiling
Analytics:
- Apache Spark for big data
- 6TB/month of flow data
- ML-based anomaly detection
- Capacity forecasting
This observability has been transformative. We detect issues in seconds instead of hours, and we understand system behavior at unprecedented depth.
5. Container Support
Docker exploded in 2014. We built first-class container support:
Docker Volume Plugin:
- FC-backed persistent volumes
- Sub-second provisioning
- Container-aware QoS
Kubernetes Integration:
- FlexVolume driver
- Dynamic provisioning
- StatefulSets support
Container Optimizations:
- Fast volume operations (<100ms)
- High operation throughput (1000 ops/sec)
- Container-specific QoS policies
Containers are fundamentally different from VMs. Building proper container support required rethinking storage lifecycle, I/O patterns, and QoS.
Lessons from Production
Some of my best learning came from production experiences:
The Importance of Profiling
I optimized code that wasn’t the bottleneck more times than I’d like to admit. The lesson: always profile first. Intuition about bottlenecks is often wrong.
Lock-Free Isn’t Always Better
I spent weeks building lock-free data structures that were slower than simple locks in practice. The lesson: measure, don’t assume. Lock-free helps with high contention, but most code doesn’t have high contention.
Abstractions Have Costs
Beautiful abstractions sometimes hide performance problems. The lesson: abstractions must be zero-cost in critical paths. Use them in control planes, minimize them in data planes.
Testing at Scale Matters
Bugs that don’t appear at 1K flows emerge catastrophically at 100K flows. The lesson: test beyond your target scale. If customers want 100K flows, test at 1M flows.
Documentation Prevents Bugs
Well-documented code has fewer bugs. When I explain why code does something, I often discover it shouldn’t do that. The lesson: documentation is a debugging tool, not just communication.
Personal Growth
Beyond technical skills, I’ve grown professionally:
From Individual Contributor to Technical Leader
I’ve started mentoring junior engineers and leading architecture discussions. This shift from “doing everything myself” to “enabling others to do great work” has been challenging but rewarding.
From Code-First to Design-First
Early in my career, I’d jump straight to coding. Now I spend more time on design, architecture, and thinking through edge cases. The best code is code you don’t write because you designed away the problem.
From Local to Systems Thinking
I’ve become better at seeing systems as wholes, not collections of components. Understanding emergent behavior, feedback loops, and system-level properties makes me more effective at designing complex systems.
From Reactive to Proactive
Instead of fixing problems as they arise, I’ve gotten better at preventing them through careful design, comprehensive testing, and good observability.
Technology Trends
2014 was a fascinating year in the broader technology landscape:
Containers Everywhere:
- Docker reached 1.0
- Kubernetes emerged from Google
- Container ecosystem exploded
Cloud Maturity:
- AWS continues dominating
- OpenStack adoption growing
- Hybrid cloud becomes mainstream
Big Data Evolution:
- Spark emerging as Hadoop successor
- Real-time analytics growing
- ML becoming accessible
Microservices Architecture:
- Moving from monoliths to services
- API-first design
- DevOps culture
These trends influenced how we built FC-Redirect. We didn’t just build storage networking software; we built cloud-native, container-ready, API-driven infrastructure.
What Didn’t Go Well
Not everything was successful:
Over-Engineering
I built several features that customers didn’t need. The lesson: validate demand before building. “If you build it, they will come” doesn’t apply to enterprise software.
Technical Debt
We accumulated technical debt in pursuit of features. In 2015, we’ll need to pay it down through refactoring and modernization.
Communication Gaps
Better communication with product management and customers would have prevented several missteps. The lesson: technical excellence isn’t enough. You must understand and communicate with stakeholders.
Looking Ahead to 2015
Several exciting projects are on the horizon:
NVMe over Fabrics
NVMe is revolutionizing storage. We’re working on NVMe over Fibre Channel support, which will require significant architectural changes for sub-microsecond latencies.
Kubernetes Native Storage
Kubernetes is becoming the container orchestration standard. We’re building native Kubernetes integration (CSI driver, storage classes, dynamic provisioning).
Machine Learning for Optimization
We’re exploring using ML to automatically optimize flow placement, predict capacity needs, and detect anomalies.
Global Scale
Some customers want to span FC-Redirect across data centers globally. This requires solving interesting problems around WAN latency, consistency, and failure domains.
Performance Targets
- 10M packets/sec (up from 4.2M)
- Sub-microsecond latency (down from 1.4μs)
- 1M concurrent flows (up from 500K)
Gratitude
This year’s achievements were team efforts:
- Colleagues at Cisco for collaboration and support
- Customers for pushing us to scale and sharing their use cases
- Open source community for tools like Raft, Spark, Docker
- Family for supporting long hours and travel
Reflections
2014 was a year of maturation. We moved from “making it work” to “making it production-ready” to “making it cloud-native.” Each transition required new skills and perspectives.
The most important lesson: sustainable systems require discipline. Fast code is useless if it’s unmaintainable. Scalable architecture is pointless if it’s unreliable. Performance matters, but so do correctness, observability, and operability.
Building production distributed systems is endlessly challenging. Just when you think you understand something, you encounter a new scale, a new failure mode, or a new requirement that challenges your assumptions.
But that’s what makes it exciting. Every day brings new problems to solve, new technologies to learn, and new systems to build.
Goals for 2015
As I look ahead:
- NVMe over Fabrics support with sub-microsecond latencies
- 10M packets/sec throughput while maintaining efficiency
- Native Kubernetes integration (CSI driver, storage classes)
- Machine learning for automatic optimization
- Global scale (multi-datacenter deployments)
- Technical debt reduction through systematic refactoring
- Mentorship of junior engineers
- Industry engagement (conference talks, open source contributions)
Conclusion
2014 was transformative. We expanded from a single-platform performance optimization project to a multi-platform, cloud-native, container-ready storage infrastructure solution.
The technology industry is evolving rapidly: containers, cloud, big data, microservices, machine learning. Storage networking must evolve with it. We can’t just be fast; we must be flexible, programmable, and cloud-native.
FC-Redirect is no longer just about redirecting flows. It’s about providing intelligent, automated, cloud-native storage networking that works seamlessly across on-premise and cloud environments.
As we head into 2015, I’m excited about the challenges ahead. NVMe, Kubernetes, machine learning, and global scale will push us to new limits.
But I’m confident we’ll meet these challenges. We’ve built strong foundations: solid architecture, comprehensive observability, proven algorithms, and a deep understanding of distributed systems.
Here’s to 2015 and the adventures it will bring!
Thanks for reading, and happy holidays!