Continuous integration and deployment seem at odds with security. Security teams want careful, controlled releases with multiple approval gates. DevOps teams want to deploy multiple times per day with minimal friction. We’ve built CI/CD pipelines for our key management platform that satisfy both requirements. Here’s how.

The Challenge

Traditional security software releases were slow and manual:

  • Developer makes changes, creates release branch
  • Security team reviews code
  • QA team performs manual testing
  • Operations team schedules maintenance window
  • Manual deployment to production
  • Extensive validation before declaring success

This process took weeks. Meanwhile, critical security patches sat waiting for the next release window.

We needed faster deployment cycles without sacrificing security rigor. CI/CD provides the automation, but security requirements shape how we implement it.

Pipeline Architecture

Our CI/CD pipeline has several stages:

Code commitBuildUnit testsSecurity scansIntegration testsDeploy to stagingAutomated testingManual approvalDeploy to productionValidation

Each stage has pass/fail criteria. Failures halt the pipeline. This ensures every deployment meets quality and security standards.

Source Control and Branching

We use Git with a specific branching strategy:

Feature branches: All development happens in feature branches, never directly on main.

Pull requests: Feature branches merge to main via pull requests requiring code review.

Main branch: Always deployable. Every commit to main triggers the CI/CD pipeline.

Release tags: Production deployments are tagged with semantic versions (v1.2.3).

This provides audit trail - we can trace every line of production code to a specific commit, pull request, and reviewer.

Automated Building

When code is committed to main, CI server (we use Jenkins) automatically:

  1. Checks out the code
  2. Runs build (compiles Go code, creates binaries)
  3. Builds Docker container images
  4. Tags images with git commit SHA and timestamp
  5. Pushes images to private container registry

Builds are reproducible - same commit always produces same binary. This is verified through checksums.

Unit and Integration Testing

Automated testing is extensive:

Unit tests: Run for every commit. Fast tests of individual functions and modules. Coverage requirement: 80%+.

Integration tests: Test interactions between components. Spin up test containers with mocked dependencies.

Contract tests: Verify that services satisfy API contracts their clients expect.

Tests must pass before code can merge to main. No exceptions - even urgent security patches go through the test suite.

For our Go services, tests run in seconds. For Java services, minutes. Fast tests enable rapid iteration.

Security Scanning

Security scans run automatically on every build:

Dependency scanning: Check for known vulnerabilities in third-party libraries. We use Snyk which maintains a vulnerability database.

Container scanning: Scan Docker images for vulnerabilities in base images and installed packages. Critical vulnerabilities fail the build.

Static code analysis: Automated code review looking for security issues (SQL injection, command injection, hardcoded secrets, etc.).

License compliance: Verify all dependencies use acceptable licenses.

Scans that find critical issues fail the build. High-severity issues generate tickets for review. Medium/low issues are tracked but don’t block deployment.

This automated security review catches issues earlier than manual security review alone.

Staging Deployment

After passing all automated checks, code deploys automatically to staging:

Staging environment: Kubernetes cluster mirroring production configuration.

Automated deployment: Helm charts define Kubernetes resources. CI server deploys via helm upgrade.

Health checks: Deployment waits for health checks to pass before declaring success.

Smoke tests: Automated smoke tests verify basic functionality after deployment.

Staging deployment is fully automated - no human intervention. This happens dozens of times per day.

Production Deployment Gating

Production deployment requires additional gates:

Manual approval: Senior engineer or team lead must approve production deployment.

Change window: Production deployments allowed during business hours (when full team is available) unless emergency.

Deployment freeze: Freeze deployments during critical business periods (month-end processing, for example).

Progressive rollout: New versions deploy gradually - first to 10% of traffic, then 50%, then 100%.

These gates slow deployment but prevent problematic releases from impacting all users immediately.

Deployment Automation

Production deployment uses the same Helm charts as staging:

  1. Engineer approves deployment in CI system
  2. CI server deploys to production Kubernetes cluster using Helm
  3. Kubernetes performs rolling update, gradually replacing old pods
  4. Health checks ensure new pods are healthy before continuing
  5. If health checks fail, rollback occurs automatically
  6. Once all pods updated, run validation tests

The deployment itself is automated and consistent. Human judgment is exercised at the approval gate, not during deployment execution.

Rollback Procedures

Despite careful testing, issues sometimes reach production. Fast rollback is essential:

Automated rollback: If health checks or validation tests fail, automatic rollback to previous version.

Manual rollback: Engineers can trigger manual rollback via CI system.

Helm rollback: Helm tracks release history and can rollback with one command.

Database migrations: Backwards-compatible database changes allow code rollback without data migration rollback.

We practice rollbacks regularly in staging to ensure they work when needed.

Database Migrations

Database schema changes are tricky in CI/CD. Our approach:

Backwards compatibility: Schema changes must be compatible with previous code version. This allows deploying schema before code.

Separate migration deployment: Database migrations deploy before application code, giving time to verify success.

Incremental changes: Large schema changes broken into small, backwards-compatible increments.

Automated testing: Integration tests verify both old and new code work with migrated schema.

This prevents scenarios where code and schema are mismatched.

Secrets Management in CI/CD

CI/CD pipelines need access to secrets (container registry credentials, Kubernetes cluster credentials, etc.). We handle this carefully:

Secret store: Secrets stored in HashiCorp Vault, not in CI configuration or environment variables.

Temporary credentials: CI server gets temporary credentials for each build, valid only for duration of pipeline.

Minimal permissions: CI credentials have minimal permissions needed. Can deploy to staging automatically, but production requires elevated approval.

Audit logging: All secret access logged for security review.

No secrets in code: Automated scans prevent secrets from being committed to repositories.

Compliance and Audit

CI/CD must support compliance requirements:

Immutable pipeline logs: Every pipeline execution fully logged, logs retained for compliance period.

Approval audit trail: Who approved what deployment and when.

Change tracking: Every deployment linked to specific code commits, tickets, and approvals.

Rollback history: Every rollback recorded with reason and approver.

Compliance reporting: Automated reports showing all production deployments, who approved, and whether they passed security scans.

This provides auditors with complete picture of how code reaches production.

Environment Parity

Staging must accurately represent production:

Same Kubernetes version: Staging and production use identical Kubernetes versions.

Same infrastructure: Staging infrastructure matches production (smaller scale but same configuration).

Same data volume: Staging has production-like data volumes (anonymized production data).

Same dependencies: Staging connects to same types of dependencies (HSMs, databases, etc.).

Without parity, issues appear in production that weren’t caught in staging.

Metrics and Monitoring

CI/CD pipeline itself is monitored:

Pipeline duration: How long from commit to production? Track trends over time.

Failure rates: What percentage of builds fail? Which stages fail most often?

Test coverage: Code coverage trends over time.

Deployment frequency: How often are we deploying to production?

Mean time to recovery: When issues occur, how quickly do we recover?

These metrics guide process improvements.

Automated Testing in Production

Even with staging testing, production is different. We run automated tests in production:

Synthetic monitoring: Automated tests simulating user workflows run continuously in production.

Canary analysis: Metrics from canary instances compared to baseline. Anomalies trigger rollback.

Feature flags: New features deployed behind flags, enabled for small user percentage first.

This catches issues that only manifest under production load or with production data.

Handling Urgent Security Patches

Security vulnerabilities require fast deployment:

Expedited pipeline: Security patches can bypass certain gates (but not all - tests still run).

Out-of-band deployments: Critical patches can deploy outside normal change windows.

Immediate rollout: Progressive rollout compressed - deployed to all instances rapidly.

Post-deployment review: Retrospective on what was deployed and why, even though approval happened after the fact.

We’ve deployed critical security patches from discovery to production in under 2 hours using this process.

Infrastructure as Code

Our infrastructure is code-managed for consistency:

Terraform: Manages cloud infrastructure (Kubernetes clusters, networks, databases).

Helm: Manages Kubernetes resources (deployments, services, config maps).

Version controlled: All infrastructure code in Git.

CI/CD for infrastructure: Infrastructure changes go through similar pipeline as application code.

Immutable infrastructure: Never modify infrastructure in place. Deploy new version, cut over, delete old.

This ensures infrastructure deployments are repeatable and auditable.

Team Culture Changes

CI/CD requires cultural shifts:

Shared responsibility: Developers own their code in production, not just in development.

Automation over documentation: Automated tests over manual test plans.

Fail fast: Surface issues quickly rather than hiding them.

Blameless postmortems: When issues occur, focus on process improvement not blame.

Continuous improvement: Regular retrospectives on pipeline and process.

The culture change was harder than the technical implementation but essential for success.

Looking Forward

We’re continuing to evolve CI/CD:

Faster pipelines: Current full pipeline takes 20 minutes. Goal is under 10 minutes.

More automation: Reducing remaining manual steps.

Better testing: Expanding test coverage, especially for failure scenarios.

Progressive delivery: More sophisticated canary analysis and automated rollback.

GitOps: Moving toward GitOps model where Git is source of truth for desired state.

Key Takeaways

For teams building CI/CD for security software:

  1. Automate security scans in the pipeline, don’t rely only on manual review
  2. Staging must accurately mirror production to catch issues
  3. Build fast rollback capabilities before you need them
  4. Balance automation with security gates - automate the mechanics, gate the approval
  5. Make database migrations backwards compatible to enable rollback
  6. Treat infrastructure as code with same rigor as application code
  7. Monitor the pipeline itself - it’s critical infrastructure
  8. Start with aggressive testing, relax only when confident
  9. Use feature flags for gradual rollout of risky changes
  10. Invest in culture change, not just tooling

CI/CD for security-critical software is possible without compromising security. The key is automating the mechanics while maintaining rigorous standards. Every commit is tested, scanned, and validated before reaching production. Deployments are consistent and repeatable. Rollbacks are fast when needed.

This has transformed our ability to respond to security vulnerabilities and customer needs. We deploy to production daily, not monthly. Security patches reach customers in hours, not weeks. And we do this while maintaining stronger security controls than our previous manual process.

If you’re building security software and think CI/CD is incompatible with security requirements, I encourage you to reconsider. With thoughtful implementation, you can have both speed and security.