DevOps transformed how we build and deploy software. Fast iterations, continuous deployment, infrastructure as code—development velocity has never been higher.
Then security teams show up and everything grinds to a halt.
I’ve been on both sides of this divide. As a security engineer embedded in product teams, I’ve seen the friction firsthand. Security is often seen as the “no” department, the group that slows down releases with last-minute reviews and compliance checklists.
But it doesn’t have to be this way.
Over the past year, we’ve integrated security into our DevOps culture. We call it DevSecOps, though I don’t love the buzzword. What matters is the outcome: security that enables development velocity rather than hindering it.
Here’s how we did it.
The Traditional Security Problem
Traditional security operates in phases:
Develop → Test → Security Review → Deploy
↑
GATE
Security is a gate at the end. If the security team finds issues, the release is blocked. Developers scramble to fix vulnerabilities they introduced weeks ago. Everyone is frustrated.
This model worked when releases happened quarterly. It doesn’t work when you deploy multiple times per day.
The problems:
- Late feedback: Security issues found right before deployment
- Context switching: Developers have moved on to new features
- Adversarial relationship: Security blocks releases, developers resent security
- Manual processes: Security reviews don’t scale with deployment frequency
- Knowledge silos: Security knowledge stays with security team
The DevSecOps Mindset
DevSecOps shifts security left—integrating it into the development process from the beginning:
Design → Develop → Test → Deploy
↓ ↓ ↓ ↓
Security Security Security Security
Security is continuous, automated, and embedded in every phase.
The principles:
- Security as code: Automate security controls
- Shift left: Find issues early when they’re cheap to fix
- Shared responsibility: Developers own security of their code
- Fail fast: Block insecure code in CI/CD, not at deployment
- Measure everything: Security metrics drive improvement
This requires cultural change, not just tools.
Cultural Change: Making Security Everyone’s Job
The biggest challenge isn’t technical—it’s cultural.
Embed Security Engineers in Product Teams
We embedded security engineers directly into product teams. They attend standups, sprint planning, and retrospectives. They’re part of the team, not external reviewers.
This changes the dynamic:
- Security engineers understand product context
- Developers get immediate security guidance
- Trust builds between security and development
- Security trade-offs are discussed, not mandated
Our security team went from 5 centralized engineers to 5 engineers embedded across product teams.
Security Champions Program
Not every team can have a dedicated security engineer. We created a security champions program—developers with security interest who receive extra training and act as security liaisons.
Security champions:
- Review pull requests for security issues
- Bring security concerns to team discussions
- Stay updated on security best practices
- Escalate complex issues to security team
We have one champion per 5-10 developers. They meet monthly with the security team to share knowledge and discuss emerging threats.
Blameless Post-Mortems
When security incidents happen, we conduct blameless post-mortems. The goal is learning, not blame.
Post-mortem template:
## Incident Summary
Brief description of what happened
## Timeline
- 14:23: Event A occurred
- 14:45: Team noticed B
- 15:10: Fix deployed
## Root Cause
What allowed this to happen?
## What Went Well
Things that worked during incident response
## What Went Wrong
Things that didn't work
## Action Items
- [ ] Fix immediate issue (Owner: Alice)
- [ ] Prevent recurrence (Owner: Bob)
- [ ] Improve detection (Owner: Charlie)
This creates a learning culture. Developers aren’t afraid to report security issues because they won’t be blamed.
Automated Security Testing
Manual security reviews don’t scale. Automate everything possible.
Static Application Security Testing (SAST)
Analyze code for vulnerabilities before it runs:
// Jenkins pipeline
stage('Security Scan') {
steps {
// Run static analysis
sh 'go vet ./...'
sh 'gosec -fmt=json -out=gosec-report.json ./...'
// Parse results
script {
def report = readJSON file: 'gosec-report.json'
if (report.Stats.found > 0) {
error("Security vulnerabilities found: ${report.Stats.found}")
}
}
}
}
We run multiple SAST tools:
- Language-specific:
gosecfor Go,banditfor Python,brakemanfor Ruby - General purpose: SonarQube for broad coverage
- Secrets detection: git-secrets, truffleHog
All integrated into CI/CD. Vulnerabilities block merges automatically.
Dynamic Application Security Testing (DAST)
Test running applications for vulnerabilities:
# GitLab CI configuration
security:
stage: test
image: owasp/zap2docker-stable
script:
- |
docker run -t owasp/zap2docker-stable zap-baseline.py \
-t https://staging.myapp.com \
-J zap-report.json
artifacts:
reports:
sast: zap-report.json
DAST catches issues SAST misses:
- Authentication/authorization bugs
- Server misconfigurations
- Exposed endpoints
- Injection vulnerabilities
We run DAST against staging environments before production deployment.
Dependency Scanning
Third-party dependencies are a major vulnerability source:
# Scan for known vulnerabilities
npm audit --json > npm-audit.json
safety check --json > safety-report.json
# Fail build if critical vulnerabilities found
if [ $(jq '.vulnerabilities.critical' npm-audit.json) -gt 0 ]; then
echo "Critical vulnerabilities found"
exit 1
fi
Automated dependency updates:
# Dependabot configuration
version: 2
updates:
- package-ecosystem: "npm"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 10
reviewers:
- "security-team"
Dependabot creates PRs for vulnerable dependencies. Security champions review and merge.
Container Scanning
Every container image scanned before deployment:
stage('Container Security') {
steps {
sh 'docker build -t myapp:${BUILD_TAG} .'
// Scan image
sh '''
clair-scanner \
--ip ${LOCAL_IP} \
--threshold High \
myapp:${BUILD_TAG}
'''
// Sign image
sh 'docker push myapp:${BUILD_TAG}'
}
}
Images with high-severity vulnerabilities are automatically rejected.
Infrastructure Security as Code
Security policies defined as code, versioned, and tested like application code.
Policy as Code with Open Policy Agent
Define security policies declaratively:
# Kubernetes admission policy
package kubernetes.admission
deny[msg] {
input.request.kind.kind == "Pod"
input.request.object.spec.containers[_].securityContext.privileged
msg = "Privileged containers are not allowed"
}
deny[msg] {
input.request.kind.kind == "Pod"
not input.request.object.spec.securityContext.runAsNonRoot
msg = "Containers must run as non-root"
}
deny[msg] {
input.request.kind.kind == "Pod"
not input.request.object.spec.containers[_].resources.limits.memory
msg = "Memory limits must be set"
}
Policies are code:
- Stored in git
- Code reviewed
- Tested with unit tests
- Versioned and auditable
Test policies before deployment:
// policy_test.go
func TestPrivilegedContainerDenied(t *testing.T) {
policy := loadPolicy("kubernetes-admission.rego")
pod := map[string]interface{}{
"kind": "Pod",
"spec": map[string]interface{}{
"containers": []interface{}{
map[string]interface{}{
"name": "test",
"securityContext": map[string]interface{}{
"privileged": true,
},
},
},
},
}
result := policy.Eval(pod)
if !result.Denied() {
t.Error("Expected privileged container to be denied")
}
}
Infrastructure as Code Security
Scan Terraform/CloudFormation templates for misconfigurations:
# Scan Terraform with tfsec
tfsec .
# Example output
Problem 1
[AWS001][WARNING] Resource 'aws_s3_bucket.data' has an ACL which allows public access.
See https://tfsec.dev/docs/aws/AWS001/
/infrastructure/s3.tf:12-16
12 | resource "aws_s3_bucket" "data" {
13 | bucket = "my-data-bucket"
14 | acl = "public-read"
15 | ...
16 | }
Integrated into CI/CD:
# GitHub Actions
name: Infrastructure Security
on: [pull_request]
jobs:
terraform-security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run tfsec
uses: tfsec/tfsec-action@v1
with:
soft_fail: false
Insecure infrastructure code is blocked before merge.
Secret Management
Never commit secrets. Use secret management tools:
// Application code
import (
"github.com/hashicorp/vault/api"
)
type SecretStore struct {
client *api.Client
}
func (s *SecretStore) GetDatabaseCredentials() (string, string, error) {
secret, err := s.client.Logical().Read("database/creds/myapp")
if err != nil {
return "", "", err
}
username := secret.Data["username"].(string)
password := secret.Data["password"].(string)
return username, password, nil
}
Secrets are:
- Never in code or configuration files
- Dynamically generated and short-lived
- Audited (who accessed what secret when)
- Automatically rotated
Pre-commit hooks prevent accidental commits:
#!/bin/bash
# .git/hooks/pre-commit
# Check for high-entropy strings (potential secrets)
git diff --cached --name-only | while read file; do
if [ -f "$file" ]; then
if grep -E '(password|secret|key|token).*=.*[A-Za-z0-9]{20,}' "$file"; then
echo "Potential secret detected in $file"
exit 1
fi
fi
done
Security in the Development Lifecycle
Security integrated at every stage.
Threat Modeling During Design
Before writing code, identify threats:
Feature: User authentication
Assets:
- User credentials
- Session tokens
- Personal information
Threats (STRIDE):
- Spoofing: Attacker impersonates user
- Tampering: Session token modified
- Repudiation: User denies action
- Information disclosure: Credentials leaked
- Denial of service: Account lockout
- Elevation of privilege: Normal user gains admin access
Mitigations:
- Spoofing: Multi-factor authentication
- Tampering: Sign session tokens with HMAC
- Information disclosure: Hash passwords with bcrypt
- DoS: Rate limiting on login attempts
- Elevation: Role-based access control
Security engineers facilitate threat modeling sessions with product teams. Identified risks are addressed in the design phase, not discovered during security review.
Secure Coding Guidelines
Documented guidelines for common security patterns:
# Secure Coding Guidelines
## Input Validation
Always validate and sanitize user input:
```go
// Bad
func getUser(userID string) (*User, error) {
query := fmt.Sprintf("SELECT * FROM users WHERE id = '%s'", userID)
return db.Query(query)
}
// Good
func getUser(userID string) (*User, error) {
// Use parameterized queries
query := "SELECT * FROM users WHERE id = ?"
return db.Query(query, userID)
}
Authentication
Use established libraries, don’t roll your own crypto:
// Bad
func hashPassword(password string) string {
return md5.Sum([]byte(password))
}
// Good
import "golang.org/x/crypto/bcrypt"
func hashPassword(password string) (string, error) {
hash, err := bcrypt.GenerateFromPassword([]byte(password), bcrypt.DefaultCost)
return string(hash), err
}
Guidelines are living documents. Updated based on incidents and new vulnerabilities.
Secure Code Review
Security checks in every code review:
# Code Review Checklist
Security:
- [ ] Input validation on all user input
- [ ] Parameterized queries (no SQL injection)
- [ ] Proper authentication/authorization checks
- [ ] No secrets in code
- [ ] Error messages don't leak sensitive information
- [ ] Logging doesn't include PII or credentials
Security champions lead security review. Complex changes reviewed by security team.
Automated Security Gates
CI/CD pipeline enforces security:
pipeline {
agent any
stages {
stage('Code Quality') {
steps {
sh 'golangci-lint run'
}
}
stage('Security Scan') {
steps {
sh 'gosec ./...'
sh 'safety check'
sh 'npm audit'
}
}
stage('Build') {
steps {
sh 'docker build -t myapp:${BUILD_TAG} .'
}
}
stage('Container Scan') {
steps {
sh 'clair-scanner myapp:${BUILD_TAG}'
}
}
stage('Deploy to Staging') {
steps {
sh 'kubectl apply -f k8s/staging/'
}
}
stage('DAST') {
steps {
sh 'zap-baseline.py -t https://staging.myapp.com'
}
}
stage('Deploy to Production') {
when {
branch 'main'
}
steps {
sh 'kubectl apply -f k8s/production/'
}
}
}
post {
failure {
slackSend(
color: 'danger',
message: "Security checks failed: ${env.JOB_NAME} ${env.BUILD_NUMBER}"
)
}
}
}
Security failures stop deployment automatically. No manual gates required.
Measuring Security
You can’t improve what you don’t measure.
Security Metrics
Track key metrics:
type SecurityMetrics struct {
VulnerabilitiesFound int
VulnerabilitiesFixed int
MeanTimeToFix time.Duration
SecurityTestCoverage float64
FailedSecurityBuilds int
IncidentsDetected int
IncidentResponseTime time.Duration
}
func (m *SecurityMetrics) Record() {
metrics.Gauge("security.vulnerabilities.found", m.VulnerabilitiesFound)
metrics.Gauge("security.vulnerabilities.fixed", m.VulnerabilitiesFixed)
metrics.Histogram("security.time_to_fix", m.MeanTimeToFix.Seconds())
metrics.Gauge("security.test_coverage", m.SecurityTestCoverage)
}
Dashboard visualizes trends:
- Vulnerability count over time (should trend down)
- Time to fix vulnerabilities (should trend down)
- Security test coverage (should trend up)
- Failed builds due to security (context-dependent)
Security SLOs
Define service level objectives for security:
security_slos:
- metric: critical_vulnerabilities
slo: 0
description: "Zero critical vulnerabilities in production"
- metric: high_vulnerabilities
slo: < 5
description: "Fewer than 5 high-severity vulnerabilities in production"
- metric: time_to_fix_critical
slo: < 24h
description: "Critical vulnerabilities fixed within 24 hours"
- metric: time_to_fix_high
slo: < 7d
description: "High vulnerabilities fixed within 7 days"
- metric: security_test_coverage
slo: > 80%
description: "80% of code covered by security tests"
SLOs drive prioritization. Missing an SLO triggers incident response.
Real-World Challenges
Challenge 1: Security vs. Speed
Problem: Developers complained security checks slow down CI/CD.
Solution:
- Optimized security scans (parallel execution, caching)
- Moved slow tests to nightly builds
- Focused on high-signal checks in PR pipeline
Result: CI/CD time from 45 minutes to 12 minutes, with better security coverage.
Challenge 2: Alert Fatigue
Problem: Too many false positives from security tools. Developers ignored alerts.
Solution:
- Tuned security tools to reduce noise
- Prioritized findings (critical/high only block builds)
- Regular review of suppressed findings
Result: Alert volume down 70%, but fix rate up 200%.
Challenge 3: Security Knowledge Gap
Problem: Developers lacked security expertise to fix vulnerabilities.
Solution:
- Security training for all engineers (quarterly)
- Security champions program
- Pairing sessions with security team
- Documented secure coding patterns
Result: Developer-fixed vulnerabilities increased from 30% to 80%.
Challenge 4: Legacy Code
Problem: Legacy systems didn’t fit new security standards.
Solution:
- Exemption process for legacy systems
- Compensating controls (extra monitoring, network segmentation)
- Gradual remediation roadmap
Result: New code meets security standards. Legacy code has mitigation plans and improved monitoring.
Tools We Use
SAST:
- Gosec (Go)
- Bandit (Python)
- Brakeman (Ruby)
- SonarQube (multi-language)
DAST:
- OWASP ZAP
- Burp Suite
Dependency Scanning:
- npm audit
- safety (Python)
- bundler-audit (Ruby)
- Snyk
Container Scanning:
- Clair
- Trivy
- Anchore
Policy as Code:
- Open Policy Agent
- HashiCorp Sentinel
Secret Management:
- HashiCorp Vault
- AWS Secrets Manager
Infrastructure Scanning:
- tfsec (Terraform)
- Checkov (multi-IaC)
Implementation Roadmap
Start small, iterate, scale.
Phase 1: Foundation (Month 1-2)
- Set up basic SAST in CI/CD
- Implement dependency scanning
- Create security champion program
- Document secure coding guidelines
Phase 2: Automation (Month 3-4)
- Add container scanning
- Implement policy as code
- Set up secret management
- Create security metrics dashboard
Phase 3: Culture (Month 5-6)
- Embed security engineers in teams
- Conduct security training
- Run threat modeling workshops
- Establish blameless post-mortems
Phase 4: Optimization (Month 7+)
- Tune tools to reduce false positives
- Expand DAST coverage
- Implement security SLOs
- Continuous improvement based on metrics
Practical Recommendations
-
Start with quick wins: Dependency scanning and secret detection are easy to implement and high value.
-
Automate ruthlessly: Manual security reviews don’t scale. Automate everything possible.
-
Fail fast: Catch issues in development, not production.
-
Measure and improve: Track metrics, set SLOs, iterate.
-
Build security culture: Make security everyone’s responsibility, not just the security team’s.
-
Provide self-service: Give developers tools to answer their own security questions.
-
Celebrate security wins: Recognize teams that improve security metrics.
-
Keep it simple: Complex security processes will be bypassed. Simple, automated security is followed.
Conclusion
DevSecOps isn’t about adding a security team to DevOps. It’s about fundamentally changing how security integrates with development.
The key shifts:
- Security as enabler, not blocker
- Automation over manual review
- Early detection over late-stage gates
- Shared responsibility over security silos
- Continuous improvement over compliance checkbox
This requires cultural change, not just tooling. Invest in people, training, and collaboration.
Security doesn’t have to slow down development. With the right culture and automation, it can actually accelerate it by reducing production incidents and building trust with customers.
In my next post, I’ll dive into observability in distributed systems—how to understand what’s happening when microservices span multiple services and clouds.
Security and speed aren’t opposites. Done right, they’re complementary.