Summary
TL;DR: Automates deployments, manages infrastructure, and builds CI/CD pipelines so you can ship faster with fewer manual steps.
DevOps is an OpenClaw skill that automate deployments, manage infrastructure, and build reliable CI/CD pipelines.
Created by Iván, this skill has been downloaded 3k+ times on ClawHub. Install it with one command and your AI agent gains these capabilities right away.
Use cases
- Setting up automatic deployments triggered by merges to the main branch
- Writing a Dockerfile and docker-compose config for a Node.js app with a Postgres database
- Creating environment-specific configuration files for staging and production deployments
- Building a rollback strategy for a blue-green deployment on a cloud platform
Installation
Run this command to install the skill on your OpenClaw agent:
npx clawhub@latest install devopsSecurity scan
This is an instruction-only skill that provides DevOps best-practice guidance; it requests no credentials, installs nothing, and its content aligns with its stated purpose.
SKILL.md
---
name: DevOps
description: Automate deployments, manage infrastructure, and build reliable CI/CD pipelines.
metadata: {"clawdbot":{"emoji":"🔧","os":["linux","darwin","win32"]}}
---
# DevOps Rules
## CI/CD Pipelines
- Fail fast: run linting and unit tests before expensive integration tests — saves time and compute
- Cache dependencies between runs — `npm install` on every build wastes minutes
- Pin action versions with SHA, not tags — `actions/checkout@v3` can change, SHA is immutable
- Secrets in environment variables, never in code or logs — mask them in CI output
- Parallel jobs for independent steps — test, lint, and build can run simultaneously
## Deployment Strategies
- Blue-green: run new version alongside old, switch traffic atomically — instant rollback by switching back
- Canary: route percentage of traffic to new version — catch issues before full rollout
- Rolling: update instances incrementally — balance between speed and risk
- Always have rollback plan before deploying — know exactly how to revert
- Deploy the same artifact to all environments — build once, promote through stages
## Infrastructure as Code
- Version control all infrastructure — terraform, ansible, cloudformation in git
- Never apply changes without plan/diff review — `terraform plan` before `apply`
- State files contain secrets — store remotely with encryption, never in git
- Modules for reusable components — don't copy-paste infrastructure definitions
- Separate environments with workspaces or directories — dev changes shouldn't affect prod
## Containers
- One process per container — containers are not VMs
- Health checks are mandatory — orchestrators need them for routing and restarts
- Don't run as root — use non-root USER in Dockerfile
- Immutable images: config via environment, not baked in — same image in all environments
- Tag images with git SHA, not just `latest` — know exactly what's deployed
## Secrets Management
- Never store secrets in environment files committed to git — use vault, sealed secrets, or CI secret storage
- Rotate secrets regularly — automation makes rotation painless
- Different secrets per environment — dev leak shouldn't compromise prod
- Audit secret access — know who accessed what and when
- Secrets in memory, not disk when possible — temp files persist longer than expected
## Monitoring & Alerting
- Four golden signals: latency, traffic, errors, saturation — start here
- Alert on symptoms, not causes — "users seeing errors" not "CPU high"
- Every alert must be actionable — if you can't do anything, it's noise
- Dashboard per service with key metrics — one glance shows health
- Structured logs (JSON) for machine parsing — grep works, but queries are better
## Reliability
- Define SLOs before building alerting — what does "healthy" mean for this service?
- Error budgets: some failures are acceptable — 99.9% means 8 hours downtime/year is OK
- Chaos engineering in staging — break things intentionally before prod breaks accidentally
- Runbooks for common incidents — 3am is not the time to figure out recovery steps
- Post-mortems without blame — focus on systems, not people
## Common Mistakes
- SSH into prod to fix things — all changes through automation, or you'll forget what you did
- No staging environment — "works on my machine" doesn't mean works in prod
- Ignoring flaky tests — they erode trust in CI, either fix or delete
- Manual steps in deployment — if it's not automated, it'll be done wrong eventually
- Monitoring only happy paths — check error rates and edge cases too
## Networking
- Internal services don't need public IPs — use private subnets, expose only load balancers
- TLS everywhere, including internal traffic — zero trust, even behind firewall
- DNS for service discovery — hardcoded IPs break when things move
- Load balancer health checks separate from app health — LB needs fast response, app health can be thorough
- Firewall default deny — explicitly allow what's needed, block everything else
Version history
Initial release
Frequently asked questions
Yes. It can help configure pipelines that detect which packages changed and only build and deploy the affected services.
Installation method
Send this prompt to your agent to install the skill
npx clawhub@latest install devopsSkill data sourced from ClawHub