This post is inverted: each section is a pattern that looks like DevOps but isn’t, with the concrete dodge this project uses.
Smell 1: cleartext passwords in the README
A common tutorial:
1 | airflow users create \ |
Readers copy → deploy → public 8080 → crypto miners by morning.
This project’s README deliberately omits --password:
1 | airflow users create \ |
Airflow prompts interactively — the password never lands in shell history, screenshots, or docs.
Smell 2: 0.0.0.0/0 left on a “temporary” rule
A lot of projects set source_ranges = ["0.0.0.0/0"] during debugging and forget to roll it back. An Airflow 8080 open to the internet is effectively a time bomb.
This project makes the CIDR a required variable with no default:
1 | variable "admin_cidr" { |
Every apply must pass -var="admin_cidr=..." explicitly. Removes “forgot to flip it back” at the language level.
Smell 3: terraform apply without a plan
A trendy but dangerous CI pattern: terraform apply -auto-approve. This project’s README keeps them separate:
1 | terraform plan -var="project_id=$GCP_PROJECT_ID" -var="admin_cidr=$ADMIN_CIDR" |
Combined with prevent_destroy, anything unexpected is blocked at plan time. Two layers of safety net beat one.
Smell 4: shipping the SA key to the VM
A common workflow: generate key.json locally → scp to the VM → set GOOGLE_APPLICATION_CREDENTIALS.
Problems:
- A key sits on disk for the VM’s lifetime; rotation is painful
- VM compromise → key leaks → whole project blast radius
This project uses VM-attached SA (main.tf):
1 | resource "google_compute_instance" "airflow_vm" { |
The VM gets tokens via the metadata server. There is no key file to leak. Replacing the VM is the rotation.
Smell 5: “validation” means terraform validate
terraform validate only catches syntax. This project’s CI runs three checks per stack:
1 | - run: terraform -chdir=nyc_taxi_pipeline/terraform fmt -check |
fmt -check catches formatting drift, init -backend=false pulls providers, validate checks resource schema. Three steps, under 30 seconds, and 80% of PR-level mistakes die there.
Python side:
1 | - run: ruff check nyc_taxi_pipeline/airflow |
dbt side:
1 | - run: dbt parse --no-version-check |
Each layer gets the cheapest possible static check — far more realistic than a big e2e suite.
Smell 6: terraform.tfstate committed to Git
Search GitHub and you’ll find plenty of public repos with terraform.tfstate in them. tfstate contains resource IDs, public IPs, SA emails — a recon goldmine.
This project’s .gitignore excludes them up front:
1 | *.tfstate |
And mandates remote state:
1 | backend "gcs" { prefix = "tfstate/cabstream" } |
Checklist
Before calling a new project “DevOps-done”:
- No cleartext passwords or tokens in README/tutorials
- Open-port CIDR is a required variable with no default
-
terraform planandapplyare separate steps - VMs use attached SAs, no key files
- CI runs fmt / validate / lint / unit tests, each under 30s
- tfstate lives in a remote backend; local versions are gitignored
Six checks ticked, and you’re past 80% of projects that “look DevOps”.
Files: .github/workflows/ci.yml, nyc_taxi_pipeline/terraform/variables.tf