Config, secrets & security
The fastest way to turn a working service into an incident is a hardcoded API key in a
public repo or an unprotected /predict endpoint. This chapter covers the
unglamorous-but-essential basics: load configuration the right way, keep secrets out
of your code, and put a lock on your API. None of it is hard — and skipping it is how
breaches and surprise bills happen.
Setup:
pip install pydantic-settings; the auth demo uses FastAPI'sTestClient. Output below is real. Code incode/config/.
Configuration: never hardcode
Anything that changes between environments — a database URL, the model path, a log level — is configuration, and it belongs in the environment, not the code. The 12-factor rule: config lives in environment variables, so the same image runs in dev, staging, and prod with different settings and no code change.
Pydantic Settings does this cleanly: declare your config as a typed class, and it
reads from environment variables (and a .env file), validates types, and fails fast.
From code/config/settings.py:
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", extra="ignore")
model_path: str = "model.json"
redis_url: str = "redis://localhost:6379/0"
max_batch_size: int = Field(default=128, ge=1, le=10_000) # validated range
api_key: str = Field(default="", repr=False) # secret: hidden from logs
$ MAX_BATCH_SIZE=256 API_KEY=secret123 python config/settings.py
Output:
model_path : model.json
redis_url : redis://localhost:6379/0
max_batch_size: 256
api_key set? : True (value never printed)
MAX_BATCH_SIZE=256 was read and type-coerced to an int; an out-of-range or
non-numeric value would fail at startup, not mid-request. And repr=False keeps the
secret out of logs and tracebacks. Validated config that fails fast beats a typo
discovered in production.
Secrets: keep them out of the code
Don't be confused: configuration vs. secrets. Both come from the environment, but a secret (API key, DB password, token) is sensitive — it must never be committed, logged, or printed. Config like a log level can live in plain
docker-compose.yml; a secret cannot.
The rules, in order of importance:
- Never commit secrets. No keys in source. Add
.envto.gitignore(the project does). One leaked key in git history is a breach — andgitremembers forever. - Load from the environment (or a
.envfile locally that's gitignored). - In production, use a secrets manager — AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault, or Kubernetes Secrets. These store, rotate, and audit access to secrets; your app fetches them at startup.
- Rotate on exposure. If a key leaks, revoke and reissue it immediately — which is easy only if it was never hardcoded.
# .env (gitignored — never committed)
API_KEY=sk-your-real-key
REDIS_URL=redis://prod-redis:6379/0
The classic mistake: committing a
.envwith real keys, or pasting a key into a notebook you later push. Scan your repo with tools like gitleaks or trufflehog; many CI pipelines (Chapter 18) run a secret-scan step to block leaks automatically.
Securing the API
Your model endpoint is on the internet — someone will find it. At minimum, it needs authentication (who are you?) and rate limiting (how often can you call?).
Authentication: require an API key
A FastAPI dependency checks a header on every protected route — one function, applied everywhere:
from fastapi import FastAPI, Depends, HTTPException, Header
def require_key(x_api_key: str = Header(default="")):
if x_api_key != settings.api_key:
raise HTTPException(status_code=401, detail="invalid or missing API key")
@app.get("/secure")
def secure(_=Depends(require_key)):
return {"ok": True}
Verified end to end (via TestClient):
no key -> 401
bad key -> 401
good key -> 200 {'ok': True}
No key or a wrong key → 401 Unauthorized; the right key → 200. The endpoint is now locked. (For real user systems you'd graduate to OAuth2 / JWT tokens — FastAPI has first-class support — but a checked API key is the right baseline for service-to-service calls.)
Rate limiting: cap abuse
Without a limit, one client (or one bug, or one attacker) can flood your service or run up a huge LLM bill. The Redis atomic-counter pattern from Chapter 7 caps requests per client per window:
def allow(user_id, limit=100, window=60):
n = r.incr(f"rate:{user_id}")
if n == 1:
r.expire(f"rate:{user_id}", window) # first hit starts the 60s window
return n <= limit
Reject with 429 Too Many Requests once the limit is hit. (Libraries like slowapi wire this into FastAPI for you.)
The security baseline checklist
For any service that goes live:
-
No secrets in code or git history (gitignore
.env, scan in CI) - Config & secrets from the environment / a secrets manager, validated at startup
- Authentication on every non-public endpoint (API key → OAuth2/JWT)
- Rate limiting to cap abuse and runaway cost
- Input validation (Pydantic — Chapter 4 — rejects junk before it runs)
- HTTPS only (terminate TLS at the load balancer / gateway)
- Least privilege — the service's credentials can do only what it needs
-
Dependency scanning (
pip-audit, Dependabot) for known CVEs - Don't log secrets or full payloads (PII, keys)
You don't need all of it on day one, but you need this list in your head before exposing a model to the internet.
The takeaway
Load config and secrets from the environment with typed, fail-fast validation
(Pydantic Settings); never commit secrets — gitignore .env, use a secrets manager
in prod, rotate on leak. Lock your API with authentication (a checked key → 401/200)
and rate limiting (429), on top of input validation and HTTPS. None of it is hard;
all of it is the difference between a demo and a service you can trust in production.
Now let's assemble every tool in this book into one running system. 👉