Config, secrets & security

The fastest way to turn a working service into an incident is a hardcoded API key in a public repo or an unprotected /predict endpoint. This chapter covers the unglamorous-but-essential basics: load configuration the right way, keep secrets out of your code, and put a lock on your API. None of it is hard — and skipping it is how breaches and surprise bills happen.

Setup: pip install pydantic-settings; the auth demo uses FastAPI's TestClient. Output below is real. Code in code/config/.

Configuration: never hardcode

Anything that changes between environments — a database URL, the model path, a log level — is configuration, and it belongs in the environment, not the code. The 12-factor rule: config lives in environment variables, so the same image runs in dev, staging, and prod with different settings and no code change.

Pydantic Settings does this cleanly: declare your config as a typed class, and it reads from environment variables (and a .env file), validates types, and fails fast. From code/config/settings.py:

from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env", extra="ignore")
    model_path: str = "model.json"
    redis_url: str = "redis://localhost:6379/0"
    max_batch_size: int = Field(default=128, ge=1, le=10_000)   # validated range
    api_key: str = Field(default="", repr=False)                # secret: hidden from logs

$ MAX_BATCH_SIZE=256 API_KEY=secret123 python config/settings.py

Output:

model_path    : model.json
redis_url     : redis://localhost:6379/0
max_batch_size: 256
api_key set?  : True (value never printed)

MAX_BATCH_SIZE=256 was read and type-coerced to an int; an out-of-range or non-numeric value would fail at startup, not mid-request. And repr=False keeps the secret out of logs and tracebacks. Validated config that fails fast beats a typo discovered in production.

Secrets: keep them out of the code

Don't be confused: configuration vs. secrets. Both come from the environment, but a secret (API key, DB password, token) is sensitive — it must never be committed, logged, or printed. Config like a log level can live in plain docker-compose.yml; a secret cannot.

The rules, in order of importance:

Never commit secrets. No keys in source. Add .env to .gitignore (the project does). One leaked key in git history is a breach — and git remembers forever.
Load from the environment (or a .env file locally that's gitignored).
In production, use a secrets manager — AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault, or Kubernetes Secrets. These store, rotate, and audit access to secrets; your app fetches them at startup.
Rotate on exposure. If a key leaks, revoke and reissue it immediately — which is easy only if it was never hardcoded.

# .env  (gitignored — never committed)
API_KEY=sk-your-real-key
REDIS_URL=redis://prod-redis:6379/0

The classic mistake: committing a .env with real keys, or pasting a key into a notebook you later push. Scan your repo with tools like gitleaks or trufflehog; many CI pipelines (Chapter 18) run a secret-scan step to block leaks automatically.

Securing the API

Your model endpoint is on the internet — someone will find it. At minimum, it needs authentication (who are you?) and rate limiting (how often can you call?).

Authentication: require an API key

A FastAPI dependency checks a header on every protected route — one function, applied everywhere:

from fastapi import FastAPI, Depends, HTTPException, Header

def require_key(x_api_key: str = Header(default="")):
    if x_api_key != settings.api_key:
        raise HTTPException(status_code=401, detail="invalid or missing API key")

@app.get("/secure")
def secure(_=Depends(require_key)):
    return {"ok": True}

Verified end to end (via TestClient):

no key   -> 401
bad key  -> 401
good key -> 200  {'ok': True}

No key or a wrong key → 401 Unauthorized; the right key → 200. The endpoint is now locked. (For real user systems you'd graduate to OAuth2 / JWT tokens — FastAPI has first-class support — but a checked API key is the right baseline for service-to-service calls.)

Rate limiting: cap abuse

Without a limit, one client (or one bug, or one attacker) can flood your service or run up a huge LLM bill. The Redis atomic-counter pattern from Chapter 7 caps requests per client per window:

def allow(user_id, limit=100, window=60):
    n = r.incr(f"rate:{user_id}")
    if n == 1:
        r.expire(f"rate:{user_id}", window)   # first hit starts the 60s window
    return n <= limit

Reject with 429 Too Many Requests once the limit is hit. (Libraries like slowapi wire this into FastAPI for you.)

The security baseline checklist

For any service that goes live:

No secrets in code or git history (gitignore .env, scan in CI)
Config & secrets from the environment / a secrets manager, validated at startup
Authentication on every non-public endpoint (API key → OAuth2/JWT)
Rate limiting to cap abuse and runaway cost
Input validation (Pydantic — Chapter 4 — rejects junk before it runs)
HTTPS only (terminate TLS at the load balancer / gateway)
Least privilege — the service's credentials can do only what it needs
Dependency scanning (pip-audit, Dependabot) for known CVEs
Don't log secrets or full payloads (PII, keys)

You don't need all of it on day one, but you need this list in your head before exposing a model to the internet.

The takeaway

Load config and secrets from the environment with typed, fail-fast validation (Pydantic Settings); never commit secrets — gitignore .env, use a secrets manager in prod, rotate on leak. Lock your API with authentication (a checked key → 401/200) and rate limiting (429), on top of input validation and HTTPS. None of it is hard; all of it is the difference between a demo and a service you can trust in production. Now let's assemble every tool in this book into one running system. 👉

Production ML & AI Tools: A Hands-On Field Guide