The full stack with docker-compose

You've wrapped one model in a dozen tools, one chapter at a time. In production they run together: the API, the worker, Redis, and the tracking server, all at once, talking to each other. Starting four services by hand in four terminals is painful and fragile. docker-compose defines the whole system in one file and launches it with one command.

Setup: Docker with Compose (bundled in Docker Desktop). Follow-along.

The whole system, declared

docker-compose.yml describes every service and how they connect:

services:
  redis:                                    # broker + cache
    image: redis:7-alpine
    ports: ["6379:6379"]

  api:                                      # the FastAPI model service
    build: .
    ports: ["8000:8000"]
    environment: [REDIS_URL=redis://redis:6379/0]
    depends_on: [redis]

  worker:                                   # Celery background worker
    build: .
    command: celery -A tasks.celery_app worker --loglevel=info
    environment: [REDIS_URL=redis://redis:6379/0]
    depends_on: [redis]

  mlflow:                                   # experiment tracking / registry UI
    image: ghcr.io/mlflow/mlflow:latest
    command: mlflow server --host 0.0.0.0 --port 5000 --backend-store-uri sqlite:////mlflow/mlflow.db
    ports: ["5000:5000"]
    volumes: [mlflow-data:/mlflow]

volumes:
  mlflow-data:

Four services, each from earlier chapters, now wired into one system. Read it top to bottom and you can see the whole architecture at a glance — which is itself a benefit.

One command to rule them all

cd code
docker compose up --build

Expected output:

[+] Running 5/5
 ✔ Network code_default      Created
 ✔ Container code-redis-1    Started
 ✔ Container code-mlflow-1   Started
 ✔ Container code-api-1      Started
 ✔ Container code-worker-1   Started
api-1     | INFO:     Uvicorn running on http://0.0.0.0:8000
worker-1  | celery@... ready.
mlflow-1  | Listening at: http://0.0.0.0:5000

The entire stack is live:

http://localhost:8000/docs — the prediction API
http://localhost:5000 — the MLflow UI
the worker consuming background jobs from Redis
Redis brokering and caching

Stop it all with one command:

docker compose down

How the services find each other

Notice REDIS_URL=redis://redis:6379/0 — the API reaches Redis by the service name redis, not an IP. Compose creates a private network where each service is reachable by its name. This is the key idea: services address each other by name, not address, so the same compose file works on any machine without editing IPs.

Don't be confused: depends_on waits for start, not ready. depends_on: [redis] makes Compose start Redis before the API container — but it doesn't wait for Redis to be accepting connections. A service that crashes because its dependency isn't ready yet needs a real healthcheck (like the one in our Dockerfile) plus retry-on-connect logic. "Started" ≠ "ready" is a classic compose gotcha.

docker-compose vs. Kubernetes

Don't be confused: compose vs. Kubernetes. docker-compose runs multiple containers on one machine — perfect for local development, CI, and small deployments. Kubernetes (K8s) runs containers across a cluster of machines with auto-scaling, self-healing, rolling updates, and load balancing — the standard for production at scale. The good news: a compose file maps conceptually onto K8s manifests, so what you learn here transfers. Start with compose; graduate to K8s when one machine isn't enough.

The complete lifecycle, assembled

Step back and look at what you've built across the book — the entire production loop from Chapter 0, now real:

  data ─► train ─► track (MLflow) ─► register ─► serve (FastAPI) ─► package (Docker)
       ─► scale (Celery+Redis) ─► version data (DVC) ─► orchestrate (Prefect)
       ─► optimize (ONNX) ─► demo (Streamlit) ─► monitor ─┐
       ▲                                                   │
       └──────────────── retrain on drift ◄────────────────┘

  GenAI stack:   vector DB ─► RAG service ─► LLM serving ─► LLM observability
  Engineering:   testing & CI/CD  ·  config, secrets & security
  … and run it all with one command (docker-compose)

Every box is a tool you can now use. Swap our tiny model for a real one and nothing about the tooling changes — that was the whole point of keeping the model trivial.

A production-readiness checklist

Before any model goes live, walk this list (each item maps to a chapter):

Experiments tracked and reproducible (MLflow, DVC)
Model versioned in a registry with a @production alias
Served behind a validated API with a /health check
Containerized; image in a registry; runs as non-root
Heavy work offloaded to a queue; hot paths cached
Retraining orchestrated and gated on quality
Inference optimized (ONNX/quantization) if latency matters
Monitoring for drift and operational metrics, with alerts
Tests passing in CI; deploys gated on green (Chapter 18)
Secrets out of code; API authenticated & rate-limited (Chapter 19)
For LLM features: cost/latency tracked, eval gate, grounded RAG (Chapter 17)
A rollback plan (move the alias back)

The takeaway

docker-compose declares your whole multi-service system — API, worker, Redis, MLflow — in one file and launches it with docker compose up; services find each other by name on a private network, and you graduate to Kubernetes when one machine isn't enough. You've now assembled the complete production loop: track, register, serve, package, scale, version, orchestrate, optimize, demo, monitor, and retrain. That's MLOps — and you can do it. Go ship something. 👉

Production ML & AI Tools: A Hands-On Field Guide