Docker: package it to run anywhere
Your FastAPI service runs on your laptop, with your Python version and your installed packages. Ship it to a server and it breaks — wrong Python, missing library, different OS. Docker ends "works on my machine" by packaging your code and its entire environment into a portable image that runs identically everywhere: your laptop, a teammate's, a server, the cloud.
Setup: install Docker Desktop (Mac/Windows) or Docker Engine (Linux). This chapter is follow-along — Docker needs a daemon, so commands and expected output are shown for you to run locally.
The three words you must know
- Dockerfile — a recipe: the steps to build your environment (base OS, install deps, copy code, run command).
- Image — the built, frozen result of that recipe. Immutable, shareable, tagged
(
sentiment-api:1.0). - Container — a running instance of an image. You can run many containers from one image.
Analogy: the Dockerfile is a class, the image is a compiled program, the container is a running process.
The Dockerfile, line by line
Here's code/Dockerfile, which packages our API. Every line is
a real best practice:
FROM python:3.11-slim # 1. small base image (fewer CVEs, less weight)
WORKDIR /app
COPY requirements.txt . # 2. deps FIRST, so Docker caches this layer
RUN pip install --no-cache-dir fastapi "uvicorn[standard]" pydantic numpy
COPY sentiment/ ./sentiment/ # 3. then the code (changes often)
COPY api/ ./api/
RUN useradd --create-home appuser # 4. don't run as root
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]
The four ideas worth internalizing:
- Slim base image —
python:3.11-slimis ~5× smaller than the full image: faster pulls, smaller attack surface. - Copy
requirements.txtbefore the code. Docker builds in cached layers; if you copy code first, every code change re-installs all dependencies. Deps-first meanspip installis cached and rebuilds are seconds, not minutes. - Run as a non-root user. If the container is compromised, the attacker isn't root. Basic, essential hygiene.
HEALTHCHECK— hits our/healthendpoint (Chapter 4) so orchestrators know when the container is actually ready, not just started.
Don't be confused:
EXPOSEvs.-p.EXPOSE 8000only documents that the app uses port 8000 — it doesn't open anything. You actually publish the port at run time with-p 8000:8000(host port : container port). Forgetting-pis the #1 "why can't I reach my container?" gotcha.
Build and run
cd code
docker build -t sentiment-api . # build the image from the Dockerfile
docker run -p 8000:8000 sentiment-api # run a container, publish the port
Expected output:
[+] Building 12.3s (12/12) FINISHED
=> naming to docker.io/library/sentiment-api 0.0s
...
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: Application startup complete.
Now curl http://localhost:8000/health works exactly as in Chapter 4 — but it's
running inside an isolated container, with its own Python and dependencies, that
will behave identically on any machine with Docker.
The commands you'll use daily
docker images # list built images
docker ps # list running containers
docker logs <container> # see a container's output
docker exec -it <container> bash # open a shell inside a running container
docker stop <container> # stop it
docker build -t name:tag . # build with a version tag
docker push registry/name:tag # push to a registry (Docker Hub, ECR, GCR)
How images travel to production
You don't copy code to servers anymore — you build an image, push it to a container registry (Docker Hub, AWS ECR, Google Artifact Registry), and servers pull and run it:
build image ─► docker push ─► registry ─► docker pull ─► run on server / K8s
This is the foundation of modern deployment. Kubernetes (the full-stack chapter and the foundations book) orchestrates thousands of these containers; cloud "serverless container" services (AWS Fargate, Cloud Run) run them without you managing servers at all.
Don't be confused: image vs. container (again, because it matters). You build an image once and run many containers from it. Stopping a container doesn't delete the image. A container is ephemeral — anything written inside it (like a
model.jsoncreated at runtime) vanishes when it stops, unless you mount a volume. Bake the model into the image, or load it from the registry/object storage at startup — never rely on files written inside a running container surviving.
A note on size & GPUs
- Keep images small: slim bases,
.dockerignore, multi-stage builds (build in a fat image, copy only the result into a slim one). - For GPU inference, use NVIDIA's CUDA base images and the NVIDIA container runtime — the same Dockerfile idea, heavier base.
The takeaway
Docker packages your code and its environment into a portable image — killing "works
on my machine." Write a Dockerfile (slim base, deps before code for caching, non-root,
healthcheck), build an image, run containers from it, and push to a registry so
any server can pull and run it identically. Containers are ephemeral; don't rely on
files written inside them. Our service is now portable — next, let's handle work that's
too slow to do inside a request, with a task queue. 👉