References

The official docs are the best source for every tool here — they're well-written and current. This page points you to them, plus the books that go deeper.

Tool documentation

  • MLflow — mlflow.org/docs. Tracking, the Model Registry, pyfunc, autolog (Ch 2Ch 3).
  • FastAPI — fastapi.tiangolo.com. The tutorial is outstanding (Ch 4).
  • Pydantic — docs.pydantic.dev. Validation and settings.
  • Uvicorn — uvicorn.org. The ASGI server.
  • Docker — docs.docker.com; Get Started + the Dockerfile best-practices guide (Ch 5).
  • Celery — docs.celeryq.dev (Ch 6).
  • Redis — redis.io/docs; the commands reference (Ch 7).
  • DVC — dvc.org/doc (Ch 8).
  • Prefect — docs.prefect.io (Ch 9).
  • ONNX / onnxruntime — onnx.ai and onnxruntime.ai (Ch 10).
  • Streamlit — docs.streamlit.io (Ch 11).
  • Gradio — gradio.app/docs (Ch 11).
  • Evidently — docs.evidentlyai.com; drift and quality reports (Ch 12).
  • Vector databases — Chroma (trychroma.com), Qdrant (qdrant.tech), pgvector, Pinecone (Ch 14).
  • Anthropic Claude API — docs.claude.com; the Messages API and Python SDK used in the RAG, serving, and observability chapters (Ch 15Ch 17).
  • Ollama / vLLM — ollama.com, docs.vllm.ai; running open-weight models (Ch 16).
  • Langfuse / Ragas — langfuse.com, docs.ragas.io; LLM tracing & evaluation (Ch 17).
  • pytest / GitHub Actions — docs.pytest.org, docs.github.com/actions (Ch 18).
  • Pydantic Settings — docs.pydantic.dev/latest/concepts/pydantic_settings (Ch 19).

Going deeper

  • Chip Huyen. Designing Machine Learning Systems. O'Reilly, 2022. — The best single book on the production ML lifecycle; complements every chapter here.
  • Noah Gift et al. Practical MLOps. O'Reilly, 2021. — Hands-on cloud MLOps.
  • Google Cloud. MLOps: Continuous delivery and automation pipelines in machine learning. — The widely-cited MLOps maturity-levels white paper.
  • Martin Kleppmann. Designing Data-Intensive Applications. O'Reilly, 2017. — The systems foundations (queues, caching, storage) under all of this.

The alternatives, by category

So you recognize them on a job description:

CategoryThis bookCommon alternatives
Experiment trackingMLflowWeights & Biases, Neptune, Comet
Model servingFastAPIBentoML, TorchServe, Triton, KServe
Task queueCeleryRQ, Dramatiq, Arq, AWS SQS
OrchestrationPrefectAirflow, Dagster, Kubeflow
Data versioningDVClakeFS, Delta Lake, Git LFS
Inference runtimeONNXTensorRT, OpenVINO, TorchScript
Demo UIStreamlit / GradioDash, Panel
MonitoringEvidentlyWhyLabs, Arize, Fiddler
Vector databaseChroma / Qdrantpgvector, Pinecone, Weaviate, Milvus
LLM APIClaudeGPT (OpenAI), Gemini (Google)
Self-hosted LLM servingvLLM / OllamaTGI, Triton, TensorRT-LLM, llama.cpp
LLM observability / evalLangfuse / RagasLangSmith, Helicone, DeepEval
CI/CDGitHub ActionsGitLab CI, Jenkins, CircleCI
Secrets managementenv + Pydantic SettingsVault, AWS/GCP Secrets Manager
Container orchestrationdocker-composeKubernetes, Nomad, ECS

Sister books in this series

  • AI Foundations in Depth — the concepts behind the models you're deploying (and a cloud/MLOps overview chapter that maps the wider landscape).
  • HNSW and IVF & Product Quantization — the vector-search engines inside vector databases and feature stores.
  • Recommendation Systems from Scratch — a production capstone using MLflow, FastAPI, and a React frontend end to end.

This book's code

Everything lives in code/ and runs standalone: sentiment/ (the model + MLflow), api/ (FastAPI), tasks/ (Celery), pipeline/ (Prefect), serving/ (ONNX + drift), streamlit_app.py, plus the Dockerfile, docker-compose.yml, and Makefile. Only NumPy is required for the core model; each chapter installs its own tool.