References

The official docs are the best source for every tool here — they're well-written and current. This page points you to them, plus the books that go deeper.

Tool documentation

MLflow — mlflow.org/docs. Tracking, the Model Registry, pyfunc, autolog (Ch 2–Ch 3).
FastAPI — fastapi.tiangolo.com. The tutorial is outstanding (Ch 4).
Pydantic — docs.pydantic.dev. Validation and settings.
Uvicorn — uvicorn.org. The ASGI server.
Docker — docs.docker.com; Get Started + the Dockerfile best-practices guide (Ch 5).
Celery — docs.celeryq.dev (Ch 6).
Redis — redis.io/docs; the commands reference (Ch 7).
DVC — dvc.org/doc (Ch 8).
Prefect — docs.prefect.io (Ch 9).
ONNX / onnxruntime — onnx.ai and onnxruntime.ai (Ch 10).
Streamlit — docs.streamlit.io (Ch 11).
Gradio — gradio.app/docs (Ch 11).
Evidently — docs.evidentlyai.com; drift and quality reports (Ch 12).
Vector databases — Chroma (trychroma.com), Qdrant (qdrant.tech), pgvector, Pinecone (Ch 14).
Anthropic Claude API — docs.claude.com; the Messages API and Python SDK used in the RAG, serving, and observability chapters (Ch 15–Ch 17).
Ollama / vLLM — ollama.com, docs.vllm.ai; running open-weight models (Ch 16).
Langfuse / Ragas — langfuse.com, docs.ragas.io; LLM tracing & evaluation (Ch 17).
pytest / GitHub Actions — docs.pytest.org, docs.github.com/actions (Ch 18).
Pydantic Settings — docs.pydantic.dev/latest/concepts/pydantic_settings (Ch 19).

Going deeper

Chip Huyen. Designing Machine Learning Systems. O'Reilly, 2022. — The best single book on the production ML lifecycle; complements every chapter here.
Noah Gift et al. Practical MLOps. O'Reilly, 2021. — Hands-on cloud MLOps.
Google Cloud. MLOps: Continuous delivery and automation pipelines in machine learning. — The widely-cited MLOps maturity-levels white paper.
Martin Kleppmann. Designing Data-Intensive Applications. O'Reilly, 2017. — The systems foundations (queues, caching, storage) under all of this.

The alternatives, by category

So you recognize them on a job description:

Category	This book	Common alternatives
Experiment tracking	MLflow	Weights & Biases, Neptune, Comet
Model serving	FastAPI	BentoML, TorchServe, Triton, KServe
Task queue	Celery	RQ, Dramatiq, Arq, AWS SQS
Orchestration	Prefect	Airflow, Dagster, Kubeflow
Data versioning	DVC	lakeFS, Delta Lake, Git LFS
Inference runtime	ONNX	TensorRT, OpenVINO, TorchScript
Demo UI	Streamlit / Gradio	Dash, Panel
Monitoring	Evidently	WhyLabs, Arize, Fiddler
Vector database	Chroma / Qdrant	pgvector, Pinecone, Weaviate, Milvus
LLM API	Claude	GPT (OpenAI), Gemini (Google)
Self-hosted LLM serving	vLLM / Ollama	TGI, Triton, TensorRT-LLM, llama.cpp
LLM observability / eval	Langfuse / Ragas	LangSmith, Helicone, DeepEval
CI/CD	GitHub Actions	GitLab CI, Jenkins, CircleCI
Secrets management	env + Pydantic Settings	Vault, AWS/GCP Secrets Manager
Container orchestration	docker-compose	Kubernetes, Nomad, ECS

Sister books in this series

AI Foundations in Depth — the concepts behind the models you're deploying (and a cloud/MLOps overview chapter that maps the wider landscape).
HNSW and IVF & Product Quantization — the vector-search engines inside vector databases and feature stores.
Recommendation Systems from Scratch — a production capstone using MLflow, FastAPI, and a React frontend end to end.

Everything lives in code/ and runs standalone: sentiment/ (the model + MLflow), api/ (FastAPI), tasks/ (Celery), pipeline/ (Prefect), serving/ (ONNX + drift), streamlit_app.py, plus the Dockerfile, docker-compose.yml, and Makefile. Only NumPy is required for the core model; each chapter installs its own tool.

Production ML & AI Tools: A Hands-On Field Guide

References

Tool documentation

Going deeper

The alternatives, by category

Sister books in this series

This book's code