References
The official docs are the best source for every tool here — they're well-written and current. This page points you to them, plus the books that go deeper.
Tool documentation
- MLflow — mlflow.org/docs. Tracking, the Model Registry,
pyfunc, autolog (Ch 2–Ch 3). - FastAPI — fastapi.tiangolo.com. The tutorial is outstanding (Ch 4).
- Pydantic — docs.pydantic.dev. Validation and settings.
- Uvicorn — uvicorn.org. The ASGI server.
- Docker — docs.docker.com; Get Started + the Dockerfile best-practices guide (Ch 5).
- Celery — docs.celeryq.dev (Ch 6).
- Redis — redis.io/docs; the commands reference (Ch 7).
- DVC — dvc.org/doc (Ch 8).
- Prefect — docs.prefect.io (Ch 9).
- ONNX / onnxruntime — onnx.ai and onnxruntime.ai (Ch 10).
- Streamlit — docs.streamlit.io (Ch 11).
- Gradio — gradio.app/docs (Ch 11).
- Evidently — docs.evidentlyai.com; drift and quality reports (Ch 12).
- Vector databases — Chroma (trychroma.com), Qdrant (qdrant.tech), pgvector, Pinecone (Ch 14).
- Anthropic Claude API — docs.claude.com; the Messages API and Python SDK used in the RAG, serving, and observability chapters (Ch 15–Ch 17).
- Ollama / vLLM — ollama.com, docs.vllm.ai; running open-weight models (Ch 16).
- Langfuse / Ragas — langfuse.com, docs.ragas.io; LLM tracing & evaluation (Ch 17).
- pytest / GitHub Actions — docs.pytest.org, docs.github.com/actions (Ch 18).
- Pydantic Settings — docs.pydantic.dev/latest/concepts/pydantic_settings (Ch 19).
Going deeper
- Chip Huyen. Designing Machine Learning Systems. O'Reilly, 2022. — The best single book on the production ML lifecycle; complements every chapter here.
- Noah Gift et al. Practical MLOps. O'Reilly, 2021. — Hands-on cloud MLOps.
- Google Cloud. MLOps: Continuous delivery and automation pipelines in machine learning. — The widely-cited MLOps maturity-levels white paper.
- Martin Kleppmann. Designing Data-Intensive Applications. O'Reilly, 2017. — The systems foundations (queues, caching, storage) under all of this.
The alternatives, by category
So you recognize them on a job description:
| Category | This book | Common alternatives |
|---|---|---|
| Experiment tracking | MLflow | Weights & Biases, Neptune, Comet |
| Model serving | FastAPI | BentoML, TorchServe, Triton, KServe |
| Task queue | Celery | RQ, Dramatiq, Arq, AWS SQS |
| Orchestration | Prefect | Airflow, Dagster, Kubeflow |
| Data versioning | DVC | lakeFS, Delta Lake, Git LFS |
| Inference runtime | ONNX | TensorRT, OpenVINO, TorchScript |
| Demo UI | Streamlit / Gradio | Dash, Panel |
| Monitoring | Evidently | WhyLabs, Arize, Fiddler |
| Vector database | Chroma / Qdrant | pgvector, Pinecone, Weaviate, Milvus |
| LLM API | Claude | GPT (OpenAI), Gemini (Google) |
| Self-hosted LLM serving | vLLM / Ollama | TGI, Triton, TensorRT-LLM, llama.cpp |
| LLM observability / eval | Langfuse / Ragas | LangSmith, Helicone, DeepEval |
| CI/CD | GitHub Actions | GitLab CI, Jenkins, CircleCI |
| Secrets management | env + Pydantic Settings | Vault, AWS/GCP Secrets Manager |
| Container orchestration | docker-compose | Kubernetes, Nomad, ECS |
Sister books in this series
- AI Foundations in Depth — the concepts behind the models you're deploying (and a cloud/MLOps overview chapter that maps the wider landscape).
- HNSW and IVF & Product Quantization — the vector-search engines inside vector databases and feature stores.
- Recommendation Systems from Scratch — a production capstone using MLflow, FastAPI, and a React frontend end to end.
This book's code
Everything lives in code/ and runs standalone:
sentiment/ (the model + MLflow), api/ (FastAPI), tasks/ (Celery), pipeline/
(Prefect), serving/ (ONNX + drift), streamlit_app.py, plus the Dockerfile,
docker-compose.yml, and Makefile. Only NumPy is required for the core model; each
chapter installs its own tool.