Capstone projects: build the real thing
You've learned the concepts, the math, the tooling, and the interview answers. This final part is where it becomes real: five complete, runnable projects that build the systems modern AI engineers are actually hired to build — each one trains or runs end to end on your laptop's CPU in seconds to minutes, with no GPU and no downloads.
These aren't toy snippets. Each is the real architecture in miniature — the same transformer that powers Claude, the same LoRA that fine-tunes 70-billion-parameter models, the same diffusion process behind Stable Diffusion — shrunk until it runs in a few seconds so you can read every line, run it, and modify it. Scale the numbers up and the code is what runs in production.
The five projects
| # | Project | What you build | Why it matters |
|---|---|---|---|
| 1 | GPT from scratch | a decoder-only transformer that generates text | the architecture behind every LLM |
| 2 | Fine-tuning & LoRA | adapt a pretrained model efficiently | how you customize LLMs on a budget |
| 3 | An LLM agent | a ReAct loop that calls tools | the pattern behind agentic AI |
| 4 | CNN image classifier | train a vision model end to end | the workhorse of computer vision |
| 5 | Diffusion model | generate data by reversing noise | how modern image generators work |
Together they cover the four things you told an interviewer you could do (Chapter 29): train a model from scratch (1, 4, 5), work with transformers and LLMs (1, 2, 3), fine-tune (2), and build modern generative and agentic AI (3, 5).
How to run them
Every project lives in code/projects/ and needs only PyTorch
(CPU build is fine) — except the agent, which is pure Python:
cd code/projects
python gpt.py # train a transformer, generate text (~12s)
python finetune_lora.py # full fine-tune vs LoRA, compared (~5s)
python agent.py # a tool-using agent's reasoning trace (instant)
python cnn.py # train an image classifier to 100% (~20s)
python diffusion.py # generate points by reversing noise (~15s)
Every output shown in these chapters is real — produced by running exactly this code. Your runs will match (seeds are fixed).
What "production-ready" means here
These projects teach the architecture and training correctly — the part that doesn't change with scale. But a model that runs in a notebook isn't deployed. Each chapter ends with a "Make it production" section that connects the project to the rest of this book and its sister volumes:
- Serve it — wrap it in an API, containerize it, monitor it (the Production ML & AI Tools book).
- Track it — log experiments and version the model (Ch 26, the MLflow chapters of the tools book).
- Scale it — the same code, bigger model, more data, a GPU, mixed precision (Ch 16).
- Retrieve for it — ground an LLM in your data with RAG (Ch 14, the HNSW/IVF-PQ books).
The gap between "I trained a tiny GPT" and "I shipped an LLM feature" is engineering, not understanding — and you now have both halves.
A note on scaling up
The only difference between these projects and the real systems is size: more parameters, more data, more compute, and a few production tricks (mixed precision, distributed training, learning-rate schedules). Nothing about the architecture changes. When you read that GPT-4 is "a transformer," you'll know precisely what that means — because you built one. When you read that a team "fine-tuned with LoRA," you'll know exactly what they did — because you did it too.
The takeaway
These five projects turn every concept in this book into working code you can run, read, and extend. Train a transformer, fine-tune it with LoRA, give it tools, train a vision model, and generate with diffusion — the actual modern AI stack, on your CPU, today. Pick any one and run it. Let's start with the project at the center of the AI universe: a GPT. 👉