Capstone projects: build the real thing

You've learned the concepts, the math, the tooling, and the interview answers. This final part is where it becomes real: five complete, runnable projects that build the systems modern AI engineers are actually hired to build — each one trains or runs end to end on your laptop's CPU in seconds to minutes, with no GPU and no downloads.

These aren't toy snippets. Each is the real architecture in miniature — the same transformer that powers Claude, the same LoRA that fine-tunes 70-billion-parameter models, the same diffusion process behind Stable Diffusion — shrunk until it runs in a few seconds so you can read every line, run it, and modify it. Scale the numbers up and the code is what runs in production.

The five projects

#	Project	What you build	Why it matters
1	GPT from scratch	a decoder-only transformer that generates text	the architecture behind every LLM
2	Fine-tuning & LoRA	adapt a pretrained model efficiently	how you customize LLMs on a budget
3	An LLM agent	a ReAct loop that calls tools	the pattern behind agentic AI
4	CNN image classifier	train a vision model end to end	the workhorse of computer vision
5	Diffusion model	generate data by reversing noise	how modern image generators work

Together they cover the four things you told an interviewer you could do (Chapter 29): train a model from scratch (1, 4, 5), work with transformers and LLMs (1, 2, 3), fine-tune (2), and build modern generative and agentic AI (3, 5).

How to run them

Every project lives in code/projects/ and needs only PyTorch (CPU build is fine) — except the agent, which is pure Python:

cd code/projects
python gpt.py            # train a transformer, generate text     (~12s)
python finetune_lora.py  # full fine-tune vs LoRA, compared        (~5s)
python agent.py          # a tool-using agent's reasoning trace    (instant)
python cnn.py            # train an image classifier to 100%       (~20s)
python diffusion.py      # generate points by reversing noise      (~15s)

Every output shown in these chapters is real — produced by running exactly this code. Your runs will match (seeds are fixed).

What "production-ready" means here

These projects teach the architecture and training correctly — the part that doesn't change with scale. But a model that runs in a notebook isn't deployed. Each chapter ends with a "Make it production" section that connects the project to the rest of this book and its sister volumes:

Serve it — wrap it in an API, containerize it, monitor it (the Production ML & AI Tools book).
Track it — log experiments and version the model (Ch 26, the MLflow chapters of the tools book).
Scale it — the same code, bigger model, more data, a GPU, mixed precision (Ch 16).
Retrieve for it — ground an LLM in your data with RAG (Ch 14, the HNSW/IVF-PQ books).

The gap between "I trained a tiny GPT" and "I shipped an LLM feature" is engineering, not understanding — and you now have both halves.

A note on scaling up

The only difference between these projects and the real systems is size: more parameters, more data, more compute, and a few production tricks (mixed precision, distributed training, learning-rate schedules). Nothing about the architecture changes. When you read that GPT-4 is "a transformer," you'll know precisely what that means — because you built one. When you read that a team "fine-tuned with LoRA," you'll know exactly what they did — because you did it too.

The takeaway

These five projects turn every concept in this book into working code you can run, read, and extend. Train a transformer, fine-tune it with LoRA, give it tools, train a vision model, and generate with diffusion — the actual modern AI stack, on your CPU, today. Pick any one and run it. Let's start with the project at the center of the AI universe: a GPT. 👉