The interview playbook

You now hold the whole foundation. This final chapter turns it into interview success: the question types you'll face, a rapid-fire concept bank with crisp answers, the coding drills that recur, and a prep plan. Treat it as the checklist before you walk in.

The five rounds of an ML interview

Round	What it tests	Where this book prepared you
Coding (DS&A)	general programming	(LeetCode — outside this book)
ML coding	implement an algorithm from scratch	Chapters 1, 8, 11, 18, 20–23
ML concepts	breadth & depth of fundamentals	the whole book
ML system design	end-to-end system thinking	Chapter 28
Behavioral	collaboration, impact, judgment	(your stories)

Most candidates over-prepare DS&A and under-prepare ML concepts and system design — the rounds that actually differentiate. Invest where this book points.

Rapid-fire concept bank

Practice saying each answer out loud in 30–60 seconds. If any feels shaky, reread the linked chapter.

Fundamentals

Bias–variance tradeoff? Underfitting (high bias) vs. overfitting (high variance); total error balances both. Ch 9
Overfitting — detect and fix? Train ≪ validation error; fix with more data, regularization, dropout, early stopping, simpler model. Ch 9
L1 vs. L2? L1 → sparse (feature selection); L2 → smooth shrinkage (weight decay). Ch 9
Generative vs. discriminative? Model P(x,y) vs. P(y|x). Naive Bayes vs. logistic regression.

Algorithms

Bagging vs. boosting? Parallel independent trees averaged (↓variance) vs. sequential error-correcting trees (↓bias). Ch 20
Why do gradient-boosted trees beat neural nets on tabular data? Handle mixed features, need little tuning, capture interactions, robust. Ch 20
k-NN vs. k-means? Supervised classification (k voters) vs. unsupervised clustering (k groups). Ch 20, Ch 21
How does a decision tree choose splits? Maximize impurity reduction (Gini/entropy). Ch 20

Deep learning

Why activation functions? Without non-linearity, stacked layers collapse to one linear layer. Ch 11
Vanishing gradients — cause and fix? Sigmoid/tanh saturate; fix with ReLU, residual connections, normalization, good init. Ch 11
What is attention? softmax(QKᵀ/√d)V — each token weights every other by query–key similarity. Ch 13
Adam vs. SGD? Adam adapts a per-parameter learning rate + momentum; robust default. Ch 8

Stats & evaluation

Explain a p-value. P(data this extreme | null true); not P(null true), and not an effect size. Ch 22
Precision vs. recall — which when? Costly false positives → precision (spam); costly false negatives → recall (cancer). Ch 10
Why is accuracy bad for imbalanced data? "Always predict majority" scores high yet is useless; use F1/AUC. Ch 10
What is ROC-AUC? Threshold-free ranking quality; P(score(pos) > score(neg)). Ch 10
The base-rate fallacy? A rare positive + a "99% accurate" test still yields mostly false positives. Ch 22

LLMs & modern

RAG vs. fine-tuning? RAG adds knowledge (retrieved at query time); fine-tuning adds behavior (baked into weights). Ch 15
What is LoRA? Low-rank weight updates — fine-tune <1% of parameters cheaply. Ch 27
What does temperature do? Scales randomness of token sampling; low = focused, high = creative. Ch 15
Why do LLMs hallucinate? They optimize plausible next tokens, not truth; mitigate with RAG and verification. Ch 15

Coding drills (implement from scratch, no libraries)

These come up in ML-coding rounds. You've already built most of them in this book — redo them on a blank page until fluent:

Gradient descent for linear/logistic regression. Ch 1, Ch 6
k-means clustering. Ch 21
k-NN classifier. Ch 20
Backprop for a 2-layer net. Ch 11
Softmax / sigmoid / cross-entropy (numerically stable). Ch 17, Ch 18
Precision/recall/F1/AUC from predictions. Ch 10
Cosine similarity / top-k retrieval. Ch 4, Ch 18
Train/test split & a CV loop. Ch 18

The recipe book is your cheat sheet — but practice writing them without it.

How to answer well (meta-skills)

Think out loud. Interviewers grade your reasoning, not just the answer. Narrate trade-offs.
Start simple, then iterate. Baseline first ("I'd start with logistic regression to establish a number"), then add complexity with justification.
Say "it depends" — then say on what. Almost every real answer is conditional; naming the conditions is the signal.
Admit unknowns gracefully. "I haven't used X, but it's like Y because…" beats bluffing. Reasoning from fundamentals is the whole point of this book.
Tie back to business impact. "This raises NDCG, which should lift engagement, which we'd confirm with an A/B test."

A 4-week prep plan

Week 1 — Fundamentals. Re-read Parts I–II and the concept bank; explain each aloud. Redo the from-scratch gradient descent and metrics.
Week 2 — Algorithms & math. Chapters 20–24; implement k-means, k-NN, a decision- tree split; drill probability and the p-value/Bayes questions.
Week 3 — Deep learning & modern. Chapters 11–15, 27; be able to whiteboard a training loop and explain attention, RAG, and LoRA.
Week 4 — System design & mocks. Chapter 28; practice 5–6 design prompts out loud under time; do mock interviews; prepare behavioral stories (impact, conflict, failure).

The takeaway

Interviews reward structured fundamentals over memorized trivia. The differentiating rounds are ML concepts and system design — exactly what this book built. Rehearse the concept bank aloud, re-implement the core algorithms on a blank page, walk the system- design framework, think out loud, start simple, and tie everything to impact. You can now define and code every term an ML engineer will throw at you — and reason from first principles when you meet a new one.

That was the goal of this whole book. Go get the job — and to prove you can do it (to yourself and to them), the final part is five complete projects you build end to end: a GPT, a LoRA fine-tune, an agent, a CNN, and a diffusion model. Let's go build. 👉

AI Foundations in Depth