Introduction
No background required. This book assumes you know nothing about machine learning and only a little Python. Every concept, symbol, and line of code is explained, and every code example is followed by the exact output it produces. If a word looks scary, it's defined the first time it appears.
The problem this book solves
You can sit down with an ML engineer, follow 80% of what they say, and then hit a single word that stops you cold — logit, kernel, embedding, broadcasting, autograd, cross-entropy, feature, tensor — and suddenly the rest of the sentence is noise. None of these words are hard. They're just rarely defined from the ground up, because every book assumes the other books taught them to you.
This is the book that teaches them. It's the missing prerequisite — a hitchhiker's guide to the vocabulary, math, and tools that sit underneath every AI project, whether that project is a recommender, a search engine, a fraud model, or a large language model.
How this book is different
Three rules, on every page:
- Show the output. Every snippet is followed by a
textblock with exactly what it prints. You never have to wonder "what does this actually do?" - Build it from scratch, then name it. We compute cosine similarity with
np.dotbefore telling you it's called cosine similarity. Understanding first, jargon second. - "Don't be confused" boxes. The field is full of near-synonyms and collisions — normalization vs. standardization, loss vs. metric, parameter vs. hyperparameter, the five different things called "kernel." These boxes pull them apart explicitly.
A map of the journey
- Part I — The data & the vocabulary. What a model is; tensors, shapes and broadcasting; features and feature extraction; the similarity/distance recipes (cosine, dot, Euclidean, RBF…); and the word kernel in all five of its meanings.
- Part II — How learning actually works. Linear and logistic models; loss functions; gradient descent and backpropagation; overfitting and regularization; evaluation metrics.
- Part III — Neural networks & PyTorch. A neural net built by hand in NumPy, then the same network in PyTorch; the deep-learning zoo (CNN, RNN, Transformer); and embeddings.
- Part IV — The modern AI stack. Tokens, transformers and LLMs; the training pipeline in practice; and a compendium of numerical gotchas.
- Part V — The classical ML toolkit. The algorithms that rule tabular data and interviews — k-NN, Naive Bayes, decision trees, random forests, gradient boosting (XGBoost), SVMs — plus clustering and PCA.
- Part VI — The math & stats interviews assume. Probability and statistics (Bayes, MLE, hypothesis testing), the linear algebra you actually need (eigen, SVD), and A/B testing.
- Part VII — Shipping it & the landscape. The practitioner's toolkit (Python, SQL, Pandas, scikit-learn, tuning), cloud & MLOps, and the modern frontier (LoRA, RAG, agents, diffusion, quantization, vector DBs).
- Part VIII — Interview success. A repeatable ML system-design framework and a full interview playbook (concept bank, coding drills, prep plan).
- Part IX — Reference. A copy-paste recipe book and a glossary of the words interviewers assume you know.
The first four parts are the deep-learning spine, read in order. Parts V–VIII broaden you into a complete, hands-on, interview-ready engineer — they can be read any time after Part II, in any order.
What you'll be able to do by the end
Define and code — in a few lines of NumPy — every term in the glossary. Read a PyTorch training loop and say what each line does. Look at a similarity score and know whether it's a distance or a similarity, whether it's normalized, and what would change if you swapped the metric. Walk into any AI conversation without flinching.
What you need
Python 3, NumPy, and (for one chapter) PyTorch — the CPU build is plenty. Every
runnable program lives in code/ alongside the book. If
you've never written Python, read the 5-minute primer next;
otherwise skip straight to What is a model, really? 👉