A 5-minute primer: Python, NumPy & the mental model
This page gives you just enough to read every example in the book. Skip it if you're already comfortable with NumPy arrays.
Reading the code boxes
Grey boxes contain Python; the box right after shows what it prints:
print("hello")
print(2 + 3)
Output:
hello
5
Variables, lists, functions
x = 10 # x now refers to the integer 10
words = ["cat", "dog"] # a list: an ordered collection
def add(a, b): # define a reusable function
return a + b # "return" hands a result back
print(add(2, 3))
Output:
5
NumPy: the array is everything
NumPy is the library for fast number-crunching in Python; essentially all of
AI's math runs on it (or on its GPU cousin, PyTorch). We nickname it np. Its one
big idea is the array: a grid of numbers you operate on all at once, instead
of looping.
import numpy as np
v = np.array([2.0, 0.5, 1.0]) # a 1-D array = a vector
print(v)
print("shape:", v.shape) # how big it is, per dimension
print("v * 2:", v * 2) # operations apply to every element
Output:
[2. 0.5 1. ]
shape: (3,)
v * 2: [4. 1. 2.]
That last line is the whole point: v * 2 multiplied every element without a
for loop. This is called vectorization, and it's why NumPy is fast.
Vectors, matrices, and the word "shape"
- A vector is a 1-D array — a single row of numbers, a point in space.
- A matrix is a 2-D array — a grid with rows and columns.
.shapetells you the size along each dimension.(3,)is a length-3 vector;(2, 3)is a 2-row, 3-column matrix.
M = np.array([[1, 2, 3],
[4, 5, 6]])
print("shape:", M.shape) # (rows, columns)
print("M.T:\n", M.T) # transpose: rows become columns
Output:
shape: (2, 3)
M.T:
[[1 4]
[2 5]
[3 6]]
The two operations you'll see constantly
a = np.array([1.0, 2.0, 3.0])
b = np.array([4.0, 5.0, 6.0])
print("dot product:", np.dot(a, b)) # 1*4 + 2*5 + 3*6 = 32
print("elementwise :", a * b) # [4, 10, 18] — NOT a dot product
Output:
dot product: 32.0
elementwise : [ 4. 10. 18.]
Don't be confused:
*vs@/np.dot.a * bmultiplies element by element and keeps the same shape.a @ b(matrix multiply) /np.dot(a, b)sums those products into a single number (for vectors). Mixing these up is the #1 NumPy bug. The dot product is the engine of nearly every similarity in this book.
The mental model of "learning"
Here is the entire field in four words: adjust numbers to reduce error.
A model is a box of adjustable numbers (called parameters or weights). You show it examples, measure how wrong it is (the loss), and nudge the numbers in the direction that makes the loss smaller. Repeat millions of times. That's it — linear regression and a 100-billion-parameter language model differ in scale and architecture, not in this core loop. We make each of those four words concrete in Chapter 1.
NumPy bits we use throughout
| You'll see | Meaning |
|---|---|
np.array([...]) | make a vector / matrix |
a * b | elementwise multiply (same shape out) |
a @ b, np.dot(a, b) | dot / matrix product (contracts a dimension) |
np.linalg.norm(v) | length of a vector |
X.shape, X.T | size per dimension; transpose |
X.mean(axis=0) | average down each column |
np.exp, np.log | $e^x$ and natural log, elementwise |
np.argsort(d) | indices that would sort d |
That's everything you need. Next: what a "model" actually is. 👉