A 5-minute primer: vectors, clusters & recall

Just enough to read every example. Skip if you're comfortable with Python, NumPy, and nearest-neighbor basics (the HNSW primer covers the same ground in more detail).

Code boxes show their output

import numpy as np
print(np.array([1.0, 2.0, 3.0]).mean())

Output:

2.0

Vectors and distance

A vector is a list of numbers — a point in space. The squared Euclidean distance between two vectors is the sum of squared differences (we skip the square root because it doesn't change which point is nearest):

import numpy as np
a = np.array([0.0, 0.0])
b = np.array([3.0, 4.0])
print(np.sum((a - b) ** 2))     # 9 + 16

Output:

25.0

In this book vectors are usually embeddings — numeric fingerprints of text, images, products, or users, arranged so similar items sit close together. We just assume we're handed them.

Clusters and centroids

A cluster is a group of nearby vectors; its centroid is the group's average — a single vector that "represents" the group. The whole book leans on one idea: replace many vectors by a few representative centroids.

  • IVF uses centroids to define regions (search only the nearby region).
  • Quantization uses centroids as a palette (store the nearest centroid's id instead of the full vector).

The algorithm that finds good centroids is k-means, built from scratch in Chapter 2.

Recall: measuring an approximate answer

These methods are approximate — fast and small, but they occasionally miss a true neighbor. We measure quality with recall:

recall@k = (how many of the true top-k neighbors we returned) ÷ k.

Recall 1.0 = perfect; 0.9 = we found 9 of the true 10. Throughout, we compare against exact brute-force search to compute recall honestly.

true_top10  = {3, 8, 12, 19, 25, 31, 44, 50, 61, 77}
approx_top10 = {3, 8, 12, 19, 25, 31, 44, 50, 61, 99}   # missed 77, added 99
recall = len(true_top10 & approx_top10) / 10
print(recall)

Output:

0.9

NumPy bits we use

You'll seeMeaning
np.argsort(d)[:k]indices of the k smallest values in d
(X - q) ** 2).sum(1)squared distance from q to every row of X
X @ C.Tall pairwise dot products between rows of X and C
np.where(a == c)[0]positions where assignment array a equals cell c
.astype(np.uint8)store small integers (0–255) in one byte each

That last one is the heart of quantization: turning floats into single bytes. On to why exact search doesn't fit in memory. 👉