A 5-minute primer: users, items & vectors

Just enough Python, NumPy, and vocabulary to read every example. Skip if you're comfortable (the HNSW primer covers vectors in more depth).

The core vocabulary

TermMeaning
userthe person we recommend to
itemthe thing we recommend (article, product, movie, song)
interactiona user doing something with an item (click, view, buy, rate)
catalogall the items we could recommend
embedding / vectora list of numbers representing an item or user
top-kthe k items we actually show (e.g. top-10)

The whole field is about one question: given who the user is and what they've done, which items should be in their top-k?

Code boxes show their output

print("recommended:", [42, 7, 13])

Output:

recommended: [42, 7, 13]

A tiny bit of Python

history = [5, 12, 3]              # a list: item ids the user interacted with
scores = {7: 0.9, 2: 0.4}        # a dict: item id -> predicted score

def top_k(scores, k):            # a function
    return sorted(scores, key=scores.get, reverse=True)[:k]

print(top_k({7: 0.9, 2: 0.4, 5: 0.7}, 2))

Output:

[7, 5]

sorted(..., reverse=True)[:k] is the heart of every recommender: score the items, sort high-to-low, keep the top k.

NumPy and vectors

NumPy (nicknamed np) does fast math on arrays of numbers. A vector is a list of numbers — a point in space. Similar items get nearby vectors.

import numpy as np
a = np.array([1.0, 0.0, 0.0])    # an "action movie" vector
b = np.array([0.9, 0.1, 0.0])    # similar
c = np.array([0.0, 0.0, 1.0])    # a "documentary" vector
print("a·b =", np.dot(a, b))     # high -> similar
print("a·c =", np.dot(a, c))     # low  -> different

Output:

a·b = 0.9
a·c = 0.0

The dot product measures similarity. Cosine similarity is the dot product after scaling both vectors to length 1 (so it measures direction, not size) — it is the workhorse similarity in this book.

The user-item matrix

Most recommenders start from a big table R: one row per user, one column per item, with a mark where they interacted. It's mostly empty (a user touches a tiny fraction of the catalog) — we call that sparse.

          item0  item1  item2  item3
 user0      1      .      1      .
 user1      .      1      .      1
 user2      1      1      .      .

A 1 = "interacted"; . = "no data" (not "disliked" — an important distinction we'll return to). Recommending = filling in the blanks: predicting which empty cells the user would light up next.

NumPy bits we use

You'll seeMeaning
np.argsort(-scores)item ids ordered by highest score first
X @ Y.Tall pairwise dot products between rows of X and Y
np.linalg.norm(v)length of a vector (for cosine)
np.exp(-x)the decay curve used for recency weighting
np.linalg.solve(A, b)solve A x = b (used in matrix factorization)

That's the toolkit. Now, the problem itself and the landscape of solutions. 👉