The papers: understand & reproduce
Like the other books, we built these methods as if from thin air. They come from a line of research papers. This chapter tells that story: what the key paper says, its lineage, how to read it, and an honest reflection on what we reproduced versus what FAISS adds.
The key paper
Hervé Jégou, Matthijs Douze, Cordelia Schmid. Product Quantization for Nearest Neighbor Search. IEEE TPAMI, 2011.
This paper introduced both halves of what we built:
- Product Quantization (Section 3) — split vectors, quantize each sub-space, store the codes, and compute distances by table lookup (ADC).
- IVFADC (Section 4) — combine an inverted file (coarse quantizer) with PQ
on residuals, which is exactly our
IVFPQ. The "ADC" is the asymmetric distance computation we implemented.
The four moving parts of the paper map onto our code:
| Paper concept | Our code |
|---|---|
| Coarse quantizer (k-means) | kmeans, IVF.train |
| Inverted lists | IVF.add / IVFPQ.add |
| Product quantizer + codebooks | PQ.train, PQ.encode |
| Asymmetric Distance Computation | PQ.distance_table, PQ.adc_distances |
| Residual encoding (IVFADC) | IVFPQ.train / IVFPQ.search |
Lineage and descendants
- Vector quantization comes from classic signal compression (Lloyd's algorithm, 1957/1982 — the same k-means we use).
- Inverted files come from text retrieval, repurposed for vectors.
- Descendants: Optimized PQ (OPQ, Ge et al. 2013) learns a rotation before PQ; ScaNN (Google, 2020) adds anisotropic quantization tuned for inner product; FAISS packages all of these.
Recognizing this lineage helps: when the paper is terse about k-means or distance algebra, the older quantization literature fills the gap.
How to read a paper like this
The same three-pass approach as the other books:
- Skim (5 min). Abstract, Figure 2 (the PQ split), and the IVFADC system diagram. Decide what you need: here, PQ + IVFADC.
- Method, carefully. Read Sections 3–4. Write down, in your own words: how a
code is formed, how ADC computes distance from codes, why residuals are
encoded, and the parameters (
m,k*=ksub,nlist/coarsek',w=nprobe). - Reproduce. Implement and measure recall vs. brute force — the demo is the proof you understood it.
Tip: the paper reports recall@{1,10,100} and memory per vector. Reproducing those axes (not necessarily the exact numbers, which depend on the dataset) is the real test — and it surfaces the re-ranking insight, since raw recall@10 looks alarming until you shortlist-and-rescore.
From paper to code: the process
- Build the primitive first. k-means underlies everything, so we wrote and tested it before IVF or PQ.
- Translate the data structures. Codebooks → a
(m, ksub, dsub)array; inverted lists →np.where(assign == c); codes → auint8matrix. - Get ADC right. The table-lookup distance (
adc_distances) is the subtle part; a wrong axis or sum and recall collapses. - Add residual encoding for IVFPQ. Encoding
x - centroidinstead ofxis a one-line change with a big recall effect — exactly as the paper claims. - Validate against ground truth, and discover the re-ranking technique when raw recall looks low.
Reflection: what we built vs. production
Our implementation is algorithmically faithful — PQ, ADC, IVFADC, residuals, correct recall behavior. What we left out is performance engineering, which is most of FAISS:
| Aspect | This book | FAISS / ScaNN |
|---|---|---|
| Language | pure Python + NumPy | C++/CUDA, SIMD, PQFastScan |
| Training | full data, simple Lloyd | sampled, k-means++, many iters |
| Distance | NumPy per cell | register-resident table lookups, GPU |
| Extras | — | OPQ rotation, on-disk, sharding, deletes |
| Re-rank | manual top-100 rescoring | IndexRefineFlat |
Same lesson as the KTS and HNSW books: the algorithm is compact and learnable; the performance is careful systems work layered on top. A faithful from-scratch version is the right way to understand it — and a baseline to validate the fast version against.
See the References for the papers and their roots.