A worked demo
Let's confirm KTS actually works. We build a synthetic "video" whose true structure we know, run KTS, and check it recovers the planted boundaries.
The full script is code/demo.py; here's how it's built.
A fake video with known change points
To test KTS we need data where we already know the right answer. So we manufacture a fake video: we pick a few random "centers" (each stands for a shot), and for each shot we generate a block of frames that are that center plus a little random noise. Frames in the same block look alike; frames in different blocks look different. The boundaries between blocks are the true change points we hope KTS will rediscover.
def make_synthetic_sequence(seed=0):
"""Return (X, true_change_points)."""
rng = np.random.default_rng(seed)
d = 8
block_lengths = [40, 25, 35, 20, 30] # 5 segments
centers = rng.normal(scale=3.0, size=(len(block_lengths), d))
chunks, true_cps, cursor = [], [], 0
for i, L in enumerate(block_lengths):
chunks.append(centers[i] + rng.normal(scale=0.4, size=(L, d)))
cursor += L
if i < len(block_lengths) - 1:
true_cps.append(cursor) # boundary = start of next block
X = np.vstack(chunks)
return X, np.array(true_cps)
A couple of unfamiliar bits:
np.random.default_rng(seed)creates a reproducible source of random numbers — using the sameseedgives the same "random" data every run, so the results in this book are repeatable.rng.normal(scale=0.4, size=(L, d))drawsL × dsmall random numbers (the per-frame noise). A biggerscale= noisier frames = harder problem.np.vstack([...])stacks the blocks on top of each other into one big(n_frames, d)array.
We use 5 blocks of lengths [40, 25, 35, 20, 30], so the true change points are
at frames [40, 65, 100, 120] (4 cuts → 5 segments).
Running KTS
from kts import build_kernel, cpd_auto, cpd_nonlin
K = build_kernel(X, kind="cosine")
# automatic number of segments
cps_auto, scores = cpd_auto(K, ncp_max=12, vmax=1.0, lmin=5)
# or, if we already knew there were 5 segments:
cps_fixed, cost = cpd_nonlin(K, ncp=4, lmin=5)
What you get
Running python demo.py produces:
sequence length n = 150
true change points = [40, 65, 100, 120] (5 segments)
[cpd_auto] change points = [40, 65, 100, 120] (5 segments)
[cpd_nonlin] change points = [40, 65, 100, 120] (fixed m=4, total scatter=4.04)
fixed-m boundary errors (frames): [0, 0, 0, 0] (mean 0.00)
🎯 Both modes recover all four true change points exactly — and cpd_auto
independently figures out that there are 5 segments, without being told.
If Matplotlib is installed, the demo also saves kts_demo.png showing the feature
heatmap with true boundaries (white dashed) and detected boundaries (red)
overlaid.
Things to try next
- Make it harder. Increase the per-cluster noise (
scale=0.4) until segmentation starts to fail — that's the signal-to-noise limit. - Change the kernel. Swap
"cosine"for"rbf"and watch how the bandwidth $\sigma$ changes sensitivity. - Tune
vmax. Lower it toward 0 to over-segment; raise it to merge. - Plot the scores.
scoresfromcpd_autovs $m$ reveals the elbow that the penalty is exploiting. - Real features. Replace
Xwith CNN features extracted one-per-second from a real video, then map change points back to timestamps.
That's KTS, from the math all the way to a working segmenter. 🎉
Finally, let's step back and look at the research paper this all came from — how to read it, how to reproduce a paper like it, and where KTS is used. 👉