The MLflow model registry
Why this chapter matters: Chapter 19 used MLflow to track experiments. But once you've found a good model, how do you get it into production safely — and roll back if it misbehaves? That's the job of the model registry: a versioned catalog of models with a pointer to "the one that's live."
The problem it solves
Without a registry, "deploy a new model" means editing code or copying files onto servers — error-prone and hard to undo. With a registry:
- every trained model becomes a numbered version,
- one version is marked production (via an alias),
- the serving code always asks for "the production version,"
so deploying a better model is a one-line registry operation, not a code change — and rolling back is just pointing the alias at the previous version.
train run ──► register ──► version 1 ┐
train run ──► register ──► version 2 ├─ alias "production" ──► API loads this
train run ──► register ──► version 3 ┘ (move the alias to deploy/rollback)
A prerequisite: a database backend
The registry needs a database-backed tracking store — the plain file store
(file:./mlruns) can't do it. Use SQLite locally (or Postgres/MySQL in
production):
export MLFLOW_TRACKING_URI=sqlite:///mlflow.db
The helper
registry.py wraps the three operations — register, promote, load — using MLflow 3
aliases (production):
"""MLflow Model Registry helpers: register a trained model, mark a version as
'production', and load whatever the current production version is.
The registry is how you decouple *deploying a better model* from *changing code*:
the API always asks for the production version; promoting a new model is a registry
operation, not a redeploy.
NOTE: the registry needs a database-backed tracking store (e.g. sqlite or a
tracking server) — the plain file store does not support it. Set:
export MLFLOW_TRACKING_URI=sqlite:///mlflow.db
This module uses MLflow 3 *aliases* ('production'); on MLflow 2 use stages
(transition_model_version_stage(..., stage='Production')).
"""
from __future__ import annotations
import pickle
import mlflow
from mlflow.tracking import MlflowClient
REGISTERED_NAME = "news-recommender"
ALIAS = "production"
def register(run_id, artifact_file="newsreco.pkl", name=REGISTERED_NAME):
"""Register a run's logged model artifact as a new model version."""
client = MlflowClient()
try:
client.create_registered_model(name)
except Exception:
pass # already exists
mv = client.create_model_version(
name=name, source=f"runs:/{run_id}/{artifact_file}", run_id=run_id)
return int(mv.version)
def promote(version, name=REGISTERED_NAME, alias=ALIAS):
"""Point the 'production' alias at a specific version."""
MlflowClient().set_registered_model_alias(name, alias, version)
def load_production(name=REGISTERED_NAME, alias=ALIAS):
"""Download + unpickle whatever version currently holds the 'production' alias.
Returns the {'recommender', 'ranker'} bundle, or None if nothing is promoted."""
client = MlflowClient()
try:
mv = client.get_model_version_by_alias(name, alias)
except Exception:
return None
local = mlflow.artifacts.download_artifacts(mv.source)
with open(local, "rb") as f:
return pickle.load(f)
The full flow, verified
Train → register → promote → load back the production model:
from newsreco import registry
from newsreco.train import run
from newsreco.config import Config
import mlflow
cfg = Config() # MLFLOW_TRACKING_URI=sqlite:///mlflow.db
run(cfg) # trains + logs a run
client = mlflow.tracking.MlflowClient()
exp = client.get_experiment_by_name(cfg.experiment)
run_id = client.search_runs([exp.experiment_id],
order_by=["attributes.start_time DESC"])[0].info.run_id
version = registry.register(run_id) # -> new model version
registry.promote(version) # point 'production' alias at it
bundle = registry.load_production() # load whatever is in production
Output:
latest run: 0e3f01f13906473c90733c96ca30e343
registered version: 1
promoted version 1 to alias 'production'
loaded production model -> 300 articles, ranker: True
sample rec for U106: ['N106', 'N98', 'N100']
The model was registered, promoted, then loaded straight back from the registry and used to recommend — the exact loop a deployment pipeline runs.
Wiring it into the API
The API will prefer the registry's production model when you ask it to (otherwise it uses the local artifact, then a fresh train):
# newsreco/api.py — _load_or_train()
if os.environ.get("NEWSRECO_USE_REGISTRY") == "1":
from .registry import load_production
bundle = load_production()
if bundle:
return bundle["recommender"], bundle.get("ranker")
# ... else local pickle ... else train fresh
So deploying a newly trained, better model is:
# 1. train a candidate (logs a run, saves artifact)
python -m newsreco.train
# 2. register + promote it (after checking its MLflow metrics look good)
python -c "from newsreco import registry, ...; v=registry.register(run_id); registry.promote(v)"
# 3. restart the API with NEWSRECO_USE_REGISTRY=1 -> it serves the new model
No code change, and rollback is registry.promote(previous_version).
MLflow versions note
This uses MLflow 3's aliases (set_registered_model_alias /
get_model_version_by_alias). On MLflow 2.x the equivalent is stages:
client.transition_model_version_stage(name, version, stage="Production") and
client.get_latest_versions(name, stages=["Production"]). Same idea, older API.
Production registry hygiene
- Gate promotion on metrics — only promote if the new run beats production on your offline metrics (and ideally an A/B test).
- Keep history — never delete old versions; they're your rollback path.
- Tag versions — record the dataset snapshot, code commit, and owner.
- Automate — a CI job that trains, evaluates, registers, and (if it clears a bar) promotes is the backbone of continuous delivery for ML.
Next, a more advanced model: predicting the next article from reading order. 👉