Prefect: orchestrate retraining

A production model must retrain as new data arrives — but "retraining" isn't one step. It's pull data → train → evaluate → promote if better → maybe alert, with retries when a step fails and a schedule so it runs every night unattended. Stringing that together with cron and shell scripts is fragile. Prefect turns the pipeline into observable, retrying, schedulable Python — a workflow orchestrator.

Setup: pip install prefect. Follow-along — runs locally as shown; a schedule needs a Prefect server/cloud.

Tasks and flows

Prefect has two decorators:

  • @task — one step (pull data, train, evaluate). Prefect tracks it: logs, retries, caching, timing.
  • @flow — the function that wires tasks into a pipeline (a DAG). It's the unit you run and schedule.

From pipeline/retrain_flow.py:

from prefect import flow, task

@task(retries=2, retry_delay_seconds=5)        # auto-retry flaky steps
def extract():
    return load_dataset()

@task
def train(data):
    return SentimentModel().fit(*data)

@task
def evaluate(model, data):
    return model.accuracy(*data)

@task
def promote(model, accuracy, threshold=0.9):
    if accuracy >= threshold:                  # an automated quality gate
        model.save("model.json")
        return f"promoted (acc={accuracy:.3f})"
    return f"rejected (acc={accuracy:.3f} < {threshold})"

@flow(name="retrain-sentiment", log_prints=True)
def retrain_pipeline():
    data = extract()
    model = train(data)
    acc = evaluate(model, data)
    print(promote(model, acc))

Notice the quality gate in promote: the new model only ships if it clears a bar. This is how you retrain automatically without risking a bad model reaching production — a critical safety valve in any automated pipeline.

Running it

cd code
python pipeline/retrain_flow.py

Expected output (Prefect narrates each task as it runs):

14:32:01.245 | INFO | prefect.engine - Created flow run 'splendid-otter' for flow 'retrain-sentiment'
14:32:01.310 | INFO | Task run 'extract-0' - Finished in state Completed()
14:32:01.402 | INFO | Task run 'train-0' - Finished in state Completed()
14:32:01.455 | INFO | Task run 'evaluate-0' - Finished in state Completed()
14:32:01.501 | INFO | Task run 'promote-0' - Finished in state Completed()
14:32:01.503 | INFO | Flow run 'splendid-otter' - promoted (acc=1.000)
14:32:01.540 | INFO | Flow run 'splendid-otter' - Finished in state Completed()

The model scored 1.000 ≥ 0.9, so the gate promoted it. Each task shows its own state — if train had thrown, Prefect would mark it Failed, retry extract per its policy, and you'd see exactly which step broke and why.

Why an orchestrator beats cron + scripts

Don't be confused: Celery vs. Prefect — tasks vs. workflows. Celery runs independent tasks off a queue (great for "score this batch"). Prefect runs workflows — multi-step pipelines with dependencies between steps, where step B needs step A's output, with retries, scheduling, and a UI showing the whole DAG. Use Celery for fire-and-forget jobs; use Prefect for "the nightly retraining pipeline."

A cron job that calls a shell script gives you none of: retries, visibility into which step failed, passing data between steps, backfills, or alerting. Prefect gives you all of it, in Python you already know.

Scheduling it

To run unattended every night at 2 AM, deploy it with a schedule:

prefect deploy pipeline/retrain_flow.py:retrain_pipeline --cron "0 2 * * *"
prefect worker start --pool default               # a worker executes scheduled runs

Now the pipeline runs nightly, the Prefect UI shows every run's status and logs, and failures can page you. That's the "retrain when it drifts" arrow of the lifecycle loop (Chapter 0) made real — often triggered by the drift check from Chapter 12.

The orchestrator landscape

  • Prefect — Pythonic, modern, gentle learning curve (this chapter).
  • Airflow — the incumbent; powerful, ubiquitous in data engineering, heavier.
  • Dagster — asset-centric, strong typing and data-awareness.
  • Kubeflow Pipelines — Kubernetes-native ML pipelines.

They all express the same idea — a DAG of steps with scheduling and observability — so the concept transfers. Airflow is the one you'll most often see on job descriptions; Prefect is the friendliest to learn it on.

The takeaway

Prefect turns retraining into an orchestrated workflow: @task steps wired by an @flow, with automatic retries, a quality gate that only promotes good models, per-step visibility, and cron scheduling for unattended nightly runs. It's cron with a brain — and the engine of the lifecycle's retraining loop. Our pipeline produces models; next, let's make those models fast and portable for inference with ONNX. 👉