ML Pipeline Orchestration with Kubeflow on Kubernetes
Your ML team has 47 Jupyter notebooks. 12 should run in order. Nobody remembers which 12. Kubeflow Pipelines fixes this on your existing K8s cluster.
Your ML team has 47 Jupyter notebooks. 12 of them “should run in order.” Nobody remembers which 12.
One fetches data. Another cleans it. A third trains. A fourth evaluates. A fifth deploys. Different repos. Hardcoded paths. Two only work on Sarah’s laptop.
This is not a pipeline. This is a disaster waiting for a deadline.

Why ML Pipelines Are Different
Data pipelines move data from A to B. ETL. Airflow handles this well.
ML pipelines are different. They produce artifacts: trained models, evaluation reports, feature transformers. They need experiment tracking. They need reproducibility.
Data pipelines care about data. ML pipelines care about data AND the model that data produces.
The 5 Core Components
| Step | What It Does | Failure Condition |
|---|---|---|
| 1. Data Prep | Fetch, validate, transform | Schema mismatch? Stop |
| 2. Training | Train model, log to tracker | Training error? Stop |
| 3. Evaluation | Score against holdout set | Below threshold? Stop |
| 4. Registration | Push to model registry | Registry unavailable? Stop |
| 5. Deployment | Canary rollout to production | Health check fails? Rollback |
Each component is a container. Each step has inputs, outputs, and a failure condition.
The DevOps Parallel
GitHub Actions: code change > build > test > deploy
Kubeflow: data change > train > evaluate > deploy
Same concept. Different trigger. CI/CD produces deployable software. ML pipelines produce deployable models.
Kubeflow vs Airflow
| Feature | Kubeflow | Airflow |
|---|---|---|
| Built for | ML pipelines | Data pipelines |
| Runs on | Kubernetes native | Standalone |
| GPU scheduling | Native | Needs plugins |
| Model artifacts | First-class | Not built-in |
| Experiment tracking | Integrated | External only |
You already run Kubernetes. Kubeflow runs on top of it. No new infrastructure. No new monitoring. No new RBAC.
This is Part 14 of the MLOps for DevOps Engineers series. For weekly updates, join the newsletter.