9.7 KiB
KServe + MLflow + JupyterHub: Iris Classification Example
This example demonstrates an end-to-end machine learning workflow using:
- JupyterHub: Interactive development, model training, and testing
- MLflow: Model tracking and registry
- MinIO: Artifact storage (S3-compatible)
- KServe: Model serving and inference
Workflow Overview
- 📓 Train & Register (
01-train-and-register.ipynb) - Train model in JupyterHub, register to MLflow - 🚀 Deploy (
02-deploy-model.yaml) - Deploy model with KServe InferenceService - 🧪 Test from Notebook (
03-test-inference.ipynb) - Test inference from JupyterHub (Recommended) - 🔧 Test from Pod (
04-test-inference-job.yaml) - Automated testing from Kubernetes Job
Architecture
┌─────────────┐ ┌─────────┐ ┌────────┐ ┌─────────────────┐
│ JupyterHub │────>│ MLflow │────>│ MinIO │<────│ KServe │
│ │ │ │ │ (S3) │ │ InferenceService│
│ 1. Train │ │ Register│ │ Store │ │ 2. Deploy │
│ Model │ │ │ │ Model │ │ & Serve │
└──────┬──────┘ └─────────┘ └────────┘ └──────────┬──────┘
│ │
│ 3. Test from Notebook (Recommended) │
└──────────────────────────────────────────────────────┘
│
│
4. Test from Pod │
(Alternative) │
v
┌──────────────┐
│ Kubernetes │
│ Test Job │
└──────────────┘
Prerequisites
Ensure the following components are installed:
# Check installations
kubectl get pods -n jupyterhub
kubectl get pods -n mlflow
kubectl get pods -n minio
kubectl get pods -n kserve
Step 1: Train and Register Model in JupyterHub
-
Access JupyterHub:
Access JupyterHub at the configured JUPYTERHUB_HOST
-
Upload the Notebook:
- Upload
01-train-and-register.ipynbto your JupyterHub workspace
- Upload
-
Set Environment Variables (in the notebook or terminal):
# MLflow authentication (required if MLflow has authentication enabled) export MLFLOW_TRACKING_USERNAME=your-username export MLFLOW_TRACKING_PASSWORD=your-passwordNote: MLFLOW_TRACKING_URI uses the default cluster-internal URL and does not need to be set.
-
Run the Notebook:
- Execute all cells in
01-train-and-register.ipynb - The model will be automatically registered to MLflow Model Registry
- Execute all cells in
-
Verify in MLflow UI:
- Access MLflow UI at the configured MLFLOW_HOST
- Navigate to "Models" → "iris-classifier"
- Click on the model version (e.g., "Version 1")
- Note the artifact_path displayed (e.g.,
mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts)
Step 2: Deploy Model with KServe
-
Get the Model Registry Path:
In MLflow UI, navigate to:
- Models → iris-classifier → Version 1
- Copy the artifact_path from the model details
- Example:
mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts
Important: Use the artifact_path from the Model Registry (contains
/models/), NOT the run-based path from the experiment runs. -
Update the InferenceService YAML:
Use the helper command to convert the MLflow artifact path to KServe storageUri:
just kserve::storage-uri "mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts" # Output: s3://mlflow/2/models/m-28620b840353444385fa8e62335decf5/artifactsEdit
02-deploy-model.yamland replace thestorageUriwith the output:storageUri: s3://mlflow/2/models/m-28620b840353444385fa8e62335decf5/artifactsNote: The default configuration uses
mlflowformat, which automatically installs dependencies fromrequirements.txt. This ensures compatibility but may take longer to start (initial container startup installs packages). -
Deploy the InferenceService:
kubectl apply -f 02-deploy-model.yaml -
Verify Deployment:
# Check InferenceService status kubectl get inferenceservice iris-classifier -n kserve # Wait for it to be ready (STATUS should show "Ready") # Note: First deployment may take 5-10 minutes due to dependency installation kubectl wait --for=condition=Ready inferenceservice/iris-classifier -n kserve --timeout=600s # Check the pods kubectl get pods -l serving.kserve.io/inferenceservice=iris-classifier -n kserve # Check logs if needed kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve -c kserve-container
Step 3: Test from JupyterHub (Recommended)
-
Upload the Test Notebook:
- Upload
03-test-inference.ipynbto your JupyterHub workspace
- Upload
-
Run the Notebook:
- Execute all cells in
03-test-inference.ipynb - The notebook will:
- Send prediction requests to the KServe endpoint
- Test single and batch predictions
- Display results with expected vs actual comparisons
- Allow you to try custom inputs
- Execute all cells in
-
Expected Results:
Test Case 1: Typical Setosa Features: [5.1, 3.5, 1.4, 0.2] Expected: Iris Setosa Predicted: Iris Setosa Status: ✓ PASS
Step 4: Test from Kubernetes Pod (Alternative)
After testing in JupyterHub, you can also test from Kubernetes Pods for automated testing or CI/CD integration.
Option 1: Automated Test with Python (Recommended)
# Run the test job
kubectl apply -f 04-test-inference-job.yaml
# Check logs
kubectl logs job/test-iris-inference -n kserve
# Expected output:
# Test Case 1:
# Input: [5.1, 3.5, 1.4, 0.2]
# Expected: setosa
# Predicted: setosa (class 0)
# Status: ✓ PASS
Option 2: Manual Test from a Pod
# Start a test pod
kubectl run test-pod --image=curlimages/curl --rm -it --restart=Never -- sh
# Inside the pod, run:
curl -X POST \
http://iris-classifier-predictor.kserve.svc.cluster.local/v2/models/iris-classifier/infer \
-H "Content-Type: application/json" \
-d '{"inputs": [{"name": "input-0", "shape": [1, 4], "datatype": "FP64", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'
Model Prediction Examples
Single Prediction (v2 Protocol)
// Request
{
"inputs": [
{
"name": "input-0",
"shape": [1, 4],
"datatype": "FP64",
"data": [[5.1, 3.5, 1.4, 0.2]] // Sepal length, Sepal width, Petal length, Petal width
}
]
}
// Response
{
"outputs": [
{
"name": "output-0",
"shape": [1],
"datatype": "INT64",
"data": [0] // 0=setosa, 1=versicolor, 2=virginica
}
]
}
Batch Prediction (v2 Protocol)
// Request
{
"inputs": [
{
"name": "input-0",
"shape": [3, 4],
"datatype": "FP64",
"data": [
[5.1, 3.5, 1.4, 0.2], // Setosa
[6.7, 3.0, 5.2, 2.3], // Virginica
[5.9, 3.0, 4.2, 1.5] // Versicolor
]
}
]
}
// Response
{
"outputs": [
{
"name": "output-0",
"shape": [3],
"datatype": "INT64",
"data": [0, 2, 1]
}
]
}
Troubleshooting
InferenceService Not Ready
# Check events
kubectl describe inferenceservice iris-classifier -n kserve
# Check pod logs
kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve -c kserve-container
S3/MinIO Connection Issues
# Verify S3 credentials secret
kubectl get secret kserve-s3-credentials -n kserve -o yaml
# Test MinIO access from a pod
kubectl run minio-test --image=amazon/aws-cli --rm -it --restart=Never -- \
sh -c "AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin aws --endpoint-url=http://minio.minio.svc.cluster.local:9000 s3 ls s3://mlflow/"
Model Not Found
# Verify the model exists in MinIO Console
# Access MinIO Console at the configured MINIO_HOST
# Navigate to mlflow bucket and verify the model path
# The path should be: EXPERIMENT_ID/models/MODEL_ID/artifacts/
# Example: 2/models/m-28620b840353444385fa8e62335decf5/artifacts/
Prediction Errors
# Check model format and KServe runtime compatibility
kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve
Cleanup
# Delete InferenceService
kubectl delete inferenceservice iris-classifier -n kserve
# Delete test job
kubectl delete job test-iris-inference -n kserve
Next Steps
- Try different models (XGBoost, TensorFlow, PyTorch)
- Add model versioning and A/B testing
- Implement canary deployments
- Add monitoring and observability
- Scale the InferenceService based on load