# KServe + MLflow + JupyterHub: Iris Classification Example This example demonstrates an end-to-end machine learning workflow using: - **JupyterHub**: Interactive development, model training, and testing - **MLflow**: Model tracking and registry - **MinIO**: Artifact storage (S3-compatible) - **KServe**: Model serving and inference ## Workflow Overview 1. **πŸ““ Train & Register** (`01-train-and-register.ipynb`) - Train model in JupyterHub, register to MLflow 2. **πŸš€ Deploy** (`02-deploy-model.yaml`) - Deploy model with KServe InferenceService 3. **πŸ§ͺ Test from Notebook** (`03-test-inference.ipynb`) - Test inference from JupyterHub (Recommended) 4. **πŸ”§ Test from Pod** (`04-test-inference-job.yaml`) - Automated testing from Kubernetes Job ## Architecture ```plain β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ JupyterHub │────>β”‚ MLflow │────>β”‚ MinIO β”‚<────│ KServe β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (S3) β”‚ β”‚ InferenceServiceβ”‚ β”‚ 1. Train β”‚ β”‚ Registerβ”‚ β”‚ Store β”‚ β”‚ 2. Deploy β”‚ β”‚ Model β”‚ β”‚ β”‚ β”‚ Model β”‚ β”‚ & Serve β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ 3. Test from Notebook (Recommended) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ 4. Test from Pod β”‚ (Alternative) β”‚ v β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Kubernetes β”‚ β”‚ Test Job β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ## Prerequisites Ensure the following components are installed: ```bash # Check installations kubectl get pods -n jupyterhub kubectl get pods -n mlflow kubectl get pods -n minio kubectl get pods -n kserve ``` ## Step 1: Train and Register Model in JupyterHub 1. **Access JupyterHub**: Access JupyterHub at the configured JUPYTERHUB_HOST 2. **Upload the Notebook**: - Upload `01-train-and-register.ipynb` to your JupyterHub workspace 3. **Set Environment Variables** (in the notebook or terminal): ```bash # MLflow authentication (required if MLflow has authentication enabled) export MLFLOW_TRACKING_USERNAME=your-username export MLFLOW_TRACKING_PASSWORD=your-password ``` Note: MLFLOW_TRACKING_URI uses the default cluster-internal URL and does not need to be set. 4. **Run the Notebook**: - Execute all cells in `01-train-and-register.ipynb` - The model will be automatically registered to MLflow Model Registry 5. **Verify in MLflow UI**: - Access MLflow UI at the configured MLFLOW_HOST - Navigate to "Models" β†’ "iris-classifier" - Click on the model version (e.g., "Version 1") - Note the **artifact_path** displayed (e.g., `mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts`) ## Step 2: Deploy Model with KServe 1. **Get the Model Registry Path**: In MLflow UI, navigate to: - **Models** β†’ **iris-classifier** β†’ **Version 1** - Copy the **artifact_path** from the model details - Example: `mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts` **Important**: Use the artifact_path from the **Model Registry** (contains `/models/`), NOT the run-based path from the experiment runs. 2. **Update the InferenceService YAML**: Use the helper command to convert the MLflow artifact path to KServe storageUri: ```bash just kserve::storage-uri "mlflow-artifacts:/2/models/m-28620b840353444385fa8e62335decf5/artifacts" # Output: s3://mlflow/2/models/m-28620b840353444385fa8e62335decf5/artifacts ``` Edit `02-deploy-model.yaml` and replace the `storageUri` with the output: ```yaml storageUri: s3://mlflow/2/models/m-28620b840353444385fa8e62335decf5/artifacts ``` **Note**: The default configuration uses `mlflow` format, which automatically installs dependencies from `requirements.txt`. This ensures compatibility but may take longer to start (initial container startup installs packages). 3. **Deploy the InferenceService**: ```bash kubectl apply -f 02-deploy-model.yaml ``` 4. **Verify Deployment**: ```bash # Check InferenceService status kubectl get inferenceservice iris-classifier -n kserve # Wait for it to be ready (STATUS should show "Ready") # Note: First deployment may take 5-10 minutes due to dependency installation kubectl wait --for=condition=Ready inferenceservice/iris-classifier -n kserve --timeout=600s # Check the pods kubectl get pods -l serving.kserve.io/inferenceservice=iris-classifier -n kserve # Check logs if needed kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve -c kserve-container ``` ## Step 3: Test from JupyterHub (Recommended) 1. **Upload the Test Notebook**: - Upload `03-test-inference.ipynb` to your JupyterHub workspace 2. **Run the Notebook**: - Execute all cells in `03-test-inference.ipynb` - The notebook will: - Send prediction requests to the KServe endpoint - Test single and batch predictions - Display results with expected vs actual comparisons - Allow you to try custom inputs 3. **Expected Results**: ```plain Test Case 1: Typical Setosa Features: [5.1, 3.5, 1.4, 0.2] Expected: Iris Setosa Predicted: Iris Setosa Status: βœ“ PASS ``` ## Step 4: Test from Kubernetes Pod (Alternative) After testing in JupyterHub, you can also test from Kubernetes Pods for automated testing or CI/CD integration. ### Option 1: Automated Test with Python (Recommended) ```bash # Run the test job kubectl apply -f 04-test-inference-job.yaml # Check logs kubectl logs job/test-iris-inference -n kserve # Expected output: # Test Case 1: # Input: [5.1, 3.5, 1.4, 0.2] # Expected: setosa # Predicted: setosa (class 0) # Status: βœ“ PASS ``` ### Option 2: Manual Test from a Pod ```bash # Start a test pod kubectl run test-pod --image=curlimages/curl --rm -it --restart=Never -- sh # Inside the pod, run: curl -X POST \ http://iris-classifier-predictor.kserve.svc.cluster.local/v2/models/iris-classifier/infer \ -H "Content-Type: application/json" \ -d '{"inputs": [{"name": "input-0", "shape": [1, 4], "datatype": "FP64", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' ``` ## Model Prediction Examples ### Single Prediction (v2 Protocol) ```json // Request { "inputs": [ { "name": "input-0", "shape": [1, 4], "datatype": "FP64", "data": [[5.1, 3.5, 1.4, 0.2]] // Sepal length, Sepal width, Petal length, Petal width } ] } // Response { "outputs": [ { "name": "output-0", "shape": [1], "datatype": "INT64", "data": [0] // 0=setosa, 1=versicolor, 2=virginica } ] } ``` ### Batch Prediction (v2 Protocol) ```json // Request { "inputs": [ { "name": "input-0", "shape": [3, 4], "datatype": "FP64", "data": [ [5.1, 3.5, 1.4, 0.2], // Setosa [6.7, 3.0, 5.2, 2.3], // Virginica [5.9, 3.0, 4.2, 1.5] // Versicolor ] } ] } // Response { "outputs": [ { "name": "output-0", "shape": [3], "datatype": "INT64", "data": [0, 2, 1] } ] } ``` ## Troubleshooting ### InferenceService Not Ready ```bash # Check events kubectl describe inferenceservice iris-classifier -n kserve # Check pod logs kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve -c kserve-container ``` ### S3/MinIO Connection Issues ```bash # Verify S3 credentials secret kubectl get secret kserve-s3-credentials -n kserve -o yaml # Test MinIO access from a pod kubectl run minio-test --image=amazon/aws-cli --rm -it --restart=Never -- \ sh -c "AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin aws --endpoint-url=http://minio.minio.svc.cluster.local:9000 s3 ls s3://mlflow/" ``` ### Model Not Found ```bash # Verify the model exists in MinIO Console # Access MinIO Console at the configured MINIO_HOST # Navigate to mlflow bucket and verify the model path # The path should be: EXPERIMENT_ID/models/MODEL_ID/artifacts/ # Example: 2/models/m-28620b840353444385fa8e62335decf5/artifacts/ ``` ### Prediction Errors ```bash # Check model format and KServe runtime compatibility kubectl logs -l serving.kserve.io/inferenceservice=iris-classifier -n kserve ``` ## Cleanup ```bash # Delete InferenceService kubectl delete inferenceservice iris-classifier -n kserve # Delete test job kubectl delete job test-iris-inference -n kserve ``` ## Next Steps - Try different models (XGBoost, TensorFlow, PyTorch) - Add model versioning and A/B testing - Implement canary deployments - Add monitoring and observability - Scale the InferenceService based on load ## References - [KServe Documentation](https://kserve.github.io/website/) - [MLflow Documentation](https://mlflow.org/docs/latest/index.html) - [KServe Model Serving](https://kserve.github.io/website/latest/modelserving/v1beta1/sklearn/v2/)