feat(kserve): install KServe
This commit is contained in:
1
kserve/.gitignore
vendored
Normal file
1
kserve/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
values.yaml
|
||||
300
kserve/README.md
Normal file
300
kserve/README.md
Normal file
@@ -0,0 +1,300 @@
|
||||
# KServe
|
||||
|
||||
KServe is a standard Model Inference Platform on Kubernetes for Machine Learning and Generative AI. It provides a standardized way to deploy, serve, and manage ML models across different frameworks.
|
||||
|
||||
## Features
|
||||
|
||||
- **Multi-Framework Support**: TensorFlow, PyTorch, scikit-learn, XGBoost, Hugging Face, Triton, and more
|
||||
- **Deployment Modes**:
|
||||
- **RawDeployment (Standard)**: Uses native Kubernetes Deployments without Knative
|
||||
- **Serverless (Knative)**: Auto-scaling with scale-to-zero capability
|
||||
- **Model Storage**: Support for S3, GCS, Azure Blob, PVC, and more
|
||||
- **Inference Protocols**: REST and gRPC
|
||||
- **Advanced Features**: Canary deployments, traffic splitting, explainability, outlier detection
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes cluster (installed via `just k8s::install`)
|
||||
- Longhorn storage (installed via `just longhorn::install`)
|
||||
- **cert-manager** (required, installed via `just cert-manager::install`)
|
||||
- MinIO (optional, for S3-compatible model storage via `just minio::install`)
|
||||
- Prometheus (optional, for monitoring via `just prometheus::install`)
|
||||
|
||||
## Installation
|
||||
|
||||
### Basic Installation
|
||||
|
||||
```bash
|
||||
# Install cert-manager (required)
|
||||
just cert-manager::install
|
||||
|
||||
# Install KServe with default settings (RawDeployment mode)
|
||||
just kserve::install
|
||||
```
|
||||
|
||||
During installation, you will be prompted for:
|
||||
|
||||
- **Prometheus Monitoring**: Whether to enable ServiceMonitor (if Prometheus is installed)
|
||||
|
||||
The domain for inference endpoints is configured via the `KSERVE_DOMAIN` environment variable (default: `cluster.local`).
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Key environment variables (set via `.env.local` or environment):
|
||||
|
||||
```bash
|
||||
KSERVE_NAMESPACE=kserve # Namespace for KServe
|
||||
KSERVE_CHART_VERSION=v0.15.0 # KServe Helm chart version
|
||||
KSERVE_DEPLOYMENT_MODE=RawDeployment # Deployment mode (RawDeployment or Knative)
|
||||
KSERVE_DOMAIN=cluster.local # Base domain for inference endpoints
|
||||
MONITORING_ENABLED=true # Enable Prometheus monitoring
|
||||
MINIO_NAMESPACE=minio # MinIO namespace (if using MinIO)
|
||||
```
|
||||
|
||||
### Domain Configuration
|
||||
|
||||
KServe uses the `KSERVE_DOMAIN` to construct URLs for inference endpoints.
|
||||
|
||||
**Internal Access Only (Default):**
|
||||
|
||||
```bash
|
||||
KSERVE_DOMAIN=cluster.local
|
||||
```
|
||||
|
||||
- InferenceServices are accessible only within the cluster
|
||||
- URLs: `http://<service-name>.<namespace>.svc.cluster.local`
|
||||
- No external Ingress configuration needed
|
||||
- Recommended for development and testing
|
||||
|
||||
**External Access:**
|
||||
|
||||
```bash
|
||||
KSERVE_DOMAIN=example.com
|
||||
```
|
||||
|
||||
- InferenceServices are accessible from outside the cluster
|
||||
- URLs: `https://<service-name>.<namespace>.example.com`
|
||||
- Requires Traefik Ingress configuration
|
||||
- DNS records must point to your cluster
|
||||
- Recommended for production deployments
|
||||
|
||||
## Usage
|
||||
|
||||
### Check Status
|
||||
|
||||
```bash
|
||||
# View status of KServe components
|
||||
just kserve::status
|
||||
|
||||
# View controller logs
|
||||
just kserve::logs
|
||||
```
|
||||
|
||||
### Deploy a Model
|
||||
|
||||
Create an `InferenceService` resource:
|
||||
|
||||
```yaml
|
||||
apiVersion: serving.kserve.io/v1beta1
|
||||
kind: InferenceService
|
||||
metadata:
|
||||
name: sklearn-iris
|
||||
namespace: default
|
||||
spec:
|
||||
predictor:
|
||||
sklearn:
|
||||
storageUri: s3://models/sklearn/iris
|
||||
```
|
||||
|
||||
Apply the resource:
|
||||
|
||||
```bash
|
||||
kubectl apply -f inferenceservice.yaml
|
||||
```
|
||||
|
||||
### Access Inference Endpoint
|
||||
|
||||
```bash
|
||||
# Get inference service URL
|
||||
kubectl get inferenceservice sklearn-iris
|
||||
```
|
||||
|
||||
**For cluster.local (internal access):**
|
||||
|
||||
```bash
|
||||
# From within the cluster
|
||||
curl -X POST http://sklearn-iris.default.svc.cluster.local/v1/models/sklearn-iris:predict \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"instances": [[6.8, 2.8, 4.8, 1.4]]}'
|
||||
```
|
||||
|
||||
**For external domain:**
|
||||
|
||||
```bash
|
||||
# From anywhere (requires DNS and Ingress configuration)
|
||||
curl -X POST https://sklearn-iris.default.example.com/v1/models/sklearn-iris:predict \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"instances": [[6.8, 2.8, 4.8, 1.4]]}'
|
||||
```
|
||||
|
||||
## Storage Configuration
|
||||
|
||||
### Using MinIO (S3-compatible)
|
||||
|
||||
If MinIO is installed, KServe will automatically configure S3 credentials:
|
||||
|
||||
```bash
|
||||
# Storage secret is created automatically during installation
|
||||
kubectl get secret kserve-s3-credentials -n kserve
|
||||
```
|
||||
|
||||
**External Secrets Integration:**
|
||||
|
||||
- When External Secrets Operator is available:
|
||||
- Credentials are retrieved directly from Vault at `minio/admin`
|
||||
- ExternalSecret resource syncs credentials to Kubernetes Secret
|
||||
- Secret includes KServe-specific annotations for S3 endpoint configuration
|
||||
- No duplicate storage needed - references existing MinIO credentials
|
||||
- When External Secrets Operator is not available:
|
||||
- Credentials are retrieved from MinIO Secret
|
||||
- Kubernetes Secret is created directly with annotations
|
||||
- Credentials are also backed up to Vault at `kserve/storage` if available
|
||||
|
||||
Models can be stored in MinIO buckets:
|
||||
|
||||
```bash
|
||||
# Create a bucket for models
|
||||
just minio::create-bucket models
|
||||
|
||||
# Upload model files to MinIO
|
||||
# Then reference in InferenceService: s3://models/path/to/model
|
||||
```
|
||||
|
||||
### Using Other Storage
|
||||
|
||||
KServe supports various storage backends:
|
||||
|
||||
- **S3**: AWS S3 or compatible services
|
||||
- **GCS**: Google Cloud Storage
|
||||
- **Azure**: Azure Blob Storage
|
||||
- **PVC**: Kubernetes Persistent Volume Claims
|
||||
- **HTTP/HTTPS**: Direct URLs
|
||||
|
||||
## Supported Frameworks
|
||||
|
||||
The following serving runtimes are enabled by default:
|
||||
|
||||
- **scikit-learn**: sklearn models
|
||||
- **XGBoost**: XGBoost models
|
||||
- **MLServer**: Multi-framework server (sklearn, XGBoost, etc.)
|
||||
- **Triton**: NVIDIA Triton Inference Server
|
||||
- **TensorFlow**: TensorFlow models
|
||||
- **PyTorch**: PyTorch models via TorchServe
|
||||
- **Hugging Face**: Transformer models
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Custom Serving Runtimes
|
||||
|
||||
You can create custom `ClusterServingRuntime` or `ServingRuntime` resources for specialized model servers.
|
||||
|
||||
### Prometheus Monitoring
|
||||
|
||||
When monitoring is enabled, KServe controller metrics are exposed and scraped by Prometheus:
|
||||
|
||||
```bash
|
||||
# View metrics in Grafana
|
||||
# Metrics include: inference request rates, latencies, error rates
|
||||
```
|
||||
|
||||
## Deployment Modes
|
||||
|
||||
### RawDeployment (Standard)
|
||||
|
||||
- Uses standard Kubernetes Deployments, Services, and Ingress
|
||||
- No Knative dependency
|
||||
- Simpler setup, more control over resources
|
||||
- Manual scaling configuration required
|
||||
|
||||
### Serverless (Knative)
|
||||
|
||||
- Requires Knative Serving installation
|
||||
- Auto-scaling with scale-to-zero
|
||||
- Advanced traffic management
|
||||
- Better resource utilization for sporadic workloads
|
||||
|
||||
## Examples
|
||||
|
||||
### Iris Classification with MLflow
|
||||
|
||||
A complete end-to-end example demonstrating model serving with KServe:
|
||||
|
||||
- Train an Iris classification model in JupyterHub
|
||||
- Register the model to MLflow Model Registry
|
||||
- Deploy the registered model with KServe InferenceService
|
||||
- Test inference using v2 protocol from JupyterHub notebooks and Kubernetes Jobs
|
||||
|
||||
This example demonstrates:
|
||||
- Converting MLflow artifact paths to KServe storageUri
|
||||
- Using MLflow format runtime (with automatic dependency installation)
|
||||
- Testing with both single and batch predictions
|
||||
- Using v2 Open Inference Protocol
|
||||
|
||||
See: [`examples/kserve-mlflow-iris`](../examples/kserve-mlflow-iris/README.md)
|
||||
|
||||
## Uninstallation
|
||||
|
||||
```bash
|
||||
# Remove KServe (keeps CRDs for safety)
|
||||
just kserve::uninstall
|
||||
```
|
||||
|
||||
This will:
|
||||
|
||||
- Uninstall KServe resources Helm chart
|
||||
- Uninstall KServe CRDs
|
||||
- Delete storage secrets
|
||||
- Delete namespace
|
||||
|
||||
**Warning**: Uninstalling will remove all InferenceService resources.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Check Controller Logs
|
||||
|
||||
```bash
|
||||
just kserve::logs
|
||||
```
|
||||
|
||||
### View InferenceService Status
|
||||
|
||||
```bash
|
||||
kubectl get inferenceservice -A
|
||||
kubectl describe inferenceservice <name> -n <namespace>
|
||||
```
|
||||
|
||||
### Check Predictor Pods
|
||||
|
||||
```bash
|
||||
kubectl get pods -l serving.kserve.io/inferenceservice=<name>
|
||||
kubectl logs <pod-name>
|
||||
```
|
||||
|
||||
### Storage Issues
|
||||
|
||||
If models fail to download:
|
||||
|
||||
```bash
|
||||
# Check storage initializer logs
|
||||
kubectl logs <pod-name> -c storage-initializer
|
||||
|
||||
# Verify S3 credentials
|
||||
kubectl get secret kserve-s3-credentials -n kserve -o yaml
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [KServe Documentation](https://kserve.github.io/website/)
|
||||
- [KServe GitHub](https://github.com/kserve/kserve)
|
||||
- [KServe Examples](https://github.com/kserve/kserve/tree/master/docs/samples)
|
||||
- [Supported ML Frameworks](https://kserve.github.io/website/latest/modelserving/v1beta1/serving_runtime/)
|
||||
264
kserve/justfile
Normal file
264
kserve/justfile
Normal file
@@ -0,0 +1,264 @@
|
||||
set fallback := true
|
||||
|
||||
export KSERVE_NAMESPACE := env("KSERVE_NAMESPACE", "kserve")
|
||||
export KSERVE_CHART_VERSION := env("KSERVE_CHART_VERSION", "v0.16.0")
|
||||
export KSERVE_DEPLOYMENT_MODE := env("KSERVE_DEPLOYMENT_MODE", "RawDeployment")
|
||||
export KSERVE_DOMAIN := env("KSERVE_DOMAIN", "cluster.local")
|
||||
export MONITORING_ENABLED := env("MONITORING_ENABLED", "")
|
||||
export PROMETHEUS_NAMESPACE := env("PROMETHEUS_NAMESPACE", "monitoring")
|
||||
export MINIO_NAMESPACE := env("MINIO_NAMESPACE", "minio")
|
||||
export EXTERNAL_SECRETS_NAMESPACE := env("EXTERNAL_SECRETS_NAMESPACE", "external-secrets")
|
||||
export K8S_VAULT_NAMESPACE := env("K8S_VAULT_NAMESPACE", "vault")
|
||||
|
||||
[private]
|
||||
default:
|
||||
@just --list --unsorted --list-submodules
|
||||
|
||||
# Create namespace
|
||||
create-namespace:
|
||||
@kubectl get namespace ${KSERVE_NAMESPACE} &>/dev/null || \
|
||||
kubectl create namespace ${KSERVE_NAMESPACE}
|
||||
|
||||
# Delete namespace
|
||||
delete-namespace:
|
||||
@kubectl delete namespace ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
|
||||
# Install KServe CRDs
|
||||
install-crds:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Installing KServe CRDs..."
|
||||
helm upgrade --cleanup-on-fail --install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd \
|
||||
--version ${KSERVE_CHART_VERSION} -n ${KSERVE_NAMESPACE} --create-namespace --wait
|
||||
echo "KServe CRDs installed successfully"
|
||||
|
||||
# Uninstall KServe CRDs
|
||||
uninstall-crds:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Uninstalling KServe CRDs..."
|
||||
helm uninstall kserve-crd -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
echo "KServe CRDs uninstalled"
|
||||
|
||||
# Setup S3 storage secret for model storage
|
||||
setup-storage:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Setting up S3 storage secret for KServe..."
|
||||
just create-namespace
|
||||
|
||||
if helm status external-secrets -n ${EXTERNAL_SECRETS_NAMESPACE} &>/dev/null; then
|
||||
echo "External Secrets Operator detected. Creating ExternalSecret..."
|
||||
echo "Using MinIO credentials from Vault (minio/admin)..."
|
||||
|
||||
kubectl delete secret kserve-s3-credentials -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
kubectl delete externalsecret kserve-s3-external-secret -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
|
||||
gomplate -f storage-external-secret.gomplate.yaml | kubectl apply -f -
|
||||
|
||||
echo "Waiting for ExternalSecret to sync..."
|
||||
kubectl wait --for=condition=Ready externalsecret/kserve-s3-external-secret \
|
||||
-n ${KSERVE_NAMESPACE} --timeout=60s
|
||||
echo "ExternalSecret synced successfully"
|
||||
else
|
||||
echo "External Secrets not available. Creating Kubernetes Secret directly..."
|
||||
|
||||
if ! kubectl get secret minio -n ${MINIO_NAMESPACE} &>/dev/null; then
|
||||
echo "Error: MinIO root credentials not found"
|
||||
echo "Please install MinIO first with 'just minio::install'"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
accesskey=$(kubectl get secret minio -n ${MINIO_NAMESPACE} \
|
||||
-o jsonpath='{.data.rootUser}' | base64 --decode)
|
||||
secretkey=$(kubectl get secret minio -n ${MINIO_NAMESPACE} \
|
||||
-o jsonpath='{.data.rootPassword}' | base64 --decode)
|
||||
|
||||
kubectl delete secret kserve-s3-credentials -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
|
||||
kubectl create secret generic kserve-s3-credentials -n ${KSERVE_NAMESPACE} \
|
||||
--from-literal=AWS_ACCESS_KEY_ID="${accesskey}" \
|
||||
--from-literal=AWS_SECRET_ACCESS_KEY="${secretkey}"
|
||||
|
||||
kubectl annotate secret kserve-s3-credentials -n ${KSERVE_NAMESPACE} \
|
||||
serving.kserve.io/s3-endpoint="minio.${MINIO_NAMESPACE}.svc.cluster.local:9000" \
|
||||
serving.kserve.io/s3-usehttps="0" \
|
||||
serving.kserve.io/s3-region="us-east-1" \
|
||||
serving.kserve.io/s3-useanoncredential="false" \
|
||||
--overwrite
|
||||
echo "Kubernetes Secret created"
|
||||
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null; then
|
||||
just vault::put kserve/storage accesskey="${accesskey}" secretkey="${secretkey}"
|
||||
echo "Storage credentials also stored in Vault for backup"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "S3 storage secret created successfully"
|
||||
|
||||
# Delete storage secret
|
||||
delete-storage:
|
||||
@kubectl delete secret kserve-s3-credentials -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
@kubectl delete externalsecret kserve-s3-external-secret -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
|
||||
# Install KServe
|
||||
install:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Installing KServe..."
|
||||
just create-namespace
|
||||
|
||||
# Check cert-manager prerequisite
|
||||
if ! kubectl get namespace cert-manager &>/dev/null; then
|
||||
echo "Error: cert-manager is not installed"
|
||||
echo "Please install cert-manager first with 'just cert-manager::install'"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Waiting for cert-manager webhook to be ready..."
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=webhook \
|
||||
-n cert-manager --timeout=300s
|
||||
echo "cert-manager webhook is ready"
|
||||
|
||||
if helm status kube-prometheus-stack -n ${PROMETHEUS_NAMESPACE} &>/dev/null; then
|
||||
if [ -z "${MONITORING_ENABLED}" ]; then
|
||||
if gum confirm "Enable Prometheus monitoring (ServiceMonitor)?"; then
|
||||
MONITORING_ENABLED="true"
|
||||
else
|
||||
MONITORING_ENABLED="false"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
MONITORING_ENABLED="false"
|
||||
fi
|
||||
|
||||
just install-crds
|
||||
|
||||
if kubectl get service minio -n ${MINIO_NAMESPACE} &>/dev/null; then
|
||||
echo "MinIO detected. Setting up S3 storage..."
|
||||
just setup-storage
|
||||
else
|
||||
echo "MinIO not found. Skipping S3 storage setup."
|
||||
echo "Models will need to use other storage options."
|
||||
fi
|
||||
|
||||
echo "Generating Helm values..."
|
||||
gomplate -f values.gomplate.yaml -o values.yaml
|
||||
|
||||
echo "Installing KServe controller..."
|
||||
helm upgrade --cleanup-on-fail --install kserve \
|
||||
oci://ghcr.io/kserve/charts/kserve --version ${KSERVE_CHART_VERSION} \
|
||||
-n ${KSERVE_NAMESPACE} --wait --timeout=10m -f values.yaml
|
||||
|
||||
if [ "${MONITORING_ENABLED}" = "true" ]; then
|
||||
echo "Enabling Prometheus monitoring for namespace ${KSERVE_NAMESPACE}..."
|
||||
kubectl label namespace ${KSERVE_NAMESPACE} buun.channel/enable-monitoring=true --overwrite
|
||||
echo "✓ Monitoring enabled"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== KServe installed ==="
|
||||
echo "Namespace: ${KSERVE_NAMESPACE}"
|
||||
echo "Deployment mode: ${KSERVE_DEPLOYMENT_MODE}"
|
||||
echo "Domain: ${KSERVE_DOMAIN}"
|
||||
echo ""
|
||||
echo "To deploy an inference service, create an InferenceService resource"
|
||||
echo "See: https://kserve.github.io/website/latest/get_started/first_isvc/"
|
||||
|
||||
# Upgrade KServe
|
||||
upgrade:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Upgrading KServe..."
|
||||
|
||||
if helm status kube-prometheus-stack -n ${PROMETHEUS_NAMESPACE} &>/dev/null; then
|
||||
if [ -z "${MONITORING_ENABLED}" ]; then
|
||||
if gum confirm "Enable Prometheus monitoring (ServiceMonitor)?"; then
|
||||
MONITORING_ENABLED="true"
|
||||
else
|
||||
MONITORING_ENABLED="false"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
MONITORING_ENABLED="false"
|
||||
fi
|
||||
|
||||
echo "Upgrading KServe CRDs..."
|
||||
just install-crds
|
||||
|
||||
echo "Generating Helm values..."
|
||||
gomplate -f values.gomplate.yaml -o values.yaml
|
||||
|
||||
echo "Upgrading KServe controller..."
|
||||
helm upgrade kserve oci://ghcr.io/kserve/charts/kserve \
|
||||
--version ${KSERVE_CHART_VERSION} -n ${KSERVE_NAMESPACE} --wait --timeout=10m \
|
||||
-f values.yaml
|
||||
|
||||
echo "KServe upgraded successfully"
|
||||
|
||||
# Uninstall KServe
|
||||
uninstall:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Uninstalling KServe..."
|
||||
helm uninstall kserve -n ${KSERVE_NAMESPACE} --ignore-not-found
|
||||
just uninstall-crds
|
||||
just delete-storage
|
||||
just delete-namespace
|
||||
echo "KServe uninstalled"
|
||||
|
||||
# Get KServe controller logs
|
||||
logs:
|
||||
@kubectl logs -n ${KSERVE_NAMESPACE} -l control-plane=kserve-controller-manager --tail=100 -f
|
||||
|
||||
# Get status of KServe components
|
||||
status:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "=== KServe Components Status ==="
|
||||
echo ""
|
||||
echo "Namespace: ${KSERVE_NAMESPACE}"
|
||||
echo ""
|
||||
echo "Pods:"
|
||||
kubectl get pods -n ${KSERVE_NAMESPACE}
|
||||
echo ""
|
||||
echo "Services:"
|
||||
kubectl get services -n ${KSERVE_NAMESPACE}
|
||||
echo ""
|
||||
echo "InferenceServices:"
|
||||
kubectl get inferenceservices -A
|
||||
|
||||
# Convert MLflow artifact path to KServe storageUri
|
||||
storage-uri artifact_path='':
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
if [ -z "{{ artifact_path }}" ]; then
|
||||
read -p "Enter MLflow artifact path from Model Registry (e.g., mlflow-artifacts:/2/models/MODEL_ID/artifacts): " artifact_path
|
||||
else
|
||||
artifact_path="{{ artifact_path }}"
|
||||
fi
|
||||
|
||||
# Convert mlflow-artifacts:/ to s3://mlflow/
|
||||
storage_uri="${artifact_path/mlflow-artifacts:/s3://mlflow}"
|
||||
|
||||
# Remove trailing filename if present (e.g., MLmodel, model.pkl)
|
||||
if [[ "$storage_uri" == */artifacts/* ]] && [[ "$storage_uri" != */artifacts ]]; then
|
||||
# Remove filename after /artifacts/
|
||||
storage_uri=$(echo "$storage_uri" | sed 's|/artifacts/.*|/artifacts|')
|
||||
fi
|
||||
|
||||
# Check if this is a run-based path (not model registry path)
|
||||
if [[ "$storage_uri" =~ s3://mlflow/[0-9]+/[a-f0-9]{32}/artifacts ]]; then
|
||||
echo "Warning: This appears to be a run-based path, not a model registry path."
|
||||
echo "KServe requires the model registry path which can be found in:"
|
||||
echo " MLflow UI → Models → [Model Name] → [Version] → artifact_path"
|
||||
echo ""
|
||||
echo "Expected format: mlflow-artifacts:/EXPERIMENT_ID/models/MODEL_ID/artifacts"
|
||||
echo "Your input: $artifact_path"
|
||||
echo ""
|
||||
echo "Output (may not work): $storage_uri"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "$storage_uri"
|
||||
33
kserve/storage-external-secret.gomplate.yaml
Normal file
33
kserve/storage-external-secret.gomplate.yaml
Normal file
@@ -0,0 +1,33 @@
|
||||
apiVersion: external-secrets.io/v1
|
||||
kind: ExternalSecret
|
||||
metadata:
|
||||
name: kserve-s3-external-secret
|
||||
namespace: {{ .Env.KSERVE_NAMESPACE }}
|
||||
spec:
|
||||
refreshInterval: 1h
|
||||
secretStoreRef:
|
||||
name: vault-secret-store
|
||||
kind: ClusterSecretStore
|
||||
target:
|
||||
name: kserve-s3-credentials
|
||||
creationPolicy: Owner
|
||||
template:
|
||||
type: Opaque
|
||||
metadata:
|
||||
annotations:
|
||||
serving.kserve.io/s3-endpoint: "minio.{{ .Env.MINIO_NAMESPACE }}.svc.cluster.local:9000"
|
||||
serving.kserve.io/s3-usehttps: "0"
|
||||
serving.kserve.io/s3-region: "us-east-1"
|
||||
serving.kserve.io/s3-useanoncredential: "false"
|
||||
data:
|
||||
AWS_ACCESS_KEY_ID: "{{ `{{ .accesskey }}` }}"
|
||||
AWS_SECRET_ACCESS_KEY: "{{ `{{ .secretkey }}` }}"
|
||||
data:
|
||||
- secretKey: accesskey
|
||||
remoteRef:
|
||||
key: minio/admin
|
||||
property: username
|
||||
- secretKey: secretkey
|
||||
remoteRef:
|
||||
key: minio/admin
|
||||
property: password
|
||||
84
kserve/values.gomplate.yaml
Normal file
84
kserve/values.gomplate.yaml
Normal file
@@ -0,0 +1,84 @@
|
||||
# KServe Helm Chart Values
|
||||
# Generated using gomplate
|
||||
|
||||
kserve:
|
||||
version: v0.16.0
|
||||
|
||||
controller:
|
||||
# Deployment mode: "Standard" for RawDeployment (no Knative), "Knative" for Serverless
|
||||
deploymentMode: {{ .Env.KSERVE_DEPLOYMENT_MODE }}
|
||||
|
||||
gateway:
|
||||
domain: {{ .Env.KSERVE_DOMAIN }}
|
||||
{{- if eq .Env.KSERVE_DEPLOYMENT_MODE "Standard" }}
|
||||
ingressGateway:
|
||||
className: traefik
|
||||
{{- end }}
|
||||
|
||||
# Enable Prometheus metrics
|
||||
{{- if eq .Env.MONITORING_ENABLED "true" }}
|
||||
metrics:
|
||||
port: 8080
|
||||
podAnnotations:
|
||||
prometheus.io/scrape: "true"
|
||||
prometheus.io/port: "8080"
|
||||
prometheus.io/path: "/metrics"
|
||||
{{- end }}
|
||||
|
||||
# Storage initializer configuration
|
||||
storage:
|
||||
s3:
|
||||
enabled: true
|
||||
{{- if ne .Env.MINIO_NAMESPACE "" }}
|
||||
endpoint: "minio.{{ .Env.MINIO_NAMESPACE }}.svc.cluster.local:9000"
|
||||
useHttps: false
|
||||
region: "us-east-1"
|
||||
verifySSL: false
|
||||
useVirtualBucket: false
|
||||
useAnonymousCredential: false
|
||||
{{- end }}
|
||||
storageInitializer:
|
||||
resources:
|
||||
requests:
|
||||
memory: "100Mi"
|
||||
cpu: "100m"
|
||||
limits:
|
||||
memory: "1Gi"
|
||||
cpu: "1"
|
||||
|
||||
# Model agent configuration
|
||||
agent:
|
||||
image: kserve/agent
|
||||
tag: v0.16.0
|
||||
|
||||
# Router configuration
|
||||
router:
|
||||
image: kserve/router
|
||||
tag: v0.16.0
|
||||
|
||||
# Serving runtimes - enable commonly used ones
|
||||
servingRuntimes:
|
||||
sklearn:
|
||||
enabled: true
|
||||
xgboost:
|
||||
enabled: true
|
||||
mlserver:
|
||||
enabled: true
|
||||
triton:
|
||||
enabled: true
|
||||
tensorflow:
|
||||
enabled: true
|
||||
pytorch:
|
||||
enabled: true
|
||||
huggingfaceserver:
|
||||
enabled: true
|
||||
|
||||
{{- if eq .Env.MONITORING_ENABLED "true" }}
|
||||
# ServiceMonitor for Prometheus metrics collection
|
||||
serviceMonitor:
|
||||
enabled: true
|
||||
namespace: {{ .Env.KSERVE_NAMESPACE }}
|
||||
labels:
|
||||
release: kube-prometheus-stack
|
||||
interval: 30s
|
||||
{{- end }}
|
||||
Reference in New Issue
Block a user