LiteLLM
Unified LLM gateway and proxy for accessing multiple LLM providers through a single OpenAI-compatible API:
- Multi-Provider Support: Anthropic, OpenAI, Ollama, Mistral, Groq, Cohere, Azure, Bedrock, Vertex AI
- OpenAI-Compatible API: Drop-in replacement for OpenAI SDK
- Load Balancing & Fallback: Automatic failover between providers
- Virtual Keys: Generate API keys for users with usage tracking
- Cost Tracking: Monitor spending across providers
- Rate Limiting: Control usage per key/user
Prerequisites
- Kubernetes cluster (k3s)
- External Secrets Operator (required)
- PostgreSQL cluster (CloudNativePG)
- Vault for secrets management
Configuration Overview
LiteLLM requires two types of configuration:
- Environment variables (
.env.local): Host, namespace, chart version - Model definitions (
models.yaml): LLM providers and models to expose
This separation allows flexible model configuration without modifying environment files.
Installation
Step 1: Create Model Configuration
Copy the example configuration and customize:
cp litellm/models.example.yaml litellm/models.yaml
Edit litellm/models.yaml to configure your models:
# Anthropic Claude
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-3-7-sonnet-latest
api_key: os.environ/ANTHROPIC_API_KEY
# OpenAI
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
# Ollama (local models - no API key required)
- model_name: llama3
litellm_params:
model: ollama/llama3.2
api_base: http://ollama.ollama:11434
Step 2: Set API Keys
For each provider that requires an API key:
just litellm::set-api-key anthropic
just litellm::set-api-key openai
Or interactively select the provider:
just litellm::set-api-key
API keys are stored in Vault and synced to Kubernetes via External Secrets Operator.
Step 3: Install LiteLLM
just litellm::install
You will be prompted for:
- LiteLLM host (FQDN): e.g.,
litellm.example.com - Enable Prometheus monitoring: If kube-prometheus-stack is installed
Model Management
Add a Model Interactively
just litellm::add-model
This guides you through:
- Selecting a provider
- Choosing a model
- Setting a model alias
Remove a Model
just litellm::remove-model
List Configured Models
just litellm::list-models
Example Output
Configured models:
- claude-sonnet: anthropic/claude-3-7-sonnet-latest
- claude-haiku: anthropic/claude-3-5-haiku-latest
- llama3: ollama/llama3.2
API Key Management
Set API Key for a Provider
just litellm::set-api-key anthropic
Get API Key (from Vault)
just litellm::get-api-key anthropic
Verify All Required Keys
just litellm::verify-api-keys
Environment Variables
| Variable | Default | Description |
|---|---|---|
LITELLM_NAMESPACE |
litellm |
Kubernetes namespace |
LITELLM_CHART_VERSION |
0.1.825 |
Helm chart version |
LITELLM_HOST |
(prompt) | External hostname (FQDN) |
OLLAMA_NAMESPACE |
ollama |
Ollama namespace for local models |
MONITORING_ENABLED |
(prompt) | Enable Prometheus ServiceMonitor |
Authentication
LiteLLM has two types of authentication:
- API Access: Uses Master Key or Virtual Keys for programmatic access
- Admin UI: Uses Keycloak SSO for browser-based access
Enable SSO for Admin UI
After installing LiteLLM, enable Keycloak authentication for the Admin UI:
just litellm::setup-oidc
This will:
- Create a Keycloak client for LiteLLM
- Store the client secret in Vault
- Configure LiteLLM with OIDC environment variables
- Upgrade the deployment with SSO enabled
Disable SSO
To disable SSO and return to unauthenticated Admin UI access:
just litellm::disable-oidc
SSO Configuration Details
| Setting | Value |
|---|---|
| Callback URL | https://<litellm-host>/sso/callback |
| Authorization Endpoint | https://<keycloak-host>/realms/<realm>/protocol/openid-connect/auth |
| Token Endpoint | https://<keycloak-host>/realms/<realm>/protocol/openid-connect/token |
| Userinfo Endpoint | https://<keycloak-host>/realms/<realm>/protocol/openid-connect/userinfo |
| Scope | openid email profile |
User Management
SSO users are automatically created in LiteLLM when they first log in. By default, new users are assigned the internal_user_viewer role (read-only access).
List Users
just litellm::list-users
Assign Role to User
Interactively select user and role:
just litellm::assign-role
Or specify directly:
just litellm::assign-role buun proxy_admin
User Roles
| Role | Description |
|---|---|
proxy_admin |
Full admin access (manage keys, users, models, settings) |
proxy_admin_viewer |
Admin read-only access |
internal_user |
Can create and manage own API keys |
internal_user_viewer |
Read-only access (default for SSO users) |
Note: To manage API keys in the Admin UI, users need at least internal_user or proxy_admin role.
API Usage
LiteLLM exposes an OpenAI-compatible API at https://your-litellm-host/.
Get Master Key
just litellm::master-key
Generate Virtual Key for a User
just litellm::generate-virtual-key buun
This will prompt for a model selection and generate an API key for the specified user. Select all to grant access to all models.
OpenAI SDK Example
from openai import OpenAI
client = OpenAI(
base_url="https://litellm.example.com",
api_key="sk-..." # Virtual key or master key
)
response = client.chat.completions.create(
model="claude-sonnet", # Use your model alias
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
curl Example
curl https://litellm.example.com/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Team Management
Teams allow you to group users and configure team-specific settings such as Langfuse projects for observability.
Create a Team
just litellm::create-team
Or with a name directly:
just litellm::create-team name="project-alpha"
List Teams
just litellm::list-teams
Get Team Info
just litellm::get-team team_id=<team-id>
Delete a Team
just litellm::delete-team team_id=<team-id>
Generate Virtual Key for a Team
just litellm::generate-team-key
This will prompt for team selection and username. The generated key inherits the team's settings (including Langfuse project configuration).
Langfuse Integration
Langfuse provides LLM observability with tracing, monitoring, and analytics. LiteLLM can send traces to Langfuse for every API call.
Enable Langfuse Integration
During installation (just litellm::install) or upgrade (just litellm::upgrade), you will be prompted to enable Langfuse integration. Alternatively:
just litellm::setup-langfuse
You will need Langfuse API keys (Public Key and Secret Key) from the Langfuse UI: Settings > API Keys.
Set Langfuse API Keys
just litellm::set-langfuse-keys
Disable Langfuse Integration
just litellm::disable-langfuse
Per-Team Langfuse Projects
Each team can have its own Langfuse project for isolated observability. This is useful when different projects or departments need separate trace data.
Setup Flow
-
Create a team:
just litellm::create-team name="project-alpha" -
Create a Langfuse project for the team and get API keys from Langfuse UI
-
Configure the team's Langfuse project:
just litellm::set-team-langfuse-projectThis will prompt for team selection and Langfuse API keys.
-
Generate a key for the team:
just litellm::generate-team-key -
Use the team key for API calls - traces will be sent to the team's Langfuse project
Architecture
LiteLLM Proxy
|
+-- Default Langfuse Project (for keys without team)
|
+-- Team A --> Langfuse Project A
|
+-- Team B --> Langfuse Project B
Environment Variables
| Variable | Default | Description |
|---|---|---|
LITELLM_LANGFUSE_INTEGRATION_ENABLED |
(prompt) | Enable Langfuse integration |
LANGFUSE_HOST |
(prompt) | Langfuse instance hostname |
Supported Providers
| Provider | Model Prefix | API Key Required |
|---|---|---|
| Anthropic | anthropic/ |
Yes |
| OpenAI | openai/ |
Yes |
| Ollama | ollama/ |
No (uses api_base) |
| Mistral | mistral/ |
Yes |
| Groq | groq/ |
Yes |
| Cohere | cohere/ |
Yes |
| Azure OpenAI | azure/ |
Yes |
| AWS Bedrock | bedrock/ |
Yes |
| Google Vertex AI | vertexai/ |
Yes |
Architecture
External Users/Applications
|
Cloudflare Tunnel (HTTPS)
|
Traefik Ingress (HTTPS)
|
LiteLLM Proxy (HTTP inside cluster)
|-- PostgreSQL (usage tracking, virtual keys)
|-- Redis (caching, rate limiting)
|-- External Secrets (API keys from Vault)
|
+-- Anthropic API
+-- OpenAI API
+-- Ollama (local)
+-- Other providers...
Upgrade
After modifying models.yaml or updating API keys:
just litellm::upgrade
Uninstall
just litellm::uninstall
This removes:
- Helm release and all Kubernetes resources
- Namespace
- External Secrets
Note: The following resources are NOT deleted:
- PostgreSQL database (use
just postgres::delete-db litellm) - API keys in Vault
Full Cleanup
To remove everything including database and Vault secrets:
just litellm::cleanup
Troubleshooting
Check Pod Status
kubectl get pods -n litellm
Expected pods:
litellm-*- LiteLLM proxylitellm-redis-master-0- Redis instance
View Logs
kubectl logs -n litellm deployment/litellm --tail=100
API Key Not Working
Verify the ExternalSecret is synced:
kubectl get externalsecret -n litellm
kubectl get secret apikey -n litellm -o yaml
Model Not Found
Ensure the model is configured in models.yaml and the deployment is updated:
just litellm::list-models
just litellm::upgrade
Provider API Errors
Check if the API key is set correctly:
just litellm::get-api-key anthropic
If empty, set the API key:
just litellm::set-api-key anthropic
Database Connection Issues
Check PostgreSQL connectivity:
kubectl exec -n litellm deployment/litellm -- \
psql -h postgres-cluster-rw.postgres -U litellm -d litellm -c "SELECT 1"
Configuration Files
| File | Description |
|---|---|
models.yaml |
Model definitions (user-created, gitignored) |
models.example.yaml |
Example model configuration |
litellm-values.gomplate.yaml |
Helm values template |
apikey-external-secret.gomplate.yaml |
ExternalSecret for API keys |
keycloak-auth-external-secret.gomplate.yaml |
ExternalSecret for Keycloak OIDC |
langfuse-auth-external-secret.gomplate.yaml |
ExternalSecret for Langfuse API keys |
Security Considerations
- Pod Security Standards: Namespace configured with baseline enforcement
(LiteLLM's Prisma requires write access to
/.cache, which preventsrestrictedlevel) - Secrets Management: API keys stored in Vault, synced via External Secrets Operator
- Virtual Keys: Generate scoped API keys for users instead of sharing master key
- TLS/HTTPS: All external traffic encrypted via Traefik Ingress
- Database Credentials: Unique PostgreSQL user with minimal privileges