548 lines
12 KiB
Markdown
548 lines
12 KiB
Markdown
# LiteLLM
|
|
|
|
Unified LLM gateway and proxy for accessing multiple LLM providers through a single OpenAI-compatible API:
|
|
|
|
- **Multi-Provider Support**: Anthropic, OpenAI, Ollama, Mistral, Groq, Cohere, Azure, Bedrock, Vertex AI
|
|
- **OpenAI-Compatible API**: Drop-in replacement for OpenAI SDK
|
|
- **Load Balancing & Fallback**: Automatic failover between providers
|
|
- **Virtual Keys**: Generate API keys for users with usage tracking
|
|
- **Cost Tracking**: Monitor spending across providers
|
|
- **Rate Limiting**: Control usage per key/user
|
|
|
|
## Prerequisites
|
|
|
|
- Kubernetes cluster (k3s)
|
|
- External Secrets Operator (required)
|
|
- PostgreSQL cluster (CloudNativePG)
|
|
- Vault for secrets management
|
|
|
|
## Configuration Overview
|
|
|
|
LiteLLM requires two types of configuration:
|
|
|
|
1. **Environment variables** (`.env.local`): Host, namespace, chart version
|
|
2. **Model definitions** (`models.yaml`): LLM providers and models to expose
|
|
|
|
This separation allows flexible model configuration without modifying environment files.
|
|
|
|
## Installation
|
|
|
|
### Step 1: Create Model Configuration
|
|
|
|
Copy the example configuration and customize:
|
|
|
|
```bash
|
|
cp litellm/models.example.yaml litellm/models.yaml
|
|
```
|
|
|
|
Edit `litellm/models.yaml` to configure your models:
|
|
|
|
```yaml
|
|
# Anthropic Claude
|
|
- model_name: claude-sonnet
|
|
litellm_params:
|
|
model: anthropic/claude-3-7-sonnet-latest
|
|
api_key: os.environ/ANTHROPIC_API_KEY
|
|
|
|
# OpenAI
|
|
- model_name: gpt-4o
|
|
litellm_params:
|
|
model: openai/gpt-4o
|
|
api_key: os.environ/OPENAI_API_KEY
|
|
|
|
# Ollama (local models - no API key required)
|
|
- model_name: llama3
|
|
litellm_params:
|
|
model: ollama/llama3.2
|
|
api_base: http://ollama.ollama:11434
|
|
```
|
|
|
|
### Step 2: Set API Keys
|
|
|
|
For each provider that requires an API key:
|
|
|
|
```bash
|
|
just litellm::set-api-key anthropic
|
|
just litellm::set-api-key openai
|
|
```
|
|
|
|
Or interactively select the provider:
|
|
|
|
```bash
|
|
just litellm::set-api-key
|
|
```
|
|
|
|
API keys are stored in Vault and synced to Kubernetes via External Secrets Operator.
|
|
|
|
### Step 3: Install LiteLLM
|
|
|
|
```bash
|
|
just litellm::install
|
|
```
|
|
|
|
You will be prompted for:
|
|
|
|
- **LiteLLM host (FQDN)**: e.g., `litellm.example.com`
|
|
- **Enable Prometheus monitoring**: If kube-prometheus-stack is installed
|
|
|
|
## Model Management
|
|
|
|
### Add a Model Interactively
|
|
|
|
```bash
|
|
just litellm::add-model
|
|
```
|
|
|
|
This guides you through:
|
|
|
|
1. Selecting a provider
|
|
2. Choosing a model
|
|
3. Setting a model alias
|
|
|
|
### Remove a Model
|
|
|
|
```bash
|
|
just litellm::remove-model
|
|
```
|
|
|
|
### List Configured Models
|
|
|
|
```bash
|
|
just litellm::list-models
|
|
```
|
|
|
|
### Example Output
|
|
|
|
```text
|
|
Configured models:
|
|
- claude-sonnet: anthropic/claude-3-7-sonnet-latest
|
|
- claude-haiku: anthropic/claude-3-5-haiku-latest
|
|
- llama3: ollama/llama3.2
|
|
```
|
|
|
|
## API Key Management
|
|
|
|
### Set API Key for a Provider
|
|
|
|
```bash
|
|
just litellm::set-api-key anthropic
|
|
```
|
|
|
|
### Get API Key (from Vault)
|
|
|
|
```bash
|
|
just litellm::get-api-key anthropic
|
|
```
|
|
|
|
### Verify All Required Keys
|
|
|
|
```bash
|
|
just litellm::verify-api-keys
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
| -------- | ------- | ----------- |
|
|
| `LITELLM_NAMESPACE` | `litellm` | Kubernetes namespace |
|
|
| `LITELLM_CHART_VERSION` | `0.1.825` | Helm chart version |
|
|
| `LITELLM_HOST` | (prompt) | External hostname (FQDN) |
|
|
| `OLLAMA_NAMESPACE` | `ollama` | Ollama namespace for local models |
|
|
| `MONITORING_ENABLED` | (prompt) | Enable Prometheus ServiceMonitor |
|
|
|
|
## Authentication
|
|
|
|
LiteLLM has two types of authentication:
|
|
|
|
1. **API Access**: Uses Master Key or Virtual Keys for programmatic access
|
|
2. **Admin UI**: Uses Keycloak SSO for browser-based access
|
|
|
|
### Enable SSO for Admin UI
|
|
|
|
After installing LiteLLM, enable Keycloak authentication for the Admin UI:
|
|
|
|
```bash
|
|
just litellm::setup-oidc
|
|
```
|
|
|
|
This will:
|
|
|
|
- Create a Keycloak client for LiteLLM
|
|
- Store the client secret in Vault
|
|
- Configure LiteLLM with OIDC environment variables
|
|
- Upgrade the deployment with SSO enabled
|
|
|
|
### Disable SSO
|
|
|
|
To disable SSO and return to unauthenticated Admin UI access:
|
|
|
|
```bash
|
|
just litellm::disable-oidc
|
|
```
|
|
|
|
### SSO Configuration Details
|
|
|
|
| Setting | Value |
|
|
| ------- | ----- |
|
|
| Callback URL | `https://<litellm-host>/sso/callback` |
|
|
| Authorization Endpoint | `https://<keycloak-host>/realms/<realm>/protocol/openid-connect/auth` |
|
|
| Token Endpoint | `https://<keycloak-host>/realms/<realm>/protocol/openid-connect/token` |
|
|
| Userinfo Endpoint | `https://<keycloak-host>/realms/<realm>/protocol/openid-connect/userinfo` |
|
|
| Scope | `openid email profile` |
|
|
|
|
## User Management
|
|
|
|
SSO users are automatically created in LiteLLM when they first log in. By default, new users are assigned the `internal_user_viewer` role (read-only access).
|
|
|
|
### List Users
|
|
|
|
```bash
|
|
just litellm::list-users
|
|
```
|
|
|
|
### Assign Role to User
|
|
|
|
Interactively select user and role:
|
|
|
|
```bash
|
|
just litellm::assign-role
|
|
```
|
|
|
|
Or specify directly:
|
|
|
|
```bash
|
|
just litellm::assign-role buun proxy_admin
|
|
```
|
|
|
|
### User Roles
|
|
|
|
| Role | Description |
|
|
| ---- | ----------- |
|
|
| `proxy_admin` | Full admin access (manage keys, users, models, settings) |
|
|
| `proxy_admin_viewer` | Admin read-only access |
|
|
| `internal_user` | Can create and manage own API keys |
|
|
| `internal_user_viewer` | Read-only access (default for SSO users) |
|
|
|
|
**Note**: To manage API keys in the Admin UI, users need at least `internal_user` or `proxy_admin` role.
|
|
|
|
## API Usage
|
|
|
|
LiteLLM exposes an OpenAI-compatible API at `https://your-litellm-host/`.
|
|
|
|
### Get Master Key
|
|
|
|
```bash
|
|
just litellm::master-key
|
|
```
|
|
|
|
### Generate Virtual Key for a User
|
|
|
|
```bash
|
|
just litellm::generate-virtual-key buun
|
|
```
|
|
|
|
This will prompt for a model selection and generate an API key for the specified user. Select `all` to grant access to all models.
|
|
|
|
### OpenAI SDK Example
|
|
|
|
```python
|
|
from openai import OpenAI
|
|
|
|
client = OpenAI(
|
|
base_url="https://litellm.example.com",
|
|
api_key="sk-..." # Virtual key or master key
|
|
)
|
|
|
|
response = client.chat.completions.create(
|
|
model="claude-sonnet", # Use your model alias
|
|
messages=[{"role": "user", "content": "Hello!"}]
|
|
)
|
|
print(response.choices[0].message.content)
|
|
```
|
|
|
|
### curl Example
|
|
|
|
```bash
|
|
curl https://litellm.example.com/v1/chat/completions \
|
|
-H "Authorization: Bearer sk-..." \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "claude-sonnet",
|
|
"messages": [{"role": "user", "content": "Hello!"}]
|
|
}'
|
|
```
|
|
|
|
## Team Management
|
|
|
|
Teams allow you to group users and configure team-specific settings such as Langfuse projects for observability.
|
|
|
|
### Create a Team
|
|
|
|
```bash
|
|
just litellm::create-team
|
|
```
|
|
|
|
Or with a name directly:
|
|
|
|
```bash
|
|
just litellm::create-team name="project-alpha"
|
|
```
|
|
|
|
### List Teams
|
|
|
|
```bash
|
|
just litellm::list-teams
|
|
```
|
|
|
|
### Get Team Info
|
|
|
|
```bash
|
|
just litellm::get-team team_id=<team-id>
|
|
```
|
|
|
|
### Delete a Team
|
|
|
|
```bash
|
|
just litellm::delete-team team_id=<team-id>
|
|
```
|
|
|
|
### Generate Virtual Key for a Team
|
|
|
|
```bash
|
|
just litellm::generate-team-key
|
|
```
|
|
|
|
This will prompt for team selection and username. The generated key inherits the team's settings (including Langfuse project configuration).
|
|
|
|
## Langfuse Integration
|
|
|
|
[Langfuse](https://langfuse.com/) provides LLM observability with tracing, monitoring, and analytics. LiteLLM can send traces to Langfuse for every API call.
|
|
|
|
### Enable Langfuse Integration
|
|
|
|
During installation (`just litellm::install`) or upgrade (`just litellm::upgrade`), you will be prompted to enable Langfuse integration. Alternatively:
|
|
|
|
```bash
|
|
just litellm::setup-langfuse
|
|
```
|
|
|
|
You will need Langfuse API keys (Public Key and Secret Key) from the Langfuse UI: **Settings > API Keys**.
|
|
|
|
### Set Langfuse API Keys
|
|
|
|
```bash
|
|
just litellm::set-langfuse-keys
|
|
```
|
|
|
|
### Disable Langfuse Integration
|
|
|
|
```bash
|
|
just litellm::disable-langfuse
|
|
```
|
|
|
|
### Per-Team Langfuse Projects
|
|
|
|
Each team can have its own Langfuse project for isolated observability. This is useful when different projects or departments need separate trace data.
|
|
|
|
#### Setup Flow
|
|
|
|
1. Create a team:
|
|
|
|
```bash
|
|
just litellm::create-team name="project-alpha"
|
|
```
|
|
|
|
2. Create a Langfuse project for the team and get API keys from Langfuse UI
|
|
|
|
3. Configure the team's Langfuse project:
|
|
|
|
```bash
|
|
just litellm::set-team-langfuse-project
|
|
```
|
|
|
|
This will prompt for team selection and Langfuse API keys.
|
|
|
|
4. Generate a key for the team:
|
|
|
|
```bash
|
|
just litellm::generate-team-key
|
|
```
|
|
|
|
5. Use the team key for API calls - traces will be sent to the team's Langfuse project
|
|
|
|
#### Architecture
|
|
|
|
```plain
|
|
LiteLLM Proxy
|
|
|
|
|
+-- Default Langfuse Project (for keys without team)
|
|
|
|
|
+-- Team A --> Langfuse Project A
|
|
|
|
|
+-- Team B --> Langfuse Project B
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
| -------- | ------- | ----------- |
|
|
| `LITELLM_LANGFUSE_INTEGRATION_ENABLED` | (prompt) | Enable Langfuse integration |
|
|
| `LANGFUSE_HOST` | (prompt) | Langfuse instance hostname |
|
|
|
|
## Supported Providers
|
|
|
|
| Provider | Model Prefix | API Key Required |
|
|
| -------- | ------------ | ---------------- |
|
|
| Anthropic | `anthropic/` | Yes |
|
|
| OpenAI | `openai/` | Yes |
|
|
| Ollama | `ollama/` | No (uses `api_base`) |
|
|
| Mistral | `mistral/` | Yes |
|
|
| Groq | `groq/` | Yes |
|
|
| Cohere | `cohere/` | Yes |
|
|
| Azure OpenAI | `azure/` | Yes |
|
|
| AWS Bedrock | `bedrock/` | Yes |
|
|
| Google Vertex AI | `vertexai/` | Yes |
|
|
|
|
## Architecture
|
|
|
|
```plain
|
|
External Users/Applications
|
|
|
|
|
Cloudflare Tunnel (HTTPS)
|
|
|
|
|
Traefik Ingress (HTTPS)
|
|
|
|
|
LiteLLM Proxy (HTTP inside cluster)
|
|
|-- PostgreSQL (usage tracking, virtual keys)
|
|
|-- Redis (caching, rate limiting)
|
|
|-- External Secrets (API keys from Vault)
|
|
|
|
|
+-- Anthropic API
|
|
+-- OpenAI API
|
|
+-- Ollama (local)
|
|
+-- Other providers...
|
|
```
|
|
|
|
## Upgrade
|
|
|
|
After modifying `models.yaml` or updating API keys:
|
|
|
|
```bash
|
|
just litellm::upgrade
|
|
```
|
|
|
|
## Uninstall
|
|
|
|
```bash
|
|
just litellm::uninstall
|
|
```
|
|
|
|
This removes:
|
|
|
|
- Helm release and all Kubernetes resources
|
|
- Namespace
|
|
- External Secrets
|
|
|
|
**Note**: The following resources are NOT deleted:
|
|
|
|
- PostgreSQL database (use `just postgres::delete-db litellm`)
|
|
- API keys in Vault
|
|
|
|
### Full Cleanup
|
|
|
|
To remove everything including database and Vault secrets:
|
|
|
|
```bash
|
|
just litellm::cleanup
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Check Pod Status
|
|
|
|
```bash
|
|
kubectl get pods -n litellm
|
|
```
|
|
|
|
Expected pods:
|
|
|
|
- `litellm-*` - LiteLLM proxy
|
|
- `litellm-redis-master-0` - Redis instance
|
|
|
|
### View Logs
|
|
|
|
```bash
|
|
kubectl logs -n litellm deployment/litellm --tail=100
|
|
```
|
|
|
|
### API Key Not Working
|
|
|
|
Verify the ExternalSecret is synced:
|
|
|
|
```bash
|
|
kubectl get externalsecret -n litellm
|
|
kubectl get secret apikey -n litellm -o yaml
|
|
```
|
|
|
|
### Model Not Found
|
|
|
|
Ensure the model is configured in `models.yaml` and the deployment is updated:
|
|
|
|
```bash
|
|
just litellm::list-models
|
|
just litellm::upgrade
|
|
```
|
|
|
|
### Provider API Errors
|
|
|
|
Check if the API key is set correctly:
|
|
|
|
```bash
|
|
just litellm::get-api-key anthropic
|
|
```
|
|
|
|
If empty, set the API key:
|
|
|
|
```bash
|
|
just litellm::set-api-key anthropic
|
|
```
|
|
|
|
### Database Connection Issues
|
|
|
|
Check PostgreSQL connectivity:
|
|
|
|
```bash
|
|
kubectl exec -n litellm deployment/litellm -- \
|
|
psql -h postgres-cluster-rw.postgres -U litellm -d litellm -c "SELECT 1"
|
|
```
|
|
|
|
## Configuration Files
|
|
|
|
| File | Description |
|
|
| ---- | ----------- |
|
|
| `models.yaml` | Model definitions (user-created, gitignored) |
|
|
| `models.example.yaml` | Example model configuration |
|
|
| `litellm-values.gomplate.yaml` | Helm values template |
|
|
| `apikey-external-secret.gomplate.yaml` | ExternalSecret for API keys |
|
|
| `keycloak-auth-external-secret.gomplate.yaml` | ExternalSecret for Keycloak OIDC |
|
|
| `langfuse-auth-external-secret.gomplate.yaml` | ExternalSecret for Langfuse API keys |
|
|
|
|
## Security Considerations
|
|
|
|
- **Pod Security Standards**: Namespace configured with **baseline** enforcement
|
|
(LiteLLM's Prisma requires write access to `/.cache`, which prevents `restricted` level)
|
|
- **Secrets Management**: API keys stored in Vault, synced via External Secrets Operator
|
|
- **Virtual Keys**: Generate scoped API keys for users instead of sharing master key
|
|
- **TLS/HTTPS**: All external traffic encrypted via Traefik Ingress
|
|
- **Database Credentials**: Unique PostgreSQL user with minimal privileges
|
|
|
|
## References
|
|
|
|
- [LiteLLM Documentation](https://docs.litellm.ai/)
|
|
- [LiteLLM GitHub](https://github.com/BerriAI/litellm)
|
|
- [LiteLLM Helm Chart](https://github.com/BerriAI/litellm/tree/main/deploy/charts/litellm-helm)
|
|
- [Supported Models](https://docs.litellm.ai/docs/providers)
|
|
- [Virtual Keys](https://docs.litellm.ai/docs/proxy/virtual_keys)
|
|
- [Langfuse Integration](https://docs.litellm.ai/docs/proxy/logging#langfuse)
|
|
- [Team-based Logging](https://docs.litellm.ai/docs/proxy/team_logging)
|