feat(jupyterhub): unlimited max TTL for admin vault token

This commit is contained in:
Masaki Yatsu
2025-09-08 15:52:20 +09:00
parent 2bf82c7f38
commit c82c6aa22b
9 changed files with 367 additions and 455 deletions

View File

@@ -15,13 +15,13 @@ This will prompt for:
- JupyterHub host (FQDN)
- NFS PV usage (if Longhorn is installed)
- NFS server details (if NFS is enabled)
- Vault integration setup
- Vault integration setup (requires root token for initial setup)
### Prerequisites
- Keycloak must be installed and configured
- For NFS storage: Longhorn must be installed
- For Vault integration: Vault must be installed and configured
- For Vault integration: Vault and External Secrets Operator must be installed
- Helm repository must be accessible
## Kernel Images
@@ -52,13 +52,13 @@ Enable/disable profiles using environment variables:
```bash
# Enable buun-stack profile (CPU version)
export JUPYTER_PROFILE_BUUN_STACK_ENABLED=true
JUPYTER_PROFILE_BUUN_STACK_ENABLED=true
# Enable buun-stack CUDA profile (GPU version)
export JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED=true
JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED=true
# Disable default datascience profile
export JUPYTER_PROFILE_DATASCIENCE_ENABLED=false
JUPYTER_PROFILE_DATASCIENCE_ENABLED=false
```
Available profile variables:
@@ -122,35 +122,66 @@ JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
### Overview
Vault integration enables secure secrets management directly from Jupyter notebooks using user-specific Vault tokens. Each user receives their own isolated Vault token during notebook spawn, ensuring complete separation of secrets between users. Users can store and retrieve API keys, database credentials, and other sensitive data securely with automatic token renewal.
Vault integration enables secure secrets management directly from Jupyter notebooks. The system uses:
- **ExternalSecret** to fetch the admin token from Vault
- **Renewable tokens** with unlimited Max TTL to avoid 30-day system limitations
- **Token renewal script** that automatically renews tokens every 12 hours
- **User-specific tokens** created during notebook spawn with isolated access
### Architecture
```plain
┌──────────────────────────────────────────────────────────────────┐
│ JupyterHub Hub Pod │
│ │
│ ┌──────────────┐ ┌────────────────┐ ┌────────────────────┐ │
│ │ Hub │ │ Token Renewer │ │ ExternalSecret │ │
│ │ Container │◄─┤ Sidecar │◄─┤ (mounted as │ │
│ │ │ │ │ │ Secret) │ │
│ └──────────────┘ └────────────────┘ └────────────────────┘ │
│ │ │ ▲ │
│ │ │ │ │
│ ▼ ▼ │ │
│ ┌──────────────────────────────────┐ │ │
│ │ /vault/secrets/vault-token │ │ │
│ │ (Admin token for user creation) │ │ │
│ └──────────────────────────────────┘ │ │
└────────────────────────────────────────────────────┼────────────┘
┌───────────▼──────────┐
│ Vault │
│ secret/jupyterhub/ │
│ vault-token │
└──────────────────────┘
```
### Prerequisites
Vault integration requires:
- Vault server installed and configured
- Keycloak OIDC authentication configured
- External Secrets Operator installed
- ClusterSecretStore configured for Vault
- **Buun-stack kernel images** (standard images don't include Vault integration)
### Setup
Vault integration is configured during JupyterHub installation. You have two options:
#### Option 1: Interactive setup (recommended)
Vault integration is configured during JupyterHub installation:
```bash
just jupyterhub::install
# Answer "yes" when prompted about Vault integration
# Provide Vault root token when prompted
```
#### Option 2: Pre-configured setup
The setup process:
```bash
export JUPYTERHUB_VAULT_INTEGRATION_ENABLED=true
just jupyterhub::install
```
**Note**: The `just jupyterhub::setup-vault-integration` command is called automatically during installation if Vault integration is enabled. This configures Vault Agent for automatic token renewal and user-specific token management.
1. Creates `jupyterhub-admin` policy with necessary permissions including `sudo` for orphan token creation
2. Creates renewable admin token with 24h TTL and unlimited Max TTL
3. Stores token in Vault at `secret/jupyterhub/vault-token`
4. Creates ExternalSecret to fetch token from Vault
5. Deploys token renewal sidecar for automatic renewal
### Usage in Notebooks
@@ -182,11 +213,31 @@ secrets.delete('api-keys', field='github') # Delete only github field
### Security Features
- **User isolation**: Each user receives a unique Vault token with access only to their own secrets
- **Automatic token renewal**: Both admin and user tokens are automatically renewed by Vault Agent
- **Vault Agent integration**: JupyterHub admin token is automatically renewed using Kubernetes authentication
- **User isolation**: Each user receives an orphan token with access only to their namespace
- **Automatic renewal**: Token renewal script renews admin token every 12 hours
- **ExternalSecret integration**: Admin token fetched securely from Vault
- **Orphan tokens**: User tokens are orphan tokens, not limited by parent policy restrictions
- **Audit trail**: All secret access is logged in Vault
- **Individual policies**: Each user has their own Vault policy restricting access to their namespace
### Token Management
#### Admin Token
The admin token is managed through:
1. **Creation**: `just jupyterhub::create-jupyterhub-vault-token` creates renewable token
2. **Storage**: Stored in Vault at `secret/jupyterhub/vault-token`
3. **Retrieval**: ExternalSecret fetches and mounts as Kubernetes Secret
4. **Renewal**: `vault-token-renewer.sh` script renews every 12 hours
#### User Tokens
User tokens are created dynamically:
1. **Pre-spawn hook** reads admin token from `/vault/secrets/vault-token`
2. **Creates user policy** `jupyter-user-{username}` with restricted access
3. **Creates orphan token** with user policy (requires `sudo` permission)
4. **Sets environment variable** `NOTEBOOK_VAULT_TOKEN` in notebook container
## Storage Options
@@ -199,9 +250,9 @@ Uses Kubernetes PersistentVolumes for user home directories.
For shared storage across nodes, configure NFS:
```bash
export JUPYTERHUB_NFS_PV_ENABLED=true
export JUPYTER_NFS_IP=192.168.10.1
export JUPYTER_NFS_PATH=/volume1/drive1/jupyter
JUPYTERHUB_NFS_PV_ENABLED=true
JUPYTER_NFS_IP=192.168.10.1
JUPYTER_NFS_PATH=/volume1/drive1/jupyter
```
NFS storage requires:
@@ -230,22 +281,18 @@ JUPYTERHUB_NFS_PV_ENABLED=false
# Vault integration
JUPYTERHUB_VAULT_INTEGRATION_ENABLED=false
VAULT_ADDR=http://vault.vault.svc:8200
VAULT_ADDR=https://vault.example.com
# Image settings
JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
IMAGE_REGISTRY=localhost:30500
# Vault token TTL settings
JUPYTERHUB_VAULT_TOKEN_TTL=24h # Admin token: 1 day (auto-renewed by Vault Agent)
JUPYTERHUB_VAULT_TOKEN_MAX_TTL=720h # Admin token: 30 days (max renewal limit)
NOTEBOOK_VAULT_TOKEN_TTL=24h # User token: 1 day (auto-renewed)
NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h # User token: 7 days (max renewal limit)
JUPYTERHUB_VAULT_TOKEN_TTL=24h # Admin token: renewed every 12h
NOTEBOOK_VAULT_TOKEN_TTL=24h # User token: 1 day
NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h # User token: 7 days max
# Vault Agent logging
VAULT_AGENT_LOG_LEVEL=info # Options: trace, debug, info, warn, error
# Application logging
# Logging
JUPYTER_BUUNSTACK_LOG_LEVEL=warning # Options: debug, info, warning, error
```
@@ -261,6 +308,13 @@ Customize JupyterHub behavior by editing `jupyterhub-values.gomplate.yaml` templ
just jupyterhub::uninstall
```
This removes:
- JupyterHub deployment
- User pods
- PVCs
- ExternalSecret
### Update
Upgrade to newer versions:
@@ -277,6 +331,18 @@ just jupyterhub::push-kernel-images
just jupyterhub::install
```
### Manual Token Refresh
If needed, manually refresh the admin token:
```bash
# Create new renewable token
just jupyterhub::create-jupyterhub-vault-token
# Restart JupyterHub to pick up new token
kubectl rollout restart deployment/hub -n jupyter
```
## Troubleshooting
### Image Pull Issues
@@ -291,28 +357,33 @@ kubectl get pods -n jupyter
kubectl describe pod <pod-name> -n jupyter
# Increase timeout if needed
helm upgrade jupyterhub jupyterhub/jupyterhub \
--timeout=30m -f jupyterhub-values.yaml
helm upgrade jupyterhub jupyterhub/jupyterhub --timeout=30m -f jupyterhub-values.yaml
```
### Vault Integration Issues
Check Vault connectivity and authentication:
Check token and authentication:
```python
# In a notebook
import os
print("Vault Address:", os.getenv('VAULT_ADDR'))
print("JWT Token:", bool(os.getenv('NOTEBOOK_VAULT_JWT')))
print("Vault Token:", bool(os.getenv('NOTEBOOK_VAULT_TOKEN')))
```bash
# Check ExternalSecret status
kubectl get externalsecret -n jupyter jupyterhub-vault-token
# Test SecretStore
from buunstack import SecretStore
secrets = SecretStore()
status = secrets.get_status()
print(status)
# Check if Secret was created
kubectl get secret -n jupyter jupyterhub-vault-token
# Check token renewal logs
kubectl logs -n jupyter -l app.kubernetes.io/component=hub -c vault-agent
# In a notebook, verify environment
%env NOTEBOOK_VAULT_TOKEN
```
Common issues:
1. **"child policies must be subset of parent"**: Admin policy needs `sudo` permission for orphan tokens
2. **Token not found**: Check ExternalSecret and ClusterSecretStore configuration
3. **Permission denied**: Verify `jupyterhub-admin` policy has all required permissions
### Authentication Issues
Verify Keycloak client configuration:
@@ -336,177 +407,38 @@ JupyterHub uses the official Zero to JupyterHub (Z2JH) Helm chart:
- Version: `4.2.0` (configurable via `JUPYTERHUB_CHART_VERSION`)
- Documentation: https://z2jh.jupyter.org/
### User-Specific Vault Token System
### Token System Architecture
The `buunstack` SecretStore uses pre-created user-specific Vault tokens that are generated during notebook spawn, ensuring complete user isolation and secure access to individual secret namespaces.
The system uses a three-tier token approach:
#### Architecture Overview
1. **Renewable Admin Token**:
- Created with `explicit-max-ttl=0` (unlimited Max TTL)
- Renewed automatically every 12 hours
- Stored in Vault and fetched via ExternalSecret
```plain
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
JupyterHub │ │ Notebook │ │ Vault │
│ │ │ │ │
│ ┌───────────┐ │ │ ┌────────────┐ │ │ ┌───────────┐ │
│ │Pre-spawn │ │───►│ │SecretStore │ ├───►│ │User Token │ │
│ │ Hook │ │ │ │ │ │ │ │ + Policy │ │
│ └───────────┘ │ │ └────────────┘ │ │ └───────────┘ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
2. **Orphan User Tokens**:
- Created with `create_orphan()` API call
- Not limited by parent token policies
- Individual TTL and Max TTL settings
**Key Components**:
3. **Token Renewal Script**:
- Runs as sidecar container
- Reads token from ExternalSecret mount
- Handles renewal and re-retrieval on failure
- **JupyterHub Admin Token**: Automatically renewed by Vault Agent, read from file at `/vault/secrets/vault-token`
- **User-Specific Tokens**: Created dynamically during notebook spawn, available as `NOTEBOOK_VAULT_TOKEN` environment variable
- **User Policies**: Restrict access to `secret/data/jupyter/users/{username}/*`
- **Vault Agent**: Sidecar container that handles automatic token renewal using Kubernetes authentication
### Key Files
#### Token Lifecycle
1. **Pre-spawn Hook Setup**
- JupyterHub uses admin Vault token to access Vault API
- Creates user-specific Vault policy with restricted path access
- Generates new user-specific Vault token with the created policy
- Passes user token to notebook environment via `NOTEBOOK_VAULT_TOKEN`
2. **SecretStore Initialization**
- Reads user-specific token from environment variable:
- `NOTEBOOK_VAULT_TOKEN` (User-specific Vault token)
- Uses token for all Vault operations within user's namespace
3. **Token Validation**
- Before operations, checks token validity using `lookup_self`
- Verifies token TTL and renewable status
4. **Automatic Token Renewal**
- If token TTL is low (< 10 minutes) and renewable, renews token
- Uses `renew_self` capability granted by user policy
- Logs renewal success for monitoring
#### Code Flow
```python
def _ensure_authenticated(self):
# Check if current Vault token is valid
try:
if self.client.is_authenticated():
# Check if token needs renewal
token_info = self.client.auth.token.lookup_self()
ttl = token_info.get("data", {}).get("ttl", 0)
renewable = token_info.get("data", {}).get("renewable", False)
# Renew if TTL < 10 minutes and renewable
if renewable and ttl > 0 and ttl < 600:
self.client.auth.token.renew_self()
logger.info("✅ Vault token renewed successfully")
return
except Exception:
pass
# Token expired and cannot be refreshed
raise Exception("User-specific Vault token expired and cannot be refreshed. Please restart your notebook server.")
```
#### Key Design Decisions
##### 1. User-Specific Token Creation
- Each user receives a unique Vault token during notebook spawn
- Individual policies ensure complete user isolation
- Admin token used only during pre-spawn hook for token creation
##### 2. Policy-Based Access Control
- User policies restrict access to `secret/data/jupyter/users/{username}/*`
- Each user can only access their own secret namespace
- Token management capabilities (`lookup_self`, `renew_self`) included
##### 3. Singleton Pattern
- Single SecretStore instance per notebook session
- Prevents multiple simultaneous authentications
- Maintains consistent token state
##### 4. Pre-created User Tokens
- Tokens are created during notebook spawn via pre-spawn hook
- Reduces initialization overhead in notebooks
- Provides immediate access to user's secret namespace
#### Error Handling
```python
# Primary error scenarios and responses:
1. User token unavailable
Token stored in NOTEBOOK_VAULT_TOKEN env var
Prompt to restart notebook server if missing
2. Vault token expired
Automatic renewal using renew_self if renewable
Restart notebook server required if not renewable
3. Vault authentication failure
Log detailed error information
Check user policy and token configuration
4. Network connectivity issues
Built-in retry in hvac client
Provide actionable error messages
```
#### Configuration
Environment variables passed to notebooks:
```yaml
# JupyterHub pre_spawn_hook sets:
spawner.environment:
# Core services
POSTGRES_HOST: 'postgres-cluster-rw.postgres'
POSTGRES_PORT: '5432'
JUPYTERHUB_API_URL: 'http://hub:8081/hub/api'
BUUNSTACK_LOG_LEVEL: 'info' # or 'debug' for detailed logging
# Vault integration
NOTEBOOK_VAULT_TOKEN: '<User-specific Vault token>'
VAULT_ADDR: 'http://vault.vault.svc:8200'
```
#### Monitoring and Debugging
Enable detailed logging for troubleshooting:
```python
# In notebook
import os
os.environ['BUUNSTACK_LOG_LEVEL'] = 'DEBUG'
# Restart kernel and check logs
from buunstack import SecretStore
secrets = SecretStore()
# Check authentication status
status = secrets.get_status()
print("Username:", status['username'])
print("Vault Address:", status['vault_addr'])
print("Authentication Method:", status['authentication_method'])
print("Vault Authenticated:", status['vault_authenticated'])
```
#### Performance Characteristics
- **Token renewal overhead**: ~10-50ms for renew_self call
- **Memory usage**: Minimal (single token stored as string)
- **Network traffic**: Only during token renewal (when TTL < 10 minutes)
- **Vault impact**: Standard token operations (lookup_self, renew_self)
- `jupyterhub-admin-policy.hcl`: Vault policy with admin permissions
- `user_policy.hcl`: Template for user-specific policies
- `vault-token-renewer.sh`: Token renewal script
- `jupyterhub-vault-token-external-secret.gomplate.yaml`: ExternalSecret configuration
## Performance Considerations
- **Image Size**: Buun-stack images are ~13GB, plan storage accordingly
- **Pull Time**: Initial pulls take 5-15 minutes depending on network
- **Resource Usage**: Data science workloads require adequate CPU/memory
- **Storage**: NFS provides better performance for shared datasets
- **Token Renewal**: User token renewal adds minimal overhead
- **Token Renewal**: Minimal overhead (renewal every 12 hours)
For production deployments, consider:
@@ -514,87 +446,13 @@ For production deployments, consider:
- Using faster storage backends
- Configuring resource limits per user
- Setting up monitoring and alerts
- Monitoring Vault token expiration and renewal patterns
## Vault Agent Integration
### Overview
JupyterHub now uses Vault Agent for automatic token renewal, eliminating the need for manual token management. Vault Agent runs as a sidecar container in the JupyterHub hub pod and automatically renews the admin token using Kubernetes authentication.
### Architecture
```plain
┌─────────────────────────────────────────────────────────────┐
│ JupyterHub Hub Pod │
│ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Hub Container │ │ Vault Agent Sidecar │ │
│ │ │ │ │ │
│ │ Reads token │◄─────────────┤ Writes token │ │
│ │ from file │ │ to shared volume │ │
│ │ │ │ │ │
│ └─────────────────┘ └─────────────────────┘ │
│ │ │ │
│ │ │ │
│ │ │ │
│ ┌─────────▼─────────┐ ┌─────────▼─────────┐ │
│ │ /vault/secrets/ │ │ Kubernetes Auth │ │
│ │ vault-token │ │ with Vault │ │
│ └───────────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Key Features
- **Automatic Renewal**: Admin token is automatically renewed every TTL/2 interval
- **Kubernetes Authentication**: Uses Kubernetes ServiceAccount for secure token acquisition
- **File-based Token Sharing**: Vault Agent writes tokens to shared volume, Hub reads from file
- **Zero Downtime**: Token renewal happens in background without service interruption
- **Configurable Logging**: Vault Agent log level can be configured via `VAULT_AGENT_LOG_LEVEL`
### Monitoring Token Renewal
Check Vault Agent status and token renewal:
```bash
# Monitor Vault Agent logs
kubectl logs -n jupyter -l app.kubernetes.io/component=hub -c vault-agent -f
# Use monitoring script
cd jupyterhub
./monitor-vault-token.sh
# Check token details
kubectl exec -n jupyter <hub-pod> -c hub -- curl -s -H "X-Vault-Token: $(cat /vault/secrets/vault-token)" $VAULT_ADDR/v1/auth/token/lookup-self
```
### Testing Token Renewal
For testing purposes, you can use shorter TTL values to observe rapid token renewal:
```bash
# Test with 1-minute TTL (renews every 30 seconds)
JUPYTERHUB_VAULT_TOKEN_TTL=1m VAULT_AGENT_LOG_LEVEL=debug just jupyterhub::install
# Monitor renewal activity
kubectl logs -n jupyter -l app.kubernetes.io/component=hub -c vault-agent -f | grep "renewed auth token"
```
### Configuration Files
The Vault Agent integration uses several configuration files:
- `vault-agent-config.gomplate.hcl`: Vault Agent configuration template
- `token-monitor.tpl`: Template for logging token information
- `monitor-vault-token.sh`: Monitoring script for token status
## Known Limitations
1. **Token Max TTL**: Even with Vault Agent auto-renewal, tokens cannot be renewed beyond `JUPYTERHUB_VAULT_TOKEN_MAX_TTL` (default: 720h/30 days). After this period, JupyterHub must be redeployed to acquire a new token from Vault. With the default 30-day limit, this requires monthly maintenance.
1. **Annual Token Recreation**: While tokens have unlimited Max TTL, best practice suggests recreating them annually
2. **Cull Settings**: Server idle timeout is set to 2 hours by default. Adjust `cull.timeout` and `cull.every` in the Helm values for different requirements.
2. **Cull Settings**: Server idle timeout is set to 2 hours by default. Adjust `cull.timeout` and `cull.every` in the Helm values for different requirements
3. **NFS Storage**: When using NFS storage, ensure proper permissions are set on the NFS server. The default `JUPYTER_FSGID` is 100.
3. **NFS Storage**: When using NFS storage, ensure proper permissions are set on the NFS server. The default `JUPYTER_FSGID` is 100
4. **Vault Agent Resource Usage**: The Vault Agent sidecar uses minimal resources (50m CPU, 64Mi memory) but adds slight overhead to the hub pod.
4. **ExternalSecret Dependency**: Requires External Secrets Operator to be installed and configured

View File

@@ -0,0 +1,50 @@
# JupyterHub Minimal Admin Policy
# Provides only necessary permissions for JupyterHub operations
# Read Keycloak credentials for OIDC authentication
path "secret/data/keycloak/admin" {
capabilities = ["read"]
}
# Full access to user secrets namespace for notebook users
path "secret/data/jupyter/*" {
capabilities = ["create", "read", "update", "delete", "list", "patch"]
}
# List secrets for user management
path "secret/metadata/jupyter/*" {
capabilities = ["list"]
}
# Token creation and management for user-specific tokens
path "auth/token/create" {
capabilities = ["create", "update"]
}
# Create orphan tokens (requires sudo for policy override)
path "auth/token/create-orphan" {
capabilities = ["create", "update", "sudo"]
}
path "auth/token/lookup-self" {
capabilities = ["read"]
}
path "auth/token/renew-self" {
capabilities = ["update"]
}
# Create user-specific policies dynamically
path "sys/policies/acl/jupyter-user-*" {
capabilities = ["create", "read", "update", "delete"]
}
# Read user policies to allow token creation with these policies
path "sys/policies/acl/*" {
capabilities = ["read", "list"]
}
# System capabilities check
path "sys/capabilities-self" {
capabilities = ["read"]
}

View File

@@ -5,11 +5,8 @@ hub:
NOTEBOOK_VAULT_TOKEN_TTL: {{ .Env.NOTEBOOK_VAULT_TOKEN_TTL | quote }}
NOTEBOOK_VAULT_TOKEN_MAX_TTL: {{ .Env.NOTEBOOK_VAULT_TOKEN_MAX_TTL | quote }}
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
# Vault Agent will provide token via file
# Vault Agent provides renewable token via file (unlimited max TTL)
VAULT_TOKEN_FILE: "/vault/secrets/vault-token"
{{- else }}
# Traditional token via environment variable
JUPYTERHUB_VAULT_TOKEN: {{ .Env.JUPYTERHUB_VAULT_TOKEN | quote }}
{{- end }}
# Install packages at container startup
@@ -63,24 +60,24 @@ hub:
# Set environment variables for spawned containers
import hvac
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
def get_vault_token():
"""Read Vault token from file written by Vault Agent"""
"""Read Vault token from file"""
import os
token_file = os.environ.get('VAULT_TOKEN_FILE', '/vault/secrets/vault-token')
token_file = '/vault/secrets/vault-token'
try:
with open(token_file, 'r') as f:
token = f.read().strip()
if token:
return token
else:
raise Exception(f"Empty token file: {token_file}")
except FileNotFoundError:
# Fallback to environment variable for backward compatibility
return os.environ.get("JUPYTERHUB_VAULT_TOKEN")
print(f"Token file not found: {token_file}")
except Exception as e:
# Log error but attempt fallback
print(f"Error reading token file {token_file}: {e}")
return os.environ.get("JUPYTERHUB_VAULT_TOKEN")
return None
{{- end }}
async def pre_spawn_hook(spawner):
"""Set essential environment variables for spawned containers"""
@@ -94,6 +91,7 @@ hub:
# Logging configuration
spawner.environment["BUUNSTACK_LOG_LEVEL"] = "{{ .Env.JUPYTER_BUUNSTACK_LOG_LEVEL }}"
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
# Create user-specific Vault token directly
try:
username = spawner.user.name
@@ -105,7 +103,7 @@ hub:
spawner.log.info(f"pre_spawn_hook starting for {username}")
spawner.log.info(f"Vault address: {vault_addr}")
spawner.log.info(f"Vault token source: {'file' if os.path.exists(os.environ.get('VAULT_TOKEN_FILE', '/vault/secrets/vault-token')) else 'env'}")
spawner.log.info(f"Vault token source: {'file' if os.path.exists('/vault/secrets/vault-token') else 'env'}")
spawner.log.info(f"Vault token present: {bool(vault_token)}, length: {len(vault_token) if vault_token else 0}")
if not vault_token:
@@ -141,7 +139,7 @@ hub:
user_token_ttl = os.environ.get("NOTEBOOK_VAULT_TOKEN_TTL", "24h")
user_token_max_ttl = os.environ.get("NOTEBOOK_VAULT_TOKEN_MAX_TTL", "168h")
token_response = vault_client.auth.token.create(
token_response = vault_client.auth.token.create_orphan(
policies=[user_policy_name],
ttl=user_token_ttl,
renewable=True,
@@ -161,6 +159,7 @@ hub:
spawner.log.error("Failed to create user-specific Vault token for {}: {}".format(spawner.user.name, e))
import traceback
spawner.log.error("Full traceback: {}".format(traceback.format_exc()))
{{- end }}
c.KubeSpawner.pre_spawn_hook = pre_spawn_hook
@@ -172,12 +171,18 @@ hub:
- name: vault-config
configMap:
name: vault-agent-config
- name: vault-admin-token
secret:
secretName: jupyterhub-vault-token
extraVolumeMounts:
- name: vault-secrets
mountPath: /vault/secrets
- name: vault-config
mountPath: /vault/config
- name: vault-admin-token
mountPath: /vault/admin-token
readOnly: true
extraContainers:
- name: vault-agent
@@ -195,8 +200,8 @@ hub:
- /bin/sh
- -c
- |
# Start Vault Agent
vault agent -config=/vault/config/agent.hcl
# Start token renewal script (handles both retrieval and renewal)
exec sh /vault/config/vault-token-renewer.sh
env:
- name: VAULT_ADDR
value: {{ .Env.VAULT_ADDR | quote }}
@@ -205,6 +210,9 @@ hub:
mountPath: /vault/secrets
- name: vault-config
mountPath: /vault/config
- name: vault-admin-token
mountPath: /vault/admin-token
readOnly: true
resources:
requests:
cpu: 50m

View File

@@ -0,0 +1,18 @@
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: jupyterhub-vault-token
namespace: {{ .Env.JUPYTERHUB_NAMESPACE }}
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-secret-store
kind: ClusterSecretStore
target:
name: jupyterhub-vault-token
creationPolicy: Owner
data:
- secretKey: token
remoteRef:
key: jupyterhub/vault-token
property: token

View File

@@ -20,7 +20,6 @@ export JUPYTER_PROFILE_TENSORFLOW_ENABLED := env("JUPYTER_PROFILE_TENSORFLOW_ENA
export JUPYTER_PROFILE_BUUN_STACK_ENABLED := env("JUPYTER_PROFILE_BUUN_STACK_ENABLED", "false")
export JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED := env("JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED", "false")
export JUPYTERHUB_VAULT_TOKEN_TTL := env("JUPYTERHUB_VAULT_TOKEN_TTL", "24h")
export JUPYTERHUB_VAULT_TOKEN_MAX_TTL := env("JUPYTERHUB_VAULT_TOKEN_MAX_TTL", "720h")
export NOTEBOOK_VAULT_TOKEN_TTL := env("NOTEBOOK_VAULT_TOKEN_TTL", "24h")
export NOTEBOOK_VAULT_TOKEN_MAX_TTL := env("NOTEBOOK_VAULT_TOKEN_MAX_TTL", "168h")
export VAULT_AGENT_LOG_LEVEL := env("VAULT_AGENT_LOG_LEVEL", "info")
@@ -54,7 +53,7 @@ delete-namespace:
kubectl delete namespace ${JUPYTERHUB_NAMESPACE} --ignore-not-found
# Install JupyterHub
install:
install root_token='':
#!/bin/bash
set -euo pipefail
export JUPYTERHUB_HOST=${JUPYTERHUB_HOST:-}
@@ -129,8 +128,16 @@ install:
if [ "${JUPYTERHUB_VAULT_INTEGRATION_ENABLED}" = "true" ]; then
echo "Setting up Vault Agent for automatic token management..."
echo " Token TTL: ${JUPYTERHUB_VAULT_TOKEN_TTL}"
echo " Token Max TTL: ${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}"
just setup-vault-integration
export VAULT_TOKEN="{{ root_token }}"
while [ -z "${VAULT_TOKEN}" ]; do
VAULT_TOKEN=$(gum input --prompt="Vault root token: " --password --width=100)
done
just setup-vault-integration ${VAULT_TOKEN}
just create-jupyterhub-vault-token ${VAULT_TOKEN}
# Create ExternalSecret for admin vault token
echo "Creating ExternalSecret for admin vault token..."
gomplate -f jupyterhub-vault-token-external-secret.gomplate.yaml | kubectl apply -f -
# Read user policy template for Vault
export USER_POLICY_HCL=$(cat user_policy.hcl)
@@ -155,6 +162,7 @@ uninstall:
helm uninstall jupyterhub -n ${JUPYTERHUB_NAMESPACE} --wait --ignore-not-found
kubectl delete pods -n ${JUPYTERHUB_NAMESPACE} -l app.kubernetes.io/component=singleuser-server
kubectl delete -n ${JUPYTERHUB_NAMESPACE} pvc jupyter-nfs-pvc --ignore-not-found
kubectl delete -n ${JUPYTERHUB_NAMESPACE} externalsecret jupyterhub-vault-token --ignore-not-found
if kubectl get pv jupyter-nfs-pv &>/dev/null; then
kubectl patch pv jupyter-nfs-pv -p '{"spec":{"claimRef":null}}'
fi
@@ -213,39 +221,43 @@ push-kernel-images:
setup-vault-integration root_token='':
#!/bin/bash
set -euo pipefail
echo "Setting up Vault integration for JupyterHub..."
# Create Kubernetes role for JupyterHub in Vault
echo "Creating Kubernetes authentication role for JupyterHub..."
echo " Service Account: hub"
echo " Namespace: jupyter"
echo " Policies: admin"
echo " TTL: ${JUPYTERHUB_VAULT_TOKEN_TTL}"
echo " Max TTL: ${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}"
export VAULT_TOKEN="{{ root_token }}"
while [ -z "${VAULT_TOKEN}" ]; do
VAULT_TOKEN=$(gum input --prompt="Vault root token: " --password --width=100)
done
vault write auth/kubernetes/role/jupyterhub \
echo "Setting up Vault integration for JupyterHub..."
# Create JupyterHub-specific policy and Kubernetes role in Vault
echo "Creating JupyterHub-specific Vault policy and Kubernetes role..."
echo " Service Account: hub"
echo " Namespace: jupyter"
echo " Policy: jupyterhub-admin (custom policy with extended max TTL)"
echo " TTL: ${JUPYTERHUB_VAULT_TOKEN_TTL}"
# Create JupyterHub-specific policy
echo "Creating jupyterhub-admin policy..."
vault policy write jupyterhub-admin jupyterhub-admin-policy.hcl
# Create Kubernetes role (use system-safe max_ttl to avoid warnings)
echo "Creating Kubernetes role..."
vault write auth/kubernetes/role/jupyterhub-admin \
bound_service_account_names=hub \
bound_service_account_namespaces=jupyter \
policies=admin \
policies=jupyterhub-admin \
ttl=${JUPYTERHUB_VAULT_TOKEN_TTL} \
max_ttl=${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}
max_ttl=720h
# Create Vault Agent configuration with gomplate
echo "Creating Vault Agent configuration..."
gomplate -f vault-agent-config.gomplate.hcl -o vault-agent-config.hcl
# Create ConfigMap with token renewal script
echo "Creating ConfigMap with token renewal script..."
kubectl create configmap vault-agent-config -n ${JUPYTERHUB_NAMESPACE} \
--from-file=agent.hcl=vault-agent-config.hcl \
--from-file=token-monitor.tpl=token-monitor.tpl \
--from-file=vault-token-renewer.sh=vault-token-renewer.sh \
--dry-run=client -o yaml | kubectl apply -f -
echo "✓ Vault integration configured (user-specific tokens + auto-renewal)"
echo ""
echo "Configuration Summary:"
echo " JupyterHub Token TTL: ${JUPYTERHUB_VAULT_TOKEN_TTL}"
echo " JupyterHub Token Max TTL: ${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}"
echo " User Token TTL: ${NOTEBOOK_VAULT_TOKEN_TTL}"
echo " User Token Max TTL: ${NOTEBOOK_VAULT_TOKEN_MAX_TTL}"
echo " Vault Agent Log Level: ${VAULT_AGENT_LOG_LEVEL}"
@@ -257,19 +269,60 @@ setup-vault-integration root_token='':
echo " # Each user gets their own isolated Vault token and policy"
echo " # Admin token is automatically renewed by Vault Agent"
# Create JupyterHub Vault token (uses admin policy for JWT operations)
create-jupyterhub-vault-token:
# Create JupyterHub Vault token (renewable with unlimited Max TTL)
create-jupyterhub-vault-token root_token='':
#!/bin/bash
set -euo pipefail
echo "Creating JupyterHub Vault token with admin policy..."
echo " TTL: ${JUPYTERHUB_VAULT_TOKEN_TTL}"
echo " Max TTL: ${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}"
export VAULT_TOKEN="{{ root_token }}"
while [ -z "${VAULT_TOKEN}" ]; do
VAULT_TOKEN=$(gum input --prompt="Vault root token: " --password --width=100)
done
# JupyterHub needs admin privileges to read Keycloak credentials from Vault
# Create token and store in Vault
just vault::create-token-and-store admin jupyterhub/vault-token ${JUPYTERHUB_VAULT_TOKEN_TTL} ${JUPYTERHUB_VAULT_TOKEN_MAX_TTL}
echo "Creating JupyterHub admin Vault token"
echo "✓ JupyterHub Vault token created and stored"
# jupyterhub-admin policy should exist (created by setup-vault-integration)
# Check if token already exists
if vault kv get secret/jupyterhub/vault-token >/dev/null 2>&1; then
echo "Existing admin token found at secret/jupyterhub/vault-token"
if gum confirm "Replace existing token with new one?"; then
echo "Creating new admin token..."
else
echo "Using existing token"
return 0
fi
fi
# Create admin vault token with unlimited max TTL
echo ""
echo "To use in JupyterHub deployment:"
echo " JUPYTERHUB_VAULT_TOKEN=\$(just vault::get jupyterhub/vault-token token)"
echo "Creating admin token (TTL: 24h, Max TTL: unlimited)..."
TOKEN_RESPONSE=$(vault token create \
-policy=jupyterhub-admin \
-ttl=24h \
-explicit-max-ttl=0 \
-display-name="jupyterhub-admin" \
-renewable=true \
-format=json)
# Extract token
ADMIN_TOKEN=$(echo "$TOKEN_RESPONSE" | jq -r .auth.client_token)
if [ -z "$ADMIN_TOKEN" ] || [ "$ADMIN_TOKEN" = "null" ]; then
echo "❌ Failed to create admin token"
exit 1
fi
# Store token in Vault for JupyterHub to retrieve
echo "Storing admin token in Vault..."
vault kv put secret/jupyterhub/vault-token token="$ADMIN_TOKEN"
echo ""
echo "✅ Admin token created and stored successfully!"
echo ""
echo "Token behavior:"
echo " - TTL: 24 hours (will expire in 24h without renewal)"
echo " - Max TTL: Unlimited (can be renewed forever)"
echo " - Vault Agent will renew every 12 hours"
echo " - No more 30-day limitation!"
echo ""
echo "Token stored at: secret/jupyterhub/vault-token"

View File

@@ -1,73 +0,0 @@
#!/bin/bash
# JupyterHub Vault Token Monitor Script
# Usage: ./monitor-vault-token.sh [pod-name]
set -euo pipefail
NAMESPACE="jupyter"
POD_NAME=${1:-$(kubectl get pods -n ${NAMESPACE} -l app.kubernetes.io/component=hub -o jsonpath='{.items[0].metadata.name}')}
echo "🔍 Monitoring Vault Agent for JupyterHub Pod: ${POD_NAME}"
echo "=================================================="
# Check if pod exists and is running
if ! kubectl get pod ${POD_NAME} -n ${NAMESPACE} >/dev/null 2>&1; then
echo "❌ Pod ${POD_NAME} not found in namespace ${NAMESPACE}"
exit 1
fi
echo "📊 Pod Status:"
kubectl get pod ${POD_NAME} -n ${NAMESPACE}
echo ""
echo "📄 Vault Secrets Directory:"
kubectl exec -n ${NAMESPACE} ${POD_NAME} -c hub -- ls -la /vault/secrets/ 2>/dev/null || echo "❌ Cannot access /vault/secrets/"
echo ""
echo "🔐 Current Token Info:"
kubectl exec -n ${NAMESPACE} ${POD_NAME} -c hub -- sh -c '
if [ -f /vault/secrets/vault-token ]; then
echo "Token file exists ($(wc -c < /vault/secrets/vault-token) bytes)"
echo "Last modified: $(stat -c %y /vault/secrets/vault-token 2>/dev/null || stat -f %Sm /vault/secrets/vault-token)"
# Test token validity
if command -v curl >/dev/null 2>&1; then
echo ""
echo "Token validation:"
RESPONSE=$(curl -s -w "%{http_code}" -H "X-Vault-Token: $(cat /vault/secrets/vault-token)" $VAULT_ADDR/v1/auth/token/lookup-self)
HTTP_CODE="${RESPONSE: -3}"
if [ "$HTTP_CODE" = "200" ]; then
echo "✅ Token is valid"
echo "$RESPONSE" | head -c -3 | grep -E "(ttl|expire_time|renewable)" | head -3
else
echo "❌ Token validation failed (HTTP $HTTP_CODE)"
fi
fi
else
echo "❌ Token file not found"
fi
' 2>/dev/null || echo "❌ Cannot check token info"
echo ""
echo "📋 Recent Vault Agent Logs:"
kubectl logs -n ${NAMESPACE} ${POD_NAME} -c vault-agent --tail=10 2>/dev/null || echo "❌ Cannot access vault-agent logs"
echo ""
echo "📋 Token Renewal Log (if exists):"
kubectl exec -n ${NAMESPACE} ${POD_NAME} -c hub -- sh -c '
if [ -f /vault/secrets/renewal.log ]; then
echo "Recent renewal events:"
tail -10 /vault/secrets/renewal.log
else
echo "No renewal log file found yet"
fi
' 2>/dev/null || echo "❌ Cannot check renewal logs"
echo ""
echo "🔄 To monitor token renewals in real-time, run:"
echo " kubectl logs -n ${NAMESPACE} ${POD_NAME} -c vault-agent -f | grep 'renewed auth token'"
echo ""
echo "🔍 To check token info periodically, run:"
echo " watch -n 30 \"kubectl exec -n ${NAMESPACE} ${POD_NAME} -c hub -- sh -c 'curl -s -H \\\"X-Vault-Token: \\\$(cat /vault/secrets/vault-token)\\\" \\\$VAULT_ADDR/v1/auth/token/lookup-self | grep -E \\\"(ttl|expire_time)\\\"'\""

View File

@@ -1,11 +0,0 @@
{{- with secret "auth/token/lookup-self" -}}
=== Vault Token Status ===
TTL: {{ .Data.ttl }} seconds
Renewable: {{ .Data.renewable }}
Expire Time: {{ .Data.expire_time }}
Policies: {{ range .Data.policies }}{{ . }} {{ end }}
Display Name: {{ .Data.display_name }}
Entity ID: {{ .Data.entity_id }}
Token Type: {{ .Data.type }}
===========================
{{- end -}}

View File

@@ -1,38 +0,0 @@
vault {
address = "{{ .Env.VAULT_ADDR }}"
}
# Enable detailed logging
log_level = "{{ .Env.VAULT_AGENT_LOG_LEVEL }}"
log_format = "standard"
auto_auth {
method "kubernetes" {
mount_path = "auth/kubernetes"
config = {
role = "jupyterhub"
}
}
sink "file" {
config = {
path = "/vault/secrets/vault-token"
}
}
}
cache {
use_auto_auth_token = true
}
listener "tcp" {
address = "127.0.0.1:8100"
tls_disable = true
}
# Add template for token monitoring
template {
source = "/vault/config/token-monitor.tpl"
destination = "/vault/secrets/token-info.log"
perms = 0644
}

View File

@@ -0,0 +1,47 @@
#!/bin/sh
# Script to handle admin token retrieval and renewal
set -e
echo "Starting Vault token management..."
export VAULT_ADDR="${VAULT_ADDR}"
# Wait for ExternalSecret to create the secret
echo "Waiting for admin token from ExternalSecret..."
while [ ! -f /vault/admin-token/token ]; do
echo "Waiting for /vault/admin-token/token..."
sleep 5
done
# Read admin token from mounted secret
ADMIN_TOKEN=$(cat /vault/admin-token/token)
if [ -z "$ADMIN_TOKEN" ]; then
echo "ERROR: No admin token found in mounted secret"
exit 1
fi
echo "Admin token retrieved from ExternalSecret"
echo "$ADMIN_TOKEN" > /vault/secrets/vault-token
# Start token renewal loop
export VAULT_TOKEN="$ADMIN_TOKEN"
while true; do
echo "$(date): Renewing admin token..."
if vault token renew >/dev/null 2>&1; then
echo "$(date): Token renewed successfully"
else
echo "$(date): Token renewal failed - trying to retrieve token again from ExternalSecret"
# Re-read token from mounted secret
ADMIN_TOKEN=$(cat /vault/admin-token/token 2>/dev/null || echo "")
if [ -n "$ADMIN_TOKEN" ]; then
echo "$ADMIN_TOKEN" > /vault/secrets/vault-token
export VAULT_TOKEN="$ADMIN_TOKEN"
echo "$(date): Token re-retrieved successfully from ExternalSecret"
else
echo "$(date): Failed to re-retrieve token from ExternalSecret"
fi
fi
sleep 43200 # 12 hours
done