feat(jupyterhub): vault token w/o keycloak auth

This commit is contained in:
Masaki Yatsu
2025-09-03 10:11:06 +09:00
parent 02ec5eb1e2
commit d233373219
15 changed files with 583 additions and 612 deletions

View File

@@ -110,7 +110,7 @@ JUPYTER_PYTHON_KERNEL_TAG=python-3.12-1
### Overview
Vault integration enables secure secrets management directly from Jupyter notebooks without re-authentication. Users can store and retrieve API keys, database credentials, and other sensitive data securely.
Vault integration enables secure secrets management directly from Jupyter notebooks using user-specific Vault tokens. Each user receives their own isolated Vault token during notebook spawn, ensuring complete separation of secrets between users. Users can store and retrieve API keys, database credentials, and other sensitive data securely with automatic token renewal.
### Prerequisites
@@ -133,7 +133,7 @@ just jupyterhub::install
Or configure manually:
```bash
# Setup Vault JWT authentication for JupyterHub
# Setup Vault integration (creates user-specific tokens)
just jupyterhub::setup-vault-jwt-auth
```
@@ -144,7 +144,7 @@ With Vault integration enabled, use the `buunstack` package in notebooks:
```python
from buunstack import SecretStore
# Initialize (uses JupyterHub session authentication)
# Initialize (uses pre-acquired user-specific token)
secrets = SecretStore()
# Store secrets
@@ -160,16 +160,17 @@ openai_key = secrets.get('api-keys', field='openai')
# List all secrets
secret_names = secrets.list()
# Delete secrets
secrets.delete('old-api-key')
# Delete secrets or specific fields
secrets.delete('old-api-key') # Delete entire secret
secrets.delete('api-keys', field='github') # Delete only github field
```
### Security Features
- **User isolation**: Each user can only access their own secrets
- **Automatic token refresh**: Background token management prevents authentication failures
- **User isolation**: Each user receives a unique Vault token with access only to their own secrets
- **Automatic token renewal**: Tokens can be renewed to extend session lifetime
- **Audit trail**: All secret access is logged in Vault
- **No re-authentication**: Uses existing JupyterHub OIDC session
- **Individual policies**: Each user has their own Vault policy restricting access to their namespace
## Storage Options
@@ -273,7 +274,8 @@ Check Vault connectivity and authentication:
# In a notebook
import os
print("Vault Address:", os.getenv('VAULT_ADDR'))
print("Access Token:", bool(os.getenv('JUPYTERHUB_OIDC_ACCESS_TOKEN')))
print("JWT Token:", bool(os.getenv('NOTEBOOK_VAULT_JWT')))
print("Vault Token:", bool(os.getenv('NOTEBOOK_VAULT_TOKEN')))
# Test SecretStore
from buunstack import SecretStore
@@ -295,12 +297,172 @@ just keycloak::update-client buunstack jupyterhub \
"https://your-jupyter-host/hub/oauth_callback"
```
## Implementation
### User-Specific Vault Token System
The `buunstack` SecretStore uses pre-created user-specific Vault tokens that are generated during notebook spawn, ensuring complete user isolation and secure access to individual secret namespaces.
#### Architecture Overview
```plain
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ JupyterHub │ │ Notebook │ │ Vault │
│ │ │ │ │ │
│ ┌───────────┐ │ │ ┌────────────┐ │ │ ┌───────────┐ │
│ │Pre-spawn │ │───►│ │SecretStore │ ├───►│ │User Token │ │
│ │ Hook │ │ │ │ │ │ │ │ + Policy │ │
│ └───────────┘ │ │ └────────────┘ │ │ └───────────┘ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
```
#### Token Lifecycle
1. **Pre-spawn Hook Setup**
- JupyterHub uses admin Vault token to access Vault API
- Creates user-specific Vault policy with restricted path access
- Generates new user-specific Vault token with the created policy
- Passes user token to notebook environment via `NOTEBOOK_VAULT_TOKEN`
2. **SecretStore Initialization**
- Reads user-specific token from environment variable:
- `NOTEBOOK_VAULT_TOKEN` (User-specific Vault token)
- Uses token for all Vault operations within user's namespace
3. **Token Validation**
- Before operations, checks token validity using `lookup_self`
- Verifies token TTL and renewable status
4. **Automatic Token Renewal**
- If token TTL is low (< 10 minutes) and renewable, renews token
- Uses `renew_self` capability granted by user policy
- Logs renewal success for monitoring
#### Code Flow
```python
def _ensure_authenticated(self):
# Check if current Vault token is valid
try:
if self.client.is_authenticated():
# Check if token needs renewal
token_info = self.client.auth.token.lookup_self()
ttl = token_info.get("data", {}).get("ttl", 0)
renewable = token_info.get("data", {}).get("renewable", False)
# Renew if TTL < 10 minutes and renewable
if renewable and ttl > 0 and ttl < 600:
self.client.auth.token.renew_self()
logger.info("✅ Vault token renewed successfully")
return
except Exception:
pass
# Token expired and cannot be refreshed
raise Exception("User-specific Vault token expired and cannot be refreshed. Please restart your notebook server.")
```
#### Key Design Decisions
##### 1. User-Specific Token Creation
- Each user receives a unique Vault token during notebook spawn
- Individual policies ensure complete user isolation
- Admin token used only during pre-spawn hook for token creation
##### 2. Policy-Based Access Control
- User policies restrict access to `secret/data/jupyter/users/{username}/*`
- Each user can only access their own secret namespace
- Token management capabilities (`lookup_self`, `renew_self`) included
##### 3. Singleton Pattern
- Single SecretStore instance per notebook session
- Prevents multiple simultaneous authentications
- Maintains consistent token state
##### 4. Pre-created User Tokens
- Tokens are created during notebook spawn via pre-spawn hook
- Reduces initialization overhead in notebooks
- Provides immediate access to user's secret namespace
#### Error Handling
```python
# Primary error scenarios and responses:
1. User token unavailable
Token stored in NOTEBOOK_VAULT_TOKEN env var
Prompt to restart notebook server if missing
2. Vault token expired
Automatic renewal using renew_self if renewable
Restart notebook server required if not renewable
3. Vault authentication failure
Log detailed error information
Check user policy and token configuration
4. Network connectivity issues
Built-in retry in hvac client
Provide actionable error messages
```
#### Configuration
Environment variables passed to notebooks:
```yaml
# JupyterHub pre_spawn_hook sets:
spawner.environment:
# Core services
POSTGRES_HOST: 'postgres-cluster-rw.postgres'
POSTGRES_PORT: '5432'
JUPYTERHUB_API_URL: 'http://hub:8081/hub/api'
BUUNSTACK_LOG_LEVEL: 'info' # or 'debug' for detailed logging
# Vault integration
NOTEBOOK_VAULT_TOKEN: '<User-specific Vault token>'
VAULT_ADDR: 'http://vault.vault.svc:8200'
```
#### Monitoring and Debugging
Enable detailed logging for troubleshooting:
```python
# In notebook
import os
os.environ['BUUNSTACK_LOG_LEVEL'] = 'DEBUG'
# Restart kernel and check logs
from buunstack import SecretStore
secrets = SecretStore()
# Check authentication status
status = secrets.get_status()
print("Username:", status['username'])
print("Vault Address:", status['vault_addr'])
print("Authentication Method:", status['authentication_method'])
print("Vault Authenticated:", status['vault_authenticated'])
```
#### Performance Characteristics
- **Token renewal overhead**: ~10-50ms for renew_self call
- **Memory usage**: Minimal (single token stored as string)
- **Network traffic**: Only during token renewal (when TTL < 10 minutes)
- **Vault impact**: Standard token operations (lookup_self, renew_self)
## Performance Considerations
- **Image Size**: Buun-stack images are ~13GB, plan storage accordingly
- **Pull Time**: Initial pulls take 5-15 minutes depending on network
- **Resource Usage**: Data science workloads require adequate CPU/memory
- **Storage**: NFS provides better performance for shared datasets
- **Token Renewal**: User token renewal adds minimal overhead
For production deployments, consider:
@@ -308,3 +470,4 @@ For production deployments, consider:
- Using faster storage backends
- Configuring resource limits per user
- Setting up monitoring and alerts
- Monitoring Vault token expiration and renewal patterns

1
env/justfile vendored
View File

@@ -78,6 +78,7 @@ setup:
gomplate -f env.local.gomplate -o ../.env.local
npm i
pip install build
# Set a specific key in .env.local
[working-directory("..")]

View File

@@ -146,12 +146,6 @@ RUN pip install \
tavily-python \
tweet-preprocessor
# Install buunstack package
COPY *.whl /opt/
RUN pip install /opt/*.whl && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
# Install PyTorch with pip (https://pytorch.org/get-started/locally/)
# langchain-openai must be updated to avoid pydantic v2 error
# https://github.com/run-llama/llama_index/issues/16540https://github.com/run-llama/llama_index/issues/16540
@@ -164,6 +158,11 @@ RUN pip install --no-cache-dir --extra-index-url=https://pypi.nvidia.com --index
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
# Install buunstack package
COPY *.whl /opt/
RUN pip install /opt/*.whl && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
WORKDIR "${HOME}"
EXPOSE 4040

View File

@@ -146,12 +146,6 @@ RUN pip install \
tavily-python \
tweet-preprocessor
# Install buunstack package
COPY *.whl /opt/
RUN pip install /opt/*.whl && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
# Install PyTorch with pip (https://pytorch.org/get-started/locally/)
# langchain-openai must be updated to avoid pydantic v2 error
# https://github.com/run-llama/llama_index/issues/16540https://github.com/run-llama/llama_index/issues/16540
@@ -164,5 +158,11 @@ RUN pip install --no-cache-dir --index-url 'https://download.pytorch.org/whl/cpu
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
# Install buunstack package
COPY *.whl /opt/
RUN pip install /opt/*.whl && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
WORKDIR "${HOME}"
EXPOSE 4040

View File

@@ -1,4 +1,21 @@
hub:
extraEnv:
JUPYTERHUB_CRYPT_KEY: {{ .Env.JUPYTERHUB_CRYPT_KEY | quote }}
# Install packages at container startup
extraFiles:
startup.sh:
mountPath: /usr/local/bin/startup.sh
mode: 0755
stringData: |
#!/bin/bash
pip install --no-cache-dir hvac==2.3.0
exec jupyterhub --config /usr/local/etc/jupyterhub/jupyterhub_config.py --upgrade-db
# Override the default command to run our startup script first
command:
- /usr/local/bin/startup.sh
config:
JupyterHub:
authenticator_class: generic-oauth
@@ -24,48 +41,97 @@ hub:
- profile
- email
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
extraConfig:
01-vault-integration: |
import os
pre-spawn-hook: |
# Set environment variables for spawned containers
import hvac
async def pre_spawn_hook(spawner):
"""Pass OIDC tokens and Vault config to notebook environment"""
auth_state = await spawner.user.get_auth_state()
if auth_state:
if 'access_token' in auth_state:
spawner.environment['JUPYTERHUB_OIDC_ACCESS_TOKEN'] = auth_state['access_token']
if 'refresh_token' in auth_state:
spawner.environment['JUPYTERHUB_OIDC_REFRESH_TOKEN'] = auth_state['refresh_token']
if 'id_token' in auth_state:
spawner.environment['JUPYTERHUB_OIDC_ID_TOKEN'] = auth_state['id_token']
if 'expires_at' in auth_state:
spawner.environment['JUPYTERHUB_OIDC_TOKEN_EXPIRES_AT'] = str(auth_state['expires_at'])
"""Set essential environment variables for spawned containers"""
# PostgreSQL configuration
spawner.environment["POSTGRES_HOST"] = "postgres-cluster-rw.postgres"
spawner.environment["POSTGRES_PORT"] = "5432"
# Add Keycloak configuration for token refresh
spawner.environment['KEYCLOAK_HOST'] = '{{ .Env.KEYCLOAK_HOST }}'
spawner.environment['KEYCLOAK_REALM'] = '{{ .Env.KEYCLOAK_REALM }}'
spawner.environment['KEYCLOAK_CLIENT_ID'] = 'jupyterhub'
# JupyterHub API configuration
spawner.environment["JUPYTERHUB_API_URL"] = "http://hub:8081/hub/api"
# Logging configuration
spawner.environment["BUUNSTACK_LOG_LEVEL"] = "{{ .Env.JUPYTER_BUUNSTACK_LOG_LEVEL }}"
# Create user-specific Vault token directly
try:
username = spawner.user.name
# Step 1: Initialize admin Vault client
vault_client = hvac.Client(url="{{ .Env.VAULT_ADDR }}", verify=False)
vault_client.token = "{{ .Env.JUPYTERHUB_VAULT_TOKEN }}"
if not vault_client.is_authenticated():
raise Exception("Admin token is not authenticated")
# Step 2: Create user-specific policy
user_policy_name = "jupyter-user-{}".format(username)
user_path = "secret/data/jupyter/users/{}/*".format(username)
user_metadata_path = "secret/metadata/jupyter/users/{}/*".format(username)
user_base_path = "secret/metadata/jupyter/users/{}".format(username)
user_policy = (
"# User-specific policy for {}\n".format(username) +
"path \"{}\" ".format(user_path) + "{\n" +
" capabilities = [\"create\", \"update\", \"read\", \"delete\", \"list\"]\n" +
"}\n\n" +
"path \"{}\" ".format(user_metadata_path) + "{\n" +
" capabilities = [\"list\", \"read\", \"delete\", \"update\"]\n" +
"}\n\n" +
"path \"{}\" ".format(user_base_path) + "{\n" +
" capabilities = [\"list\"]\n" +
"}\n\n" +
"# Read access to shared resources\n" +
"path \"secret/data/jupyter/shared/*\" {\n" +
" capabilities = [\"read\", \"list\"]\n" +
"}\n\n" +
"path \"secret/metadata/jupyter/shared\" {\n" +
" capabilities = [\"list\"]\n" +
"}\n\n" +
"# Token management capabilities\n" +
"path \"auth/token/lookup-self\" {\n" +
" capabilities = [\"read\"]\n" +
"}\n\n" +
"path \"auth/token/renew-self\" {\n" +
" capabilities = [\"update\"]\n" +
"}"
)
# Write user-specific policy
try:
vault_client.sys.create_or_update_policy(user_policy_name, user_policy)
spawner.log.info("✅ Created policy: {}".format(user_policy_name))
except Exception as policy_e:
spawner.log.warning("Policy creation failed (may already exist): {}".format(policy_e))
# Step 3: Create user-specific token
token_response = vault_client.auth.token.create(
policies=[user_policy_name],
ttl="1h",
renewable=True,
display_name="notebook-{}".format(username)
)
user_vault_token = token_response["auth"]["client_token"]
lease_duration = token_response["auth"].get("lease_duration", 3600)
# Set user-specific Vault token as environment variable
spawner.environment["NOTEBOOK_VAULT_TOKEN"] = user_vault_token
spawner.log.info("✅ User-specific Vault token created for {} (expires in {}s, renewable)".format(username, lease_duration))
except Exception as e:
spawner.log.error("Failed to create user-specific Vault token for {}: {}".format(spawner.user.name, e))
import traceback
spawner.log.error("Full traceback: {}".format(traceback.format_exc()))
c.Spawner.pre_spawn_hook = pre_spawn_hook
{{- end }}
02-postgres-integration: |
from functools import wraps
# Store the original pre_spawn_hook if it exists
original_hook = c.Spawner.pre_spawn_hook if hasattr(c.Spawner, 'pre_spawn_hook') else None
async def postgres_pre_spawn_hook(spawner):
"""Add PostgreSQL connection information to notebook environment"""
# Call the original hook first if it exists
if original_hook:
await original_hook(spawner)
# Add PostgreSQL configuration
spawner.environment['POSTGRES_HOST'] = 'postgres-cluster-rw.postgres'
spawner.environment['POSTGRES_PORT'] = '5432'
c.Spawner.pre_spawn_hook = postgres_pre_spawn_hook
podSecurityContext:
fsGroup: {{ .Env.JUPYTER_FSGID }}
@@ -85,23 +151,8 @@ singleuser:
{{ end -}}
capacity: 10Gi
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
extraEnv:
VAULT_ADDR: "{{ .Env.VAULT_ADDR }}"
KEYCLOAK_HOST: "{{ .Env.KEYCLOAK_HOST }}"
KEYCLOAK_REALM: "{{ .Env.KEYCLOAK_REALM }}"
# lifecycleHooks:
# postStart:
# exec:
# command:
# - /bin/bash
# - -c
# - |
# # Install hvac for Vault integration
# mamba install hvac requests
# echo "Vault integration ready"
{{- end }}
networkPolicy:
egress:
- to:
@@ -129,7 +180,6 @@ singleuser:
ports:
- port: 4000
protocol: TCP
{{- if eq .Env.JUPYTERHUB_VAULT_INTEGRATION_ENABLED "true" }}
- to:
- namespaceSelector:
matchLabels:
@@ -137,9 +187,6 @@ singleuser:
ports:
- port: 8200
protocol: TCP
- port: 8201
protocol: TCP
{{- end }}
- to:
- ipBlock:
cidr: 0.0.0.0/0

View File

@@ -5,7 +5,7 @@ export JUPYTERHUB_CHART_VERSION := env("JUPYTERHUB_CHART_VERSION", "4.2.0")
export JUPYTERHUB_OIDC_CLIENT_ID := env("JUPYTERHUB_OIDC_CLIENT_ID", "jupyterhub")
export JUPYTERHUB_NFS_PV_ENABLED := env("JUPYTERHUB_NFS_PV_ENABLED", "")
export JUPYTERHUB_VAULT_INTEGRATION_ENABLED := env("JUPYTERHUB_VAULT_INTEGRATION_ENABLED", "")
export JUPYTER_PYTHON_KERNEL_TAG := env("JUPYTER_PYTHON_KERNEL_TAG", "python-3.12-8")
export JUPYTER_PYTHON_KERNEL_TAG := env("JUPYTER_PYTHON_KERNEL_TAG", "python-3.12-24")
export KERNEL_IMAGE_BUUN_STACK_REPOSITORY := env("KERNEL_IMAGE_BUUN_STACK_REPOSITORY", "buun-stack-notebook")
export KERNEL_IMAGE_BUUN_STACK_CUDA_REPOSITORY := env("KERNEL_IMAGE_BUUN_STACK_CUDA_REPOSITORY", "buun-stack-cuda-notebook")
export JUPYTER_PROFILE_MINIMAL_ENABLED := env("JUPYTER_PROFILE_MINIMAL_ENABLED", "false")
@@ -20,6 +20,7 @@ export IMAGE_REGISTRY := env("IMAGE_REGISTRY", "localhost:30500")
export KEYCLOAK_REALM := env("KEYCLOAK_REALM", "buunstack")
export LONGHORN_NAMESPACE := env("LONGHORN_NAMESPACE", "longhorn")
export VAULT_ADDR := env("VAULT_ADDR", "http://vault.vault.svc:8200")
export JUPYTER_BUUNSTACK_LOG_LEVEL := env("JUPYTER_BUUNSTACK_LOG_LEVEL", "info")
[private]
default:
@@ -54,6 +55,15 @@ install:
--placeholder="e.g., jupyter.example.com"
)
done
# Generate JUPYTERHUB_CRYPT_KEY if not exists
if [ -z "${JUPYTERHUB_CRYPT_KEY:-}" ]; then
echo "Generating JUPYTERHUB_CRYPT_KEY..."
export JUPYTERHUB_CRYPT_KEY=$(just utils::random-password)
echo "JUPYTERHUB_CRYPT_KEY=${JUPYTERHUB_CRYPT_KEY}" >> ../../.env.local
echo "✓ JUPYTERHUB_CRYPT_KEY generated and saved to .env.local"
fi
just create-namespace
# just k8s::copy-regcred ${JUPYTERHUB_NAMESPACE}
just keycloak::create-client ${KEYCLOAK_REALM} ${JUPYTERHUB_OIDC_CLIENT_ID} \
@@ -96,8 +106,17 @@ install:
fi
kubectl apply -n ${JUPYTERHUB_NAMESPACE} -f nfs-pvc.yaml
fi
# Create or get JupyterHub Vault token before gomplate
if ! just vault::exist jupyterhub/vault-token &>/dev/null; then
echo "Creating JupyterHub Vault token..."
just create-jupyterhub-vault-token
fi
export JUPYTERHUB_VAULT_TOKEN=$(just vault::get jupyterhub/vault-token token)
# https://z2jh.jupyter.org/en/stable/
gomplate -f jupyterhub-values.gomplate.yaml -o jupyterhub-values.yaml
helm upgrade --cleanup-on-fail --install jupyterhub jupyterhub/jupyterhub \
--version ${JUPYTERHUB_CHART_VERSION} -n ${JUPYTERHUB_NAMESPACE} \
--timeout=20m -f jupyterhub-values.yaml
@@ -138,62 +157,68 @@ delete-pv:
# Build Jupyter notebook kernel images
build-kernel-images:
#!/bin/bash
set -euo pipefail
# Build python package wheel
set -euxo pipefail
(
cd ../python-package
rm -rf dist/ build/ *.egg-info/
SETUPTOOLS_SCM_PRETEND_VERSION_FOR_BUUNSTACK=0.1.0 python -m build --wheel
cd ../jupyterhub
# Copy built wheel to image directories
cp ../python-package/dist/*.whl ./images/datastack-notebook/
cp ../python-package/dist/*.whl ./images/datastack-cuda-notebook/
)
(
cd ./images/datastack-notebook
cp ../../../python-package/dist/*.whl ./
docker build -t \
${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG} \
--build-arg spark_version="3.5.4" \
--build-arg spark_download_url="https://archive.apache.org/dist/spark/" \
.
)
rm -f ./images/datastack-notebook/*.whl
if [ "${JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED}" = "true" ]; then
(
cd ./images/datastack-cuda-notebook
cp ../../../python-package/dist/*.whl ./
docker build -t \
${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_CUDA_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG} \
--build-arg spark_version="3.5.4" \
--build-arg spark_download_url="https://archive.apache.org/dist/spark/" \
.
)
# Clean up copied wheel files
rm -f ./images/datastack-notebook/*.whl
rm -f ./images/datastack-cuda-notebook/*.whl
fi
# Push Jupyter notebook kernel images
push-kernel-images: build-kernel-images
docker push ${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG}
docker push ${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_CUDA_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG}
# Configure Vault for JupyterHub integration
setup-vault-integration:
#!/bin/bash
set -euo pipefail
echo "Creating JupyterHub Vault policy..."
just vault::write-policy jupyter-user $(pwd)/vault-policy.hcl
echo "✓ JupyterHub policy created"
docker push ${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG}
if [ "${JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED}" = "true" ]; then
docker push ${IMAGE_REGISTRY}/${KERNEL_IMAGE_BUUN_STACK_CUDA_REPOSITORY}:${JUPYTER_PYTHON_KERNEL_TAG}
fi
# Setup JWT auth for JupyterHub tokens (no re-authentication needed)
# Setup Vault integration for JupyterHub (user-specific tokens)
setup-vault-jwt-auth:
#!/bin/bash
set -euo pipefail
echo "Setting up Vault integration for JupyterHub..."
just setup-vault-integration
just vault::setup-jwt-auth "jupyterhub" "jupyter-token" "jupyter-user"
echo "✓ Vault integration configured"
echo "✓ Vault integration configured (user-specific tokens)"
echo ""
echo "Users can now access Vault from notebooks using:"
echo " import os, hvac"
echo " client = hvac.Client(url=os.getenv('VAULT_ADDR'), verify=False)"
echo " client.auth.jwt.jwt_login("
echo " role='jupyter-token',"
echo " jwt=os.getenv('JUPYTERHUB_OIDC_ACCESS_TOKEN'),"
echo " path='jwt'"
echo " )"
echo " from buunstack import SecretStore"
echo " secrets = SecretStore()"
echo " # Each user gets their own isolated Vault token and policy"
# Create JupyterHub Vault token (uses admin policy for JWT operations)
create-jupyterhub-vault-token ttl="720h":
#!/bin/bash
set -euo pipefail
echo "Creating JupyterHub Vault token with admin policy..."
# JupyterHub needs admin privileges to read Keycloak credentials from Vault
# Create token and store in Vault
just vault::create-token-and-store admin jupyterhub/vault-token {{ ttl }}
echo "✓ JupyterHub Vault token created and stored"
echo ""
echo "To use in JupyterHub deployment:"
echo " JUPYTERHUB_VAULT_TOKEN=\$(just vault::get jupyterhub/vault-token token)"

View File

@@ -1,26 +0,0 @@
# JupyterHub user policy for Vault access
# Read access to shared jupyter resources
path "secret/data/jupyter/shared/*" {
capabilities = ["read", "list"]
}
# Allow users to list shared directory
path "secret/metadata/jupyter/shared" {
capabilities = ["list"]
}
# Full access to user-specific paths
path "secret/data/jupyter/users/{{identity.entity.aliases.auth_jwt_*.metadata.username}}/*" {
capabilities = ["create", "update", "read", "delete", "list"]
}
# Allow users to list their own directory
path "secret/metadata/jupyter/users/{{identity.entity.aliases.auth_jwt_*.metadata.username}}/*" {
capabilities = ["list", "read", "delete"]
}
# Allow users to list only their own user directory for navigation
path "secret/metadata/jupyter/users/{{identity.entity.aliases.auth_jwt_*.metadata.username}}" {
capabilities = ["list"]
}

View File

@@ -6,4 +6,5 @@ just = "1.42.4"
k3sup = "0.13.10"
kubelogin = "1.34.0"
node = "22.18.0"
python = "3.12.11"
vault = "1.20.2"

View File

@@ -1,14 +1,14 @@
# buunstack
A Python package for buun-stack that provides secure secrets management with HashiCorp Vault and automatic Keycloak OIDC token refresh for JupyterHub users.
A Python package for buun-stack that provides secure secrets management with HashiCorp Vault using pre-acquired Vault tokens from JupyterHub for seamless authentication.
## Features
- 🔒 **Secure Secrets Management**: Integration with HashiCorp Vault
- 🔄 **Automatic Token Refresh**: Seamless Keycloak OIDC token management
- 🚀 **Pre-acquired Authentication**: Uses Vault tokens created at notebook spawn
- 📱 **Simple API**: Easy-to-use interface for secrets storage and retrieval
- 🔄 **Automatic Token Renewal**: Built-in token refresh for long-running sessions
- 🏢 **Enterprise Ready**: Built for production environments
- 🚀 **JupyterHub Integration**: Native support for JupyterHub workflows
## Quick Start
@@ -23,15 +23,15 @@ pip install buunstack
```python
from buunstack import SecretStore
# Initialize with automatic token refresh (default)
# Initialize with pre-acquired Vault token (automatic)
secrets = SecretStore()
# Put API keys and configuration
secrets.put('api-keys', {
'openai_key': 'sk-your-key-here',
'github_token': 'ghp_your-token',
'database_url': 'postgresql://user:pass@host:5432/db'
})
secrets.put('api-keys',
openai_key='sk-your-key-here',
github_token='ghp_your-token',
database_url='postgresql://user:pass@host:5432/db'
)
# Get secrets
api_keys = secrets.get('api-keys')
@@ -44,18 +44,19 @@ all_secrets = secrets.list()
### Configuration Options
```python
# Manual token management
secrets = SecretStore(auto_token_refresh=False)
# Disable JupyterHub token synchronization
secrets = SecretStore(sync_with_jupyterhub=False)
# Custom refresh timing
# Custom token validity buffer
secrets = SecretStore(
auto_token_refresh=True,
refresh_buffer_seconds=600, # Refresh 10 minutes before expiry
background_refresh_interval=3600 # Background refresh every hour
sync_with_jupyterhub=True,
refresh_buffer_seconds=600 # Sync tokens 10 minutes before expiry
)
# Start background auto-refresh
refresher = secrets.start_background_refresh()
# Check synchronization status
status = secrets.get_status()
print(f"JupyterHub sync enabled: {status['sync_with_jupyterhub']}")
print(f"API configured: {status.get('jupyterhub_api_configured', False)}")
```
### Environment Variables Helper

1
python-package/buunstack/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
/examples/

View File

@@ -8,6 +8,6 @@ try:
from ._version import __version__
except ImportError:
__version__ = "unknown"
__author__ = "Buun Stack Team"
__author__ = "Buun ch."
__all__ = ["SecretStore", "get_env_from_secrets", "put_env_to_secrets"]

View File

@@ -12,7 +12,7 @@ def quickstart_example():
print("🚀 buunstack QuickStart Example")
print("=" * 40)
# Initialize SecretStore (auto-refresh enabled by default)
# Initialize SecretStore (JupyterHub sync enabled by default)
secrets = SecretStore()
print(f"✅ SecretStore initialized for user: {secrets.username}")
@@ -87,32 +87,29 @@ def advanced_example():
print("\n🔧 Advanced Configuration Example")
print("=" * 40)
# Manual token management
# Manual token management (disable JupyterHub sync)
print("\n1⃣ Manual token management:")
manual_secrets = SecretStore(auto_token_refresh=False)
print(f" Auto-refresh: {manual_secrets.auto_token_refresh}")
manual_secrets = SecretStore(sync_with_jupyterhub=False)
print(f" JupyterHub sync: {manual_secrets.sync_with_jupyterhub}")
# Custom timing
print("\n2⃣ Custom refresh timing:")
custom_secrets = SecretStore(
auto_token_refresh=True,
refresh_buffer_seconds=600, # Refresh 10 minutes before expiry
background_refresh_interval=3600, # Background refresh every hour
sync_with_jupyterhub=True,
refresh_buffer_seconds=600, # Sync 10 minutes before expiry
)
print(f" Refresh buffer: {custom_secrets.refresh_buffer_seconds}s")
print(f" Background interval: {custom_secrets.background_refresh_interval}s")
print(f" JupyterHub sync: {custom_secrets.sync_with_jupyterhub}")
# Background refresh (if auto_token_refresh is enabled)
if custom_secrets.auto_token_refresh and custom_secrets.refresh_token:
print("\n3⃣ Starting background refresher:")
refresher = custom_secrets.start_background_refresh()
refresher_status = refresher.get_status()
print(f" Running: {refresher_status['running']}")
print(f" Interval: {refresher_status['interval_seconds']}s")
# Stop the refresher
custom_secrets.stop_background_refresh()
print(" Stopped background refresher")
# Check JupyterHub API configuration
print("\n3⃣ JupyterHub API configuration:")
status = custom_secrets.get_status()
api_configured = status.get('jupyterhub_api_configured', False)
print(f" API configured: {api_configured}")
if api_configured:
print(f" API URL: {custom_secrets.jupyterhub_api_url}")
else:
print(" API token or URL not configured")
if __name__ == "__main__":

View File

@@ -1,61 +1,60 @@
"""
Secrets management for JupyterHub with Vault backend
Secrets management with user-specific Vault token authentication
"""
import logging
import os
import threading
import warnings
from datetime import datetime, timedelta
from typing import Any, overload
import hvac
import jwt
import requests
# Suppress SSL warnings for self-signed certificates
warnings.filterwarnings("ignore", message="Unverified HTTPS request")
# Set up logging (disabled by default)
logger = logging.getLogger("buunstack")
logger.addHandler(logging.NullHandler()) # Default to no output
log_level_str = os.getenv("BUUNSTACK_LOG_LEVEL", "warning").upper()
log_level = getattr(logging, log_level_str, logging.WARNING)
logger.setLevel(log_level)
# For Jupyter notebooks, we need to ensure proper logging configuration
# Always add handler if none exists, regardless of conditions
if not logger.handlers:
handler = logging.StreamHandler()
handler.setLevel(log_level)
formatter = logging.Formatter(
"%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
handler.setFormatter(formatter)
logger.addHandler(handler)
# Disable propagation to avoid root logger interference in notebooks
logger.propagate = False
# Debug: Log the handler addition
if log_level <= logging.DEBUG:
print(f"DEBUG: Added StreamHandler to buunstack logger (level={log_level})")
logging.getLogger().setLevel(log_level)
# Additional debug information for troubleshooting
if log_level <= logging.DEBUG:
print(
f"DEBUG: buunstack logger initialized - level={logger.level}, handlers={len(logger.handlers)}"
)
class SecretStore:
"""
Simple secrets management for JupyterHub with Vault backend.
Secure secrets management with JupyterHub API authentication.
SecretStore provides a secure interface for managing secrets in JupyterHub
environments using HashiCorp Vault as the backend storage. It supports
automatic OIDC token refresh via Keycloak integration and provides both
manual and background token management options.
This class implements the singleton pattern to ensure only one instance
exists per user session, preventing duplicate background refresh threads.
Attributes
----------
auto_token_refresh : bool
Whether automatic token refresh is enabled.
refresh_buffer_seconds : int
Seconds before token expiry to trigger refresh.
background_refresh_interval : int
Seconds between background refresh checks.
username : str or None
JupyterHub username from environment.
vault_addr : str or None
Vault server address from environment.
base_path : str
Base path for user's secrets in Vault.
Uses JupyterHub's vault-token API endpoint to obtain Vault tokens
by exchanging auth_state JWT. Implements singleton pattern for
consistent state across imports.
Examples
--------
>>> secrets = SecretStore()
>>> secrets.put('api-keys', openai='sk-123', github='ghp-456')
>>> data = secrets.get('api-keys')
>>> print(data['openai'])
'sk-123'
>>> # Or get specific field directly
>>> openai_key = secrets.get('api-keys', field='openai')
>>> print(openai_key)
'sk-123'
@@ -65,203 +64,84 @@ class SecretStore:
_initialized = False
def __new__(cls, *args, **kwargs):
"""Return singleton SecretStore instance."""
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def __init__(
self,
auto_token_refresh: bool = True,
refresh_buffer_seconds: int = 300,
background_refresh_interval: int = 1800,
):
def __init__(self):
"""
Initialize SecretStore with authentication and configuration.
Initialize SecretStore with JupyterHub API authentication.
Note: Due to singleton pattern, parameters are only used on the first
instantiation. Subsequent calls return the existing instance with
its original configuration.
Parameters
----------
auto_token_refresh : bool, optional
Enable automatic token refresh using Keycloak OIDC, by default True.
Requires KEYCLOAK_HOST, KEYCLOAK_REALM, and JUPYTERHUB_OIDC_REFRESH_TOKEN
environment variables. Only used on first instantiation.
refresh_buffer_seconds : int, optional
Seconds before token expiry to trigger refresh, by default 300.
Only used when auto_token_refresh is True. Only used on first instantiation.
background_refresh_interval : int, optional
Seconds between background refresh checks, by default 1800.
Only used when background refresh is started. Only used on first instantiation.
Raises
------
ValueError
If required environment variables are missing:
- JUPYTERHUB_USER: JupyterHub username
- VAULT_ADDR: Vault server address
- JUPYTERHUB_OIDC_ACCESS_TOKEN: Initial access token
- KEYCLOAK_HOST, KEYCLOAK_REALM: Required for auto_token_refresh
ConnectionError
If unable to connect to Vault server or authenticate.
Examples
--------
>>> # Basic usage with auto-refresh
>>> secrets = SecretStore()
>>> # Manual token management
>>> secrets = SecretStore(auto_token_refresh=False)
>>> # Custom timing
>>> secrets = SecretStore(
... refresh_buffer_seconds=600,
... background_refresh_interval=3600
... )
Uses JupyterHub's vault-token API endpoint to exchange
auth_state JWT for Vault tokens.
"""
if self._initialized:
return
self.auto_token_refresh = auto_token_refresh
self.refresh_buffer_seconds = refresh_buffer_seconds
self.background_refresh_interval = background_refresh_interval
self.username = os.getenv("JUPYTERHUB_USER")
self.vault_addr = os.getenv("VAULT_ADDR")
if self.auto_token_refresh:
self.keycloak_host = os.getenv("KEYCLOAK_HOST")
self.keycloak_realm = os.getenv("KEYCLOAK_REALM")
self.keycloak_client_id = os.getenv("KEYCLOAK_CLIENT_ID", "jupyterhub")
self.refresh_token = os.getenv("JUPYTERHUB_OIDC_REFRESH_TOKEN")
self.access_token = os.getenv("JUPYTERHUB_OIDC_ACCESS_TOKEN")
self.token_expiry = (
self._get_token_expiry(self.access_token) if self.access_token else None
)
self.client = hvac.Client(url=self.vault_addr, verify=False)
self._background_refresher = None
self._authenticate_vault()
self.base_path = f"jupyter/users/{self.username}"
logger.info(f"SecretStore initialized for user: {self.username}")
logger.info(
f"Auto token refresh: {'enabled' if self.auto_token_refresh else 'disabled'}"
)
# Using pre-acquired Vault token from notebook spawn
if self.auto_token_refresh and self.token_expiry:
logger.info(f"Token expires at: {self.token_expiry}")
# Initialize Vault client
self.client = hvac.Client(url=self.vault_addr, verify=False)
# Attempt authentication
self._authenticate_vault()
logger.info(f"SecretStore initialized for user: {self.username}")
logger.info("Using user-specific Vault token authentication")
self._initialized = True
def _get_token_expiry(self, token: str) -> datetime | None:
"""Extract expiry time from JWT token"""
if not token:
return None
try:
payload = jwt.decode(token, options={"verify_signature": False})
exp = payload.get("exp")
if exp:
return datetime.fromtimestamp(exp)
# Fallback to iat + 1 hour
iat = payload.get("iat")
if iat:
return datetime.fromtimestamp(iat + 3600)
except Exception as e:
logger.warning(f"Could not decode token expiry: {e}")
return datetime.now() + timedelta(hours=1)
def _is_token_valid(self) -> bool:
"""Check if current token is still valid"""
if not self.auto_token_refresh or not self.token_expiry:
return True # Assume valid if refresh is disabled
time_until_expiry = (self.token_expiry - datetime.now()).total_seconds()
return time_until_expiry > self.refresh_buffer_seconds
def _refresh_keycloak_tokens(self) -> bool:
"""Refresh tokens using Keycloak refresh token"""
if not self.auto_token_refresh:
return False
if not self.refresh_token or not self.keycloak_host or not self.keycloak_realm:
logger.error("Missing refresh token or Keycloak configuration")
return False
token_url = f"https://{self.keycloak_host}/realms/{self.keycloak_realm}/protocol/openid-connect/token"
try:
logger.info("Refreshing tokens from Keycloak...")
response = requests.post(
token_url,
data={
"grant_type": "refresh_token",
"refresh_token": self.refresh_token,
"client_id": self.keycloak_client_id,
},
verify=False,
)
if response.status_code == 200:
tokens = response.json()
# Update tokens
self.access_token = tokens["access_token"]
if "refresh_token" in tokens:
self.refresh_token = tokens["refresh_token"]
# Update environment variables
os.environ["JUPYTERHUB_OIDC_ACCESS_TOKEN"] = self.access_token
if "refresh_token" in tokens:
os.environ["JUPYTERHUB_OIDC_REFRESH_TOKEN"] = self.refresh_token
# Update token expiry
self.token_expiry = self._get_token_expiry(self.access_token)
logger.info("✅ Tokens refreshed successfully")
return True
else:
logger.error(
f"Token refresh failed: {response.status_code} - {response.text}"
)
return False
except Exception as e:
logger.error(f"Exception during token refresh: {e}")
return False
def _authenticate_vault(self):
"""Authenticate with Vault using current access token"""
if not self.access_token:
raise ValueError("No access token available")
"""
Authenticate with Vault using user-specific token from notebook spawn.
try:
self.client.auth.jwt.jwt_login(
role="jupyter-token", jwt=self.access_token, path="jwt"
Raises
------
Exception
If user-specific Vault token is not available.
"""
vault_token = os.getenv("NOTEBOOK_VAULT_TOKEN")
if not vault_token:
raise Exception(
"No user-specific Vault token available. "
"Please restart your notebook server."
)
logger.info("✅ Authenticated with Vault successfully")
except Exception as e:
logger.error(f"Vault authentication failed: {e}")
raise
self.client.token = vault_token
logger.info("✅ Using user-specific Vault token from notebook spawn")
def _ensure_authenticated(self):
"""Ensure we have valid tokens and Vault authentication"""
if self.auto_token_refresh and not self._is_token_valid():
logger.info("Token invalid or expiring soon")
"""
Ensure we have valid Vault authentication with token renewal.
"""
try:
if self.client.is_authenticated():
# Check if token needs renewal (if renewable and close to expiry)
try:
token_info = self.client.auth.token.lookup_self()
ttl = token_info.get("data", {}).get("ttl", 0)
renewable = token_info.get("data", {}).get("renewable", False)
if self._refresh_keycloak_tokens():
self._authenticate_vault()
else:
# Renew if TTL < 10 minutes and renewable
if renewable and ttl > 0 and ttl < 600:
logger.info(f"Renewing Vault token (TTL: {ttl}s)")
self.client.auth.token.renew_self()
logger.info("✅ Vault token renewed successfully")
except Exception as e:
logger.warning(f"Token renewal check failed: {e}")
return
except Exception:
pass
# Token expired or invalid - no fallback available with user-specific tokens
raise Exception(
"Failed to refresh tokens. Manual re-authentication required."
"User-specific Vault token expired and cannot be refreshed. Please restart your notebook server."
)
def put(self, key: str, **kwargs: Any) -> None:
@@ -432,20 +312,24 @@ class SecretStore:
logger.warning(f'Could not get secret "{key}": {e}')
raise KeyError(f"Secret '{key}' not found") from e
def delete(self, key: str) -> None:
def delete(self, key: str, field: str | None = None) -> None:
"""
Delete a secret from your personal storage.
Delete a secret or a specific field from your personal storage.
Permanently removes the secret and all its versions from Vault.
This operation cannot be undone.
If field is None, permanently removes the entire secret and all its versions.
If field is specified, removes only that field from the secret.
Parameters
----------
key : str
The key/name of the secret to delete.
The key/name of the secret to delete or modify.
field : str, optional
Specific field to delete from the secret. If None, deletes entire secret.
Raises
------
KeyError
If the key or field doesn't exist.
ConnectionError
If unable to connect to Vault server.
hvac.exceptions.Forbidden
@@ -456,12 +340,20 @@ class SecretStore:
Examples
--------
>>> secrets = SecretStore()
>>> # Delete entire secret
>>> secrets.delete('old-api-key')
>>> # Secret is permanently removed
>>>
>>> # Delete only specific field
>>> secrets.put('credentials', github='token123', aws='secret456')
>>> secrets.delete('credentials', field='github')
>>> # Now only 'aws' field remains
"""
self._ensure_authenticated()
path = f"{self.base_path}/{key}"
if field is None:
# Delete entire secret
try:
self.client.secrets.kv.v2.delete_metadata_and_all_versions(
path=path, mount_point="secret"
@@ -470,6 +362,44 @@ class SecretStore:
except Exception as e:
logger.error(f'Failed to delete secret "{key}": {e}')
raise
else:
# Delete specific field only
try:
# First, get the current secret
response = self.client.secrets.kv.v2.read_secret_version(
path=path, mount_point="secret", raise_on_deleted_version=False
)
if response and "data" in response and "data" in response["data"]:
data = response["data"]["data"]
# Check if field exists
if field not in data:
raise KeyError(f"Field '{field}' not found in secret '{key}'")
# Remove the field
del data[field]
# If no fields remain, delete the entire secret
if not data:
self.client.secrets.kv.v2.delete_metadata_and_all_versions(
path=path, mount_point="secret"
)
logger.info(f"Deleted secret '{key}' (no fields remaining)")
else:
# Update the secret without the deleted field
self.client.secrets.kv.v2.create_or_update_secret(
path=path, secret=data, mount_point="secret"
)
logger.info(f"Deleted field '{field}' from secret '{key}'")
else:
raise KeyError(f"Secret '{key}' not found")
except KeyError:
raise
except Exception as e:
logger.error(
f"Failed to delete field '{field}' from secret '{key}': {e}"
)
raise
def list(self) -> list[str]:
"""
@@ -505,236 +435,35 @@ class SecretStore:
def get_status(self) -> dict[str, Any]:
"""
Get comprehensive status information about the SecretStore instance.
Returns detailed information about configuration, authentication status,
token validity, and background refresh status.
Get status information about the SecretStore instance.
Returns
-------
dict[str, Any]
Status dictionary containing:
- username: JupyterHub username
- auto_token_refresh: Whether auto-refresh is enabled
- has_access_token: Whether access token is available
- vault_addr: Vault server address
- has_refresh_token: Whether refresh token is available (if auto_token_refresh=True)
- keycloak_configured: Whether Keycloak settings are configured (if auto_token_refresh=True)
- token_expires_at: Token expiration time (if available)
- token_expires_in_seconds: Seconds until token expires (if available)
- background_refresher_running: Whether background refresher is active
- authentication_method: Authentication method used
- vault_authenticated: Whether Vault client is authenticated
Examples
--------
>>> secrets = SecretStore()
>>> status = secrets.get_status()
>>> print(f"User: {status['username']}")
>>> print(f"Token expires in: {status.get('token_expires_in_seconds', 'N/A')} seconds")
"""
status = {
"username": self.username,
"auto_token_refresh": self.auto_token_refresh,
"has_access_token": bool(self.access_token),
"vault_addr": self.vault_addr,
"authentication_method": "User-specific Vault token",
}
if self.auto_token_refresh:
status.update(
{
"has_refresh_token": bool(self.refresh_token),
"keycloak_configured": bool(
self.keycloak_host and self.keycloak_realm
),
}
)
if self.token_expiry:
time_remaining = (self.token_expiry - datetime.now()).total_seconds()
status.update(
{
"token_valid": self._is_token_valid(),
"token_expiry": self.token_expiry.isoformat(),
"seconds_remaining": max(0, time_remaining),
"minutes_remaining": max(0, time_remaining / 60),
}
)
return status
def start_background_refresh(self) -> "BackgroundRefresher":
"""
Start automatic background token refreshing.
Begins a background thread that periodically checks and refreshes
the access token before it expires. Only available when
auto_token_refresh is enabled.
Returns
-------
BackgroundRefresher
The background refresher instance that can be used to monitor
or control the refresh process.
Raises
------
ValueError
If auto_token_refresh is False. Background refresh requires
automatic token refresh to be enabled.
Examples
--------
>>> secrets = SecretStore(auto_token_refresh=True)
>>> refresher = secrets.start_background_refresh()
>>> status = refresher.get_status()
>>> print(f"Background refresh running: {status['running']}")
"""
if not self.auto_token_refresh:
raise ValueError("Background refresh requires auto_token_refresh=True")
if self._background_refresher is None:
self._background_refresher = BackgroundRefresher(
self, interval_seconds=self.background_refresh_interval
)
self._background_refresher.start()
return self._background_refresher
def stop_background_refresh(self) -> None:
"""
Stop the background token refresher.
Stops the background thread that was refreshing tokens automatically.
It's safe to call this method even if no background refresher is running.
Examples
--------
>>> secrets = SecretStore()
>>> refresher = secrets.start_background_refresh()
>>> # ... do some work ...
>>> secrets.stop_background_refresh()
"""
if self._background_refresher:
self._background_refresher.stop()
class BackgroundRefresher:
"""
Background token refresher for automatic token management.
This class runs in a separate daemon thread and periodically checks if
the access token needs to be refreshed, automatically handling the refresh
process to maintain uninterrupted access to Vault.
Attributes
----------
secret_store : SecretStore
The SecretStore instance to refresh tokens for.
interval_seconds : int
Seconds between refresh checks.
refresh_count : int
Number of successful refreshes performed.
last_refresh : datetime or None
Timestamp of the last successful refresh.
Examples
--------
>>> secrets = SecretStore(auto_token_refresh=True)
>>> refresher = secrets.start_background_refresh()
>>> # Refresher runs automatically in background
>>> status = refresher.get_status()
>>> print(f"Refreshes performed: {status['refresh_count']}")
"""
def __init__(self, secret_store: SecretStore, interval_seconds: int = 1800):
"""
Initialize the background refresher.
Parameters
----------
secret_store : SecretStore
The SecretStore instance to manage tokens for.
interval_seconds : int, optional
Seconds between refresh checks, by default 1800 (30 minutes).
"""
self.secret_store = secret_store
self.interval_seconds = interval_seconds
self._stop_event = threading.Event()
self._thread = None
self.refresh_count = 0
self.last_refresh = None
def start(self) -> None:
"""
Start the background refresh thread.
Creates and starts a daemon thread that will periodically check
and refresh tokens. Safe to call multiple times.
"""
if self._thread is None or not self._thread.is_alive():
self._stop_event.clear()
self._thread = threading.Thread(target=self._refresh_loop, daemon=True)
self._thread.start()
logger.info(
f"Started background refresher (interval: {self.interval_seconds}s)"
)
def stop(self) -> None:
"""
Stop the background refresh thread.
Signals the refresh thread to stop and waits up to 5 seconds
for it to finish gracefully.
"""
if self._thread and self._thread.is_alive():
self._stop_event.set()
self._thread.join(timeout=5)
logger.info("Stopped background refresher")
def _refresh_loop(self):
while not self._stop_event.is_set():
if self._stop_event.wait(self.interval_seconds):
break
try:
if self.secret_store._refresh_keycloak_tokens():
self.secret_store._authenticate_vault()
self.refresh_count += 1
self.last_refresh = datetime.now()
logger.info(
f"✅ Background refresh #{self.refresh_count} successful"
)
else:
logger.error("❌ Background refresh failed")
except Exception as e:
logger.error(f"Exception in background refresh: {e}")
status["vault_authenticated"] = self.client.is_authenticated()
except Exception:
status["vault_authenticated"] = False
def get_status(self) -> dict[str, Any]:
"""
Get the current status of the background refresher.
Returns
-------
dict[str, Any]
Status dictionary containing:
- running: Whether the refresh thread is active
- refresh_count: Number of successful refreshes performed
- last_refresh: ISO timestamp of last successful refresh (or None)
- interval_seconds: Configured refresh interval
Examples
--------
>>> refresher = secrets.start_background_refresh()
>>> status = refresher.get_status()
>>> print(f"Running: {status['running']}, Count: {status['refresh_count']}")
"""
return {
"running": self._thread and self._thread.is_alive(),
"refresh_count": self.refresh_count,
"last_refresh": self.last_refresh.isoformat()
if self.last_refresh
else None,
"interval_seconds": self.interval_seconds,
}
return status
# Utility functions
@@ -817,7 +546,7 @@ def put_env_to_secrets(
>>> # Store with custom key
>>> put_env_to_secrets(secrets, {'API_KEY': 'secret'}, 'production-config')
'jupyter/users/username/environment'
'jupyter/users/username/production-config'
"""
# Convert all values to strings and use **kwargs for put()
string_env_dict = {k: str(v) for k, v in env_dict.items()}

View File

@@ -76,3 +76,7 @@ strict_equality = true
minversion = "6.0"
addopts = "-ra -q"
testpaths = ["tests"]
[tool.pyright]
reportUnusedParameter = "none"
reportUnusedVariable = "warning"

View File

@@ -136,6 +136,29 @@ create-admin-token root_token='': check-env
# Create token with admin policy
vault token create -policy=admin
# Create token with specified policy and store in Vault
create-token-and-store policy path ttl="24h" root_token='': check-env
#!/bin/bash
set -euo pipefail
{{ _vault_root_env_setup }}
echo "Creating token with policy '{{ policy }}'..."
# Create token with specified policy
token_output=$(vault token create -policy={{ policy }} -ttl={{ ttl }} -format=json)
service_token=$(echo "${token_output}" | jq -r '.auth.client_token')
echo "Storing token in Vault at path '{{ path }}'..."
# Store the token in Vault itself for later retrieval
vault kv put -mount=secret {{ path }} token="${service_token}"
echo "✓ Token created and stored in Vault"
echo "Policy: {{ policy }}"
echo "Path: secret/{{ path }}"
echo "Token (first 20 chars): ${service_token:0:20}..."
echo ""
echo "To retrieve the token later:"
echo " just vault::get {{ path }} token"
# Create admin policy for Vault
create-admin-policy root_token='':
#!/bin/bash
@@ -160,6 +183,12 @@ create-admin-policy root_token='':
path "sys/policies/acl/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
path "auth/token/create" {
capabilities = ["create", "update"]
}
path "auth/token/create/*" {
capabilities = ["create", "update"]
}
EOF
echo "Admin policy created successfully"
@@ -287,7 +316,7 @@ setup-jwt-auth audience role policy='default':
user_claim="preferred_username" \
token_policies="{{ policy }}" \
ttl="1h" \
max_ttl="24h"
max_ttl="48h"
echo "✓ JWT authentication configured"
echo " Audience: {{ audience }}"