docs: reconstruct docs

2025-11-13 13:26:57 +09:00
parent 972adc209d
commit 0ff24310ce
8 changed files with 1164 additions and 590 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -159,6 +159,7 @@ install:
 ```

 ServiceMonitor template (`servicemonitor.gomplate.yaml`):
+
 ```yaml
 {{- if eq .Env.MONITORING_ENABLED "true" }}
 apiVersion: monitoring.coreos.com/v1
@@ -366,3 +367,36 @@ receiving
 - Only write code comments when necessary, as the code should be self-explanatory
  (Avoid trivial comment for each code block)
 - Write output messages and code comments in English
+
+### Markdown Style
+
+When writing Markdown documentation:
+
+1. **NEVER use ordered lists as section headers**:
+   - Ordered lists indent content and are not suitable for headings
+   - Use proper heading levels (####) instead of numbered lists for section titles
+
+   ```markdown
+   <!-- INCORRECT: Ordered list used as headers -->
+   1. **Setup Instructions:**
+
+   Details here...
+
+   2. **Next Step:**
+
+   More details...
+
+   <!-- CORRECT: Use headings instead -->
+   #### Setup Instructions
+
+   Details here...
+
+   #### Next Step
+
+   More details...
+   ```
+
+2. **Always validate with markdownlint-cli2**:
+   - Run `markdownlint-cli2 <file>` before committing any Markdown files
+   - Fix all linting errors to ensure consistent formatting
+   - Pay attention to code block language specifications (MD040) and list formatting (MD029)
--- a/airflow/README.md
+++ b/airflow/README.md
@@ -46,7 +46,7 @@ This document covers Airflow installation, deployment, and debugging in the buun
   **Note**: New users have only Viewer access by default and cannot execute DAGs without role assignment.

 4. **Access Airflow Web UI**:
-   - Navigate to your Airflow instance (e.g., `https://airflow.buun.dev`)
+   - Navigate to your Airflow instance (e.g., `https://airflow.yourdomain.com`)
   - Login with your Keycloak credentials

 ### Uninstalling
@@ -63,7 +63,7 @@ just airflow::uninstall true

 ### 1. Access JupyterHub

- Navigate to your JupyterHub instance (e.g., `https://jupyter.buun.dev`)
+- Navigate to your JupyterHub instance (e.g., `https://jupyter.yourdomain.com`)
 - Login with your credentials

 ### 2. Navigate to Airflow DAGs Directory
@@ -82,7 +82,7 @@ In JupyterHub, the Airflow DAGs directory is mounted at:

 ### 4. Verify Deployment

-1. Access Airflow Web UI (e.g., `https://airflow.buun.dev`)
+1. Access Airflow Web UI (e.g., `https://airflow.yourdomain.com`)
 2. Check that the DAG `csv_to_postgres` appears in the DAGs list
 3. If the DAG doesn't appear immediately, wait 1-2 minutes for Airflow to detect the new file

--- a/dagster/README.md
+++ b/dagster/README.md
@@ -28,7 +28,7 @@ This document covers Dagster installation, deployment, and debugging in the buun
   ```

 3. **Access Dagster Web UI**:
-   - Navigate to your Dagster instance (e.g., `https://dagster.buun.dev`)
+   - Navigate to your Dagster instance (e.g., `https://dagster.yourdomain.com`)
   - Login with your Keycloak credentials

 ### Uninstalling
--- a/docs/jupyterhub.md
+++ b/docs/jupyterhub.md
@@ -1,577 +1,5 @@
-# JupyterHub
+# JupyterHub Documentation

-JupyterHub provides a multi-user Jupyter notebook environment with Keycloak OIDC authentication, Vault integration for secure secrets management, and custom kernel images for data science workflows.
+This documentation has been moved to [jupyterhub/README.md](../jupyterhub/README.md).

-## Installation
-
-Install JupyterHub with interactive configuration:
-
-```bash
-just jupyterhub::install
-```
-
-This will prompt for:
-
- JupyterHub host (FQDN)
- NFS PV usage (if Longhorn is installed)
- NFS server details (if NFS is enabled)
- Vault integration setup (requires root token for initial setup)
-
-### Prerequisites
-
- Keycloak must be installed and configured
- For NFS storage: Longhorn must be installed
- For Vault integration: Vault and External Secrets Operator must be installed
- Helm repository must be accessible
-
-## Kernel Images
-
-### Important Note
-
-Building and using custom buun-stack images requires building the `buunstack` Python package first. The package wheel file will be included in the Docker image during build.
-
-JupyterHub supports multiple kernel image profiles:
-
-### Standard Profiles
-
- **minimal**: Basic Python environment
- **base**: Python with common data science packages
- **datascience**: Full data science stack (default)
- **pyspark**: PySpark for big data processing
- **pytorch**: PyTorch for machine learning
- **tensorflow**: TensorFlow for machine learning
-
-### Buun-Stack Profiles
-
- **buun-stack**: Comprehensive data science environment with Vault integration
- **buun-stack-cuda**: CUDA-enabled version with GPU support
-
-## Profile Configuration
-
-Enable/disable profiles using environment variables:
-
-```bash
-# Enable buun-stack profile (CPU version)
-JUPYTER_PROFILE_BUUN_STACK_ENABLED=true
-
-# Enable buun-stack CUDA profile (GPU version)
-JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED=true
-
-# Disable default datascience profile
-JUPYTER_PROFILE_DATASCIENCE_ENABLED=false
-```
-
-Available profile variables:
-
- `JUPYTER_PROFILE_MINIMAL_ENABLED`
- `JUPYTER_PROFILE_BASE_ENABLED`
- `JUPYTER_PROFILE_DATASCIENCE_ENABLED`
- `JUPYTER_PROFILE_PYSPARK_ENABLED`
- `JUPYTER_PROFILE_PYTORCH_ENABLED`
- `JUPYTER_PROFILE_TENSORFLOW_ENABLED`
- `JUPYTER_PROFILE_BUUN_STACK_ENABLED`
- `JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED`
-
-Only `JUPYTER_PROFILE_DATASCIENCE_ENABLED` is true by default.
-
-## Buun-Stack Images
-
-Buun-stack images provide comprehensive data science environments with:
-
- All standard data science packages (NumPy, Pandas, Scikit-learn, etc.)
- Deep learning frameworks (PyTorch, TensorFlow, Keras)
- Big data tools (PySpark, Apache Arrow)
- NLP and ML libraries (LangChain, Transformers, spaCy)
- Database connectors and tools
- **Vault integration** with `buunstack` Python package
-
-### Building Custom Images
-
-Build and push buun-stack images to your registry:
-
-```bash
-# Build images (includes building the buunstack Python package)
-just jupyterhub::build-kernel-images
-
-# Push to registry
-just jupyterhub::push-kernel-images
-```
-
-The build process:
-
-1. Builds the `buunstack` Python package wheel
-2. Copies the wheel into the Docker build context
-3. Installs the wheel in the Docker image
-4. Cleans up temporary files
-
-⚠️ **Note**: Buun-stack images are comprehensive and large (~13GB). Initial image pulls and deployments take significant time due to the extensive package set.
-
-### Image Configuration
-
-Configure image settings in `.env.local`:
-
-```bash
-# Image registry
-IMAGE_REGISTRY=localhost:30500
-
-# Image tag (current default)
-JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
-```
-
-## Vault Integration
-
-### Overview
-
-Vault integration enables secure secrets management directly from Jupyter notebooks. The system uses:
-
- **ExternalSecret** to fetch the admin token from Vault
- **Renewable tokens** with unlimited Max TTL to avoid 30-day system limitations
- **Token renewal script** that automatically renews tokens at TTL/2 intervals (minimum 30 seconds)
- **User-specific tokens** created during notebook spawn with isolated access
-
-### Architecture
-
-```plain
-┌────────────────────────────────────────────────────────────────┐
-│                         JupyterHub Hub Pod                     │
-│                                                                │
-│  ┌──────────────┐  ┌────────────────┐  ┌────────────────────┐  │
-│  │     Hub      │  │ Token Renewer  │  │  ExternalSecret    │  │
-│  │  Container   │◄─┤   Sidecar      │◄─┤   (mounted as      │  │
-│  │              │  │                │  │    Secret)         │  │
-│  └──────────────┘  └────────────────┘  └────────────────────┘  │
-│         │                    │                     ▲           │
-│         │                    │                     │           │
-│         ▼                    ▼                     │           │
-│  ┌──────────────────────────────────┐              │           │
-│  │    /vault/secrets/vault-token    │              │           │
-│  │  (Admin token for user creation) │              │           │
-│  └──────────────────────────────────┘              │           │
-└────────────────────────────────────────────────────┼───────────┘
-                                                     │
-                                         ┌───────────▼──────────┐
-                                         │       Vault          │
-                                         │  secret/jupyterhub/  │
-                                         │     vault-token      │
-                                         └──────────────────────┘
-```
-
-### Prerequisites
-
-Vault integration requires:
-
- Vault server installed and configured
- External Secrets Operator installed
- ClusterSecretStore configured for Vault
- Buun-stack kernel images (standard images don't include Vault integration)
-
-### Setup
-
-Vault integration is configured during JupyterHub installation:
-
-```bash
-just jupyterhub::install
-# Answer "yes" when prompted about Vault integration
-# Provide Vault root token when prompted
-```
-
-The setup process:
-
-1. Creates `jupyterhub-admin` policy with necessary permissions including `sudo` for orphan token creation
-2. Creates renewable admin token with 24h TTL and unlimited Max TTL
-3. Stores token in Vault at `secret/jupyterhub/vault-token`
-4. Creates ExternalSecret to fetch token from Vault
-5. Deploys token renewal sidecar for automatic renewal
-
-### Usage in Notebooks
-
-With Vault integration enabled, use the `buunstack` package in notebooks:
-
-```python
-from buunstack import SecretStore
-
-# Initialize (uses pre-acquired user-specific token)
-secrets = SecretStore()
-
-# Store secrets
-secrets.put('api-keys',
-    openai='sk-...',
-    github='ghp_...',
-    database_url='postgresql://...')
-
-# Retrieve secrets
-api_keys = secrets.get('api-keys')
-openai_key = secrets.get('api-keys', field='openai')
-
-# List all secrets
-secret_names = secrets.list()
-
-# Delete secrets or specific fields
-secrets.delete('old-api-key')  # Delete entire secret
-secrets.delete('api-keys', field='github')  # Delete only github field
-```
-
-### Security Features
-
- **User isolation**: Each user receives an orphan token with access only to their namespace
- **Automatic renewal**: Token renewal script renews admin token at TTL/2 intervals (minimum 30 seconds)
- **ExternalSecret integration**: Admin token fetched securely from Vault
- **Orphan tokens**: User tokens are orphan tokens, not limited by parent policy restrictions
- **Audit trail**: All secret access is logged in Vault
-
-### Token Management
-
-#### Admin Token
-
-The admin token is managed through:
-
-1. **Creation**: `just jupyterhub::create-jupyterhub-vault-token` creates renewable token
-2. **Storage**: Stored in Vault at `secret/jupyterhub/vault-token`
-3. **Retrieval**: ExternalSecret fetches and mounts as Kubernetes Secret
-4. **Renewal**: `vault-token-renewer.sh` script renews at TTL/2 intervals
-
-#### User Tokens
-
-User tokens are created dynamically:
-
-1. **Pre-spawn hook** reads admin token from `/vault/secrets/vault-token`
-2. **Creates user policy** `jupyter-user-{username}` with restricted access
-3. **Creates orphan token** with user policy (requires `sudo` permission)
-4. **Sets environment variable** `NOTEBOOK_VAULT_TOKEN` in notebook container
-
-## Token Renewal Implementation
-
-### Admin Token Renewal
-
-The admin token renewal is handled by a sidecar container (`vault-token-renewer`) running alongside the JupyterHub hub:
-
-**Implementation Details:**
-
-1. **Renewal Script**: `/vault/config/vault-token-renewer.sh`
-   - Runs in the `vault-token-renewer` sidecar container
-   - Uses Vault 1.17.5 image with HashiCorp Vault CLI
-
-2. **Environment-Based TTL Configuration**:
-
-   ```bash
-   # Reads TTL from environment variable (set in .env.local)
-   TTL_RAW="${JUPYTERHUB_VAULT_TOKEN_TTL}"  # e.g., "5m", "24h"
-
-   # Converts to seconds and calculates renewal interval
-   RENEWAL_INTERVAL=$((TTL_SECONDS / 2))  # TTL/2 with minimum 30s
-   ```
-
-3. **Token Source**: ExternalSecret → Kubernetes Secret → mounted file
-
-   ```bash
-   # Token retrieved from ExternalSecret-managed mount
-   ADMIN_TOKEN=$(cat /vault/admin-token/token)
-   ```
-
-4. **Renewal Loop**:
-
-   ```bash
-   while true; do
-       vault token renew >/dev/null 2>&1
-       sleep $RENEWAL_INTERVAL
-   done
-   ```
-
-5. **Error Handling**: If renewal fails, re-retrieves token from ExternalSecret mount
-
-**Key Files:**
-
- `vault-token-renewer.sh`: Main renewal script
- `jupyterhub-vault-token-external-secret.gomplate.yaml`: ExternalSecret configuration
- `vault-token-renewer-config` ConfigMap: Contains the renewal script
-
-### User Token Renewal
-
-User token renewal is handled within the notebook environment by the `buunstack` Python package:
-
-**Implementation Details:**
-
-1. **Token Source**: Environment variable set by pre-spawn hook
-
-   ```python
-   # In pre_spawn_hook.gomplate.py
-   spawner.environment["NOTEBOOK_VAULT_TOKEN"] = user_vault_token
-   ```
-
-2. **Automatic Renewal**: Built into `SecretStore` class operations
-
-   ```python
-   # In buunstack/secrets.py
-   def _ensure_authenticated(self):
-       token_info = self.client.auth.token.lookup_self()
-       ttl = token_info.get("data", {}).get("ttl", 0)
-       renewable = token_info.get("data", {}).get("renewable", False)
-
-       # Renew if TTL < 10 minutes and renewable
-       if renewable and ttl > 0 and ttl < 600:
-           self.client.auth.token.renew_self()
-   ```
-
-3. **Renewal Trigger**: Every `SecretStore` operation (get, put, delete, list)
-   - Checks token validity before operation
-   - Automatically renews if TTL < 10 minutes
-   - Transparent to user code
-
-4. **Token Configuration** (set during creation):
-   - **TTL**: `NOTEBOOK_VAULT_TOKEN_TTL` (default: 24h = 1 day)
-   - **Max TTL**: `NOTEBOOK_VAULT_TOKEN_MAX_TTL` (default: 168h = 7 days)
-   - **Policy**: User-specific `jupyter-user-{username}`
-   - **Type**: Orphan token (independent of parent token lifecycle)
-
-5. **Expiry Handling**: When token reaches Max TTL:
-   - Cannot be renewed further
-   - User must restart notebook server (triggers new token creation)
-   - Prevented by `JUPYTERHUB_CULL_MAX_AGE` setting (6 days < 7 day Max TTL)
-
-**Key Files:**
-
- `pre_spawn_hook.gomplate.py`: User token creation logic
- `buunstack/secrets.py`: Token renewal implementation
- `user_policy.hcl`: User token permissions template
-
-### Token Lifecycle Summary
-
-```
-┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
-│   Admin Token   │    │   User Token     │    │  Pod Lifecycle  │
-│                 │    │                  │    │                 │
-│ Created: Manual │    │ Created: Spawn   │    │ Max Age: 7 days │
-│ TTL: 5m-24h     │    │ TTL: 1 day       │    │ Auto-restart    │
-│ Max TTL: ∞      │    │ Max TTL: 7 days  │    │ at Max TTL      │
-│ Renewal: Auto   │    │ Renewal: Auto    │    │                 │
-│ Interval: TTL/2 │    │ Trigger: Usage   │    │                 │
-└─────────────────┘    └──────────────────┘    └─────────────────┘
-         │                       │                       │
-         ▼                       ▼                       ▼
-   vault-token-renewer      buunstack.py            cull.maxAge
-   sidecar                  SecretStore            pod restart
-```
-
-## Storage Options
-
-### Default Storage
-
-Uses Kubernetes PersistentVolumes for user home directories.
-
-### NFS Storage
-
-For shared storage across nodes, configure NFS:
-
-```bash
-JUPYTERHUB_NFS_PV_ENABLED=true
-JUPYTER_NFS_IP=192.168.10.1
-JUPYTER_NFS_PATH=/volume1/drive1/jupyter
-```
-
-NFS storage requires:
-
- Longhorn storage system installed
- NFS server accessible from cluster nodes
- Proper NFS export permissions configured
-
-## Configuration
-
-### Environment Variables
-
-Key configuration variables:
-
-```bash
-# Basic settings
-JUPYTERHUB_NAMESPACE=jupyter
-JUPYTERHUB_CHART_VERSION=4.2.0
-JUPYTERHUB_OIDC_CLIENT_ID=jupyterhub
-
-# Keycloak integration
-KEYCLOAK_REALM=buunstack
-
-# Storage
-JUPYTERHUB_NFS_PV_ENABLED=false
-
-# Vault integration
-JUPYTERHUB_VAULT_INTEGRATION_ENABLED=false
-VAULT_ADDR=https://vault.example.com
-
-# Image settings
-JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
-IMAGE_REGISTRY=localhost:30500
-
-# Vault token TTL settings
-JUPYTERHUB_VAULT_TOKEN_TTL=24h       # Admin token: renewed at TTL/2 intervals
-NOTEBOOK_VAULT_TOKEN_TTL=24h         # User token: 1 day (renewed on usage)
-NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h    # User token: 7 days max
-
-# Server pod lifecycle settings
-JUPYTERHUB_CULL_MAX_AGE=604800       # Max pod age in seconds (7 days = 604800s)
-                                     # Should be <= NOTEBOOK_VAULT_TOKEN_MAX_TTL
-
-# Logging
-JUPYTER_BUUNSTACK_LOG_LEVEL=warning  # Options: debug, info, warning, error
-```
-
-### Advanced Configuration
-
-Customize JupyterHub behavior by editing `jupyterhub-values.gomplate.yaml` template before installation.
-
-## Management
-
-### Uninstall
-
-```bash
-just jupyterhub::uninstall
-```
-
-This removes:
-
- JupyterHub deployment
- User pods
- PVCs
- ExternalSecret
-
-### Update
-
-Upgrade to newer versions:
-
-```bash
-# Update image tag in .env.local
-export JUPYTER_PYTHON_KERNEL_TAG=python-3.12-29
-
-# Rebuild and push images
-just jupyterhub::build-kernel-images
-just jupyterhub::push-kernel-images
-
-# Upgrade JupyterHub deployment
-just jupyterhub::install
-```
-
-### Manual Token Refresh
-
-If needed, manually refresh the admin token:
-
-```bash
-# Create new renewable token
-just jupyterhub::create-jupyterhub-vault-token
-
-# Restart JupyterHub to pick up new token
-kubectl rollout restart deployment/hub -n jupyter
-```
-
-## Troubleshooting
-
-### Image Pull Issues
-
-Buun-stack images are large and may timeout:
-
-```bash
-# Check pod status
-kubectl get pods -n jupyter
-
-# Check image pull progress
-kubectl describe pod <pod-name> -n jupyter
-
-# Increase timeout if needed
-helm upgrade jupyterhub jupyterhub/jupyterhub --timeout=30m -f jupyterhub-values.yaml
-```
-
-### Vault Integration Issues
-
-Check token and authentication:
-
-```bash
-# Check ExternalSecret status
-kubectl get externalsecret -n jupyter jupyterhub-vault-token
-
-# Check if Secret was created
-kubectl get secret -n jupyter jupyterhub-vault-token
-
-# Check token renewal logs
-kubectl logs -n jupyter -l app.kubernetes.io/component=hub -c vault-token-renewer
-
-# In a notebook, verify environment
-%env NOTEBOOK_VAULT_TOKEN
-```
-
-Common issues:
-
-1. **"child policies must be subset of parent"**: Admin policy needs `sudo` permission for orphan tokens
-2. **Token not found**: Check ExternalSecret and ClusterSecretStore configuration
-3. **Permission denied**: Verify `jupyterhub-admin` policy has all required permissions
-
-### Authentication Issues
-
-Verify Keycloak client configuration:
-
-```bash
-# Check client exists
-just keycloak::get-client buunstack jupyterhub
-
-# Check redirect URIs
-just keycloak::update-client buunstack jupyterhub \
-  "https://your-jupyter-host/hub/oauth_callback"
-```
-
-## Technical Implementation Details
-
-### Helm Chart Version
-
-JupyterHub uses the official Zero to JupyterHub (Z2JH) Helm chart:
-
- Chart: `jupyterhub/jupyterhub`
- Version: `4.2.0` (configurable via `JUPYTERHUB_CHART_VERSION`)
- Documentation: https://z2jh.jupyter.org/
-
-### Token System Architecture
-
-The system uses a three-tier token approach:
-
-1. **Renewable Admin Token**:
-   - Created with `explicit-max-ttl=0` (unlimited Max TTL)
-   - Renewed automatically at TTL/2 intervals (minimum 30 seconds)
-   - Stored in Vault and fetched via ExternalSecret
-
-2. **Orphan User Tokens**:
-   - Created with `create_orphan()` API call
-   - Not limited by parent token policies
-   - Individual TTL and Max TTL settings
-
-3. **Token Renewal Script**:
-   - Runs as sidecar container
-   - Reads token from ExternalSecret mount
-   - Handles renewal and re-retrieval on failure
-
-### Key Files
-
- `jupyterhub-admin-policy.hcl`: Vault policy with admin permissions
- `user_policy.hcl`: Template for user-specific policies
- `vault-token-renewer.sh`: Token renewal script
- `jupyterhub-vault-token-external-secret.gomplate.yaml`: ExternalSecret configuration
-
-## Performance Considerations
-
- **Image Size**: Buun-stack images are ~13GB, plan storage accordingly
- **Pull Time**: Initial pulls take 5-15 minutes depending on network
- **Resource Usage**: Data science workloads require adequate CPU/memory
- **Token Renewal**: Minimal overhead (renewal at TTL/2 intervals)
-
-For production deployments, consider:
-
- Pre-pulling images to all nodes
- Using faster storage backends
- Configuring resource limits per user
- Setting up monitoring and alerts
-
-## Known Limitations
-
-1. **Annual Token Recreation**: While tokens have unlimited Max TTL, best practice suggests recreating them annually
-
-2. **Token Expiry and Pod Lifecycle**: User tokens have a TTL of 1 day (`NOTEBOOK_VAULT_TOKEN_TTL=24h`) and maximum TTL of 7 days (`NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h`). Daily usage extends the token for another day, allowing up to 7 days of continuous use. Server pods are automatically restarted after 7 days (`JUPYTERHUB_CULL_MAX_AGE=604800s`) to refresh tokens.
-
-3. **Cull Settings**: Server idle timeout is set to 2 hours by default. Adjust `cull.timeout` and `cull.every` in the Helm values for different requirements
-
-4. **NFS Storage**: When using NFS storage, ensure proper permissions are set on the NFS server. The default `JUPYTER_FSGID` is 100
-
-5. **ExternalSecret Dependency**: Requires External Secrets Operator to be installed and configured
+Please refer to the new location for complete JupyterHub setup, configuration, and usage documentation.
--- a/docs/resource-management.md
+++ b/docs/resource-management.md
@@ -0,0 +1,538 @@
+# Resource Managementplain
+
+This document describes how to configure resource requests and limits for components in the buun-stack.
+
+## Table of Contents
+
+- [Overview](#overview)
+- [QoS Classes](#qos-classes)
+- [Using Goldilocks](#using-goldilocks)
+- [Configuring Resources](#configuring-resources)
+- [Best Practices](#best-practices)
+- [Troubleshooting](#troubleshooting)
+
+## Overview
+
+Kubernetes uses resource requests and limits to:
+
+- **Schedule pods** on nodes with sufficient resources
+- **Ensure quality of service** through QoS classes
+- **Prevent resource exhaustion** by limiting resource consumption
+
+All critical components in buun-stack should have resource requests and limits configured.
+
+## QoS Classes
+
+Kubernetes assigns one of three QoS classes to each pod based on its resource configuration:
+
+### Guaranteed QoS (Highest Priority)
+
+**Requirements:**
+
+- Every container must have CPU and memory requests
+- Every container must have CPU and memory limits
+- Requests and limits must be **equal** for both CPU and memory
+
+**Characteristics:**
+
+- Highest priority during resource contention
+- Last to be evicted when node runs out of resources
+- Predictable performance
+
+**Example:**
+
+```yaml
+resources:
+  requests:
+    cpu: 200mplain
+    memory: 1Gi
+  limits:
+    cpu: 200m       # Same as requests
+    memory: 1Gi     # Same as requests
+```
+
+**Use for:** Critical data stores (PostgreSQL, Vault)
+
+### Burstable QoS (Medium Priority)
+
+**Requirements:**
+
+- At least one container has requests or limits
+- Does not meet Guaranteed QoS criteria
+- Typically `requests < limits`
+
+**Characteristics:**
+
+- Medium priority during resource contention
+- Can burst to limits when resources are available
+- More resource-efficient than Guaranteed
+
+**Example:**
+
+```yaml
+resources:
+  requests:
+    cpu: 50m
+    memory: 128Mi
+  limits:
+    cpu: 100m       # Can burst up to this
+    memory: 256Mi   # Can burst up to this
+```
+
+**Use for:** Operators, auxiliary services, variable workloads
+
+### BestEffort QoS (Lowest Priority)
+
+**Requirements:**
+
+- No resource requests or limits configured
+
+**Characteristics:**
+
+- Lowest priority during resource contention
+- First to be evicted when node runs out of resources
+- **Not recommended for production**
+
+## Using Goldilocks
+
+Goldilocks uses Vertical Pod Autoscaler (VPA) to recommend resource settings based on actual usage.
+
+### Setup
+
+For installation and detailed setup instructions, see:
+
+- [VPA Installation and Configuration](../vpa/README.md)
+- [Goldilocks Installation and Configuration](../goldilocks/README.md)
+
+Quick start:
+
+```bash
+# Install VPA
+just vpa::install
+
+# Install Goldilocks
+just goldilocks::install
+
+# Enable monitoring for a namespace
+just goldilocks::enable-namespace <namespace>
+```
+
+Access the dashboard at your configured Goldilocks host (e.g., `https://goldilocks.example.com`).
+
+### Using the Dashboard
+
+- Navigate to the namespace
+- Expand "Containers" section for each workload
+- Review both "Guaranteed QoS" and "Burstable QoS" recommendations
+
+### Limitations
+
+Goldilocks only monitors **standard Kubernetes workloads** (Deployment, StatefulSet, DaemonSet). It **does not** automatically create VPAs for:
+
+- Custom Resource Definitions (CRDs)
+- Resources managed by operators (e.g., CloudNativePG Cluster)
+
+For CRDs, use alternative methods:
+
+- Check actual usage: `kubectl top pod <pod-name> -n <namespace>`
+- Use Grafana dashboards: `Kubernetes / Compute Resources / Pod`
+- Monitor over time and adjust based on observed patterns
+
+### Working with Recommendations
+
+#### For Standard Workloads (Supported by Goldilocks)
+
+Review Goldilocks recommendations in the dashboard, then configure resources based on your testing status:
+
+**With load testing:**
+
+- Use Goldilocks recommended values with minimal headroom (1.5-2x)
+- Round to clean values (50m, 100m, 200m, 512Mi, 1Gi, etc.)
+
+**Without load testing:**
+
+- Add more headroom to handle unexpected load (3-5x)
+- Round to clean values
+
+**Example:**
+
+Goldilocks recommendation: 50m CPU, 128Mi Memory
+
+- With load testing: 100m CPU, 256Mi Memory (2x, rounded)
+- Without load testing: 200m CPU, 512Mi Memory (4x, rounded)
+
+#### For CRDs and Unsupported Workloads
+
+Use Grafana to check actual resource usage:
+
+1. **Navigate to Grafana dashboard**: `Kubernetes / Compute Resources / Pod`
+2. **Select namespace and pod**
+3. **Review usage over 24+ hours** to identify peak values
+
+Then apply the same approach:
+
+**With load testing:**
+
+- Use observed peak values with minimal headroom (1.5-2x)
+
+**Without load testing:**
+
+- Add significant headroom (3-5x) for safety
+
+**Example:**
+
+Grafana shows peak: 40m CPU, 207Mi Memory
+
+- With load testing: 100m CPU, 512Mi Memory (2.5x/2.5x, rounded)
+- Without load testing: 200m CPU, 1Gi Memory (5x/5x, rounded, Guaranteed QoS)
+
+## Configuring Resources
+
+### Helm-Managed Components
+
+For components installed via Helm, configure resources in the values file.
+
+#### Example: PostgreSQL Operator (CNPG)
+
+**File:** `postgres/cnpg-values.yaml`
+
+```yaml
+resources:
+    requests:
+        cpu: 50m
+        memory: 128Mi
+    limits:
+        cpu: 100m
+        memory: 256Mi
+```
+
+**Apply:**
+
+```bash
+cd postgres
+helm upgrade --install cnpg cnpg/cloudnative-pg --version ${CNPG_CHART_VERSION} \
+    -n ${CNPG_NAMESPACE} -f cnpg-values.yaml
+```
+
+#### Example: Vault
+
+**File:** `vault/vault-values.gomplate.yaml`
+
+```yaml
+server:
+  resources:
+    requests:
+      cpu: 50m
+      memory: 512Mi
+    limits:
+      cpu: 50m
+      memory: 512Mi
+
+injector:
+  resources:
+    requests:
+      cpu: 50m
+      memory: 128Mi
+    limits:
+      cpu: 50m
+      memory: 128Mi
+
+csi:
+  enabled: true
+  agent:
+    resources:
+      requests:
+        cpu: 50m
+        memory: 128Mi
+      limits:
+        cpu: 50m
+        memory: 128Mi
+  resources:
+    requests:
+      cpu: 50m
+      memory: 64Mi
+    limits:
+      cpu: 50m
+      memory: 128Mi
+```
+
+**Apply:**
+
+```bash
+cd vault
+gomplate -f vault-values.gomplate.yaml -o vault-values.yaml
+helm upgrade vault hashicorp/vault --version ${VAULT_CHART_VERSION} \
+    -n vault -f vault-values.yaml
+```
+
+**Note:** After updating StatefulSet resources, delete the pod to apply changes:
+
+```bash
+kubectl delete pod vault-0 -n vault
+# Unseal Vault after restart
+kubectl exec -n vault vault-0 -- vault operator unseal <UNSEAL_KEY>
+```
+
+### CRD-Managed Components
+
+For components managed by Custom Resource Definitions, patch the CRD directly.
+
+#### Example: PostgreSQL Cluster (CloudNativePG)
+
+**Update values file**
+
+**File:** `postgres/postgres-cluster-values.gomplate.yaml`
+
+```yaml
+cluster:
+  instances: 1
+
+  # Resource configuration (Guaranteed QoS)
+  resources:
+    requests:
+      cpu: 200m
+      memory: 1Gi
+    limits:
+      cpu: 200m
+      memory: 1Gi
+
+  storage:
+    size: {{ .Env.POSTGRES_STORAGE_SIZE }}
+```
+
+**Apply via justfile:**
+
+```bash
+just postgres::create-cluster
+```
+
+**Restart pod to apply changes:**
+
+```bash
+kubectl delete pod postgres-cluster-1 -n postgres
+kubectl wait --for=condition=Ready pod/postgres-cluster-1 -n postgres --timeout=180s
+```
+
+**Data Safety:** PostgreSQL data is stored in PersistentVolumeClaim (PVC) and will be preserved during pod restart.
+
+### Verification
+
+After applying resource configurations:
+
+**1. Check resource settings:**
+
+```bash
+# For standard workloads
+kubectl get deployment <name> -n <namespace> -o jsonpath='{.spec.template.spec.containers[0].resources}' | jq
+
+# For pods
+kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.containers[0].resources}' | jq
+```
+
+**2. Verify QoS Class:**
+
+```bash
+kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.qosClass}'
+```
+
+**3. Check actual usage:**
+
+```bash
+kubectl top pod <pod-name> -n <namespace>
+```
+
+## Best Practices
+
+### Choosing QoS Class
+
+| Component Type | Recommended QoS | Rationale |
+|---------------|-----------------|-----------|
+| **Data stores** (PostgreSQL, Vault) | Guaranteed | Critical services, data integrity, predictable performance |
+| **Operators** (CNPG, etc.) | Burstable | Lightweight controllers, occasional spikes |
+| **Auxiliary services** (Injectors, CSI providers) | Burstable | Support services, variable load |
+
+### Setting Resource Values
+
+**1. Start with actual usage:**
+
+```bash
+# Check current usage
+kubectl top pod <pod-name> -n <namespace>
+
+# Check historical usage in Grafana
+# Dashboard: Kubernetes / Compute Resources / Pod
+```
+
+**2. Add appropriate headroom:**
+
+| Scenario | Recommended Multiplier | Example |
+|----------|----------------------|---------|
+| Stable, predictable load | 2-3x current usage | Current: 40m → Set: 100m |
+| Variable load | 5-10x current usage | Current: 40m → Set: 200m |
+| Growth expected | 5-10x current usage | Current: 200Mi → Set: 1Gi |
+
+**3. Use round numbers:**
+
+- CPU: 50m, 100m, 200m, 500m, 1000m (1 core)
+- Memory: 64Mi, 128Mi, 256Mi, 512Mi, 1Gi, 2Gi
+
+**4. Monitor and adjust:**
+
+- Check usage patterns after 1-2 weeks
+- Adjust based on observed peak usage
+- Iterate as workload changes
+
+### Resource Configuration Examples
+
+Based on actual deployments in buun-stack:
+
+```yaml
+# PostgreSQL Operator (Burstable)
+resources:
+  requests:
+    cpu: 50m
+    memory: 128Mi
+  limits:
+    cpu: 100m
+    memory: 256Mi
+
+# PostgreSQL Cluster (Guaranteed)
+resources:
+  requests:
+    cpu: 200m
+    memory: 1Gi
+  limits:
+    cpu: 200m
+    memory: 1Gi
+
+# Vault Server (Guaranteed)
+resources:
+  requests:
+    cpu: 50m
+    memory: 512Mi
+  limits:
+    cpu: 50m
+    memory: 512Mi
+
+# Vault Agent Injector (Guaranteed)
+resources:
+  requests:
+    cpu: 50m
+    memory: 128Mi
+  limits:
+    cpu: 50m
+    memory: 128Mi
+```
+
+## Troubleshooting
+
+### Pod Stuck in Pending State
+
+**Symptom:**
+
+```plain
+NAME       READY   STATUS    RESTARTS   AGE
+my-pod     0/1     Pending   0          5m
+```
+
+**Check events:**
+
+```bash
+kubectl describe pod <pod-name> -n <namespace> | tail -20
+```
+
+**Common causes:**
+
+#### Insufficient resources
+
+```plain
+FailedScheduling: 0/1 nodes are available: 1 Insufficient cpu/memory
+```
+
+**Solution:** Reduce resource requests or add more nodes
+
+#### Pod anti-affinity
+
+```plain
+FailedScheduling: 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules
+```
+
+**Solution:** Delete old pod to allow new pod to schedule
+
+```bash
+kubectl delete pod <old-pod-name> -n <namespace>
+```
+
+### OOMKilled (Out of Memory)
+
+**Symptom:**
+
+```plain
+NAME       READY   STATUS      RESTARTS   AGE
+my-pod     0/1     OOMKilled   1          5m
+```
+
+**Solution:**
+
+#### Check memory limit is sufficient
+
+```bash
+kubectl top pod <pod-name> -n <namespace>
+```
+
+#### Increase memory limits
+
+```yaml
+resources:
+  limits:
+    memory: 2Gi  # Increase from 1Gi
+```
+
+### Helm Stuck in pending-upgrade
+
+**Symptom:**
+
+```bash
+helm status <release> -n <namespace>
+# STATUS: pending-upgrade
+```
+
+**Solution:**
+
+```bash
+# Remove pending release secret
+kubectl get secrets -n <namespace> -l owner=helm,name=<release> --sort-by=.metadata.creationTimestamp
+kubectl delete secret sh.helm.release.v1.<release>.v<pending-version> -n <namespace>
+
+# Verify status is back to deployed
+helm status <release> -n <namespace>
+
+# Re-run upgrade
+helm upgrade <release> <chart> -n <namespace> -f values.yaml
+```
+
+### VPA Not Providing Recommendations
+
+**Symptom:**
+
+- VPA shows "NoPodsMatched" or "ConfigUnsupported"
+- Goldilocks shows empty containers section
+
+**Cause:**
+VPA cannot monitor Custom Resource Definitions (CRDs) directly
+
+**Solution:**
+Use alternative monitoring methods:
+
+1. kubectl top pod
+2. Grafana dashboards
+3. Prometheus queries
+
+For CRDs, configure resources manually based on observed usage patterns.
+
+## References
+
+- [Kubernetes Resource Management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)
+- [Kubernetes QoS Classes](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/)
+- [Goldilocks Documentation](https://goldilocks.docs.fairwinds.com/)
+- [CloudNativePG Resource Management](https://cloudnative-pg.io/documentation/current/resource_management/)
--- a/jupyterhub/README.md
+++ b/jupyterhub/README.md
@@ -1,24 +1,146 @@
 # JupyterHub

-Multi-user platform for interactive computing:
+JupyterHub provides a multi-user Jupyter notebook environment with Keycloak OIDC authentication, Vault integration for secure secrets management, and custom kernel images for data science workflows.

- Collaborative Jupyter notebook environment
- Integrated with Keycloak for OIDC authentication
- Persistent storage for user workspaces
- Support for multiple kernels and environments
- Vault integration for secure secrets management
+## Table of Contents

-See [JupyterHub Documentation](../docs/jupyterhub.md) for detailed setup and configuration.
+- [Installation](#installation)
+- [Prerequisites](#prerequisites)
+- [Access](#access)
+- [Kernel Images](#kernel-images)
+- [Profile Configuration](#profile-configuration)
+- [Buun-Stack Images](#buun-stack-images)
+- [buunstack Package & SecretStore](#buunstack-package--secretstore)
+- [Vault Integration](#vault-integration)
+- [Token Renewal Implementation](#token-renewal-implementation)
+- [Storage Options](#storage-options)
+- [Configuration](#configuration)
+- [Custom Container Images](#custom-container-images)
+- [Management](#management)
+- [Troubleshooting](#troubleshooting)
+- [Technical Implementation Details](#technical-implementation-details)
+- [Performance Considerations](#performance-considerations)
+- [Known Limitations](#known-limitations)

 ## Installation

+Install JupyterHub with interactive configuration:
+
 ```bash
 just jupyterhub::install
 ```

+This will prompt for:
+
+- JupyterHub host (FQDN)
+- NFS PV usage (if Longhorn is installed)
+- NFS server details (if NFS is enabled)
+- Vault integration setup (requires root token for initial setup)
+
+## Prerequisites
+
+- Keycloak must be installed and configured
+- For NFS storage: Longhorn must be installed
+- For Vault integration: Vault and External Secrets Operator must be installed
+- Helm repository must be accessible
+
 ## Access

-Access JupyterHub at `https://jupyter.yourdomain.com` and authenticate via Keycloak.
+Access JupyterHub at your configured host (e.g., `https://jupyter.example.com`) and authenticate via Keycloak.
+
+## Kernel Images
+
+### Important Note
+
+Building and using custom buun-stack images requires building the `buunstack` Python package first. The package wheel file will be included in the Docker image during build.
+
+JupyterHub supports multiple kernel image profiles:
+
+### Standard Profiles
+
+- **minimal**: Basic Python environment
+- **base**: Python with common data science packages
+- **datascience**: Full data science stack (default)
+- **pyspark**: PySpark for big data processing
+- **pytorch**: PyTorch for machine learning
+- **tensorflow**: TensorFlow for machine learning
+
+### Buun-Stack Profiles
+
+- **buun-stack**: Comprehensive data science environment with Vault integration
+- **buun-stack-cuda**: CUDA-enabled version with GPU support
+
+## Profile Configuration
+
+Enable/disable profiles using environment variables:
+
+```bash
+# Enable buun-stack profile (CPU version)
+JUPYTER_PROFILE_BUUN_STACK_ENABLED=true
+
+# Enable buun-stack CUDA profile (GPU version)
+JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED=true
+
+# Disable default datascience profile
+JUPYTER_PROFILE_DATASCIENCE_ENABLED=false
+```
+
+Available profile variables:
+
+- `JUPYTER_PROFILE_MINIMAL_ENABLED`
+- `JUPYTER_PROFILE_BASE_ENABLED`
+- `JUPYTER_PROFILE_DATASCIENCE_ENABLED`
+- `JUPYTER_PROFILE_PYSPARK_ENABLED`
+- `JUPYTER_PROFILE_PYTORCH_ENABLED`
+- `JUPYTER_PROFILE_TENSORFLOW_ENABLED`
+- `JUPYTER_PROFILE_BUUN_STACK_ENABLED`
+- `JUPYTER_PROFILE_BUUN_STACK_CUDA_ENABLED`
+
+Only `JUPYTER_PROFILE_DATASCIENCE_ENABLED` is true by default.
+
+## Buun-Stack Images
+
+Buun-stack images provide comprehensive data science environments with:
+
+- All standard data science packages (NumPy, Pandas, Scikit-learn, etc.)
+- Deep learning frameworks (PyTorch, TensorFlow, Keras)
+- Big data tools (PySpark, Apache Arrow)
+- NLP and ML libraries (LangChain, Transformers, spaCy)
+- Database connectors and tools
+- **Vault integration** with `buunstack` Python package
+
+### Building Custom Images
+
+Build and push buun-stack images to your registry:
+
+```bash
+# Build images (includes building the buunstack Python package)
+just jupyterhub::build-kernel-images
+
+# Push to registry
+just jupyterhub::push-kernel-images
+```
+
+The build process:
+
+1. Builds the `buunstack` Python package wheel
+2. Copies the wheel into the Docker build context
+3. Installs the wheel in the Docker image
+4. Cleans up temporary files
+
+⚠️ **Note**: Buun-stack images are comprehensive and large (~13GB). Initial image pulls and deployments take significant time due to the extensive package set.
+
+### Image Configuration
+
+Configure image settings in `.env.local`:
+
+```bash
+# Image registry
+IMAGE_REGISTRY=localhost:30500
+
+# Image tag (current default)
+JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
+```

 ## buunstack Package & SecretStore

@@ -60,6 +182,305 @@ For detailed documentation, usage examples, and API reference, see:

 [📖 buunstack Package Documentation](../python-package/README.md)

+## Vault Integration
+
+### Overview
+
+Vault integration enables secure secrets management directly from Jupyter notebooks. The system uses:
+
+- **ExternalSecret** to fetch the admin token from Vault
+- **Renewable tokens** with unlimited Max TTL to avoid 30-day system limitations
+- **Token renewal script** that automatically renews tokens at TTL/2 intervals (minimum 30 seconds)
+- **User-specific tokens** created during notebook spawn with isolated access
+
+### Architecture
+
+```plain
+┌────────────────────────────────────────────────────────────────┐
+│                         JupyterHub Hub Pod                     │
+│                                                                │
+│  ┌──────────────┐  ┌────────────────┐  ┌────────────────────┐  │
+│  │     Hub      │  │ Token Renewer  │  │  ExternalSecret    │  │
+│  │  Container   │◄─┤   Sidecar      │◄─┤   (mounted as      │  │
+│  │              │  │                │  │    Secret)         │  │
+│  └──────────────┘  └────────────────┘  └────────────────────┘  │
+│         │                    │                     ▲           │
+│         │                    │                     │           │
+│         ▼                    ▼                     │           │
+│  ┌──────────────────────────────────┐              │           │
+│  │    /vault/secrets/vault-token    │              │           │
+│  │  (Admin token for user creation) │              │           │
+│  └──────────────────────────────────┘              │           │
+└────────────────────────────────────────────────────┼───────────┘
+                                                     │
+                                         ┌───────────▼──────────┐
+                                         │       Vault          │
+                                         │  secret/jupyterhub/  │
+                                         │     vault-token      │
+                                         └──────────────────────┘
+```
+
+### Prerequisites
+
+Vault integration requires:
+
+- Vault server installed and configured
+- External Secrets Operator installed
+- ClusterSecretStore configured for Vault
+- Buun-stack kernel images (standard images don't include Vault integration)
+
+### Setup
+
+Vault integration is configured during JupyterHub installation:
+
+```bash
+just jupyterhub::install
+# Answer "yes" when prompted about Vault integration
+# Provide Vault root token when prompted
+```
+
+The setup process:
+
+1. Creates `jupyterhub-admin` policy with necessary permissions including `sudo` for orphan token creation
+2. Creates renewable admin token with 24h TTL and unlimited Max TTL
+3. Stores token in Vault at `secret/jupyterhub/vault-token`
+4. Creates ExternalSecret to fetch token from Vault
+5. Deploys token renewal sidecar for automatic renewal
+
+### Usage in Notebooks
+
+With Vault integration enabled, use the `buunstack` package in notebooks:
+
+```python
+from buunstack import SecretStore
+
+# Initialize (uses pre-acquired user-specific token)
+secrets = SecretStore()
+
+# Store secrets
+secrets.put('api-keys',
+    openai='sk-...',
+    github='ghp_...',
+    database_url='postgresql://...')
+
+# Retrieve secrets
+api_keys = secrets.get('api-keys')
+openai_key = secrets.get('api-keys', field='openai')
+
+# List all secrets
+secret_names = secrets.list()
+
+# Delete secrets or specific fields
+secrets.delete('old-api-key')  # Delete entire secret
+secrets.delete('api-keys', field='github')  # Delete only github field
+```
+
+### Security Features
+
+- **User isolation**: Each user receives an orphan token with access only to their namespace
+- **Automatic renewal**: Token renewal script renews admin token at TTL/2 intervals (minimum 30 seconds)
+- **ExternalSecret integration**: Admin token fetched securely from Vault
+- **Orphan tokens**: User tokens are orphan tokens, not limited by parent policy restrictions
+- **Audit trail**: All secret access is logged in Vault
+
+### Token Management
+
+#### Admin Token
+
+The admin token is managed through:
+
+1. **Creation**: `just jupyterhub::create-jupyterhub-vault-token` creates renewable token
+2. **Storage**: Stored in Vault at `secret/jupyterhub/vault-token`
+3. **Retrieval**: ExternalSecret fetches and mounts as Kubernetes Secret
+4. **Renewal**: `vault-token-renewer.sh` script renews at TTL/2 intervals
+
+#### User Tokens
+
+User tokens are created dynamically:
+
+1. **Pre-spawn hook** reads admin token from `/vault/secrets/vault-token`
+2. **Creates user policy** `jupyter-user-{username}` with restricted access
+3. **Creates orphan token** with user policy (requires `sudo` permission)
+4. **Sets environment variable** `NOTEBOOK_VAULT_TOKEN` in notebook container
+
+## Token Renewal Implementation
+
+### Admin Token Renewal
+
+The admin token renewal is handled by a sidecar container (`vault-token-renewer`) running alongside the JupyterHub hub:
+
+**Implementation Details:**
+
+1. **Renewal Script**: `/vault/config/vault-token-renewer.sh`
+   - Runs in the `vault-token-renewer` sidecar container
+   - Uses Vault 1.17.5 image with HashiCorp Vault CLI
+
+2. **Environment-Based TTL Configuration**:
+
+   ```bash
+   # Reads TTL from environment variable (set in .env.local)
+   TTL_RAW="${JUPYTERHUB_VAULT_TOKEN_TTL}"  # e.g., "5m", "24h"
+
+   # Converts to seconds and calculates renewal interval
+   RENEWAL_INTERVAL=$((TTL_SECONDS / 2))  # TTL/2 with minimum 30s
+   ```
+
+3. **Token Source**: ExternalSecret → Kubernetes Secret → mounted file
+
+   ```bash
+   # Token retrieved from ExternalSecret-managed mount
+   ADMIN_TOKEN=$(cat /vault/admin-token/token)
+   ```
+
+4. **Renewal Loop**:
+
+   ```bash
+   while true; do
+       vault token renew >/dev/null 2>&1
+       sleep $RENEWAL_INTERVAL
+   done
+   ```
+
+5. **Error Handling**: If renewal fails, re-retrieves token from ExternalSecret mount
+
+**Key Files:**
+
+- `vault-token-renewer.sh`: Main renewal script
+- `jupyterhub-vault-token-external-secret.gomplate.yaml`: ExternalSecret configuration
+- `vault-token-renewer-config` ConfigMap: Contains the renewal script
+
+### User Token Renewal
+
+User token renewal is handled within the notebook environment by the `buunstack` Python package:
+
+**Implementation Details:**
+
+1. **Token Source**: Environment variable set by pre-spawn hook
+
+   ```python
+   # In pre_spawn_hook.gomplate.py
+   spawner.environment["NOTEBOOK_VAULT_TOKEN"] = user_vault_token
+   ```
+
+2. **Automatic Renewal**: Built into `SecretStore` class operations
+
+   ```python
+   # In buunstack/secrets.py
+   def _ensure_authenticated(self):
+       token_info = self.client.auth.token.lookup_self()
+       ttl = token_info.get("data", {}).get("ttl", 0)
+       renewable = token_info.get("data", {}).get("renewable", False)
+
+       # Renew if TTL < 10 minutes and renewable
+       if renewable and ttl > 0 and ttl < 600:
+           self.client.auth.token.renew_self()
+   ```
+
+3. **Renewal Trigger**: Every `SecretStore` operation (get, put, delete, list)
+   - Checks token validity before operation
+   - Automatically renews if TTL < 10 minutes
+   - Transparent to user code
+
+4. **Token Configuration** (set during creation):
+   - **TTL**: `NOTEBOOK_VAULT_TOKEN_TTL` (default: 24h = 1 day)
+   - **Max TTL**: `NOTEBOOK_VAULT_TOKEN_MAX_TTL` (default: 168h = 7 days)
+   - **Policy**: User-specific `jupyter-user-{username}`
+   - **Type**: Orphan token (independent of parent token lifecycle)
+
+5. **Expiry Handling**: When token reaches Max TTL:
+   - Cannot be renewed further
+   - User must restart notebook server (triggers new token creation)
+   - Prevented by `JUPYTERHUB_CULL_MAX_AGE` setting (6 days < 7 day Max TTL)
+
+**Key Files:**
+
+- `pre_spawn_hook.gomplate.py`: User token creation logic
+- `buunstack/secrets.py`: Token renewal implementation
+- `user_policy.hcl`: User token permissions template
+
+### Token Lifecycle Summary
+
+```plain
+┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
+│   Admin Token   │    │   User Token     │    │  Pod Lifecycle  │
+│                 │    │                  │    │                 │
+│ Created: Manual │    │ Created: Spawn   │    │ Max Age: 7 days │
+│ TTL: 5m-24h     │    │ TTL: 1 day       │    │ Auto-restart    │
+│ Max TTL: ∞      │    │ Max TTL: 7 days  │    │ at Max TTL      │
+│ Renewal: Auto   │    │ Renewal: Auto    │    │                 │
+│ Interval: TTL/2 │    │ Trigger: Usage   │    │                 │
+└─────────────────┘    └──────────────────┘    └─────────────────┘
+         │                       │                       │
+         ▼                       ▼                       ▼
+   vault-token-renewer      buunstack.py            cull.maxAge
+   sidecar                  SecretStore            pod restart
+```
+
+## Storage Options
+
+### Default Storage
+
+Uses Kubernetes PersistentVolumes for user home directories.
+
+### NFS Storage
+
+For shared storage across nodes, configure NFS:
+
+```bash
+JUPYTERHUB_NFS_PV_ENABLED=true
+JUPYTER_NFS_IP=192.168.10.1
+JUPYTER_NFS_PATH=/volume1/drive1/jupyter
+```
+
+NFS storage requires:
+
+- Longhorn storage system installed
+- NFS server accessible from cluster nodes
+- Proper NFS export permissions configured
+
+## Configuration
+
+### Environment Variables
+
+Key configuration variables:
+
+```bash
+# Basic settings
+JUPYTERHUB_NAMESPACE=jupyter
+JUPYTERHUB_CHART_VERSION=4.2.0
+JUPYTERHUB_OIDC_CLIENT_ID=jupyterhub
+
+# Keycloak integration
+KEYCLOAK_REALM=buunstack
+
+# Storage
+JUPYTERHUB_NFS_PV_ENABLED=false
+
+# Vault integration
+JUPYTERHUB_VAULT_INTEGRATION_ENABLED=false
+VAULT_ADDR=https://vault.example.com
+
+# Image settings
+JUPYTER_PYTHON_KERNEL_TAG=python-3.12-28
+IMAGE_REGISTRY=localhost:30500
+
+# Vault token TTL settings
+JUPYTERHUB_VAULT_TOKEN_TTL=24h       # Admin token: renewed at TTL/2 intervals
+NOTEBOOK_VAULT_TOKEN_TTL=24h         # User token: 1 day (renewed on usage)
+NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h    # User token: 7 days max
+
+# Server pod lifecycle settings
+JUPYTERHUB_CULL_MAX_AGE=604800       # Max pod age in seconds (7 days = 604800s)
+                                     # Should be <= NOTEBOOK_VAULT_TOKEN_MAX_TTL
+
+# Logging
+JUPYTER_BUUNSTACK_LOG_LEVEL=warning  # Options: debug, info, warning, error
+```
+
+### Advanced Configuration
+
+Customize JupyterHub behavior by editing `jupyterhub-values.gomplate.yaml` template before installation.
+
 ## Custom Container Images

 JupyterHub uses custom container images with pre-installed data science tools and integrations:
@@ -88,3 +509,156 @@ GPU-enabled notebook image based on `jupyter/pytorch-notebook:cuda12`:
 [📖 See Image Documentation](./images/datastack-cuda-notebook/README.md)

 Both images are based on the official [Jupyter Docker Stacks](https://github.com/jupyter/docker-stacks) and include all standard data science libraries (NumPy, pandas, scikit-learn, matplotlib, etc.).
+
+## Management
+
+### Uninstall
+
+```bash
+just jupyterhub::uninstall
+```
+
+This removes:
+
+- JupyterHub deployment
+- User pods
+- PVCs
+- ExternalSecret
+
+### Update
+
+Upgrade to newer versions:
+
+```bash
+# Update image tag in .env.local
+export JUPYTER_PYTHON_KERNEL_TAG=python-3.12-29
+
+# Rebuild and push images
+just jupyterhub::build-kernel-images
+just jupyterhub::push-kernel-images
+
+# Upgrade JupyterHub deployment
+just jupyterhub::install
+```
+
+### Manual Token Refresh
+
+If needed, manually refresh the admin token:
+
+```bash
+# Create new renewable token
+just jupyterhub::create-jupyterhub-vault-token
+
+# Restart JupyterHub to pick up new token
+kubectl rollout restart deployment/hub -n jupyter
+```
+
+## Troubleshooting
+
+### Image Pull Issues
+
+Buun-stack images are large and may timeout:
+
+```bash
+# Check pod status
+kubectl get pods -n jupyter
+
+# Check image pull progress
+kubectl describe pod <pod-name> -n jupyter
+
+# Increase timeout if needed
+helm upgrade jupyterhub jupyterhub/jupyterhub --timeout=30m -f jupyterhub-values.yaml
+```
+
+### Vault Integration Issues
+
+Check token and authentication:
+
+```bash
+# Check ExternalSecret status
+kubectl get externalsecret -n jupyter jupyterhub-vault-token
+
+# Check if Secret was created
+kubectl get secret -n jupyter jupyterhub-vault-token
+
+# Check token renewal logs
+kubectl logs -n jupyter -l app.kubernetes.io/component=hub -c vault-token-renewer
+
+# In a notebook, verify environment
+%env NOTEBOOK_VAULT_TOKEN
+```
+
+Common issues:
+
+1. **"child policies must be subset of parent"**: Admin policy needs `sudo` permission for orphan tokens
+2. **Token not found**: Check ExternalSecret and ClusterSecretStore configuration
+3. **Permission denied**: Verify `jupyterhub-admin` policy has all required permissions
+
+### Authentication Issues
+
+Verify Keycloak client configuration:
+
+```bash
+# Check client exists
+just keycloak::get-client buunstack jupyterhub
+
+# Check redirect URIs
+just keycloak::update-client buunstack jupyterhub \
+  "https://your-jupyter-host/hub/oauth_callback"
+```
+
+## Technical Implementation Details
+
+### Helm Chart Version
+
+JupyterHub uses the official Zero to JupyterHub (Z2JH) Helm chart:
+
+- Chart: `jupyterhub/jupyterhub`
+- Version: `4.2.0` (configurable via `JUPYTERHUB_CHART_VERSION`)
+- Documentation: https://z2jh.jupyter.org/
+
+### Token System Architecture
+
+The system uses a three-tier token approach:
+
+1. **Renewable Admin Token**:
+   - Created with `explicit-max-ttl=0` (unlimited Max TTL)
+   - Renewed automatically at TTL/2 intervals (minimum 30 seconds)
+   - Stored in Vault and fetched via ExternalSecret
+2. **Orphan User Tokens**:
+   - Created with `create_orphan()` API call
+   - Not limited by parent token policies
+   - Individual TTL and Max TTL settings
+3. **Token Renewal Script**:
+   - Runs as sidecar container
+   - Reads token from ExternalSecret mount
+   - Handles renewal and re-retrieval on failure
+
+### Key Files
+
+- `jupyterhub-admin-policy.hcl`: Vault policy with admin permissions
+- `user_policy.hcl`: Template for user-specific policies
+- `vault-token-renewer.sh`: Token renewal script
+- `jupyterhub-vault-token-external-secret.gomplate.yaml`: ExternalSecret configuration
+
+## Performance Considerations
+
+- **Image Size**: Buun-stack images are ~13GB, plan storage accordingly
+- **Pull Time**: Initial pulls take 5-15 minutes depending on network
+- **Resource Usage**: Data science workloads require adequate CPU/memory
+- **Token Renewal**: Minimal overhead (renewal at TTL/2 intervals)
+
+For production deployments, consider:
+
+- Pre-pulling images to all nodes
+- Using faster storage backends
+- Configuring resource limits per user
+- Setting up monitoring and alerts
+
+## Known Limitations
+
+1. **Annual Token Recreation**: While tokens have unlimited Max TTL, best practice suggests recreating them annually
+2. **Token Expiry and Pod Lifecycle**: User tokens have a TTL of 1 day (`NOTEBOOK_VAULT_TOKEN_TTL=24h`) and maximum TTL of 7 days (`NOTEBOOK_VAULT_TOKEN_MAX_TTL=168h`). Daily usage extends the token for another day, allowing up to 7 days of continuous use. Server pods are automatically restarted after 7 days (`JUPYTERHUB_CULL_MAX_AGE=604800s`) to refresh tokens.
+3. **Cull Settings**: Server idle timeout is set to 2 hours by default. Adjust `cull.timeout` and `cull.every` in the Helm values for different requirements
+4. **NFS Storage**: When using NFS storage, ensure proper permissions are set on the NFS server. The default `JUPYTER_FSGID` is 100
+5. **ExternalSecret Dependency**: Requires External Secrets Operator to be installed and configured
--- a/trino/MCP.md
+++ b/trino/MCP.md
@@ -26,7 +26,7 @@ Create `.env.claude` with Trino connection settings:

 ```bash
 # Trino Connection (Password Authentication)
-TRINO_HOST=trino.buun.dev
+TRINO_HOST=trino.yourdomain.com
 TRINO_PORT=443
 TRINO_SCHEME=https
 TRINO_SSL=true
@@ -75,7 +75,7 @@ Create `~/.env.claude` in your home directory with 1Password references:

 ```bash
 # Trino Connection (Password Authentication)
-TRINO_HOST=trino.buun.dev
+TRINO_HOST=trino.yourdomain.com
 TRINO_PORT=443
 TRINO_SCHEME=https
 TRINO_SSL=true
--- a/trino/justfile
+++ b/trino/justfile
@@ -392,7 +392,7 @@ cli user="":
    TRINO_HOST="${TRINO_HOST}"
    while [ -z "${TRINO_HOST}" ]; do
        TRINO_HOST=$(gum input --prompt="Trino host (FQDN): " --width=100 \
-            --placeholder="e.g., trino.buun.dev")
+            --placeholder="e.g., trino.yourdomain.com")
    done
    TRINO_USER="{{ user }}"
    if [ -z "${TRINO_USER}" ]; then