feat(querybook): install Querybook
This commit is contained in:
13
README.md
13
README.md
@@ -36,6 +36,7 @@ A remotely accessible Kubernetes home lab with OIDC authentication. Build a mode
|
||||
|
||||
- **[JupyterHub](https://jupyter.org/hub)**: Interactive computing with collaborative notebooks
|
||||
- **[Trino](https://trino.io/)**: Distributed SQL query engine for querying multiple data sources
|
||||
- **[Querybook](https://www.querybook.org/)**: Big data querying UI with notebook interface
|
||||
- **[ClickHouse](https://clickhouse.com/)**: High-performance columnar analytics database
|
||||
- **[Qdrant](https://qdrant.tech/)**: Vector database for AI/ML applications
|
||||
- **[Lakekeeper](https://lakekeeper.io/)**: Apache Iceberg REST Catalog for data lake management
|
||||
@@ -152,6 +153,17 @@ Business intelligence and data visualization platform with PostgreSQL integratio
|
||||
|
||||
[📖 See Metabase Documentation](./metabase/README.md)
|
||||
|
||||
### Querybook
|
||||
|
||||
Pinterest's big data querying UI with notebook interface for collaborative data exploration:
|
||||
|
||||
- **Trino Integration**: Execute SQL queries against multiple data sources with user impersonation
|
||||
- **Notebook Interface**: Create shareable datadocs with queries, visualizations, and documentation
|
||||
- **Keycloak Authentication**: OAuth2 integration with group-based admin access
|
||||
- **Real-time Execution**: WebSocket-based query execution with live progress updates
|
||||
|
||||
[📖 See Querybook Documentation](./querybook/README.md)
|
||||
|
||||
### Trino
|
||||
|
||||
Fast distributed SQL query engine for big data analytics with:
|
||||
@@ -299,6 +311,7 @@ kubectl --context yourpc-oidc get nodes
|
||||
# Vault: https://vault.yourdomain.com
|
||||
# Keycloak: https://auth.yourdomain.com
|
||||
# Trino: https://trino.yourdomain.com
|
||||
# Querybook: https://querybook.yourdomain.com
|
||||
# Metabase: https://metabase.yourdomain.com
|
||||
# Airflow: https://airflow.yourdomain.com
|
||||
# JupyterHub: https://jupyter.yourdomain.com
|
||||
|
||||
1
justfile
1
justfile
@@ -23,6 +23,7 @@ mod minio
|
||||
mod oauth2-proxy
|
||||
mod postgres
|
||||
mod qdrant
|
||||
mod querybook
|
||||
mod trino
|
||||
mod utils
|
||||
mod vault
|
||||
|
||||
10
querybook/.gitignore
vendored
Normal file
10
querybook/.gitignore
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
# Generated Helm values (contains OAuth client secret)
|
||||
querybook-values.yaml
|
||||
|
||||
# Generated Kubernetes manifests
|
||||
querybook-config-external-secret.yaml
|
||||
keycloak-auth-configmap.yaml
|
||||
traefik-middleware.yaml
|
||||
|
||||
# Cloned Helm chart repository
|
||||
querybook-repo/
|
||||
252
querybook/README.md
Normal file
252
querybook/README.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# Querybook
|
||||
|
||||
Pinterest's big data querying UI with notebook interface, Keycloak OAuth authentication, and Trino integration.
|
||||
|
||||
## Overview
|
||||
|
||||
This module deploys Querybook using the official Helm chart from Pinterest with:
|
||||
|
||||
- **Keycloak OAuth2 authentication** for user login
|
||||
- **Trino integration** with user impersonation for query attribution
|
||||
- **PostgreSQL backend** for metadata storage
|
||||
- **Redis** for caching and session management
|
||||
- **Traefik integration** with WebSocket support for real-time query execution
|
||||
- **Group-based admin access** via Keycloak groups
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes cluster (k3s)
|
||||
- Keycloak installed and configured
|
||||
- PostgreSQL cluster (CloudNativePG)
|
||||
- Trino with access control configured
|
||||
- External Secrets Operator (optional, for Vault integration)
|
||||
|
||||
## Installation
|
||||
|
||||
### Basic Installation
|
||||
|
||||
```bash
|
||||
just querybook::install
|
||||
```
|
||||
|
||||
You will be prompted for:
|
||||
|
||||
1. **Querybook host (FQDN)**: e.g., `querybook.example.com`
|
||||
2. **Keycloak host (FQDN)**: e.g., `auth.example.com`
|
||||
|
||||
### What Gets Installed
|
||||
|
||||
- Querybook web service
|
||||
- Querybook scheduler (background jobs)
|
||||
- Querybook workers (query execution)
|
||||
- PostgreSQL database for Querybook metadata
|
||||
- Redis for caching and sessions
|
||||
- Keycloak OAuth2 client (confidential client)
|
||||
- `querybook-admin` group in Keycloak for admin access
|
||||
- Traefik Middleware for WebSocket and header forwarding
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables (set in `.env.local` or override):
|
||||
|
||||
```bash
|
||||
QUERYBOOK_NAMESPACE=querybook # Kubernetes namespace
|
||||
QUERYBOOK_HOST=querybook.example.com # External hostname
|
||||
KEYCLOAK_HOST=auth.example.com # Keycloak hostname
|
||||
KEYCLOAK_REALM=buunstack # Keycloak realm name
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Access Querybook
|
||||
|
||||
1. Navigate to `https://your-querybook-host/`
|
||||
2. Click "Login with OAuth" to authenticate with Keycloak
|
||||
3. Create datadocs (notebooks) and execute queries
|
||||
|
||||
### Grant Admin Access
|
||||
|
||||
Add users to the `querybook-admin` group:
|
||||
|
||||
```bash
|
||||
just keycloak::add-user-to-group <username> querybook-admin
|
||||
```
|
||||
|
||||
Admin users can:
|
||||
|
||||
- Manage query engines
|
||||
- Configure data sources
|
||||
- Manage user permissions
|
||||
- View all datadocs
|
||||
|
||||
### Configure Trino Query Engine
|
||||
|
||||
1. Log in as an admin user
|
||||
2. Navigate to Admin → Query Engines
|
||||
3. Click "Add Query Engine"
|
||||
4. Configure:
|
||||
|
||||
```plain
|
||||
Name: Trino
|
||||
Language: Trino
|
||||
Environment: production (or your preferred environment name)
|
||||
```
|
||||
|
||||
5. Navigate to Admin → Environments → [your environment]
|
||||
6. Add new query engine connection:
|
||||
|
||||
```plain
|
||||
Connection String: trino://trino.example.com:443?SSL=true
|
||||
Username: admin
|
||||
Password: [from just trino::admin-password]
|
||||
```
|
||||
|
||||
7. Optional: Configure additional connection parameters:
|
||||
- **Catalog**: Specify default catalog (e.g., `postgresql` or `iceberg`)
|
||||
- **Schema**: Specify default schema
|
||||
- **Proxy_user_id**: Leave empty or set to enable user impersonation
|
||||
|
||||
### User Impersonation
|
||||
|
||||
Querybook connects to Trino as `admin` but executes queries as the logged-in user via Trino's impersonation feature. This provides:
|
||||
|
||||
- **Query Attribution**: Queries are attributed to the actual user, not the admin account
|
||||
- **Audit Logging**: Trino logs show the real user who executed each query
|
||||
- **Access Control**: Future per-user access policies can be enforced
|
||||
|
||||
**How it Works**:
|
||||
|
||||
1. User logs into Querybook with Keycloak
|
||||
2. Querybook connects to Trino using admin credentials
|
||||
3. Querybook sends queries with `X-Trino-User: <username>` header
|
||||
4. Trino impersonates the user (allowed by access control rules)
|
||||
5. Query runs as if executed by the actual user
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
External Users
|
||||
↓
|
||||
Cloudflare Tunnel (HTTPS)
|
||||
↓
|
||||
Traefik Ingress (HTTPS)
|
||||
├─ Traefik Middleware (X-Forwarded-*, WebSocket upgrade)
|
||||
└─ Backend: HTTP
|
||||
↓
|
||||
Querybook Web
|
||||
├─ OAuth2 → Keycloak (authentication)
|
||||
├─ PostgreSQL (metadata)
|
||||
├─ Redis (cache/sessions)
|
||||
└─ WebSocket (real-time query updates)
|
||||
↓
|
||||
Querybook Workers
|
||||
↓
|
||||
Trino (HTTPS via external hostname)
|
||||
└─ Password auth + User impersonation
|
||||
```
|
||||
|
||||
**Key Components**:
|
||||
|
||||
- **Traefik Middleware**: Handles WebSocket upgrade headers and X-Forwarded-* headers
|
||||
- **OAuth2 Integration**: Uses standard OIDC scopes (openid, email, profile) with groups mapper
|
||||
- **Trino Connection**: Must use external HTTPS hostname (not internal service name)
|
||||
- **User Impersonation**: Admin credentials with X-Trino-User header for query attribution
|
||||
|
||||
## Authentication
|
||||
|
||||
### User Login (OAuth2)
|
||||
|
||||
- Users authenticate via Keycloak
|
||||
- Standard OIDC flow with Authorization Code grant
|
||||
- Group membership included in UserInfo endpoint response
|
||||
- Session stored in Redis
|
||||
|
||||
### Admin Access
|
||||
|
||||
- Controlled by Keycloak group membership
|
||||
- Users in `querybook-admin` group have full admin privileges
|
||||
- Regular users can create and manage their own datadocs
|
||||
|
||||
### Trino Connection
|
||||
|
||||
- Uses password authentication (admin user)
|
||||
- Connects via external HTTPS hostname (Traefik provides TLS)
|
||||
- Python Trino client enforces HTTPS when authentication is used
|
||||
- User impersonation via X-Trino-User header
|
||||
|
||||
## Management
|
||||
|
||||
### Upgrade Querybook
|
||||
|
||||
```bash
|
||||
just querybook::upgrade
|
||||
```
|
||||
|
||||
Updates the Helm deployment with current configuration.
|
||||
|
||||
### Uninstall
|
||||
|
||||
```bash
|
||||
# Keep PostgreSQL database
|
||||
just querybook::uninstall false
|
||||
|
||||
# Delete PostgreSQL database too
|
||||
just querybook::uninstall true
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Check Pod Status
|
||||
|
||||
```bash
|
||||
kubectl get pods -n querybook
|
||||
```
|
||||
|
||||
### WebSocket Connection Fails
|
||||
|
||||
- Verify Traefik middleware exists: `kubectl get middleware querybook-headers -n querybook`
|
||||
- Check WebSocket upgrade headers in middleware configuration
|
||||
- Ensure Ingress annotation references middleware: `querybook-querybook-headers@kubernetescrd`
|
||||
|
||||
### OAuth Login Fails
|
||||
|
||||
- Verify Keycloak client exists: `just keycloak::list-clients`
|
||||
- Check redirect URL: `https://<querybook-host>/oauth2callback`
|
||||
- Verify client secret matches: Compare Vault/K8s secret with Keycloak
|
||||
- Check Keycloak is accessible from Querybook pods
|
||||
|
||||
### Trino Connection Fails
|
||||
|
||||
- **Error: "cannot use authentication with HTTP"**
|
||||
- Must use external hostname with HTTPS: `trino://trino.example.com:443?SSL=true`
|
||||
- Do NOT use internal service name (e.g., `trino.trino.svc.cluster.local:8080`)
|
||||
- Python Trino client enforces HTTPS when authentication is used
|
||||
|
||||
- **Error: "500 Internal Server Error"**
|
||||
- Verify Trino is accessible via external hostname
|
||||
- Check Trino admin password: `just trino::admin-password`
|
||||
- Test Trino connection manually with curl
|
||||
|
||||
- **Error: "Access Denied: User admin cannot impersonate user X"**
|
||||
- Verify Trino access control is configured
|
||||
- Check impersonation rules: `kubectl exec -n trino deployment/trino-coordinator -- cat /etc/trino/access-control/rules.json`
|
||||
- Ensure admin can impersonate all users
|
||||
|
||||
### Query Execution Stuck
|
||||
|
||||
- Check worker pod logs: `just querybook::logs worker`
|
||||
- Verify Redis is running: `kubectl get pods -n querybook | grep redis`
|
||||
- Check Trino coordinator health: `kubectl get pods -n trino`
|
||||
|
||||
### Database Connection Issues
|
||||
|
||||
- Verify PostgreSQL cluster is running: `kubectl get cluster -n postgres`
|
||||
- Check database exists: `just postgres::list-databases | grep querybook`
|
||||
- Verify secret exists: `kubectl get secret querybook-config-secret -n querybook`
|
||||
|
||||
## References
|
||||
|
||||
- [Querybook Documentation](https://www.querybook.org/)
|
||||
- [Querybook GitHub](https://github.com/pinterest/querybook)
|
||||
- [Trino Integration](../trino/README.md)
|
||||
- [Keycloak OAuth2](https://www.keycloak.org/docs/latest/securing_apps/#_oidc)
|
||||
46
querybook/custom-auth/keycloak_auth.py
Normal file
46
querybook/custom-auth/keycloak_auth.py
Normal file
@@ -0,0 +1,46 @@
|
||||
"""
|
||||
Keycloak OIDC authentication backend for Querybook
|
||||
"""
|
||||
from app.auth.oauth_auth import OAuthLoginManager, OAUTH_CALLBACK_PATH
|
||||
from env import QuerybookSettings
|
||||
|
||||
|
||||
class KeycloakLoginManager(OAuthLoginManager):
|
||||
@property
|
||||
def oauth_config(self):
|
||||
return {
|
||||
"callback_url": "{}{}".format(
|
||||
QuerybookSettings.PUBLIC_URL, OAUTH_CALLBACK_PATH
|
||||
),
|
||||
"client_id": QuerybookSettings.OAUTH_CLIENT_ID,
|
||||
"client_secret": QuerybookSettings.OAUTH_CLIENT_SECRET,
|
||||
"authorization_url": QuerybookSettings.OAUTH_AUTHORIZATION_URL,
|
||||
"token_url": QuerybookSettings.OAUTH_TOKEN_URL,
|
||||
"profile_url": QuerybookSettings.OAUTH_USER_PROFILE,
|
||||
"scope": ["openid", "email", "profile"],
|
||||
}
|
||||
|
||||
def _parse_user_profile(self, resp):
|
||||
"""Parse standard OIDC UserInfo response from Keycloak"""
|
||||
user = resp.json()
|
||||
# Keycloak returns standard OIDC claims:
|
||||
# - preferred_username: username
|
||||
# - email: email address
|
||||
# - name: full name (optional)
|
||||
username = user.get("preferred_username") or user.get("email", "").split("@")[0]
|
||||
email = user.get("email", "")
|
||||
fullname = user.get("name", username)
|
||||
return username, email, fullname
|
||||
|
||||
|
||||
login_manager = KeycloakLoginManager()
|
||||
|
||||
ignore_paths = [OAUTH_CALLBACK_PATH]
|
||||
|
||||
|
||||
def init_app(app):
|
||||
login_manager.init_app(app)
|
||||
|
||||
|
||||
def login(request):
|
||||
return login_manager.login(request)
|
||||
327
querybook/justfile
Normal file
327
querybook/justfile
Normal file
@@ -0,0 +1,327 @@
|
||||
set fallback := true
|
||||
|
||||
export QUERYBOOK_NAMESPACE := env("QUERYBOOK_NAMESPACE", "querybook")
|
||||
export QUERYBOOK_HOST := env("QUERYBOOK_HOST", "")
|
||||
export QUERYBOOK_CHART_REPO := env("QUERYBOOK_CHART_REPO", "https://github.com/pinterest/querybook")
|
||||
export QUERYBOOK_CHART_PATH := env("QUERYBOOK_CHART_PATH", "helm")
|
||||
export EXTERNAL_SECRETS_NAMESPACE := env("EXTERNAL_SECRETS_NAMESPACE", "external-secrets")
|
||||
export K8S_VAULT_NAMESPACE := env("K8S_VAULT_NAMESPACE", "vault")
|
||||
export KEYCLOAK_REALM := env("KEYCLOAK_REALM", "buunstack")
|
||||
export KEYCLOAK_HOST := env("KEYCLOAK_HOST", "")
|
||||
|
||||
[private]
|
||||
default:
|
||||
@just --list --unsorted --list-submodules
|
||||
|
||||
# Create Querybook namespace
|
||||
create-namespace:
|
||||
@kubectl get namespace ${QUERYBOOK_NAMESPACE} &>/dev/null || \
|
||||
kubectl create namespace ${QUERYBOOK_NAMESPACE}
|
||||
|
||||
# Delete Querybook namespace
|
||||
delete-namespace:
|
||||
@kubectl delete namespace ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
|
||||
# Clone Querybook Helm chart repository
|
||||
clone-chart-repo:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
if [ ! -d "querybook-repo" ]; then
|
||||
echo "Cloning Querybook Helm chart repository..."
|
||||
git clone --depth 1 ${QUERYBOOK_CHART_REPO} querybook-repo
|
||||
else
|
||||
echo "Querybook repository already exists. Pulling latest changes..."
|
||||
cd querybook-repo && git pull
|
||||
fi
|
||||
|
||||
# Remove cloned chart repository
|
||||
remove-chart-repo:
|
||||
rm -rf querybook-repo
|
||||
|
||||
# Create Keycloak client and OAuth secret for Querybook
|
||||
create-keycloak-client:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
while [ -z "${QUERYBOOK_HOST}" ]; do
|
||||
QUERYBOOK_HOST=$(
|
||||
gum input --prompt="Querybook host (FQDN): " --width=100 \
|
||||
--placeholder="e.g., querybook.example.com"
|
||||
)
|
||||
done
|
||||
|
||||
echo "Creating Keycloak client for Querybook..."
|
||||
|
||||
# Delete existing client if present
|
||||
just keycloak::delete-client ${KEYCLOAK_REALM} querybook || true
|
||||
|
||||
# Generate client secret
|
||||
CLIENT_SECRET=$(just utils::random-password)
|
||||
|
||||
# Create 'querybook-admin' group if it doesn't exist
|
||||
echo "Creating 'querybook-admin' group..."
|
||||
just keycloak::create-group querybook-admin '' 'Querybook administrators' || echo "Group may already exist"
|
||||
|
||||
# Create confidential client with client secret
|
||||
# Uses standard OIDC scopes: openid, email, profile (no custom scopes needed)
|
||||
just keycloak::create-client \
|
||||
realm=${KEYCLOAK_REALM} \
|
||||
client_id=querybook \
|
||||
redirect_url="https://${QUERYBOOK_HOST}/oauth2callback" \
|
||||
client_secret="${CLIENT_SECRET}"
|
||||
|
||||
# Add groups mapper to include group membership in UserInfo
|
||||
echo "Adding groups mapper to querybook client..."
|
||||
just keycloak::add-groups-mapper querybook
|
||||
|
||||
# Store client secret temporarily in Kubernetes Secret (always created)
|
||||
kubectl delete secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
kubectl create secret generic querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} \
|
||||
--from-literal=client_secret="${CLIENT_SECRET}"
|
||||
|
||||
# Also store in Vault if available
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null; then
|
||||
echo "Storing OAuth client secret in Vault..."
|
||||
just vault::put querybook/oauth client_secret="${CLIENT_SECRET}"
|
||||
fi
|
||||
|
||||
echo "Keycloak client created successfully"
|
||||
echo "Client ID: querybook"
|
||||
echo "Scopes: openid, email, profile (standard OIDC scopes)"
|
||||
echo "Redirect URI: https://${QUERYBOOK_HOST}/oauth2callback"
|
||||
echo ""
|
||||
echo "Admin Group: querybook-admin"
|
||||
echo "To grant admin access, add users to 'querybook-admin' group:"
|
||||
echo " just keycloak::add-user-to-group <username> querybook-admin"
|
||||
|
||||
# Delete Keycloak client
|
||||
delete-keycloak-client:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Deleting Keycloak client for Querybook..."
|
||||
just keycloak::delete-client ${KEYCLOAK_REALM} querybook || true
|
||||
echo "Deleting querybook-admin group..."
|
||||
just keycloak::delete-group querybook-admin || true
|
||||
kubectl delete secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
|
||||
# Create Querybook secrets
|
||||
create-secrets:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
# Generate Flask secret key
|
||||
flask_secret=$(just utils::random-password)
|
||||
|
||||
# Get PostgreSQL credentials
|
||||
pg_host="postgres-cluster-rw.postgres"
|
||||
pg_port="5432"
|
||||
pg_user=$(just postgres::admin-username)
|
||||
pg_password=$(just postgres::admin-password)
|
||||
pg_database="querybook"
|
||||
|
||||
# Build database connection string
|
||||
database_conn="postgresql://${pg_user}:${pg_password}@${pg_host}:${pg_port}/${pg_database}"
|
||||
|
||||
# Get OAuth client secret (created by create-keycloak-client)
|
||||
# Try Vault first, fallback to Kubernetes Secret
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null && \
|
||||
just vault::get querybook/oauth client_secret &>/dev/null; then
|
||||
oauth_client_secret=$(just vault::get querybook/oauth client_secret)
|
||||
elif kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} &>/dev/null; then
|
||||
oauth_client_secret=$(kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} \
|
||||
-o jsonpath='{.data.client_secret}' | base64 -d)
|
||||
else
|
||||
echo "Error: Cannot retrieve OAuth client secret. Please run 'just querybook::create-keycloak-client' first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if helm status external-secrets -n ${EXTERNAL_SECRETS_NAMESPACE} &>/dev/null; then
|
||||
echo "External Secrets Operator detected. Storing secrets in Vault..."
|
||||
|
||||
just vault::put querybook/config \
|
||||
FLASK_SECRET_KEY="${flask_secret}" \
|
||||
DATABASE_CONN="${database_conn}" \
|
||||
REDIS_URL="redis://redis:6379/0" \
|
||||
ELASTICSEARCH_HOST="elasticsearch:9200" \
|
||||
OAUTH_CLIENT_SECRET="${oauth_client_secret}"
|
||||
|
||||
kubectl delete secret querybook-secret -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
kubectl delete externalsecret querybook-secret -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
|
||||
gomplate -f querybook-config-external-secret.gomplate.yaml \
|
||||
-o querybook-config-external-secret.yaml
|
||||
kubectl apply -f querybook-config-external-secret.yaml
|
||||
|
||||
echo "Waiting for ExternalSecret to sync..."
|
||||
kubectl wait --for=condition=Ready externalsecret/querybook-secret \
|
||||
-n ${QUERYBOOK_NAMESPACE} --timeout=60s
|
||||
else
|
||||
echo "External Secrets Operator not found. Creating secret directly..."
|
||||
kubectl delete secret querybook-secret -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
kubectl create secret generic querybook-secret -n ${QUERYBOOK_NAMESPACE} \
|
||||
--from-literal=FLASK_SECRET_KEY="${flask_secret}" \
|
||||
--from-literal=DATABASE_CONN="${database_conn}" \
|
||||
--from-literal=REDIS_URL="redis://redis:6379/0" \
|
||||
--from-literal=ELASTICSEARCH_HOST="elasticsearch:9200" \
|
||||
--from-literal=OAUTH_CLIENT_SECRET="${oauth_client_secret}"
|
||||
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null; then
|
||||
just vault::put querybook/config \
|
||||
FLASK_SECRET_KEY="${flask_secret}" \
|
||||
DATABASE_CONN="${database_conn}" \
|
||||
REDIS_URL="redis://redis:6379/0" \
|
||||
ELASTICSEARCH_HOST="elasticsearch:9200" \
|
||||
OAUTH_CLIENT_SECRET="${oauth_client_secret}"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Delete Querybook secrets
|
||||
delete-secrets:
|
||||
@kubectl delete secret querybook-secret -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
@kubectl delete externalsecret querybook-secret -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
|
||||
# Create Keycloak auth ConfigMap
|
||||
create-auth-configmap:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Creating Keycloak auth ConfigMap..."
|
||||
gomplate -f keycloak-auth-configmap.gomplate.yaml -o keycloak-auth-configmap.yaml
|
||||
kubectl apply -f keycloak-auth-configmap.yaml
|
||||
|
||||
# Create Traefik Middleware for WebSocket support
|
||||
create-traefik-middleware:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
echo "Creating Traefik Middleware for WebSocket support..."
|
||||
gomplate -f traefik-middleware.gomplate.yaml -o traefik-middleware.yaml
|
||||
kubectl apply -f traefik-middleware.yaml
|
||||
|
||||
# Install Querybook
|
||||
install:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
while [ -z "${QUERYBOOK_HOST}" ]; do
|
||||
QUERYBOOK_HOST=$(
|
||||
gum input --prompt="Querybook host (FQDN): " --width=100 \
|
||||
--placeholder="e.g., querybook.example.com"
|
||||
)
|
||||
done
|
||||
|
||||
while [ -z "${KEYCLOAK_HOST}" ]; do
|
||||
KEYCLOAK_HOST=$(
|
||||
gum input --prompt="Keycloak host (FQDN): " --width=100 \
|
||||
--placeholder="e.g., auth.example.com"
|
||||
)
|
||||
done
|
||||
|
||||
just create-namespace
|
||||
just postgres::create-db querybook
|
||||
just create-keycloak-client
|
||||
just create-secrets
|
||||
just clone-chart-repo
|
||||
|
||||
# Get OAuth client secret for gomplate template
|
||||
# Try Vault first, fallback to Kubernetes Secret
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null && \
|
||||
just vault::get querybook/oauth client_secret &>/dev/null; then
|
||||
export OAUTH_CLIENT_SECRET=$(just vault::get querybook/oauth client_secret)
|
||||
elif kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} &>/dev/null; then
|
||||
export OAUTH_CLIENT_SECRET=$(kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} \
|
||||
-o jsonpath='{.data.client_secret}' | base64 -d)
|
||||
else
|
||||
echo "Error: Cannot retrieve OAuth client secret. Please run 'just querybook::create-keycloak-client' first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Create Traefik Middleware (must exist before Helm install)
|
||||
just create-traefik-middleware
|
||||
|
||||
# Create Keycloak auth ConfigMap (must exist before Helm install)
|
||||
just create-auth-configmap
|
||||
|
||||
gomplate -f querybook-values.gomplate.yaml -o querybook-values.yaml
|
||||
|
||||
helm upgrade --cleanup-on-fail --install querybook ./querybook-repo/${QUERYBOOK_CHART_PATH} \
|
||||
-n ${QUERYBOOK_NAMESPACE} --wait \
|
||||
-f querybook-values.yaml
|
||||
|
||||
echo ""
|
||||
echo "Querybook installed successfully!"
|
||||
echo "Access URL: https://${QUERYBOOK_HOST}"
|
||||
echo ""
|
||||
echo "OAuth Configuration:"
|
||||
echo " Provider: Keycloak (custom OIDC backend)"
|
||||
echo " Realm: ${KEYCLOAK_REALM}"
|
||||
echo " Scopes: openid, email, profile"
|
||||
echo " Authorization URL: https://${KEYCLOAK_HOST}/realms/${KEYCLOAK_REALM}/protocol/openid-connect/auth"
|
||||
echo ""
|
||||
echo "Admin Access:"
|
||||
echo " To grant admin access, add users to 'querybook-admin' group:"
|
||||
echo " just keycloak::add-user-to-group <username> querybook-admin"
|
||||
echo ""
|
||||
|
||||
# Upgrade Querybook
|
||||
upgrade:
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
while [ -z "${QUERYBOOK_HOST}" ]; do
|
||||
QUERYBOOK_HOST=$(
|
||||
gum input --prompt="Querybook host (FQDN): " --width=100 \
|
||||
--placeholder="e.g., querybook.example.com"
|
||||
)
|
||||
done
|
||||
|
||||
while [ -z "${KEYCLOAK_HOST}" ]; do
|
||||
KEYCLOAK_HOST=$(
|
||||
gum input --prompt="Keycloak host (FQDN): " --width=100 \
|
||||
--placeholder="e.g., auth.example.com"
|
||||
)
|
||||
done
|
||||
|
||||
# Get OAuth client secret for gomplate template
|
||||
# Try Vault first, fallback to Kubernetes Secret
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null && \
|
||||
just vault::get querybook/oauth client_secret &>/dev/null; then
|
||||
export OAUTH_CLIENT_SECRET=$(just vault::get querybook/oauth client_secret)
|
||||
elif kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} &>/dev/null; then
|
||||
export OAUTH_CLIENT_SECRET=$(kubectl get secret querybook-oauth-temp -n ${QUERYBOOK_NAMESPACE} \
|
||||
-o jsonpath='{.data.client_secret}' | base64 -d)
|
||||
else
|
||||
echo "Error: Cannot retrieve OAuth client secret. Please run 'just querybook::create-keycloak-client' first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Upgrading Querybook..."
|
||||
|
||||
# Update Traefik Middleware (must exist before Helm upgrade)
|
||||
just create-traefik-middleware
|
||||
|
||||
# Update Keycloak auth ConfigMap (must exist before Helm upgrade)
|
||||
just create-auth-configmap
|
||||
|
||||
gomplate -f querybook-values.gomplate.yaml -o querybook-values.yaml
|
||||
helm upgrade querybook ./querybook-repo/${QUERYBOOK_CHART_PATH} \
|
||||
-n ${QUERYBOOK_NAMESPACE} --wait \
|
||||
-f querybook-values.yaml
|
||||
|
||||
echo "Querybook upgraded successfully"
|
||||
|
||||
# Uninstall Querybook
|
||||
uninstall delete-db='true':
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
helm uninstall querybook -n ${QUERYBOOK_NAMESPACE} --ignore-not-found --wait
|
||||
kubectl delete configmap querybook-keycloak-auth -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
kubectl delete middleware querybook-headers -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
kubectl delete serverstransport querybook-transport -n ${QUERYBOOK_NAMESPACE} --ignore-not-found
|
||||
just delete-secrets
|
||||
just delete-keycloak-client
|
||||
just delete-namespace
|
||||
if [ "{{ delete-db }}" = "true" ]; then
|
||||
just postgres::delete-db querybook
|
||||
fi
|
||||
|
||||
# Clean up Vault entries if present
|
||||
if helm status vault -n ${K8S_VAULT_NAMESPACE} &>/dev/null; then
|
||||
just vault::delete querybook/config || true
|
||||
just vault::delete querybook/oauth || true
|
||||
fi
|
||||
84
querybook/keycloak-auth-configmap.gomplate.yaml
Normal file
84
querybook/keycloak-auth-configmap.gomplate.yaml
Normal file
@@ -0,0 +1,84 @@
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: querybook-keycloak-auth
|
||||
namespace: {{ .Env.QUERYBOOK_NAMESPACE }}
|
||||
data:
|
||||
keycloak_auth.py: |
|
||||
"""
|
||||
Keycloak OIDC authentication backend for Querybook
|
||||
"""
|
||||
from app.auth.oauth_auth import OAuthLoginManager, OAUTH_CALLBACK_PATH
|
||||
from env import QuerybookSettings
|
||||
from lib.logger import get_logger
|
||||
from logic.user import get_user_by_name, create_user
|
||||
|
||||
LOG = get_logger(__file__)
|
||||
|
||||
|
||||
class KeycloakLoginManager(OAuthLoginManager):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self._current_user_groups = []
|
||||
|
||||
@property
|
||||
def oauth_config(self):
|
||||
return {
|
||||
"callback_url": "{}{}".format(
|
||||
QuerybookSettings.PUBLIC_URL, OAUTH_CALLBACK_PATH
|
||||
),
|
||||
"client_id": QuerybookSettings.OAUTH_CLIENT_ID,
|
||||
"client_secret": QuerybookSettings.OAUTH_CLIENT_SECRET,
|
||||
"authorization_url": QuerybookSettings.OAUTH_AUTHORIZATION_URL,
|
||||
"token_url": QuerybookSettings.OAUTH_TOKEN_URL,
|
||||
"profile_url": QuerybookSettings.OAUTH_USER_PROFILE,
|
||||
"scope": ["openid", "email", "profile"],
|
||||
}
|
||||
|
||||
def _parse_user_profile(self, resp):
|
||||
"""Parse standard OIDC UserInfo response from Keycloak"""
|
||||
user = resp.json()
|
||||
username = user.get("preferred_username") or user.get("email", "").split("@")[0]
|
||||
email = user.get("email", "")
|
||||
|
||||
# Store groups for role synchronization
|
||||
self._current_user_groups = user.get("groups", [])
|
||||
LOG.info(f"User {username} groups: {self._current_user_groups}")
|
||||
|
||||
return username, email
|
||||
|
||||
def login_user(self, username, email, session=None):
|
||||
"""Override login_user - using default Querybook behavior
|
||||
|
||||
Note: Querybook automatically makes the first user an admin via
|
||||
create_admin_when_no_admin() function. Additional users can be
|
||||
granted admin access through Querybook's UI or database.
|
||||
"""
|
||||
from .utils import AuthenticationError
|
||||
|
||||
if not username or not isinstance(username, str):
|
||||
raise AuthenticationError("Please provide a valid username")
|
||||
|
||||
user = get_user_by_name(username, session=session)
|
||||
if not user:
|
||||
user = create_user(
|
||||
username=username, fullname=username, email=email, session=session
|
||||
)
|
||||
|
||||
# Log group membership for debugging
|
||||
LOG.info(f"User {username} Keycloak groups: {self._current_user_groups}")
|
||||
|
||||
return user
|
||||
|
||||
|
||||
login_manager = KeycloakLoginManager()
|
||||
|
||||
ignore_paths = [OAUTH_CALLBACK_PATH]
|
||||
|
||||
|
||||
def init_app(app):
|
||||
login_manager.init_app(app)
|
||||
|
||||
|
||||
def login(request):
|
||||
return login_manager.login(request)
|
||||
34
querybook/querybook-config-external-secret.gomplate.yaml
Normal file
34
querybook/querybook-config-external-secret.gomplate.yaml
Normal file
@@ -0,0 +1,34 @@
|
||||
apiVersion: external-secrets.io/v1
|
||||
kind: ExternalSecret
|
||||
metadata:
|
||||
name: querybook-secret
|
||||
namespace: {{ .Env.QUERYBOOK_NAMESPACE }}
|
||||
spec:
|
||||
refreshInterval: 1h
|
||||
secretStoreRef:
|
||||
name: vault-secret-store
|
||||
kind: ClusterSecretStore
|
||||
target:
|
||||
name: querybook-secret
|
||||
creationPolicy: Owner
|
||||
data:
|
||||
- secretKey: FLASK_SECRET_KEY
|
||||
remoteRef:
|
||||
key: querybook/config
|
||||
property: FLASK_SECRET_KEY
|
||||
- secretKey: DATABASE_CONN
|
||||
remoteRef:
|
||||
key: querybook/config
|
||||
property: DATABASE_CONN
|
||||
- secretKey: REDIS_URL
|
||||
remoteRef:
|
||||
key: querybook/config
|
||||
property: REDIS_URL
|
||||
- secretKey: ELASTICSEARCH_HOST
|
||||
remoteRef:
|
||||
key: querybook/config
|
||||
property: ELASTICSEARCH_HOST
|
||||
- secretKey: OAUTH_CLIENT_SECRET
|
||||
remoteRef:
|
||||
key: querybook/config
|
||||
property: OAUTH_CLIENT_SECRET
|
||||
187
querybook/querybook-values.gomplate.yaml
Normal file
187
querybook/querybook-values.gomplate.yaml
Normal file
@@ -0,0 +1,187 @@
|
||||
# Querybook Helm Chart Values
|
||||
# https://github.com/pinterest/querybook/tree/master/helm
|
||||
|
||||
# Worker configuration
|
||||
worker:
|
||||
replicaCount: 1
|
||||
name: worker
|
||||
image:
|
||||
repository: querybook/querybook
|
||||
pullPolicy: IfNotPresent
|
||||
tag: latest
|
||||
resources:
|
||||
requests:
|
||||
memory: 1Gi
|
||||
cpu: 700m
|
||||
limits:
|
||||
memory: 2Gi
|
||||
cpu: 1
|
||||
|
||||
# Scheduler configuration
|
||||
scheduler:
|
||||
replicaCount: 1
|
||||
name: scheduler
|
||||
image:
|
||||
repository: querybook/querybook
|
||||
pullPolicy: IfNotPresent
|
||||
tag: latest
|
||||
resources:
|
||||
requests:
|
||||
memory: 200Mi
|
||||
cpu: 100m
|
||||
limits:
|
||||
memory: 300Mi
|
||||
cpu: 200m
|
||||
|
||||
# Web server configuration
|
||||
web:
|
||||
replicaCount: 1
|
||||
name: web
|
||||
image:
|
||||
repository: querybook/querybook
|
||||
pullPolicy: IfNotPresent
|
||||
tag: latest
|
||||
service:
|
||||
serviceType: ClusterIP
|
||||
servicePort: 80
|
||||
containerPort: 10001
|
||||
resources:
|
||||
requests:
|
||||
memory: 1Gi
|
||||
cpu: 500m
|
||||
limits:
|
||||
memory: 2Gi
|
||||
cpu: 1
|
||||
|
||||
# Custom initContainer to inject Keycloak auth backend
|
||||
initContainers:
|
||||
- name: copy-keycloak-auth
|
||||
image: busybox:latest
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- cp /config/keycloak_auth.py /auth/keycloak_auth.py && chmod 644 /auth/keycloak_auth.py
|
||||
volumeMounts:
|
||||
- name: keycloak-auth-config
|
||||
mountPath: /config
|
||||
- name: auth-volume
|
||||
mountPath: /auth
|
||||
|
||||
# Volume mounts for main container
|
||||
volumeMounts:
|
||||
- name: auth-volume
|
||||
mountPath: /opt/querybook/querybook/server/app/auth/keycloak_auth.py
|
||||
subPath: keycloak_auth.py
|
||||
|
||||
# Volumes
|
||||
volumes:
|
||||
- name: keycloak-auth-config
|
||||
configMap:
|
||||
name: querybook-keycloak-auth
|
||||
- name: auth-volume
|
||||
emptyDir: {}
|
||||
|
||||
# Use external PostgreSQL (buun-stack PostgreSQL cluster)
|
||||
mysql:
|
||||
enabled: false
|
||||
|
||||
# Redis configuration (use Helm chart's embedded Redis)
|
||||
redis:
|
||||
enabled: true
|
||||
replicaCount: 1
|
||||
name: redis
|
||||
image:
|
||||
repository: redis
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "7.2"
|
||||
service:
|
||||
serviceType: ClusterIP
|
||||
servicePort: 6379
|
||||
resources:
|
||||
requests:
|
||||
memory: 512Mi
|
||||
cpu: 200m
|
||||
limits:
|
||||
memory: 1Gi
|
||||
cpu: 500m
|
||||
|
||||
# Elasticsearch configuration (use Helm chart's embedded Elasticsearch)
|
||||
elasticsearch:
|
||||
enabled: true
|
||||
replicaCount: 1
|
||||
name: elasticsearch
|
||||
image:
|
||||
repository: docker.elastic.co/elasticsearch/elasticsearch
|
||||
pullPolicy: IfNotPresent
|
||||
tag: "7.17.16"
|
||||
extraEnvs:
|
||||
- name: ES_JAVA_OPTS
|
||||
value: -Xms1g -Xmx1g
|
||||
- name: bootstrap.memory_lock
|
||||
value: 'false'
|
||||
- name: cluster.name
|
||||
value: querybook-cluster
|
||||
- name: discovery.type
|
||||
value: single-node
|
||||
service:
|
||||
serviceType: ClusterIP
|
||||
servicePort: 9200
|
||||
resources:
|
||||
requests:
|
||||
memory: 2Gi
|
||||
cpu: 500m
|
||||
limits:
|
||||
memory: 3Gi
|
||||
cpu: 1
|
||||
|
||||
# Ingress configuration
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: traefik
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
# WebSocket support - apply middleware for X-Forwarded-Proto header
|
||||
traefik.ingress.kubernetes.io/router.middlewares: querybook-querybook-headers@kubernetescrd
|
||||
# Sticky sessions for WebSocket connections
|
||||
traefik.ingress.kubernetes.io/service.sticky.cookie: "true"
|
||||
traefik.ingress.kubernetes.io/service.sticky.cookie.name: querybook-session
|
||||
# Increase timeouts for WebSocket connections (in seconds)
|
||||
traefik.ingress.kubernetes.io/service.serversTransport: querybook-transport@kubernetescrd
|
||||
path: /
|
||||
pathType: Prefix
|
||||
hosts:
|
||||
- {{ .Env.QUERYBOOK_HOST }}
|
||||
tls:
|
||||
- hosts:
|
||||
- {{ .Env.QUERYBOOK_HOST }}
|
||||
|
||||
# Querybook environment variables
|
||||
extraEnv:
|
||||
# Public URL (required for OAuth)
|
||||
PUBLIC_URL: https://{{ .Env.QUERYBOOK_HOST }}
|
||||
|
||||
# WebSocket CORS origins (required for socket.io to accept connections)
|
||||
WS_CORS_ALLOWED_ORIGINS: '["https://{{ .Env.QUERYBOOK_HOST }}"]'
|
||||
|
||||
# Authentication backend (custom Keycloak OIDC implementation)
|
||||
AUTH_BACKEND: app.auth.keycloak_auth
|
||||
|
||||
# OAuth configuration for Keycloak
|
||||
OAUTH_CLIENT_ID: querybook
|
||||
OAUTH_CLIENT_SECRET: {{ .Env.OAUTH_CLIENT_SECRET }}
|
||||
OAUTH_AUTHORIZATION_URL: https://{{ .Env.KEYCLOAK_HOST }}/realms/{{ .Env.KEYCLOAK_REALM }}/protocol/openid-connect/auth
|
||||
OAUTH_TOKEN_URL: https://{{ .Env.KEYCLOAK_HOST }}/realms/{{ .Env.KEYCLOAK_REALM }}/protocol/openid-connect/token
|
||||
OAUTH_USER_PROFILE: https://{{ .Env.KEYCLOAK_HOST }}/realms/{{ .Env.KEYCLOAK_REALM }}/protocol/openid-connect/userinfo
|
||||
|
||||
# Session configuration
|
||||
LOGS_OUT_AFTER: "0" # Never expire (re-login on browser close)
|
||||
|
||||
# Use existing secret for Flask, database, Redis, and Elasticsearch configuration
|
||||
existingSecret: querybook-secret
|
||||
|
||||
# Node selector, affinity, and tolerations
|
||||
nodeSelector: {}
|
||||
affinity: {}
|
||||
tolerations: []
|
||||
podAnnotations: {}
|
||||
25
querybook/traefik-middleware.gomplate.yaml
Normal file
25
querybook/traefik-middleware.gomplate.yaml
Normal file
@@ -0,0 +1,25 @@
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: querybook-headers
|
||||
namespace: {{ .Env.QUERYBOOK_NAMESPACE }}
|
||||
spec:
|
||||
headers:
|
||||
customRequestHeaders:
|
||||
X-Forwarded-Proto: "https"
|
||||
customResponseHeaders:
|
||||
X-Forwarded-Proto: "https"
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: ServersTransport
|
||||
metadata:
|
||||
name: querybook-transport
|
||||
namespace: {{ .Env.QUERYBOOK_NAMESPACE }}
|
||||
spec:
|
||||
serverName: ""
|
||||
insecureSkipVerify: false
|
||||
# Timeouts for WebSocket connections
|
||||
forwardingTimeouts:
|
||||
dialTimeout: 30s
|
||||
responseHeaderTimeout: 0s # No timeout for response headers (needed for WebSocket)
|
||||
idleConnTimeout: 0s # No timeout for idle connections (needed for WebSocket)
|
||||
Reference in New Issue
Block a user