feat(querybook): use custom iamge to install sqlalchemy-trino
This commit is contained in:
@@ -54,8 +54,108 @@ QUERYBOOK_NAMESPACE=querybook # Kubernetes namespace
|
|||||||
QUERYBOOK_HOST=querybook.example.com # External hostname
|
QUERYBOOK_HOST=querybook.example.com # External hostname
|
||||||
KEYCLOAK_HOST=auth.example.com # Keycloak hostname
|
KEYCLOAK_HOST=auth.example.com # Keycloak hostname
|
||||||
KEYCLOAK_REALM=buunstack # Keycloak realm name
|
KEYCLOAK_REALM=buunstack # Keycloak realm name
|
||||||
|
|
||||||
|
# Optional: Use custom Docker image (for testing fixes/patches)
|
||||||
|
QUERYBOOK_CUSTOM_IMAGE=localhost:30500/querybook # Custom image repository
|
||||||
|
QUERYBOOK_CUSTOM_IMAGE_TAG=trino-metastore # Custom image tag (default: latest)
|
||||||
|
QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY=Always # Image pull policy (default: Always)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Using Custom Image
|
||||||
|
|
||||||
|
To use a custom Querybook image (e.g., with patches or fixes):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Set environment variables
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE=localhost:30500/querybook
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE_TAG=trino-metastore
|
||||||
|
|
||||||
|
# Install or upgrade Querybook
|
||||||
|
just querybook::install
|
||||||
|
# or
|
||||||
|
just querybook::upgrade
|
||||||
|
```
|
||||||
|
|
||||||
|
**When to use custom image**:
|
||||||
|
|
||||||
|
- Testing bug fixes before they are merged upstream
|
||||||
|
- Applying patches for specific issues (e.g., datetime JSON serialization)
|
||||||
|
- Using Trino Metastore integration (requires sqlalchemy-trino)
|
||||||
|
- Using modified versions with custom features
|
||||||
|
|
||||||
|
**Custom image includes** (`trino-metastore` tag):
|
||||||
|
|
||||||
|
- Datetime JSON serialization fixes for WebSocket communication
|
||||||
|
- `sqlalchemy-trino` package for Metastore integration
|
||||||
|
|
||||||
|
**Custom image behavior** (when `QUERYBOOK_CUSTOM_IMAGE` is set):
|
||||||
|
|
||||||
|
- Pull policy: `Always` (default, override with `QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY`)
|
||||||
|
- Ensures latest image is always pulled from registry
|
||||||
|
|
||||||
|
**Default behavior** (when `QUERYBOOK_CUSTOM_IMAGE` is not set):
|
||||||
|
|
||||||
|
- Uses official image: `querybook/querybook:latest`
|
||||||
|
- Pull policy: `IfNotPresent`
|
||||||
|
- Note: Official image does not include `sqlalchemy-trino`, so Trino Metastore integration will not work
|
||||||
|
|
||||||
|
### Building Custom Image
|
||||||
|
|
||||||
|
To build a custom Querybook image with `sqlalchemy-trino` support:
|
||||||
|
|
||||||
|
1. **Clone Querybook repository**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/pinterest/querybook.git
|
||||||
|
cd querybook
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Create requirements/local.txt**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cat > requirements/local.txt <<EOF
|
||||||
|
# Local additional requirements for buun-stack
|
||||||
|
# SQLAlchemy dialect for Trino (required for Metastore)
|
||||||
|
sqlalchemy-trino
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Build the Docker image**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# For remote Docker host (e.g., k3s node)
|
||||||
|
DOCKER_HOST=ssh://yourdomain.com docker build \
|
||||||
|
--build-arg EXTRA_PIP_INSTALLS=extra.txt \
|
||||||
|
-t localhost:30500/querybook:trino-metastore .
|
||||||
|
|
||||||
|
# For local Docker
|
||||||
|
docker build \
|
||||||
|
--build-arg EXTRA_PIP_INSTALLS=extra.txt \
|
||||||
|
-t localhost:30500/querybook:trino-metastore .
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Push to registry**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
DOCKER_HOST=ssh://yourdomain.com docker push localhost:30500/querybook:trino-metastore
|
||||||
|
# or for local Docker
|
||||||
|
docker push localhost:30500/querybook:trino-metastore
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Deploy to Kubernetes**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE=localhost:30500/querybook
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE_TAG=trino-metastore
|
||||||
|
just querybook::upgrade
|
||||||
|
```
|
||||||
|
|
||||||
|
**Notes**:
|
||||||
|
|
||||||
|
- The Dockerfile automatically includes `requirements/local.txt` if it exists (lines 40-42)
|
||||||
|
- `EXTRA_PIP_INSTALLS=extra.txt` ensures additional dependencies are installed during build
|
||||||
|
- The custom image will have both the official Querybook packages and `sqlalchemy-trino`
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
### Access Querybook
|
### Access Querybook
|
||||||
@@ -84,27 +184,74 @@ Admin users can:
|
|||||||
1. Log in as an admin user
|
1. Log in as an admin user
|
||||||
2. Navigate to Admin → Query Engines
|
2. Navigate to Admin → Query Engines
|
||||||
3. Click "Add Query Engine"
|
3. Click "Add Query Engine"
|
||||||
4. Configure:
|
4. Configure basic settings:
|
||||||
|
|
||||||
```plain
|
```plain
|
||||||
Name: Trino
|
Name: Trino
|
||||||
Language: Trino
|
Language: Trino
|
||||||
|
Executor: Trino (not SqlAlchemy)
|
||||||
Environment: production (or your preferred environment name)
|
Environment: production (or your preferred environment name)
|
||||||
```
|
```
|
||||||
|
|
||||||
5. Navigate to Admin → Environments → [your environment]
|
5. Configure connection settings:
|
||||||
6. Add new query engine connection:
|
|
||||||
|
|
||||||
```plain
|
```plain
|
||||||
Connection String: trino://trino.example.com:443?SSL=true
|
Connection String: trino://trino.example.com:443/iceberg?SSL=true
|
||||||
Username: admin
|
Username: admin
|
||||||
Password: [from just trino::admin-password]
|
Password: [from just trino::admin-password]
|
||||||
|
Proxy_user_id: (leave empty to use admin username)
|
||||||
```
|
```
|
||||||
|
|
||||||
7. Optional: Configure additional connection parameters:
|
**Important Notes**:
|
||||||
- **Catalog**: Specify default catalog (e.g., `postgresql` or `iceberg`)
|
- **Catalog in Connection String**: Include `/iceberg` (or your catalog name) after the port
|
||||||
- **Schema**: Specify default schema
|
- With catalog: `trino://host:443/iceberg?SSL=true` → queries work without `iceberg.` prefix
|
||||||
- **Proxy_user_id**: Leave empty or set to enable user impersonation
|
- Without catalog: `trino://host:443?SSL=true` → queries fail with "Catalog 'hive' not found"
|
||||||
|
- **Proxy_user_id**: Leave empty (defaults to Username field = admin)
|
||||||
|
- For user impersonation, configure Trino access control separately
|
||||||
|
|
||||||
|
6. Optional: Link to Metastore for table autocompletion:
|
||||||
|
- **Metastore**: Select created Metastore (see Metastore Configuration section below)
|
||||||
|
- Enables autocomplete for table and column names in query editor
|
||||||
|
|
||||||
|
### Configure Metastore (Optional but Recommended)
|
||||||
|
|
||||||
|
Metastore enables table/column autocompletion and provides a browsable table catalog.
|
||||||
|
|
||||||
|
**Prerequisites**: Custom image with `sqlalchemy-trino` (official image does not include this package)
|
||||||
|
|
||||||
|
1. Navigate to Admin → Metastores
|
||||||
|
2. Click "Create Metastore"
|
||||||
|
3. Configure:
|
||||||
|
|
||||||
|
```plain
|
||||||
|
Name: Trino Iceberg
|
||||||
|
Metastore Loader: SqlAlchemyMetastoreLoader
|
||||||
|
Connection String: trino://admin:[password]@trino.example.com:443/iceberg?SSL=true
|
||||||
|
Acct Info (Key-Value):
|
||||||
|
http_scheme = https
|
||||||
|
Impersonate: OFF (recommended for shared table catalog)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Important Notes**:
|
||||||
|
- Include authentication in Connection String: `admin:[password]@host`
|
||||||
|
- Include catalog in Connection String: `/iceberg` after port
|
||||||
|
- `http_scheme` must be set to `https` in Acct Info
|
||||||
|
- Keep Impersonate OFF unless you need per-user table filtering
|
||||||
|
|
||||||
|
4. Click "Run Task" to sync table metadata
|
||||||
|
5. Verify in Admin → Metastores that "Last Synced" timestamp is updated
|
||||||
|
6. Check left sidebar "Tables" for table list
|
||||||
|
|
||||||
|
**Scheduled Updates** (recommended):
|
||||||
|
|
||||||
|
- Navigate to Admin → Metastores → [your metastore] → Schedule
|
||||||
|
- Set cron expression: `0 */6 * * *` (sync every 6 hours)
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
|
||||||
|
- **Tables Sidebar**: Browse schemas and tables, view column details
|
||||||
|
- **Autocomplete**: Type table/column names in query editor, press Tab or Escape
|
||||||
|
- **Search**: Use search box in Tables sidebar to find tables by name
|
||||||
|
|
||||||
### User Impersonation
|
### User Impersonation
|
||||||
|
|
||||||
@@ -124,7 +271,7 @@ Querybook connects to Trino as `admin` but executes queries as the logged-in use
|
|||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```plain
|
||||||
External Users
|
External Users
|
||||||
↓
|
↓
|
||||||
Cloudflare Tunnel (HTTPS)
|
Cloudflare Tunnel (HTTPS)
|
||||||
@@ -244,6 +391,34 @@ kubectl get pods -n querybook
|
|||||||
- Check database exists: `just postgres::list-databases | grep querybook`
|
- Check database exists: `just postgres::list-databases | grep querybook`
|
||||||
- Verify secret exists: `kubectl get secret querybook-config-secret -n querybook`
|
- Verify secret exists: `kubectl get secret querybook-config-secret -n querybook`
|
||||||
|
|
||||||
|
### Metastore Issues
|
||||||
|
|
||||||
|
- **Tables sidebar is empty**
|
||||||
|
- Check Admin → Metastores for "Last Synced" timestamp
|
||||||
|
- Click "Run Task" to manually sync
|
||||||
|
- Verify Metastore is linked to Query Engine (Admin → Query Engines → Metastore field)
|
||||||
|
- Check worker logs: `kubectl logs -n querybook deployment/worker --tail=100 | grep metastore`
|
||||||
|
|
||||||
|
- **Error: "Can't load plugin: sqlalchemy.dialects:trino"**
|
||||||
|
- Official Querybook image does not include `sqlalchemy-trino`
|
||||||
|
- Use custom image with `QUERYBOOK_CUSTOM_IMAGE_TAG=trino-metastore`
|
||||||
|
- See "Using Custom Image" section above
|
||||||
|
|
||||||
|
- **Error: "Connection.**init**() got an unexpected keyword argument 'password'"**
|
||||||
|
- Do not use `password` key in Acct Info
|
||||||
|
- Embed authentication in Connection String: `trino://admin:[password]@host:port/catalog?SSL=true`
|
||||||
|
- Set `http_scheme = https` in Acct Info
|
||||||
|
|
||||||
|
- **Only system.* schemas visible**
|
||||||
|
- Connection String is missing catalog specification
|
||||||
|
- Add `/iceberg` (or your catalog) after port: `trino://host:443/iceberg?SSL=true`
|
||||||
|
|
||||||
|
- **Autocomplete not working**
|
||||||
|
- Verify Query Engine has Metastore linked (Admin → Query Engines → Metastore field)
|
||||||
|
- Refresh DataDoc page (F5) after linking Metastore
|
||||||
|
- Check Environment matches between DataDoc and Query Engine
|
||||||
|
- Try Tab or Escape key instead of Ctrl+Space (macOS shortcut conflict)
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
- [Querybook Documentation](https://www.querybook.org/)
|
- [Querybook Documentation](https://www.querybook.org/)
|
||||||
|
|||||||
@@ -4,6 +4,9 @@ export QUERYBOOK_NAMESPACE := env("QUERYBOOK_NAMESPACE", "querybook")
|
|||||||
export QUERYBOOK_HOST := env("QUERYBOOK_HOST", "")
|
export QUERYBOOK_HOST := env("QUERYBOOK_HOST", "")
|
||||||
export QUERYBOOK_CHART_REPO := env("QUERYBOOK_CHART_REPO", "https://github.com/pinterest/querybook")
|
export QUERYBOOK_CHART_REPO := env("QUERYBOOK_CHART_REPO", "https://github.com/pinterest/querybook")
|
||||||
export QUERYBOOK_CHART_PATH := env("QUERYBOOK_CHART_PATH", "helm")
|
export QUERYBOOK_CHART_PATH := env("QUERYBOOK_CHART_PATH", "helm")
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE := env("QUERYBOOK_CUSTOM_IMAGE", "")
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE_TAG := env("QUERYBOOK_CUSTOM_IMAGE_TAG", "")
|
||||||
|
export QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY := env("QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY", "IfNotPresent")
|
||||||
export EXTERNAL_SECRETS_NAMESPACE := env("EXTERNAL_SECRETS_NAMESPACE", "external-secrets")
|
export EXTERNAL_SECRETS_NAMESPACE := env("EXTERNAL_SECRETS_NAMESPACE", "external-secrets")
|
||||||
export K8S_VAULT_NAMESPACE := env("K8S_VAULT_NAMESPACE", "vault")
|
export K8S_VAULT_NAMESPACE := env("K8S_VAULT_NAMESPACE", "vault")
|
||||||
export KEYCLOAK_REALM := env("KEYCLOAK_REALM", "buunstack")
|
export KEYCLOAK_REALM := env("KEYCLOAK_REALM", "buunstack")
|
||||||
|
|||||||
@@ -6,9 +6,15 @@ worker:
|
|||||||
replicaCount: 1
|
replicaCount: 1
|
||||||
name: worker
|
name: worker
|
||||||
image:
|
image:
|
||||||
|
{{- if .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
repository: {{ .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
pullPolicy: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY | default "Always" }}
|
||||||
|
tag: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_TAG | default "latest" }}
|
||||||
|
{{- else }}
|
||||||
repository: querybook/querybook
|
repository: querybook/querybook
|
||||||
pullPolicy: IfNotPresent
|
pullPolicy: IfNotPresent
|
||||||
tag: latest
|
tag: latest
|
||||||
|
{{- end }}
|
||||||
resources:
|
resources:
|
||||||
requests:
|
requests:
|
||||||
memory: 1Gi
|
memory: 1Gi
|
||||||
@@ -22,9 +28,15 @@ scheduler:
|
|||||||
replicaCount: 1
|
replicaCount: 1
|
||||||
name: scheduler
|
name: scheduler
|
||||||
image:
|
image:
|
||||||
|
{{- if .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
repository: {{ .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
pullPolicy: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY | default "Always" }}
|
||||||
|
tag: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_TAG | default "latest" }}
|
||||||
|
{{- else }}
|
||||||
repository: querybook/querybook
|
repository: querybook/querybook
|
||||||
pullPolicy: IfNotPresent
|
pullPolicy: IfNotPresent
|
||||||
tag: latest
|
tag: latest
|
||||||
|
{{- end }}
|
||||||
resources:
|
resources:
|
||||||
requests:
|
requests:
|
||||||
memory: 200Mi
|
memory: 200Mi
|
||||||
@@ -38,9 +50,15 @@ web:
|
|||||||
replicaCount: 1
|
replicaCount: 1
|
||||||
name: web
|
name: web
|
||||||
image:
|
image:
|
||||||
|
{{- if .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
repository: {{ .Env.QUERYBOOK_CUSTOM_IMAGE }}
|
||||||
|
pullPolicy: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_PULL_POLICY | default "Always" }}
|
||||||
|
tag: {{ .Env.QUERYBOOK_CUSTOM_IMAGE_TAG | default "latest" }}
|
||||||
|
{{- else }}
|
||||||
repository: querybook/querybook
|
repository: querybook/querybook
|
||||||
pullPolicy: IfNotPresent
|
pullPolicy: IfNotPresent
|
||||||
tag: latest
|
tag: latest
|
||||||
|
{{- end }}
|
||||||
service:
|
service:
|
||||||
serviceType: ClusterIP
|
serviceType: ClusterIP
|
||||||
servicePort: 80
|
servicePort: 80
|
||||||
|
|||||||
Reference in New Issue
Block a user