Files
buun-stack/trino/README.md
2025-10-15 20:42:51 +09:00

5.9 KiB

Trino

Fast distributed SQL query engine for big data analytics with Keycloak authentication.

Overview

This module deploys Trino using the official Helm chart with:

  • Keycloak OAuth2 authentication for Web UI access
  • Password authentication for JDBC clients (Metabase, etc.)
  • PostgreSQL catalog for querying PostgreSQL databases
  • Iceberg catalog with Lakekeeper (optional)
  • TPCH catalog with sample data for testing

Prerequisites

  • Kubernetes cluster (k3s)
  • Keycloak installed and configured
  • PostgreSQL cluster (CloudNativePG)
  • MinIO (optional, for Iceberg catalog)
  • External Secrets Operator (optional, for Vault integration)

Installation

Basic Installation

just trino::install

You will be prompted for:

  1. Trino host (FQDN): e.g., trino.example.com
  2. PostgreSQL catalog setup: Recommended for production use
  3. MinIO storage setup: Optional, for Iceberg/Hive catalogs

What Gets Installed

  • Trino coordinator (1 instance)
  • Trino workers (2 instances by default)
  • OAuth2 client in Keycloak
  • Password authentication for JDBC access
  • PostgreSQL catalog (if selected)
  • Iceberg catalog with Lakekeeper (if MinIO selected)
  • TPCH catalog with sample data

Configuration

Environment variables (set in .env.local or override):

TRINO_NAMESPACE=trino                   # Kubernetes namespace
TRINO_CHART_VERSION=1.41.0              # Helm chart version
TRINO_IMAGE_TAG=477                     # Trino version
TRINO_COORDINATOR_MEMORY=4Gi            # Coordinator memory
TRINO_COORDINATOR_CPU=2                 # Coordinator CPU
TRINO_WORKER_MEMORY=4Gi                 # Worker memory
TRINO_WORKER_CPU=2                      # Worker CPU
TRINO_WORKER_COUNT=2                    # Number of workers

Usage

Web UI Access

  1. Navigate to https://your-trino-host/
  2. Click "Sign in" to authenticate with Keycloak
  3. Execute queries in the Web UI

Get Admin Password

For JDBC/Metabase connections:

just trino::admin-password

Returns the password for username admin.

Metabase Integration

Important: Trino requires TLS/SSL for password authentication. You must use the external hostname (not the internal Kubernetes service name).

  1. In Metabase, go to Admin → Databases → Add database

  2. Select Database type: Starburst

  3. Configure connection:

    Host: your-trino-host (e.g., trino.example.com)
    Port: 443
    Username: admin
    Password: [from just trino::admin-password]
    Catalog: postgresql
    SSL: Yes
    

Note: Do NOT use internal Kubernetes hostnames like trino.trino.svc.cluster.local as they do not have valid TLS certificates for password authentication.

Example Queries

Query TPCH sample data:

SELECT * FROM tpch.tiny.customer LIMIT 10;

Query PostgreSQL:

SELECT * FROM postgresql.public.pg_tables;

Show all catalogs:

SHOW CATALOGS;

Show schemas in a catalog:

SHOW SCHEMAS FROM postgresql;

Catalogs

TPCH (Always Available)

Sample TPC-H benchmark data for testing:

  • tpch.tiny.* - Small dataset
  • tpch.sf1.* - 1GB dataset

Tables: customer, orders, lineitem, part, supplier, nation, region

PostgreSQL

Queries your CloudNativePG cluster:

  • Catalog: postgresql
  • Default schema: public
  • Database: trino

Iceberg (Optional)

Queries Iceberg tables via Lakekeeper:

  • Catalog: iceberg
  • Storage: MinIO S3-compatible

Management

Upgrade Trino

just trino::upgrade

Updates the Helm deployment with current configuration.

Uninstall

# Keep PostgreSQL database
just trino::uninstall false

# Delete PostgreSQL database too
just trino::uninstall true

Cleanup All Resources

just trino::cleanup

Removes:

  • PostgreSQL database
  • Vault secrets
  • Keycloak OAuth client

Authentication

Web UI (OAuth2)

  • Uses Keycloak for authentication
  • Requires valid user in the configured realm
  • Automatic redirect to Keycloak login

JDBC/Metabase (Password)

  • Username: admin
  • Password: Retrieved via just trino::admin-password
  • Stored in Vault at trino/password

Architecture

External Users
      ↓
Cloudflare Tunnel (HTTPS)
      ↓
Traefik Ingress
      ↓
Trino Coordinator (HTTP:8080)
      ↓
Trino Workers (HTTP:8080)
      ↓
Data Sources:
  - PostgreSQL (CloudNativePG)
  - MinIO (S3)
  - Iceberg (Lakekeeper)

Troubleshooting

Check Pod Status

kubectl get pods -n trino

View Coordinator Logs

kubectl logs -n trino -l app.kubernetes.io/component=coordinator --tail=100

View Worker Logs

kubectl logs -n trino -l app.kubernetes.io/component=worker --tail=100

Test Authentication

# From inside coordinator pod
kubectl exec -n trino deployment/trino-coordinator -- \
  curl -u admin:PASSWORD http://localhost:8080/v1/info

Common Issues

Metabase Sync Fails

  • Ensure catalog is specified in connection settings (e.g., postgresql)
  • Check Trino coordinator logs for errors
  • Verify PostgreSQL/Iceberg connectivity

OAuth2 Login Fails

  • Verify Keycloak OAuth client exists: just keycloak::list-clients
  • Check redirect URL matches Trino host
  • Ensure Keycloak is accessible from Trino pods

Password Authentication Fails

  • Retrieve current password: just trino::admin-password
  • Ensure SSL/TLS is enabled in JDBC URL
  • For internal testing, HTTP is supported via http-server.authentication.allow-insecure-over-http=true

References