feat(goldilocks): install Goldilocks and VPA

2025-11-10 14:17:53 +09:00
parent 189a376511
commit f429720617
9 changed files with 993 additions and 0 deletions
--- a/vpa/README.md
+++ b/vpa/README.md
@@ -0,0 +1,238 @@
+# Vertical Pod Autoscaler (VPA)
+
+Kubernetes resource monitoring and recommendation system:
+
+- **Monitoring-only mode**: Observes workloads without automatic scaling
+- **Prometheus integration**: Metrics collection via Prometheus instead of metrics-server
+- **Resource recommendations**: Generates CPU and memory suggestions based on actual usage
+- **Goldilocks integration**: Works with Goldilocks dashboard for visualization
+- **Non-intrusive**: Does not modify running workloads
+
+## Important Note
+
+**This VPA installation is configured for monitoring and recommendation only**:
+
+- ✅ **Recommender**: Enabled - Analyzes workload metrics and generates recommendations
+- ❌ **Updater**: Disabled - Does NOT automatically apply recommendations to pods
+- ❌ **Admission Controller**: Disabled - Does NOT modify pod resources at creation time
+
+This configuration ensures VPA observes your workloads without affecting them. You can review recommendations and manually adjust resource settings.
+
+## Prerequisites
+
+- Kubernetes cluster (k3s)
+- Prometheus (kube-prometheus-stack) installed
+
+VPA requires Prometheus to collect historical metrics data. Install Prometheus first:
+
+```bash
+just prometheus::install
+```
+
+## Installation
+
+```bash
+just vpa::install
+```
+
+The installation will automatically detect Prometheus and configure VPA to use it as the metrics source.
+
+## Configuration
+
+Environment variables (set in `.env.local` or override):
+
+```bash
+VPA_NAMESPACE=vpa                                                   # VPA namespace
+PROMETHEUS_NAMESPACE=monitoring                                     # Prometheus namespace
+PROMETHEUS_ADDRESS=http://kube-prometheus-stack-prometheus.monitoring.svc:9090  # Prometheus URL
+```
+
+## Usage
+
+### View VPA Status
+
+```bash
+just vpa::status
+```
+
+### View Recommender Logs
+
+```bash
+just vpa::logs-recommender
+```
+
+### View VPA Resources
+
+List all VPA resources across namespaces:
+
+```bash
+kubectl get vpa -A
+```
+
+View specific VPA recommendations:
+
+```bash
+kubectl describe vpa <vpa-name> -n <namespace>
+```
+
+Get recommendation in JSON format:
+
+```bash
+kubectl get vpa <vpa-name> -n <namespace> -o jsonpath='{.status.recommendation}' | jq
+```
+
+### Manual VPA Resource Creation
+
+Create a VPA resource for monitoring:
+
+```yaml
+apiVersion: autoscaling.k8s.io/v1
+kind: VerticalPodAutoscaler
+metadata:
+  name: my-app-vpa
+  namespace: default
+spec:
+  targetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: my-app
+  updatePolicy:
+    updateMode: "Off"  # Monitoring only
+```
+
+Apply with:
+
+```bash
+kubectl apply -f vpa-resource.yaml
+```
+
+## Integration with Goldilocks
+
+VPA alone provides raw recommendations through kubectl commands. For a user-friendly dashboard experience, use Goldilocks:
+
+```bash
+# Install Goldilocks
+just goldilocks::install
+
+# Enable monitoring for specific namespaces
+just goldilocks::enable-namespace <namespace>
+```
+
+Goldilocks automatically creates VPA resources for all workloads in labeled namespaces and presents recommendations in a web dashboard.
+
+## Enabling Automatic Scaling
+
+If you want to enable automatic pod resource updates, modify `values.gomplate.yaml`:
+
+```yaml
+updater:
+  enabled: true
+  replicaCount: 1
+  resources:
+    requests:
+      cpu: 50m
+      memory: 500Mi
+    limits:
+      cpu: 200m
+      memory: 1Gi
+  podMonitor:
+    enabled: true
+
+admissionController:
+  enabled: true
+  replicaCount: 1
+  generateCertificate: true
+  mutatingWebhookConfiguration:
+    failurePolicy: Ignore
+  resources:
+    requests:
+      cpu: 50m
+      memory: 200Mi
+    limits:
+      cpu: 200m
+      memory: 500Mi
+```
+
+Then reinstall:
+
+```bash
+just vpa::install
+```
+
+⚠️ **Warning**: Enabling updater and admission controller will cause VPA to automatically modify pod resources. Test thoroughly before enabling in production.
+
+## VPA Update Modes
+
+VPA supports three update modes (configured in VPA resource):
+
+- **Off** (Monitoring only - Current configuration): Generates recommendations but does not apply them
+- **Initial**: Applies recommendations only when pods are created
+- **Auto**: Automatically applies recommendations by evicting and recreating pods
+
+## Management
+
+### Uninstall
+
+```bash
+just vpa::uninstall
+```
+
+This removes:
+
+- Helm release
+- VPA CRDs
+- Namespace
+
+## Troubleshooting
+
+### Recommender Not Starting
+
+Check Prometheus connectivity:
+
+```bash
+just vpa::logs-recommender
+```
+
+Verify Prometheus is running:
+
+```bash
+kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus
+```
+
+### No Recommendations Generated
+
+VPA requires workload metrics over time:
+
+- Minimum: A few minutes of runtime
+- Recommended: 24+ hours for accurate recommendations
+
+Verify workload is running and generating metrics:
+
+```bash
+kubectl get pods -n <namespace>
+kubectl top pods -n <namespace>
+```
+
+### VPA Resource Not Created
+
+For Goldilocks-managed VPA resources, ensure:
+
+1. Namespace has label: `goldilocks.fairwinds.com/enabled=true`
+2. Workload is managed by a controller (Deployment, StatefulSet, etc.)
+3. Goldilocks controller is running: `kubectl get pods -n goldilocks`
+
+### Check VPA Components
+
+```bash
+kubectl get pods -n vpa
+```
+
+Should show:
+
+- `vpa-recommender-*`: Running
+
+## References
+
+- [Kubernetes VPA Documentation](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler)
+- [Fairwinds VPA Helm Chart](https://github.com/FairwindsOps/charts/tree/master/stable/vpa)
+- [VPA Design Proposals](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md)
--- a/vpa/justfile
+++ b/vpa/justfile
@@ -0,0 +1,89 @@
+set fallback := true
+
+export VPA_NAMESPACE := env("VPA_NAMESPACE", "vpa")
+export PROMETHEUS_NAMESPACE := env("PROMETHEUS_NAMESPACE", "monitoring")
+export PROMETHEUS_ADDRESS := env("PROMETHEUS_ADDRESS", "http://kube-prometheus-stack-prometheus." + PROMETHEUS_NAMESPACE + ".svc:9090")
+
+[private]
+default:
+    @just --list --unsorted --list-submodules
+
+# Add Helm repository
+add-helm-repo:
+    helm repo add fairwinds-stable https://charts.fairwinds.com/stable
+    helm repo update
+
+# Remove Helm repository
+remove-helm-repo:
+    helm repo remove fairwinds-stable
+
+# Create namespace
+create-namespace:
+    @kubectl get namespace ${VPA_NAMESPACE} &>/dev/null || \
+        kubectl create namespace ${VPA_NAMESPACE}
+
+# Delete namespace
+delete-namespace:
+    @kubectl delete namespace ${VPA_NAMESPACE} --ignore-not-found
+
+# Install Vertical Pod Autoscaler
+install:
+    #!/bin/bash
+    set -euo pipefail
+
+    # Check if Prometheus is installed
+    if ! helm status kube-prometheus-stack -n ${PROMETHEUS_NAMESPACE} &>/dev/null; then
+        echo "Error: Prometheus (kube-prometheus-stack) is not installed."
+        echo "Please install Prometheus first using: just prometheus::install"
+        exit 1
+    fi
+
+    just add-helm-repo
+    just create-namespace
+
+    # Generate values.yaml from template
+    gomplate -f values.gomplate.yaml -o values.yaml
+
+    # Install VPA with Helm
+    helm upgrade --install vpa fairwinds-stable/vpa --namespace ${VPA_NAMESPACE} \
+        --values values.yaml --wait
+
+    echo "VPA installed successfully in namespace: ${VPA_NAMESPACE}"
+    echo ""
+    echo "To verify installation:"
+    echo "  kubectl get pods -n ${VPA_NAMESPACE}"
+    echo "  kubectl get vpa -A"
+
+# Uninstall Vertical Pod Autoscaler
+uninstall:
+    #!/bin/bash
+    set -euo pipefail
+
+    if ! helm status vpa -n ${VPA_NAMESPACE} &>/dev/null; then
+        echo "VPA is not installed."
+        exit 0
+    fi
+
+    if command -v gum &>/dev/null; then
+        if ! gum confirm "Are you sure you want to uninstall VPA?"; then
+            echo "Uninstall cancelled."
+            exit 0
+        fi
+    else
+        read -p "Are you sure you want to uninstall VPA? (y/N) " -n 1 -r
+        echo
+        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+            echo "Uninstall cancelled."
+            exit 0
+        fi
+    fi
+
+    helm uninstall vpa -n ${VPA_NAMESPACE}
+
+    # Delete VPA CRDs
+    kubectl delete crd verticalpodautoscalercheckpoints.autoscaling.k8s.io --ignore-not-found
+    kubectl delete crd verticalpodautoscalers.autoscaling.k8s.io --ignore-not-found
+
+    just delete-namespace
+
+    echo "VPA uninstalled successfully."
--- a/vpa/values.gomplate.yaml
+++ b/vpa/values.gomplate.yaml
@@ -0,0 +1,59 @@
+# Fairwinds VPA Helm chart values
+# Optimized for resource monitoring with Prometheus + Goldilocks
+
+rbac:
+  create: true
+
+serviceAccount:
+  create: true
+  automountServiceAccountToken: true
+
+recommender:
+  enabled: true
+  replicaCount: 1
+
+  resources:
+    requests:
+      cpu: 50m
+      memory: 500Mi
+    limits:
+      cpu: 200m
+      memory: 1Gi
+
+  extraArgs:
+    v: '4' # Verbose logging level
+    pod-recommendation-min-cpu-millicores: 15
+    pod-recommendation-min-memory-mb: 100
+    storage: prometheus
+    prometheus-address: '{{ .Env.PROMETHEUS_ADDRESS }}'
+
+  podMonitor:
+    enabled: true
+
+updater:
+  enabled: false
+  # Disabled for monitoring-only mode
+  # The updater component automatically applies VPA recommendations to pods
+  # Enable this only if you want automatic pod resource updates
+
+admissionController:
+  enabled: false
+  # Disabled for monitoring-only mode
+  # The admission controller validates and mutates pod resources at creation time
+  # Enable this only if you want automatic resource enforcement
+
+metrics-server:
+  enabled: false
+
+podSecurityContext:
+  runAsNonRoot: true
+  runAsUser: 65534
+  seccompProfile:
+    type: RuntimeDefault
+
+securityContext:
+  readOnlyRootFilesystem: true
+  allowPrivilegeEscalation: false
+  capabilities:
+    drop:
+      - ALL
--- a/vpa/values.yaml
+++ b/vpa/values.yaml
@@ -0,0 +1,59 @@
+# Fairwinds VPA Helm chart values
+# Optimized for resource monitoring with Prometheus + Goldilocks
+
+rbac:
+  create: true
+
+serviceAccount:
+  create: true
+  automountServiceAccountToken: true
+
+recommender:
+  enabled: true
+  replicaCount: 1
+
+  resources:
+    requests:
+      cpu: 50m
+      memory: 500Mi
+    limits:
+      cpu: 200m
+      memory: 1Gi
+
+  extraArgs:
+    v: '4' # Verbose logging level
+    pod-recommendation-min-cpu-millicores: 15
+    pod-recommendation-min-memory-mb: 100
+    storage: prometheus
+    prometheus-address: 'http://kube-prometheus-stack-prometheus.monitoring.svc:9090'
+
+  podMonitor:
+    enabled: true
+
+updater:
+  enabled: false
+  # Disabled for monitoring-only mode
+  # The updater component automatically applies VPA recommendations to pods
+  # Enable this only if you want automatic pod resource updates
+
+admissionController:
+  enabled: false
+  # Disabled for monitoring-only mode
+  # The admission controller validates and mutates pod resources at creation time
+  # Enable this only if you want automatic resource enforcement
+
+metrics-server:
+  enabled: false
+
+podSecurityContext:
+  runAsNonRoot: true
+  runAsUser: 65534
+  seccompProfile:
+    type: RuntimeDefault
+
+securityContext:
+  readOnlyRootFilesystem: true
+  allowPrivilegeEscalation: false
+  capabilities:
+    drop:
+      - ALL