feat(goldilocks): install Goldilocks and VPA

This commit is contained in:
Masaki Yatsu
2025-11-10 14:17:53 +09:00
parent 189a376511
commit f429720617
9 changed files with 993 additions and 0 deletions

238
vpa/README.md Normal file
View File

@@ -0,0 +1,238 @@
# Vertical Pod Autoscaler (VPA)
Kubernetes resource monitoring and recommendation system:
- **Monitoring-only mode**: Observes workloads without automatic scaling
- **Prometheus integration**: Metrics collection via Prometheus instead of metrics-server
- **Resource recommendations**: Generates CPU and memory suggestions based on actual usage
- **Goldilocks integration**: Works with Goldilocks dashboard for visualization
- **Non-intrusive**: Does not modify running workloads
## Important Note
**This VPA installation is configured for monitoring and recommendation only**:
-**Recommender**: Enabled - Analyzes workload metrics and generates recommendations
-**Updater**: Disabled - Does NOT automatically apply recommendations to pods
-**Admission Controller**: Disabled - Does NOT modify pod resources at creation time
This configuration ensures VPA observes your workloads without affecting them. You can review recommendations and manually adjust resource settings.
## Prerequisites
- Kubernetes cluster (k3s)
- Prometheus (kube-prometheus-stack) installed
VPA requires Prometheus to collect historical metrics data. Install Prometheus first:
```bash
just prometheus::install
```
## Installation
```bash
just vpa::install
```
The installation will automatically detect Prometheus and configure VPA to use it as the metrics source.
## Configuration
Environment variables (set in `.env.local` or override):
```bash
VPA_NAMESPACE=vpa # VPA namespace
PROMETHEUS_NAMESPACE=monitoring # Prometheus namespace
PROMETHEUS_ADDRESS=http://kube-prometheus-stack-prometheus.monitoring.svc:9090 # Prometheus URL
```
## Usage
### View VPA Status
```bash
just vpa::status
```
### View Recommender Logs
```bash
just vpa::logs-recommender
```
### View VPA Resources
List all VPA resources across namespaces:
```bash
kubectl get vpa -A
```
View specific VPA recommendations:
```bash
kubectl describe vpa <vpa-name> -n <namespace>
```
Get recommendation in JSON format:
```bash
kubectl get vpa <vpa-name> -n <namespace> -o jsonpath='{.status.recommendation}' | jq
```
### Manual VPA Resource Creation
Create a VPA resource for monitoring:
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # Monitoring only
```
Apply with:
```bash
kubectl apply -f vpa-resource.yaml
```
## Integration with Goldilocks
VPA alone provides raw recommendations through kubectl commands. For a user-friendly dashboard experience, use Goldilocks:
```bash
# Install Goldilocks
just goldilocks::install
# Enable monitoring for specific namespaces
just goldilocks::enable-namespace <namespace>
```
Goldilocks automatically creates VPA resources for all workloads in labeled namespaces and presents recommendations in a web dashboard.
## Enabling Automatic Scaling
If you want to enable automatic pod resource updates, modify `values.gomplate.yaml`:
```yaml
updater:
enabled: true
replicaCount: 1
resources:
requests:
cpu: 50m
memory: 500Mi
limits:
cpu: 200m
memory: 1Gi
podMonitor:
enabled: true
admissionController:
enabled: true
replicaCount: 1
generateCertificate: true
mutatingWebhookConfiguration:
failurePolicy: Ignore
resources:
requests:
cpu: 50m
memory: 200Mi
limits:
cpu: 200m
memory: 500Mi
```
Then reinstall:
```bash
just vpa::install
```
⚠️ **Warning**: Enabling updater and admission controller will cause VPA to automatically modify pod resources. Test thoroughly before enabling in production.
## VPA Update Modes
VPA supports three update modes (configured in VPA resource):
- **Off** (Monitoring only - Current configuration): Generates recommendations but does not apply them
- **Initial**: Applies recommendations only when pods are created
- **Auto**: Automatically applies recommendations by evicting and recreating pods
## Management
### Uninstall
```bash
just vpa::uninstall
```
This removes:
- Helm release
- VPA CRDs
- Namespace
## Troubleshooting
### Recommender Not Starting
Check Prometheus connectivity:
```bash
just vpa::logs-recommender
```
Verify Prometheus is running:
```bash
kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus
```
### No Recommendations Generated
VPA requires workload metrics over time:
- Minimum: A few minutes of runtime
- Recommended: 24+ hours for accurate recommendations
Verify workload is running and generating metrics:
```bash
kubectl get pods -n <namespace>
kubectl top pods -n <namespace>
```
### VPA Resource Not Created
For Goldilocks-managed VPA resources, ensure:
1. Namespace has label: `goldilocks.fairwinds.com/enabled=true`
2. Workload is managed by a controller (Deployment, StatefulSet, etc.)
3. Goldilocks controller is running: `kubectl get pods -n goldilocks`
### Check VPA Components
```bash
kubectl get pods -n vpa
```
Should show:
- `vpa-recommender-*`: Running
## References
- [Kubernetes VPA Documentation](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler)
- [Fairwinds VPA Helm Chart](https://github.com/FairwindsOps/charts/tree/master/stable/vpa)
- [VPA Design Proposals](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md)

89
vpa/justfile Normal file
View File

@@ -0,0 +1,89 @@
set fallback := true
export VPA_NAMESPACE := env("VPA_NAMESPACE", "vpa")
export PROMETHEUS_NAMESPACE := env("PROMETHEUS_NAMESPACE", "monitoring")
export PROMETHEUS_ADDRESS := env("PROMETHEUS_ADDRESS", "http://kube-prometheus-stack-prometheus." + PROMETHEUS_NAMESPACE + ".svc:9090")
[private]
default:
@just --list --unsorted --list-submodules
# Add Helm repository
add-helm-repo:
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update
# Remove Helm repository
remove-helm-repo:
helm repo remove fairwinds-stable
# Create namespace
create-namespace:
@kubectl get namespace ${VPA_NAMESPACE} &>/dev/null || \
kubectl create namespace ${VPA_NAMESPACE}
# Delete namespace
delete-namespace:
@kubectl delete namespace ${VPA_NAMESPACE} --ignore-not-found
# Install Vertical Pod Autoscaler
install:
#!/bin/bash
set -euo pipefail
# Check if Prometheus is installed
if ! helm status kube-prometheus-stack -n ${PROMETHEUS_NAMESPACE} &>/dev/null; then
echo "Error: Prometheus (kube-prometheus-stack) is not installed."
echo "Please install Prometheus first using: just prometheus::install"
exit 1
fi
just add-helm-repo
just create-namespace
# Generate values.yaml from template
gomplate -f values.gomplate.yaml -o values.yaml
# Install VPA with Helm
helm upgrade --install vpa fairwinds-stable/vpa --namespace ${VPA_NAMESPACE} \
--values values.yaml --wait
echo "VPA installed successfully in namespace: ${VPA_NAMESPACE}"
echo ""
echo "To verify installation:"
echo " kubectl get pods -n ${VPA_NAMESPACE}"
echo " kubectl get vpa -A"
# Uninstall Vertical Pod Autoscaler
uninstall:
#!/bin/bash
set -euo pipefail
if ! helm status vpa -n ${VPA_NAMESPACE} &>/dev/null; then
echo "VPA is not installed."
exit 0
fi
if command -v gum &>/dev/null; then
if ! gum confirm "Are you sure you want to uninstall VPA?"; then
echo "Uninstall cancelled."
exit 0
fi
else
read -p "Are you sure you want to uninstall VPA? (y/N) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Uninstall cancelled."
exit 0
fi
fi
helm uninstall vpa -n ${VPA_NAMESPACE}
# Delete VPA CRDs
kubectl delete crd verticalpodautoscalercheckpoints.autoscaling.k8s.io --ignore-not-found
kubectl delete crd verticalpodautoscalers.autoscaling.k8s.io --ignore-not-found
just delete-namespace
echo "VPA uninstalled successfully."

59
vpa/values.gomplate.yaml Normal file
View File

@@ -0,0 +1,59 @@
# Fairwinds VPA Helm chart values
# Optimized for resource monitoring with Prometheus + Goldilocks
rbac:
create: true
serviceAccount:
create: true
automountServiceAccountToken: true
recommender:
enabled: true
replicaCount: 1
resources:
requests:
cpu: 50m
memory: 500Mi
limits:
cpu: 200m
memory: 1Gi
extraArgs:
v: '4' # Verbose logging level
pod-recommendation-min-cpu-millicores: 15
pod-recommendation-min-memory-mb: 100
storage: prometheus
prometheus-address: '{{ .Env.PROMETHEUS_ADDRESS }}'
podMonitor:
enabled: true
updater:
enabled: false
# Disabled for monitoring-only mode
# The updater component automatically applies VPA recommendations to pods
# Enable this only if you want automatic pod resource updates
admissionController:
enabled: false
# Disabled for monitoring-only mode
# The admission controller validates and mutates pod resources at creation time
# Enable this only if you want automatic resource enforcement
metrics-server:
enabled: false
podSecurityContext:
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL

59
vpa/values.yaml Normal file
View File

@@ -0,0 +1,59 @@
# Fairwinds VPA Helm chart values
# Optimized for resource monitoring with Prometheus + Goldilocks
rbac:
create: true
serviceAccount:
create: true
automountServiceAccountToken: true
recommender:
enabled: true
replicaCount: 1
resources:
requests:
cpu: 50m
memory: 500Mi
limits:
cpu: 200m
memory: 1Gi
extraArgs:
v: '4' # Verbose logging level
pod-recommendation-min-cpu-millicores: 15
pod-recommendation-min-memory-mb: 100
storage: prometheus
prometheus-address: 'http://kube-prometheus-stack-prometheus.monitoring.svc:9090'
podMonitor:
enabled: true
updater:
enabled: false
# Disabled for monitoring-only mode
# The updater component automatically applies VPA recommendations to pods
# Enable this only if you want automatic pod resource updates
admissionController:
enabled: false
# Disabled for monitoring-only mode
# The admission controller validates and mutates pod resources at creation time
# Enable this only if you want automatic resource enforcement
metrics-server:
enabled: false
podSecurityContext:
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL