docs: update resource-management.md
This commit is contained in:
36
CLAUDE.md
36
CLAUDE.md
@@ -98,12 +98,29 @@ kubectl --context <host>-oidc get nodes # Test OIDC auth
|
|||||||
- **Templates**: `*.gomplate.yaml` files use environment variables from `.env.local`
|
- **Templates**: `*.gomplate.yaml` files use environment variables from `.env.local`
|
||||||
- **Custom Extensions**: `custom.just` can be created for additional workflows
|
- **Custom Extensions**: `custom.just` can be created for additional workflows
|
||||||
|
|
||||||
|
### Resource Management
|
||||||
|
|
||||||
|
All components should have appropriate resource requests and limits configured. See [docs/resource-management.md](docs/resource-management.md) for:
|
||||||
|
|
||||||
|
- QoS class selection (Guaranteed vs Burstable)
|
||||||
|
- Using Goldilocks/VPA for recommendations
|
||||||
|
- Configuration guidelines and examples
|
||||||
|
- **Important**: Never set resources below Goldilocks recommendations; always round up to clean values
|
||||||
|
|
||||||
### Gomplate Template Pattern
|
### Gomplate Template Pattern
|
||||||
|
|
||||||
**Environment Variable Management:**
|
**Environment Variable Management:**
|
||||||
|
|
||||||
- Justfile manages environment variables and their default values
|
- Justfile manages environment variables and their default values at the top using `export VAR := env("VAR", "default")`
|
||||||
- Gomplate templates access variables using `{{ .Env.VAR }}`
|
- Gomplate templates access variables using `{{ .Env.VAR }}`
|
||||||
|
- **IMPORTANT**: Variables exported at the top of justfile are automatically available to all recipes - do NOT use `export` again inside recipes
|
||||||
|
|
||||||
|
**Conditional Rendering Rules:**
|
||||||
|
|
||||||
|
- For boolean flags (enabled/disabled features), use simple truthiness check: `{{- if .Env.VAR }}`
|
||||||
|
- The justfile should set the variable to "true" (or any non-empty value) to enable, or empty string to disable
|
||||||
|
- **DO NOT use**: `{{- if eq (.Env.VAR | default "false") "true" }}` - this is redundant
|
||||||
|
- **CORRECT**: `{{- if .Env.VAR }}` - simple and clean
|
||||||
|
|
||||||
**Example justfile pattern:**
|
**Example justfile pattern:**
|
||||||
|
|
||||||
@@ -111,12 +128,15 @@ kubectl --context <host>-oidc get nodes # Test OIDC auth
|
|||||||
# At the top of justfile - define variables with defaults
|
# At the top of justfile - define variables with defaults
|
||||||
export PROMETHEUS_NAMESPACE := env("PROMETHEUS_NAMESPACE", "monitoring")
|
export PROMETHEUS_NAMESPACE := env("PROMETHEUS_NAMESPACE", "monitoring")
|
||||||
export GRAFANA_HOST := env("GRAFANA_HOST", "")
|
export GRAFANA_HOST := env("GRAFANA_HOST", "")
|
||||||
|
export MONITORING_ENABLED := env("MONITORING_ENABLED", "")
|
||||||
|
|
||||||
# In recipes - export variables for gomplate
|
# In recipes - use variables directly (already exported at top)
|
||||||
install:
|
install:
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
export GRAFANA_OIDC_ENABLED="${GRAFANA_OIDC_ENABLED:-false}"
|
if gum confirm "Enable monitoring?"; then
|
||||||
|
MONITORING_ENABLED="true"
|
||||||
|
fi
|
||||||
gomplate -f values.gomplate.yaml -o values.yaml
|
gomplate -f values.gomplate.yaml -o values.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -128,8 +148,8 @@ namespace: {{ .Env.PROMETHEUS_NAMESPACE }}
|
|||||||
ingress:
|
ingress:
|
||||||
hosts:
|
hosts:
|
||||||
- {{ .Env.GRAFANA_HOST }}
|
- {{ .Env.GRAFANA_HOST }}
|
||||||
{{- if eq .Env.GRAFANA_OIDC_ENABLED "true" }}
|
{{- if .Env.MONITORING_ENABLED }}
|
||||||
oidc:
|
monitoring:
|
||||||
enabled: true
|
enabled: true
|
||||||
{{- end }}
|
{{- end }}
|
||||||
```
|
```
|
||||||
@@ -145,11 +165,11 @@ install:
|
|||||||
if [ -z "${MONITORING_ENABLED}" ]; then
|
if [ -z "${MONITORING_ENABLED}" ]; then
|
||||||
if gum confirm "Enable Prometheus monitoring?"; then
|
if gum confirm "Enable Prometheus monitoring?"; then
|
||||||
MONITORING_ENABLED="true"
|
MONITORING_ENABLED="true"
|
||||||
fi
|
|
||||||
fi
|
|
||||||
else
|
else
|
||||||
MONITORING_ENABLED="false"
|
MONITORING_ENABLED="false"
|
||||||
fi
|
fi
|
||||||
|
fi
|
||||||
|
fi
|
||||||
# ... helm install
|
# ... helm install
|
||||||
|
|
||||||
if [ "${MONITORING_ENABLED}" = "true" ]; then
|
if [ "${MONITORING_ENABLED}" = "true" ]; then
|
||||||
@@ -161,7 +181,7 @@ install:
|
|||||||
ServiceMonitor template (`servicemonitor.gomplate.yaml`):
|
ServiceMonitor template (`servicemonitor.gomplate.yaml`):
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
{{- if eq .Env.MONITORING_ENABLED "true" }}
|
{{- if .Env.MONITORING_ENABLED }}
|
||||||
apiVersion: monitoring.coreos.com/v1
|
apiVersion: monitoring.coreos.com/v1
|
||||||
kind: ServiceMonitor
|
kind: ServiceMonitor
|
||||||
metadata:
|
metadata:
|
||||||
|
|||||||
@@ -190,13 +190,25 @@ Given that VPA already includes:
|
|||||||
- **Critical services**: Use VPA + 1.5-2x for extra safety margin, or use Guaranteed QoS
|
- **Critical services**: Use VPA + 1.5-2x for extra safety margin, or use Guaranteed QoS
|
||||||
- **New services**: Start with VPA + 1.5x, monitor, adjust after 1-2 weeks
|
- **New services**: Start with VPA + 1.5x, monitor, adjust after 1-2 weeks
|
||||||
|
|
||||||
|
**IMPORTANT:** Never configure resources **below** Goldilocks recommendations. Setting values lower than recommended will:
|
||||||
|
- Cause Goldilocks dashboard to flag the workload as under-resourced
|
||||||
|
- Potentially lead to performance issues or OOMKilled events
|
||||||
|
- Defeat the purpose of using VPA-based recommendations
|
||||||
|
|
||||||
|
When rounding values, always round **up** to the next clean value, not down.
|
||||||
|
|
||||||
**Example:**
|
**Example:**
|
||||||
|
|
||||||
Goldilocks recommendation: 50m CPU, 128Mi Memory
|
Goldilocks recommendation: 50m CPU, 128Mi Memory
|
||||||
|
|
||||||
- Standard service: 50m CPU, 128Mi Memory (use as-is, rounded)
|
- Standard service: 50m CPU, 128Mi Memory (use as-is, rounded up if needed)
|
||||||
- Critical service: 100m CPU, 256Mi Memory (2x for extra safety)
|
- Critical service: 100m CPU, 256Mi Memory (2x for extra safety)
|
||||||
|
|
||||||
|
Goldilocks recommendation: 15m CPU, 105M Memory
|
||||||
|
|
||||||
|
- Correct: 25m CPU, 128Mi Memory (rounded up to clean values)
|
||||||
|
- Incorrect: 10m CPU, 100Mi Memory (below recommendations, will be flagged)
|
||||||
|
|
||||||
#### For CRDs and Unsupported Workloads
|
#### For CRDs and Unsupported Workloads
|
||||||
|
|
||||||
Use Grafana to check actual resource usage:
|
Use Grafana to check actual resource usage:
|
||||||
|
|||||||
Reference in New Issue
Block a user