VerticaAutoscaler use cases and best practices
The Autoscaler provides advanced options for fine-tuning how your workloads scale in and out based on metric changes. A good configuration is key to preventing issues such as flapping, over-provisioning, or slow responsiveness.
Use the following best practices to effectively configure advanced settings.
Key configuration fields and best practices
pollingInterval (ScaledObject)
- The interval (in seconds) at which KEDA queries Prometheus to retrieve metric values.
Default: 30s
- Use shorter intervals (5-15s) for applications that need rapid scaling.
- Use longer intervals (30-60s) for stable workloads where aggressive scaling is not necessary such as batch jobs or cron-like tasks.
cooldownPeriod (ScaledObject)
- The duration (in seconds) before scaling out after the trigger metric falls below the target value.
Default: 30s
- Set to 30–60s if your database or subcluster handles quick traffic bursts.
- Use 300s or increase if your database takes time to warm up.
behavior (ScaledObject and HPA)
- Controls and fine-tunes scaling behavior, such as how quickly to scale out or in.
-
Prevent scale flapping from spiky Prometheus metrics:
Use case: Prometheus metric is noisy.
Behaviour configuration:
behavior: scaleDown: stabilizationWindowSeconds: 300 selectPolicy: Max scaleUp: stabilizationWindowSeconds: 0 selectPolicy: Max
- Prevents premature scale-in during brief traffic drops.
- Still enables rapid scale-out when demand increases.
-
Control aggressive scaling for expensive resources
Use case: Each replica is expensive so you want to limit how quickly they are added.
Behaviour configuration:
behavior: scaleUp: policies: - type: Pods value: 1 periodSeconds: 60 selectPolicy: Max
- Limits scale-out to 1 pod per minute to prevent burst scaling.
- Useful for protecting resource quotas or avoid hitting API server rate limits.
-
Enable rapid scale-in for cost-sensitive environments
Use case: You want to quickly free resources during low-traffic periods, for example, to optimize costs.
Behaviour configuration:
behavior: scaleDown: stabilizationWindowSeconds: 30 selectPolicy: Min
- Maintains quick scale-in responsiveness.
- Uses a short window to evaluate metric drops.
-
Mixed workloads with busy periods and idle gaps
Use case: You have predictable burst windows (for example, at the top of each hour), followed by idle time.
Behaviour configuration:
behavior: scaleUp: stabilizationWindowSeconds: 10 policies: - type: Percent value: 100 periodSeconds: 30 scaleDown: stabilizationWindowSeconds: 120 selectPolicy: Max
- Enables rapid ramp-up during burst periods.
- Stabilizes scale-in to avoid dropping too fast when the burst ends.
-
Smooth scaling for stateful workloads
Use case: Workload includes some startup and shutdown overhead.
Behaviour configuration:
behavior: scaleUp: stabilizationWindowSeconds: 60 selectPolicy: Max scaleDown: stabilizationWindowSeconds: 300 selectPolicy: Max
- Adds a cooldown effect.
- Avoids rapid scaling before a pod is fully initialized.