VerticaAutoscaler use cases and best practices

The Autoscaler provides advanced options for fine-tuning how your workloads scale in and out based on metric changes. A good configuration is key to preventing issues such as flapping, over-provisioning, or slow responsiveness.

Use the following best practices to effectively configure advanced settings.

Key configuration fields and best practices

pollingInterval (ScaledObject)

The interval (in seconds) at which KEDA queries Prometheus to retrieve metric values.

Default: 30s

Use shorter intervals (5-15s) for applications that need rapid scaling.
Use longer intervals (30-60s) for stable workloads where aggressive scaling is not necessary such as batch jobs or cron-like tasks.

cooldownPeriod (ScaledObject)

The duration (in seconds) before scaling out after the trigger metric falls below the target value.

Default: 30s

Set to 30–60s if your database or subcluster handles quick traffic bursts.
Use 300s or increase if your database takes time to warm up.

behavior (ScaledObject and HPA)

Controls and fine-tunes scaling behavior, such as how quickly to scale out or in.

Prevent scale flapping from spiky Prometheus metrics:

Use case: Prometheus metric is noisy.

Behaviour configuration:
```
      
behavior:
 scaleDown:
   stabilizationWindowSeconds: 300
   selectPolicy: Max
 scaleUp:
   stabilizationWindowSeconds: 0
   selectPolicy: Max
  
```
- Prevents premature scale-in during brief traffic drops.
- Still enables rapid scale-out when demand increases.
Control aggressive scaling for expensive resources

Use case: Each replica is expensive so you want to limit how quickly they are added.

Behaviour configuration:
```
      
behavior:
  scaleUp:
    policies:
      - type: Pods
        value: 1
        periodSeconds: 60
    selectPolicy: Max
  
```
- Limits scale-out to 1 pod per minute to prevent burst scaling.
- Useful for protecting resource quotas or avoid hitting API server rate limits.
Enable rapid scale-in for cost-sensitive environments

Use case: You want to quickly free resources during low-traffic periods, for example, to optimize costs.

Behaviour configuration:
```
      
behavior:
 scaleDown:
   stabilizationWindowSeconds: 30
   selectPolicy: Min
  
```
- Maintains quick scale-in responsiveness.
- Uses a short window to evaluate metric drops.
Mixed workloads with busy periods and idle gaps

Use case: You have predictable burst windows (for example, at the top of each hour), followed by idle time.

Behaviour configuration:
```
      
behavior:
 scaleUp:
   stabilizationWindowSeconds: 10
   policies:
     - type: Percent
       value: 100
       periodSeconds: 30
 scaleDown:
   stabilizationWindowSeconds: 120
   selectPolicy: Max
  
```
- Enables rapid ramp-up during burst periods.
- Stabilizes scale-in to avoid dropping too fast when the burst ends.
Smooth scaling for stateful workloads

Use case: Workload includes some startup and shutdown overhead.

Behaviour configuration:
```
      
behavior:
  scaleUp:
    stabilizationWindowSeconds: 60
    selectPolicy: Max
  scaleDown:
    stabilizationWindowSeconds: 300
    selectPolicy: Max
  
```
- Adds a cooldown effect.
- Avoids rapid scaling before a pod is fully initialized.