VerticaAutoscaler use cases and best practices

The Autoscaler provides advanced options for fine-tuning how your workloads scale in and out based on metric changes. A good configuration is key to preventing issues such as flapping, over-provisioning, or slow responsiveness.

Use the following best practices to effectively configure advanced settings.

Key configuration fields and best practices

pollingInterval (ScaledObject)
The interval (in seconds) at which KEDA queries Prometheus to retrieve metric values.

Default: 30s

  • Use shorter intervals (5-15s) for applications that need rapid scaling.
  • Use longer intervals (30-60s) for stable workloads where aggressive scaling is not necessary such as batch jobs or cron-like tasks.
cooldownPeriod (ScaledObject)
The duration (in seconds) before scaling out after the trigger metric falls below the target value.

Default: 30s

  • Set to 30–60s if your database or subcluster handles quick traffic bursts.
  • Use 300s or increase if your database takes time to warm up.
behavior (ScaledObject and HPA)
Controls and fine-tunes scaling behavior, such as how quickly to scale out or in.
  1. Prevent scale flapping from spiky Prometheus metrics:

    Use case: Prometheus metric is noisy.

    Behaviour configuration:

          
    behavior:
     scaleDown:
       stabilizationWindowSeconds: 300
       selectPolicy: Max
     scaleUp:
       stabilizationWindowSeconds: 0
       selectPolicy: Max
      
    

    • Prevents premature scale-in during brief traffic drops.
    • Still enables rapid scale-out when demand increases.
  2. Control aggressive scaling for expensive resources

    Use case: Each replica is expensive so you want to limit how quickly they are added.

    Behaviour configuration:

          
    behavior:
      scaleUp:
        policies:
          - type: Pods
            value: 1
            periodSeconds: 60
        selectPolicy: Max
      
    

    • Limits scale-out to 1 pod per minute to prevent burst scaling.
    • Useful for protecting resource quotas or avoid hitting API server rate limits.
  3. Enable rapid scale-in for cost-sensitive environments

    Use case: You want to quickly free resources during low-traffic periods, for example, to optimize costs.

    Behaviour configuration:

          
    behavior:
     scaleDown:
       stabilizationWindowSeconds: 30
       selectPolicy: Min
      
    

    • Maintains quick scale-in responsiveness.
    • Uses a short window to evaluate metric drops.
  4. Mixed workloads with busy periods and idle gaps

    Use case: You have predictable burst windows (for example, at the top of each hour), followed by idle time.

    Behaviour configuration:

          
    behavior:
     scaleUp:
       stabilizationWindowSeconds: 10
       policies:
         - type: Percent
           value: 100
           periodSeconds: 30
     scaleDown:
       stabilizationWindowSeconds: 120
       selectPolicy: Max
      
    

    • Enables rapid ramp-up during burst periods.
    • Stabilizes scale-in to avoid dropping too fast when the burst ends.
  5. Smooth scaling for stateful workloads

    Use case: Workload includes some startup and shutdown overhead.

    Behaviour configuration:

          
    behavior:
      scaleUp:
        stabilizationWindowSeconds: 60
        selectPolicy: Max
      scaleDown:
        stabilizationWindowSeconds: 300
        selectPolicy: Max
      
    

    • Adds a cooldown effect.
    • Avoids rapid scaling before a pod is fully initialized.