VerticaAutoscaler custom resource definition

The VerticaAutoscaler custom resource (CR) is a HorizontalPodAutoscaler that automatically scales resources for existing subclusters using one of the following strategies:.

The VerticaAutoscaler custom resource (CR) automatically scales resources for existing subclusters using one of the following autoscalers:

This is achieved using one of the following strategies:

  • Subcluster scaling for short-running dashboard queries
  • Pod scaling for long-running analytic queries

The VerticaAutoscaler CR plays a crucial role in managing the scaling of VerticaDB instances using resource metrics or custom Prometheus metrics for efficient scaling. OpenText™ Analytics Database manages subclusters by workload, which helps you pinpoint the best metrics to trigger a scaling event. To maintain data integrity, the operator does not scale in unless all connections to the pods are drained and sessions are closed.

Additionally, the VerticaAutoscaler provides a webhook to validate state changes. By default, this webhook is enabled. You can configure this webhook with the webhook.enable Helm chart parameter.

Autoscalers

An autoscaler is a Kubernetes object that dynamically adjusts resource allocation based on metrics. The VerticaAutoscaler CR utilizes two types of autoscalers:

  • Horizontal Pod Autoscaler (HPA) - a native Kubernetes object
  • Scaled Object - a custom resource (CR) owned and managed by the Kubernetes Event-Driven Autoscaling (KEDA) operator.

Horizontal Pod Autoscaler (HPA) vs Kubernetes Event-Driven Autoscaling (KEDA) and ScaledObject

In Kubernetes, both the Horizontal Pod Autoscaler (HPA) and Kubernetes Event-Driven Autoscaling (KEDA)'s ScaledObject enable automatic pod scaling based on specific metrics. However, they differ in their operation and the types of metrics they utilize for scaling.

Horizontal Pod Autoscaler

HPA is a built-in Kubernetes resource that automatically scales the number of pods in a deployment based on observed CPU utilization or custom metrics, such as memory usage or other application-specific metrics.

Key features:

  • Metrics: HPA primarily scales based on CPU utilization, memory usage, or custom metrics sourced from the metrics-server or Prometheus adapter.
  • Scaling Trigger: HPA monitors the metric values (for example, CPU utilization) and compares them to a defined target (typically a percentage, such as 50% CPU utilization). If the actual value exceeds the target, it scales up the number of pods; if it falls below the target, it scales down accordingly.
  • Limitations: HPA is effective for resource-based scaling, but may face challenges when scaling based on event-driven triggers (such as, message queues, incoming requests) unless combined with custom metrics.

For more information about the algorithm that determines when the HPA scales, see the Kubernetes documentation.

KEDA's ScaledObject

Kubernetes Event-Driven Autoscaler (KEDA) is designed to scale workloads based on event-driven metrics.

Key features:

  • Integrates with external event sources (like Prometheus).
  • Supports scaling to zero, ensuring no pods run when there is no demand.
  • Utilizes KEDA’s ScaledObject Custom Resource Definition (CRD).
  • Works with HPA internally, with KEDA managing the HPA configuration.

Following is a feature comparison between HPA and KEDA ScaledObject:

Feature HPA (Horizontal Pod Autoscaler) KEDA (ScaledObject)
Scaling Trigger CPU, Memory, Custom metrics CPU, Memory, Custom metrics, External events (queues, databases, etc.)
Use HPA? Native Kubernetes feature Uses HPA internally
Complexity Simple Requires KEDA installation
Flexibility More focused on resource scaling Provides greater flexibility by integrating with multiple external event sources
Response Time Delayed (depends on Metrics API) Faster (direct event triggers)

Examples

The examples in this section use the following VerticaDB custom resource. Each example uses the number of active sessions to trigger scaling:

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: v-test
spec:
  communal:
    path: "path/to/communal-storage"
    endpoint: "path/to/communal-endpoint"
    credentialSecret: credentials-secret
  licenseSecret: license
  subclusters:
    - name: pri1
      size: 3
      type: primary
      serviceName: primary1
      resources:
        limits:
          cpu: "8"
        requests:
          cpu: "4"
    - name: sec1
      size: 3
      type: secondary
      serviceName: secondary1
      resources:
        limits:
          cpu: "8"
        requests:
          cpu: "4"

Prerequisites

For HPA autoscaler:

  • Configure Metrics server for basic metrics along with a custom metrics provider such as the Prometheus Adapter.
  • Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
  • Install Prometheus Adapter to make custom metrics available to HPA. This adapter exposes the custom metrics scraped by Prometheus to the Kubernetes API server (only necessary if scaling based on Prometheus metrics).
  • Configure Custom Metrics API (Prometheus Adapter) to expose custom Prometheus metrics.
  • Configure the database to expose Prometheus-compatible metrics.

For KEDA's ScaledObject:

  • Install KEDA v2.15.0 installation in your cluster. KEDA is responsible for scaling workloads based on external metrics such as custom Prometheus metrics.
  • KEDA 2.15 is compatible with Kubernetes versions 1.28 to 1.30. While it might work on earlier Kubernetes versions, it is outside the supported range and could result in unexpected issues.
  • Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
  • Your database is configured to expose Prometheus-compatible metrics.

Subcluster scaling

Automatically adjust the number of subclusters in your custom resource to fine-tune resources for short-running dashboard queries. For example, increase the number of subclusters to increase throughput. For more information, see Improving query throughput using subclusters.

All subclusters share the same service object, so there are no required changes to external service objects. Pods in the new subcluster are load balanced by the existing service object.

The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50:

Horizontal Pod Autoscaler

  1. Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:

    rules:
      default: false
      custom:
          # Total number of active sessions. Used for testing
        - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod)
          resources:
            overrides:
              namespace:
                resource: namespace
              pod:
                resource: pod
          name:
            matches: "^(.*)_counter$"
            as: "${1}_total" # vertica_sessions_running_total
           seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}
    
  2. Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

    apiVersion: vertica.com/v1beta1
    kind: VerticaAutoscaler
    metadata:
      name: v-scale
      verticaDBName: v-test
      scalingGranularity: Subcluster
      serviceName: primary1
      customAutoscaler:
        type: HPA
        hpa:
          minReplicas: 3
          maxReplicas: 12
          metrics:
            - metric:
                type: Pods
                pods:
                  metric:
                    name: vertica_sessions_running_total
                  target:
                    type: AverageValue
                    averageValue: 50
    
  3. This creates a HorizontalPodAutoscaler object with the following configuration:

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: v-scale-hpa
    spec:
      maxReplicas: 12
      metrics:
      - type: Pods
      pods:
        metric:
          name: vertica_sessions_running_total
        target:
          type: AverageValue
          averageValue: "50"
      minReplicas: 3
      scaleTargetRef:
        apiVersion: vertica.com/v1
        kind: VerticaAutoscaler
        name: v-scale
    
  • Sets the target average number of active sessions to 50.
  • Scales to a minimum of three pods in one subcluster and 12 pods in four subclusters.

ScaledObject

  1. Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    metadata:
      name: v-scale
    spec:
      verticaDBName: v-test
      serviceName: primary1
      scalingGranularity: Subcluster
      customAutoscaler:
        type: ScaledObject
        scaledObject:
          minReplicas: 3
          maxReplicas: 12
          metrics:
          - name: vertica_sessions_running_total
            metricType: AverageValue
            prometheus:
              serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090"
              query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-scale-primary1"})
              threshold: 50
    
  2. This creates a ScaledObject object with the following configuration:

    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: v-scale-keda
    spec:
      maxReplicaCount: 3
      minReplicaCount: 12
      scaleTargetRef:
        apiVersion: vertica.com/v1
        kind: VerticaAutoscaler
        name: v-scale
      triggers:
      - metadata:
          query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
          serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090
          threshold: "50"
      metricType: AverageValue
      name: vertica_sessions_running_total
      type: prometheus 
    
  • Sets the target average number of active sessions to 50.
  • Scales to a minimum of three pods in one subcluster, and 12 pods in four subclusters

KEDA directly queries Prometheus using the provided address without relying on the pod selector. The scaling is determined by the result of the query, so ensure the query performs as expected.

Pod scaling

For long-running, analytic queries, increase the pod count for a subcluster. For additional information about analytic queries, see Using elastic crunch scaling to improve query performance.

When you scale pods in an Eon Mode database, you must consider the impact on database shards. For details, see Namespaces and shards.

The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50.

Horizontal Pod Autoscaler

  1. Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:

    rules:
      default: false
      custom:
          # Total number of active sessions. Used for testing
        - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod)
          resources:
            overrides:
              namespace:
                resource: namespace
              pod:
                resource: pod
          name:
            matches: "^(.*)_counter$"
            as: "${1}_total" # vertica_sessions_running_total
           seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}
    
  2. Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

    apiVersion: vertica.com/v1beta1
    kind: VerticaAutoscaler
    metadata:
      name: v-scale
      verticaDBName: v-test
      scalingGranularity: Pod
      serviceName: primary1
      customAutoscaler:
        type: HPA
        hpa:
          minReplicas: 3
          maxReplicas: 12
          metrics:
            - metric:
                type: Pods
                pods:
                  metric:
                    name: vertica_sessions_running_total
                  target:
                    type: AverageValue
                    averageValue: 50 
    
  3. This creates a HorizontalPodAutoscaler object with the following configuration:

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: v-scale-hpa
    spec:
      maxReplicas: 12
      metrics:
      - type: Pods
        pods:
          metric:
            name: vertica_sessions_running_total
          target:
            type: AverageValue
            averageValue: "50"
      minReplicas: 3
      scaleTargetRef:
        apiVersion: vertica.com/v1
        kind: VerticaAutoscaler
        name: v-scale 
    
    • Sets the target average number of active sessions to 50.
    • Scales the primary1 subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by the serviceName, the last one is scaled.

ScaledObject

  1. Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    metadata:
      name: v-scale
    spec:
      verticaDBName: v-test
      serviceName: primary1
      scalingGranularity: Pod
      customAutoscaler:
        type: ScaledObject
        scaledObject:
          minReplicas: 3
          maxReplicas: 12
          metrics:
          - name: vertica_sessions_running_total
            metricType: AverageValue
            prometheus:
              serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090"
              query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
              threshold: 50
    
  2. This creates a ScaledObject object with the following configuration:

    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: v-scale-keda
    spec:
      maxReplicaCount: 3
      minReplicaCount: 12
      scaleTargetRef:
        apiVersion: vertica.com/v1
        kind: VerticaAutoscaler
        name: v-scale
      triggers:
      - metadata:
          query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
          serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090
          threshold: "50"
        metricType: AverageValue
        name: vertica_sessions_running_total
        type: prometheus
    
    • Sets the target average number of active sessions to 50.
    • Scales the primary1 subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by the serviceName, the last one is scaled.

Event monitoring

Horizontal Pod Autoscaler

To view the Horizontal Pod Autoscaler object, use the kubetctl describe hpa command:

Name:                                                  v-scale-hpa
Namespace:                                             vertica
Reference:                                             VerticaAutoscaler/vertica.com/v1 v-scale
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 12 Feb 2024 15:11:28 -0300
Metrics:                       ( current / target )
  "vertica_sessions_running_total" on pods:  5 / 50                                            
Min replicas:                                          3
Max replicas:                                          12
VerticaAutoscaler pods:                                3 current / 3 desired
Conditions:
  Type              Status  Reason                Message
  ----              ------  ------                -------
  AbleToScale       True    ReadyForNewScale      the HPA controller was able to calculate a new replica count
  ScalingActive     True    ValidMetricFound      the HPA was able to successfully calculate a replica count from pods metric vertica_sessions_running_total
  ScalingLimited    False   DesiredWithinRange    the desired replica count is within the acceptable range

ScaledObject

To view the Scaled Object, use the kubetctl describe scaledobject command:

Name:         v-scale-keda
Namespace:    default
Labels:       <none>
Annotations:  <none>
CreationTimestamp:  <unknown>
Spec:
  Cooldown Period:   30
  Max Replica Count:  12
  Min Replica Count:  3
  Polling Interval:   30
  Scale Target Ref:
    API Version:  vertica.com/v1
    Kind:         VerticaAutoscaler
    Name:         v-scale
    Metadata:
      Auth Modes:           
      Query:                 sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
      Server Address:        http://prometheus-tls-kube-promet-prometheus.prometheus-tls.svc:9090
      Threshold:             50
      Unsafe Ssl:            false
    Metric Type:             AverageValue
    Name:                    vertica_sessions_running_total
    Type:                    prometheus
    Use Cached Metrics:      false
Conditions:
  Type              Status  Reason                Message
  ----              ------  ------                -------
  Active            True    ScaledObjectActive    ScaledObject is active
  Ready             True    ScaledObjectReady     ScaledObject is ready
  Fallback          False   NoFallback            No fallback was triggered
Events:            <none>  

Viewing scaling events and autoscaler actions

When a scaling event occurs, you can view the newly created pods. Use kubectl to view the StatefulSets:

$ kubectl get statefulsets
NAME                                                   READY   AGE
v-test-v-scale-0                                        0/3     71s
v- test-primary1                                        3/3     39m
v- test-secondary

Use kubectl describe to view the executing commands:

$ kubectl describe vdb v-test | tail
  Upgrade Status:
Events:
  Type    Reason                   Age   From                Message
  ----    ------                   ----  ----                -------
  Normal  SubclusterAdded          10s   verticadb-operator  Added new subcluster 'v-scale-0'
  Normal  AddNodeStart             9s    verticadb-operator  Starting add database node for pod(s) 'v-test-v-scale-0-0, v-test-v-scale-0-1, v-test-v-scale-0-2'