VerticaAutoscaler custom resource definition

The VerticaAutoscaler custom resource (CR) is a HorizontalPodAutoscaler that automatically scales resources for existing subclusters using one of the following strategies:.

The VerticaAutoscaler custom resource (CR) automatically scales resources for existing subclusters using one of the following autoscalers:

This is achieved using one of the following strategies:

Subcluster scaling for short-running dashboard queries
Pod scaling for long-running analytic queries

The VerticaAutoscaler CR plays a crucial role in managing the scaling of VerticaDB instances using resource metrics or custom Prometheus metrics for efficient scaling. OpenText™ Analytics Database manages subclusters by workload, which helps you pinpoint the best metrics to trigger a scaling event. To maintain data integrity, the operator does not scale in unless all connections to the pods are drained and sessions are closed.

Additionally, the VerticaAutoscaler provides a webhook to validate state changes. By default, this webhook is enabled. You can configure this webhook with the webhook.enable Helm chart parameter.

Autoscalers

An autoscaler is a Kubernetes object that dynamically adjusts resource allocation based on metrics. The VerticaAutoscaler CR utilizes two types of autoscalers:

Horizontal Pod Autoscaler (HPA) - a native Kubernetes object
Scaled Object - a custom resource (CR) owned and managed by the Kubernetes Event-Driven Autoscaling (KEDA) operator.

Horizontal Pod Autoscaler (HPA) vs Kubernetes Event-Driven Autoscaling (KEDA) and ScaledObject

In Kubernetes, both the Horizontal Pod Autoscaler (HPA) and Kubernetes Event-Driven Autoscaling (KEDA)'s ScaledObject enable automatic pod scaling based on specific metrics. However, they differ in their operation and the types of metrics they utilize for scaling.

Horizontal Pod Autoscaler

HPA is a built-in Kubernetes resource that automatically scales the number of pods in a deployment based on observed CPU utilization or custom metrics, such as memory usage or other application-specific metrics.

Key features:

Metrics: HPA primarily scales based on CPU utilization, memory usage, or custom metrics sourced from the metrics-server or Prometheus adapter.
Scaling Trigger: HPA monitors the metric values (for example, CPU utilization) and compares them to a defined target (typically a percentage, such as 50% CPU utilization). If the actual value exceeds the target, it scales up the number of pods; if it falls below the target, it scales down accordingly.
Limitations: HPA is effective for resource-based scaling, but may face challenges when scaling based on event-driven triggers (such as, message queues, incoming requests) unless combined with custom metrics.

For more information about the algorithm that determines when the HPA scales, see the Kubernetes documentation.

KEDA's ScaledObject

Kubernetes Event-Driven Autoscaler (KEDA) is designed to scale workloads based on event-driven metrics.

Key features:

Integrates with external event sources (like Prometheus).
Supports scaling to zero, ensuring no pods run when there is no demand.
Utilizes KEDA’s ScaledObject Custom Resource Definition (CRD).
Works with HPA internally, with KEDA managing the HPA configuration.

Following is a feature comparison between HPA and KEDA ScaledObject:

Feature	HPA (Horizontal Pod Autoscaler)	KEDA (ScaledObject)
Scaling Trigger	CPU, Memory, Custom metrics	CPU, Memory, Custom metrics, External events (queues, databases, etc.)
Use HPA?	Native Kubernetes feature	Uses HPA internally
Complexity	Simple	Requires KEDA installation
Flexibility	More focused on resource scaling	Provides greater flexibility by integrating with multiple external event sources
Response Time	Delayed (depends on Metrics API)	Faster (direct event triggers)

Examples

The examples in this section use the following VerticaDB custom resource. Each example uses the number of active sessions to trigger scaling:

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: v-test
spec:
  communal:
    path: "path/to/communal-storage"
    endpoint: "path/to/communal-endpoint"
    credentialSecret: credentials-secret
  licenseSecret: license
  subclusters:
    - name: pri1
      size: 3
      type: primary
      serviceName: primary1
      resources:
        limits:
          cpu: "8"
        requests:
          cpu: "4"
    - name: sec1
      size: 3
      type: secondary
      serviceName: secondary1
      resources:
        limits:
          cpu: "8"
        requests:
          cpu: "4"

Prerequisites

Complete Installing the VerticaDB operator.
Install the kubectl command line tool.
Complete VerticaDB custom resource definition.
Confirm that you have the resources to scale.

Note
By default, the custom resource uses the free Community Edition (CE) license. This license allows you to deploy up to three nodes with a maximum of 1TB of data. To add resources beyond these limits, you must add your Vertica license to the custom resource as described in VerticaDB custom resource definition.

To scale based on CPU utilization, you must set CPU limits and requests.

For HPA autoscaler:

Configure Metrics server for basic metrics along with a custom metrics provider such as the Prometheus Adapter.
Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
Install Prometheus Adapter to make custom metrics available to HPA. This adapter exposes the custom metrics scraped by Prometheus to the Kubernetes API server (only necessary if scaling based on Prometheus metrics).
Configure Custom Metrics API (Prometheus Adapter) to expose custom Prometheus metrics.
Configure the database to expose Prometheus-compatible metrics.

For KEDA's ScaledObject:

Install KEDA v2.15.0 installation in your cluster. KEDA is responsible for scaling workloads based on external metrics such as custom Prometheus metrics.
KEDA 2.15 is compatible with Kubernetes versions 1.28 to 1.30. While it might work on earlier Kubernetes versions, it is outside the supported range and could result in unexpected issues.
Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
Your database is configured to expose Prometheus-compatible metrics.

Subcluster scaling

Automatically adjust the number of subclusters in your custom resource to fine-tune resources for short-running dashboard queries. For example, increase the number of subclusters to increase throughput. For more information, see Improving query throughput using subclusters.

All subclusters share the same service object, so there are no required changes to external service objects. Pods in the new subcluster are load balanced by the existing service object.

The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50:

Horizontal Pod Autoscaler

Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:

rules:
  default: false
  custom:
      # Total number of active sessions. Used for testing
    - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod)
      resources:
        overrides:
          namespace:
            resource: namespace
          pod:
            resource: pod
      name:
        matches: "^(.*)_counter$"
        as: "${1}_total" # vertica_sessions_running_total
       seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}

Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

apiVersion: vertica.com/v1beta1
kind: VerticaAutoscaler
metadata:
  name: v-scale
  verticaDBName: v-test
  scalingGranularity: Subcluster
  serviceName: primary1
  customAutoscaler:
    type: HPA
    hpa:
      minReplicas: 3
      maxReplicas: 12
      metrics:
        - metric:
            type: Pods
            pods:
              metric:
                name: vertica_sessions_running_total
              target:
                type: AverageValue
                averageValue: 50

This creates a HorizontalPodAutoscaler object with the following configuration:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: v-scale-hpa
spec:
  maxReplicas: 12
  metrics:
  - type: Pods
  pods:
    metric:
      name: vertica_sessions_running_total
    target:
      type: AverageValue
      averageValue: "50"
  minReplicas: 3
  scaleTargetRef:
    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    name: v-scale

Sets the target average number of active sessions to 50.
Scales to a minimum of three pods in one subcluster and 12 pods in four subclusters.

ScaledObject

Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

apiVersion: vertica.com/v1
kind: VerticaAutoscaler
metadata:
  name: v-scale
spec:
  verticaDBName: v-test
  serviceName: primary1
  scalingGranularity: Subcluster
  customAutoscaler:
    type: ScaledObject
    scaledObject:
      minReplicas: 3
      maxReplicas: 12
      metrics:
      - name: vertica_sessions_running_total
        metricType: AverageValue
        prometheus:
          serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090"
          query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-scale-primary1"})
          threshold: 50

This creates a ScaledObject object with the following configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: v-scale-keda
spec:
  maxReplicaCount: 3
  minReplicaCount: 12
  scaleTargetRef:
    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    name: v-scale
  triggers:
  - metadata:
      query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
      serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090
      threshold: "50"
  metricType: AverageValue
  name: vertica_sessions_running_total
  type: prometheus

Sets the target average number of active sessions to 50.
Scales to a minimum of three pods in one subcluster, and 12 pods in four subclusters

KEDA directly queries Prometheus using the provided address without relying on the pod selector. The scaling is determined by the result of the query, so ensure the query performs as expected.

Note

The service label must be set to retrieve metrics specifically for that service from Prometheus. spec.serviceName is the partial service name without the vdb name as a prefix. In the Prometheus query, the service label must be the actual service object name in Kubernetes with vdb name prepended.

Pod scaling

For long-running, analytic queries, increase the pod count for a subcluster. For additional information about analytic queries, see Using elastic crunch scaling to improve query performance.

When you scale pods in an Eon Mode database, you must consider the impact on database shards. For details, see Namespaces and shards.

The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50.

Horizontal Pod Autoscaler

Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:

rules:
  default: false
  custom:
      # Total number of active sessions. Used for testing
    - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod)
      resources:
        overrides:
          namespace:
            resource: namespace
          pod:
            resource: pod
      name:
        matches: "^(.*)_counter$"
        as: "${1}_total" # vertica_sessions_running_total
       seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}

Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

apiVersion: vertica.com/v1beta1
kind: VerticaAutoscaler
metadata:
  name: v-scale
  verticaDBName: v-test
  scalingGranularity: Pod
  serviceName: primary1
  customAutoscaler:
    type: HPA
    hpa:
      minReplicas: 3
      maxReplicas: 12
      metrics:
        - metric:
            type: Pods
            pods:
              metric:
                name: vertica_sessions_running_total
              target:
                type: AverageValue
                averageValue: 50

This creates a HorizontalPodAutoscaler object with the following configuration:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: v-scale-hpa
spec:
  maxReplicas: 12
  metrics:
  - type: Pods
    pods:
      metric:
        name: vertica_sessions_running_total
      target:
        type: AverageValue
        averageValue: "50"
  minReplicas: 3
  scaleTargetRef:
    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    name: v-scale

Sets the target average number of active sessions to 50.
Scales the primary1 subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by the serviceName, the last one is scaled.

ScaledObject

Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:

apiVersion: vertica.com/v1
kind: VerticaAutoscaler
metadata:
  name: v-scale
spec:
  verticaDBName: v-test
  serviceName: primary1
  scalingGranularity: Pod
  customAutoscaler:
    type: ScaledObject
    scaledObject:
      minReplicas: 3
      maxReplicas: 12
      metrics:
      - name: vertica_sessions_running_total
        metricType: AverageValue
        prometheus:
          serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090"
          query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
          threshold: 50

This creates a ScaledObject object with the following configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: v-scale-keda
spec:
  maxReplicaCount: 3
  minReplicaCount: 12
  scaleTargetRef:
    apiVersion: vertica.com/v1
    kind: VerticaAutoscaler
    name: v-scale
  triggers:
  - metadata:
      query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
      serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090
      threshold: "50"
    metricType: AverageValue
    name: vertica_sessions_running_total
    type: prometheus

Sets the target average number of active sessions to 50.
Scales the primary1 subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by the serviceName, the last one is scaled.

Event monitoring

Horizontal Pod Autoscaler

To view the Horizontal Pod Autoscaler object, use the kubetctl describe hpa command:

Name:                                                  v-scale-hpa
Namespace:                                             vertica
Reference:                                             VerticaAutoscaler/vertica.com/v1 v-scale
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 12 Feb 2024 15:11:28 -0300
Metrics:                       ( current / target )
  "vertica_sessions_running_total" on pods:  5 / 50                                            
Min replicas:                                          3
Max replicas:                                          12
VerticaAutoscaler pods:                                3 current / 3 desired
Conditions:
  Type              Status  Reason                Message
  ----              ------  ------                -------
  AbleToScale       True    ReadyForNewScale      the HPA controller was able to calculate a new replica count
  ScalingActive     True    ValidMetricFound      the HPA was able to successfully calculate a replica count from pods metric vertica_sessions_running_total
  ScalingLimited    False   DesiredWithinRange    the desired replica count is within the acceptable range

ScaledObject

To view the Scaled Object, use the kubetctl describe scaledobject command:

Name:         v-scale-keda
Namespace:    default
Labels:       <none>
Annotations:  <none>
CreationTimestamp:  <unknown>
Spec:
  Cooldown Period:   30
  Max Replica Count:  12
  Min Replica Count:  3
  Polling Interval:   30
  Scale Target Ref:
    API Version:  vertica.com/v1
    Kind:         VerticaAutoscaler
    Name:         v-scale
    Metadata:
      Auth Modes:           
      Query:                 sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
      Server Address:        http://prometheus-tls-kube-promet-prometheus.prometheus-tls.svc:9090
      Threshold:             50
      Unsafe Ssl:            false
    Metric Type:             AverageValue
    Name:                    vertica_sessions_running_total
    Type:                    prometheus
    Use Cached Metrics:      false
Conditions:
  Type              Status  Reason                Message
  ----              ------  ------                -------
  Active            True    ScaledObjectActive    ScaledObject is active
  Ready             True    ScaledObjectReady     ScaledObject is ready
  Fallback          False   NoFallback            No fallback was triggered
Events:            <none>

Viewing scaling events and autoscaler actions

When a scaling event occurs, you can view the newly created pods. Use kubectl to view the StatefulSets:

$ kubectl get statefulsets
NAME                                                   READY   AGE
v-test-v-scale-0                                        0/3     71s
v- test-primary1                                        3/3     39m
v- test-secondary

Use kubectl describe to view the executing commands:

$ kubectl describe vdb v-test | tail
  Upgrade Status:
Events:
  Type    Reason                   Age   From                Message
  ----    ------                   ----  ----                -------
  Normal  SubclusterAdded          10s   verticadb-operator  Added new subcluster 'v-scale-0'
  Normal  AddNodeStart             9s    verticadb-operator  Starting add database node for pod(s) 'v-test-v-scale-0-0, v-test-v-scale-0-1, v-test-v-scale-0-2'