VerticaAutoscaler custom resource definition
The VerticaAutoscaler custom resource (CR) automatically scales resources for existing subclusters using one of the following autoscalers:
This is achieved using one of the following strategies:
- Subcluster scaling for short-running dashboard queries
- Pod scaling for long-running analytic queries
The VerticaAutoscaler CR plays a crucial role in managing the scaling of VerticaDB instances using resource metrics or custom Prometheus metrics for efficient scaling. OpenText™ Analytics Database manages subclusters by workload, which helps you pinpoint the best metrics to trigger a scaling event. To maintain data integrity, the operator does not scale in unless all connections to the pods are drained and sessions are closed.
Additionally, the VerticaAutoscaler provides a webhook to validate state changes. By default, this webhook is enabled. You can configure this webhook with the webhook.enable
Helm chart parameter.
Autoscalers
An autoscaler is a Kubernetes object that dynamically adjusts resource allocation based on metrics. The VerticaAutoscaler CR utilizes two types of autoscalers:
- Horizontal Pod Autoscaler (HPA) - a native Kubernetes object
- Scaled Object - a custom resource (CR) owned and managed by the Kubernetes Event-Driven Autoscaling (KEDA) operator.
Horizontal Pod Autoscaler (HPA) vs Kubernetes Event-Driven Autoscaling (KEDA) and ScaledObject
In Kubernetes, both the Horizontal Pod Autoscaler (HPA) and Kubernetes Event-Driven Autoscaling (KEDA)'s ScaledObject enable automatic pod scaling based on specific metrics. However, they differ in their operation and the types of metrics they utilize for scaling.
Horizontal Pod Autoscaler
HPA is a built-in Kubernetes resource that automatically scales the number of pods in a deployment based on observed CPU utilization or custom metrics, such as memory usage or other application-specific metrics.
Key features:
- Metrics: HPA primarily scales based on CPU utilization, memory usage, or custom metrics sourced from the metrics-server or Prometheus adapter.
- Scaling Trigger: HPA monitors the metric values (for example, CPU utilization) and compares them to a defined target (typically a percentage, such as 50% CPU utilization). If the actual value exceeds the target, it scales up the number of pods; if it falls below the target, it scales down accordingly.
- Limitations: HPA is effective for resource-based scaling, but may face challenges when scaling based on event-driven triggers (such as, message queues, incoming requests) unless combined with custom metrics.
For more information about the algorithm that determines when the HPA scales, see the Kubernetes documentation.
KEDA's ScaledObject
Kubernetes Event-Driven Autoscaler (KEDA) is designed to scale workloads based on event-driven metrics.
Key features:
- Integrates with external event sources (like Prometheus).
- Supports scaling to zero, ensuring no pods run when there is no demand.
- Utilizes KEDA’s ScaledObject Custom Resource Definition (CRD).
- Works with HPA internally, with KEDA managing the HPA configuration.
Following is a feature comparison between HPA and KEDA ScaledObject:
Feature | HPA (Horizontal Pod Autoscaler) | KEDA (ScaledObject) |
---|---|---|
Scaling Trigger | CPU, Memory, Custom metrics | CPU, Memory, Custom metrics, External events (queues, databases, etc.) |
Use HPA? | Native Kubernetes feature | Uses HPA internally |
Complexity | Simple | Requires KEDA installation |
Flexibility | More focused on resource scaling | Provides greater flexibility by integrating with multiple external event sources |
Response Time | Delayed (depends on Metrics API) | Faster (direct event triggers) |
Examples
The examples in this section use the following VerticaDB custom resource. Each example uses the number of active sessions to trigger scaling:
apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
name: v-test
spec:
communal:
path: "path/to/communal-storage"
endpoint: "path/to/communal-endpoint"
credentialSecret: credentials-secret
licenseSecret: license
subclusters:
- name: pri1
size: 3
type: primary
serviceName: primary1
resources:
limits:
cpu: "8"
requests:
cpu: "4"
- name: sec1
size: 3
type: secondary
serviceName: secondary1
resources:
limits:
cpu: "8"
requests:
cpu: "4"
Prerequisites
-
Complete Installing the VerticaDB operator.
-
Install the kubectl command line tool.
-
Complete VerticaDB custom resource definition.
-
Confirm that you have the resources to scale.
Note
By default, the custom resource uses the free Community Edition (CE) license. This license allows you to deploy up to three nodes with a maximum of 1TB of data. To add resources beyond these limits, you must add your Vertica license to the custom resource as described in VerticaDB custom resource definition.
- To scale based on CPU utilization, you must set CPU limits and requests.
For HPA autoscaler:
- Configure Metrics server for basic metrics along with a custom metrics provider such as the Prometheus Adapter.
- Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
- Install Prometheus Adapter to make custom metrics available to HPA. This adapter exposes the custom metrics scraped by Prometheus to the Kubernetes API server (only necessary if scaling based on Prometheus metrics).
- Configure Custom Metrics API (Prometheus Adapter) to expose custom Prometheus metrics.
- Configure the database to expose Prometheus-compatible metrics.
For KEDA's ScaledObject:
- Install KEDA v2.15.0 installation in your cluster. KEDA is responsible for scaling workloads based on external metrics such as custom Prometheus metrics.
- KEDA 2.15 is compatible with Kubernetes versions 1.28 to 1.30. While it might work on earlier Kubernetes versions, it is outside the supported range and could result in unexpected issues.
- Install Prometheus in the Kubernetes cluster to scrape metrics from your database instance.
- Your database is configured to expose Prometheus-compatible metrics.
Subcluster scaling
Automatically adjust the number of subclusters in your custom resource to fine-tune resources for short-running dashboard queries. For example, increase the number of subclusters to increase throughput. For more information, see Improving query throughput using subclusters.
All subclusters share the same service object, so there are no required changes to external service objects. Pods in the new subcluster are load balanced by the existing service object.
The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50:
Horizontal Pod Autoscaler
-
Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:
rules: default: false custom: # Total number of active sessions. Used for testing - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod) resources: overrides: namespace: resource: namespace pod: resource: pod name: matches: "^(.*)_counter$" as: "${1}_total" # vertica_sessions_running_total seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}
-
Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:
apiVersion: vertica.com/v1beta1 kind: VerticaAutoscaler metadata: name: v-scale verticaDBName: v-test scalingGranularity: Subcluster serviceName: primary1 customAutoscaler: type: HPA hpa: minReplicas: 3 maxReplicas: 12 metrics: - metric: type: Pods pods: metric: name: vertica_sessions_running_total target: type: AverageValue averageValue: 50
-
This creates a HorizontalPodAutoscaler object with the following configuration:
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: v-scale-hpa spec: maxReplicas: 12 metrics: - type: Pods pods: metric: name: vertica_sessions_running_total target: type: AverageValue averageValue: "50" minReplicas: 3 scaleTargetRef: apiVersion: vertica.com/v1 kind: VerticaAutoscaler name: v-scale
- Sets the target average number of active sessions to 50.
- Scales to a minimum of three pods in one subcluster and 12 pods in four subclusters.
ScaledObject
-
Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:
apiVersion: vertica.com/v1 kind: VerticaAutoscaler metadata: name: v-scale spec: verticaDBName: v-test serviceName: primary1 scalingGranularity: Subcluster customAutoscaler: type: ScaledObject scaledObject: minReplicas: 3 maxReplicas: 12 metrics: - name: vertica_sessions_running_total metricType: AverageValue prometheus: serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090" query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-scale-primary1"}) threshold: 50
-
This creates a ScaledObject object with the following configuration:
apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: v-scale-keda spec: maxReplicaCount: 3 minReplicaCount: 12 scaleTargetRef: apiVersion: vertica.com/v1 kind: VerticaAutoscaler name: v-scale triggers: - metadata: query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"}) serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090 threshold: "50" metricType: AverageValue name: vertica_sessions_running_total type: prometheus
- Sets the target average number of active sessions to 50.
- Scales to a minimum of three pods in one subcluster, and 12 pods in four subclusters
KEDA directly queries Prometheus using the provided address without relying on the pod selector. The scaling is determined by the result of the query, so ensure the query performs as expected.
Note
The service label must be set to retrieve metrics specifically for that service from Prometheus.spec.serviceName
is the partial service name without the vdb name as a prefix. In the Prometheus query, the service label must be the actual service object name in Kubernetes with vdb name prepended.
Pod scaling
For long-running, analytic queries, increase the pod count for a subcluster. For additional information about analytic queries, see Using elastic crunch scaling to improve query performance.
When you scale pods in an Eon Mode database, you must consider the impact on database shards. For details, see Namespaces and shards.
The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the average number of active sessions is 50.
Horizontal Pod Autoscaler
-
Install the Prometheus adapter and configure it to retrieve metrics from Prometheus:
rules: default: false custom: # Total number of active sessions. Used for testing - metricsQuery: sum(vertica_sessions_running_counter{type="active", initiator="user"}) by (namespace, pod) resources: overrides: namespace: resource: namespace pod: resource: pod name: matches: "^(.*)_counter$" as: "${1}_total" # vertica_sessions_running_total seriesQuery: vertica_sessions_running_counter{namespace!="", pod!="", type="active", initiator="user"}
-
Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:
apiVersion: vertica.com/v1beta1 kind: VerticaAutoscaler metadata: name: v-scale verticaDBName: v-test scalingGranularity: Pod serviceName: primary1 customAutoscaler: type: HPA hpa: minReplicas: 3 maxReplicas: 12 metrics: - metric: type: Pods pods: metric: name: vertica_sessions_running_total target: type: AverageValue averageValue: 50
-
This creates a HorizontalPodAutoscaler object with the following configuration:
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: v-scale-hpa spec: maxReplicas: 12 metrics: - type: Pods pods: metric: name: vertica_sessions_running_total target: type: AverageValue averageValue: "50" minReplicas: 3 scaleTargetRef: apiVersion: vertica.com/v1 kind: VerticaAutoscaler name: v-scale
- Sets the target average number of active sessions to 50.
- Scales the
primary1
subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by theserviceName
, the last one is scaled.
ScaledObject
-
Define the VerticaAutoscaler custom resource in a YAML-formatted manifest and deploy it:
apiVersion: vertica.com/v1 kind: VerticaAutoscaler metadata: name: v-scale spec: verticaDBName: v-test serviceName: primary1 scalingGranularity: Pod customAutoscaler: type: ScaledObject scaledObject: minReplicas: 3 maxReplicas: 12 metrics: - name: vertica_sessions_running_total metricType: AverageValue prometheus: serverAddress: "http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090" query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"}) threshold: 50
-
This creates a ScaledObject object with the following configuration:
apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: v-scale-keda spec: maxReplicaCount: 3 minReplicaCount: 12 scaleTargetRef: apiVersion: vertica.com/v1 kind: VerticaAutoscaler name: v-scale triggers: - metadata: query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"}) serverAddress: http://prometheus-kube-prometheus-prometheus.prometheus.svc.cluster.local:9090 threshold: "50" metricType: AverageValue name: vertica_sessions_running_total type: prometheus
- Sets the target average number of active sessions to 50.
- Scales the
primary1
subcluster to a minimum of three pods and a maximum of 12 pods. If multiple subclusters are selected by theserviceName
, the last one is scaled.
Event monitoring
Horizontal Pod Autoscaler
To view the Horizontal Pod Autoscaler object, use the kubetctl describe hpa
command:
Name: v-scale-hpa
Namespace: vertica
Reference: VerticaAutoscaler/vertica.com/v1 v-scale
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 12 Feb 2024 15:11:28 -0300
Metrics: ( current / target )
"vertica_sessions_running_total" on pods: 5 / 50
Min replicas: 3
Max replicas: 12
VerticaAutoscaler pods: 3 current / 3 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale the HPA controller was able to calculate a new replica count
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric vertica_sessions_running_total
ScalingLimited False DesiredWithinRange the desired replica count is within the acceptable range
ScaledObject
To view the Scaled Object, use the kubetctl describe scaledobject
command:
Name: v-scale-keda
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: <unknown>
Spec:
Cooldown Period: 30
Max Replica Count: 12
Min Replica Count: 3
Polling Interval: 30
Scale Target Ref:
API Version: vertica.com/v1
Kind: VerticaAutoscaler
Name: v-scale
Metadata:
Auth Modes:
Query: sum(vertica_sessions_running_counter{type="active", initiator="user", service="v-test-primary1"})
Server Address: http://prometheus-tls-kube-promet-prometheus.prometheus-tls.svc:9090
Threshold: 50
Unsafe Ssl: false
Metric Type: AverageValue
Name: vertica_sessions_running_total
Type: prometheus
Use Cached Metrics: false
Conditions:
Type Status Reason Message
---- ------ ------ -------
Active True ScaledObjectActive ScaledObject is active
Ready True ScaledObjectReady ScaledObject is ready
Fallback False NoFallback No fallback was triggered
Events: <none>
Viewing scaling events and autoscaler actions
When a scaling event occurs, you can view the newly created pods. Use kubectl
to view the StatefulSets
:
$ kubectl get statefulsets
NAME READY AGE
v-test-v-scale-0 0/3 71s
v- test-primary1 3/3 39m
v- test-secondary
Use kubectl describe
to view the executing commands:
$ kubectl describe vdb v-test | tail
Upgrade Status:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SubclusterAdded 10s verticadb-operator Added new subcluster 'v-scale-0'
Normal AddNodeStart 9s verticadb-operator Starting add database node for pod(s) 'v-test-v-scale-0-0, v-test-v-scale-0-1, v-test-v-scale-0-2'