This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Containerized Vertica

Vertica leverages container technology to meet the needs of modern application development and operations workflows that must deliver software quickly and efficiently across a variety of infrastructures.

Vertica Eon Mode leverages container technology to meet the needs of modern application development and operations workflows that must deliver software quickly and efficiently across a variety of infrastructures. Containerized Vertica supports Kubernetes with automation tools to help maintain the state of your environment with minimal disruptions and manual intervention.

Containerized Vertica provides the following benefits:

  • Performance: Eon Mode separates compute from storage, which provides the optimal architecture for stateful, containerized applications. Eon Mode subclusters can target specific workloads and scale elastically according to the current computational needs.

  • High availability: Vertica containers provide a consistent, repeatable environment that you can deploy quickly. If a database host or service fails, you can easily replace the resource.

  • Resource utilization: A container is a runtime environment that packages an application and its dependencies in an isolated process. This isolation allows containerized applications to share hardware without interference, providing granular resource control and cost savings.

  • Flexibility: Kubernetes is the de facto container orchestration platform. It is supported by a large ecosystem of public and private cloud providers.

Containerized Vertica ecosystem

Vertica provides various tools and artifacts for production and development environments. The containerized Vertica ecosystem includes the following:

  • Vertica Helm chart: Helm is a Kubernetes package manager that bundles into a single package the YAML manifests that deploy Kubernetes objects. Download Vertica Helm charts from the Vertica Helm Charts Repository.

  • Custom Resource Definition (CRD): A CRD is a shared global object that extends the Kubernetes API with your custom resource types. You can use a CRD to instantiate a custom resource (CR), a deployable object with a desired state. Vertica provides CRDs that deploy and support the Eon Mode architecture on Kubernetes.

  • VerticaDB Operator: The operator is a custom controller that monitors the state of your CR and automates administrator tasks. If the current state differs from the declared state, the operator works to correct the current state.

  • Admission controller: The admission controller uses a webhook that the operator queries to verify changes to mutable states in a CR.

  • VerticaDB vlogger: The vlogger is a lightweight image used to deploy a sidecar utility container. The sidecar sends logs from vertica.log in the Vertica server container to standard output on the host node to simplify log aggregation.

  • Vertica Community Edition (CE) image: The CE image is the containerized version of the limited Enterprise Mode Vertica community edition (CE) license. The CE image provides a test environment consisting of an example database and developer tools.

    In addition to the pre-built CE image, you can build a custom CE image with the tools provided in the Vertica one-node-ce GitHub repository.

  • Communal Storage Options: Vertica supports a variety of public and private cloud storage providers. For a list of supported storage providers, see Containerized environments.

  • Kafka integration: Containerized Kafka Scheduler provides a CRD and Helm chart to create and launch the Vertica Kafka scheduler, a standalone Java application that automatically consumes data from one or more Kafka topics and then loads the structured data into Vertica.

Repositories

Vertica maintains the following open source projects on GitHub:

  • vertica-kubernetes: Vertica on Kubernetes is an open source project that welcomes contributions from the Vertica community. The vertica-kubernetes GitHub repository contains all the source code for the Vertica on Kubernetes integration, and includes contributing guidelines and instructions on how to set up development workflows.

  • vcluster: vcluster is a Go library that uses a high-level REST interface to perform database operations with the Node Management Agent (NMA) and HTTPS service. The vclusterops library replaces Administration tools (admintools), a traditional command-line interface that executes administrator commands through STDIN and required SSH keys for internal node communications. The vclusterops deployment is more efficient in containerized environments than the admintools deployment.

    vcluster is an open source project, so you can build custom implementations with the library. For details about migrating your existing admintools deployment to vcluster, see Upgrading Vertica on Kubernetes.

  • vertica-containers: GitHub repository that contains source code for the following container-based projects:

1 - Containerized Vertica on Kubernetes

Kubernetes is an open-source container orchestration platform that automatically manages infrastructure resources and schedules tasks for containerized applications at scale.

Kubernetes is an open-source container orchestration platform that automatically manages infrastructure resources and schedules tasks for containerized applications at scale. Kubernetes achieves automation with a declarative model that decouples the application from the infrastructure. The administrator provides Kubernetes the desired state of an application, and Kubernetes deploys the application and works to maintain that desired state. This frees the administrator to update the application as business needs evolve, without worrying about the implementation details.

An application consists of resources, which are stateful objects that you create from Kubernetes resource types. Kubernetes provides access to resource types through the Kubernetes API, an HTTP API that exposes resource types as endpoints. The most common way to create a resource is with a YAML-formatted manifest file that defines the desired state of the resource. You use the kubectl command-line tool to request a resource instance of that type from the Kubernetes API. In addition to the default resource types, you can extend the Kubernetes API and define your own resource types as a Custom Resource Definition (CRD).

To manage the infrastructure, Kubernetes uses a host to run the control plane, and designates one or more hosts as worker nodes. The control plane is a collection of services and controllers that maintain the desired state of Kubernetes objects and schedule tasks on worker nodes. Worker nodes complete tasks that the control plane assigns. Just as you can create a CRD to extend the Kubernetes API, you can create a custom controller that maintains the state of your custom resources (CR) created from the CRD.

Vertica custom resource definition and custom controller

The VerticaDB CRD extends the Kubernetes API so that you can create custom resources that deploy an Eon Mode database as a StatefulSet. In addition, Vertica provides the VerticaDB operator, a custom controller that maintains the desired state of your CR and automates lifecycle tasks. The result is a self-healing, highly-available, and scalable Eon Mode database that requires minimal manual intervention.

To simplify deployment, Vertica packages the CRD and the operator in Helm charts. A Helm chart bundles manifest files into a single package to create multiple resource type objects with a single command.

Custom resource definition architecture

The Vertica CRD creates a StatefulSet, a workload resource type that persists data with ephemeral Kubernetes objects. The following diagram describes the Vertica CRD architecture:

VerticaDB operator

The VerticaDB operator is a cluster-scoped custom controller that maintains the state of custom objects and automates administrator tasks across all namespaces in the cluster. The operator watches objects and compares their current state to the desired state declared in the custom resource. When the current state does not match the desired state, the operator works to restore the objects to the desired state.

In addition to state maintenance, the operator:

  • Installs Vertica

  • Creates an Eon Mode database

  • Upgrades Vertica

  • Revives an existing Eon Mode database

  • Restarts and reschedules DOWN pods

  • Scales subclusters

  • Manages services for pods

  • Monitors pod health

  • Handles load balancing for internal and external traffic

To validate changes to the custom resource, the operator queries the admission controller, a webhook that provides rules for mutable states in a custom resource.

Vertica makes the operator and admission controller available with Helm charts, kubectl command-line tool, or through OperatorHub.io. For details about installing the operator and the admission controller with both methods, see Installing the VerticaDB operator.

Vertica pod

A pod is essentially a wrapper around one or more logically grouped containers. A Vertica pod in the default configuration consists of two containers: the Vertica server container that runs the main Vertica process, and the Node Management Agent (NMA) container.

The NMA runs in a sidecar container, which is a container that contributes to the pod's main process, the Vertica server. The Vertica pod runs a single process per container to align each process lifetime with its container lifetime. This alignment provides the following benefits:

  • Accurate health checks. A container can have only one health check, so performing a health check on a container with multiple running processes might return inaccurate results.
  • Granular Kubernetes probe control. Kubernetes sets probes at the container level. If the Vertica container runs multiple processes, the NMA process might interfere with the probe that you set for the Vertica server process. This interference is not an issue with single-process containers.
  • Simplified monitoring. A container with multiple processes has multiple states, which complicates monitoring. A container with a single process returns a single state.
  • Easier troubleshooting. If a container runs multiple processes and crashes, it might be difficult to determine which failure caused the crash. Running one process per container makes it easier to pinpoint issues.

When a Vertica pod launches, the NMA process starts and prepares the configuration files for the Vertica server process. After the server container retrieves environment information from the NMA configuration files, the Vertica server process is ready.

All containers in the Vertica pod consume the host node resources in a shared execution environment. In addition to sharing resources, a pod extends the container to interact with Kubernetes services. For example, you can assign labels to associate pods to other objects, and you can implement affinity rules to schedule pods on specific host nodes.

DNS names provide continuity between pod lifecycles. Each pod is assigned an ordered and stable DNS name that is unique within its cluster. When a Vertica pod fails, the rescheduled pod uses the same DNS name as its predecessor. If a pod needs to persist data between lifecycles, you can mount a custom volume in its filesystem.

Rescheduled pods require information about the environment to become part of the cluster. This information is provided by the Downward API. Environment information, such as the superuser password Secret, is mounted in the /etc/podinfo directory.

NMA sidecar

The NMA sidecar exposes a REST API that the VerticaDB operator uses to administer your cluster. This container runs the same image as the Vertica server process, specified by the spec.image parameter setting in the VerticaDB custom resource definition.

The NMA sidecar is designed to consume minimal resources, but because the database size determines the amount of resources consumed by some NMA operations, there are no default resource limits. This prevents failures that result from inadequate available resources.

Running the NMA in a sidecar enables idiomatic Kubernetes logging, which sends all logs to STDOUT and STDERR on the host node. In addition, the kubectl logs command accepts a container name, so you can specify a container name during log collection.

Sidecar logger

The Vertica server process writes log messages to a catalog file named vertica.log. However, idiomatic Kubernetes practices send log messages to STDOUT and STDERR on the host node for log aggregation.

To align Vertica server logging with Kubernetes convention, Vertica provides the vertica-logger sidecar image. You can run this image in a sidecar, and it retrieves logs from vertica.log and sends them to the container's STDOUT and STDERR stream. If your sidecar logger needs to persist data, you can mount a custom volume in the filesystem.

For implementation details, see VerticaDB custom resource definition.

Persistent storage

A pod is an ephemeral, immutable object that requires access to external storage to persist data between lifecycles. To persist data, the operator uses the following API resource types:

  • StorageClass: Represents an external storage provider. You must create a StorageClass object separately from your custom resource and set this value with the local.storageClassName configuration parameter.

  • PersistentVolume (PV): A unit of storage that mounts in a pod to persist data. You dynamically or statically provision PVs. Each PV references a StorageClass.

  • PersistentVolumeClaim (PVC): The resource type that a pod uses to describe its StorageClass and storage requirements. When you delete a VerticaDB CR, its PVC is deleted.

A pod mounts a PV in its filesystem to persist data, but a PV is not associated with a pod by default. However, the pod is associated with a PVC that includes a StorageClass in its storage requirements. When a pod requests storage with a PVC, the operator observes this request and then searches for a PV that meets the storage requirements. If the operator locates a PV, it binds the PVC to the PV and mounts the PV as a volume in the pod. If the operator does not locate a PV, it must either dynamically provision one, or the administrator must manually provision one before the operator can bind it to a pod.

PVs persist data because they exist independently of the pod life cycle. When a pod fails or is rescheduled, it has no effect on the PV. When you delete a VerticaDB, the VerticaDB operator automatically deletes any PVCs associated with that VerticaDB instance.

For additional details about StorageClass, PersistentVolume, and PersistentVolumeClaim, see the Kubernetes documentation.

StorageClass requirements

The StorageClass affects how the Vertica server environment and operator function. For optimum performance, consider the following:

  • If you do not set the local.storageClassName configuration parameter, the operator uses the default storage class. If you use the default storage class, confirm that it meets storage requirements for a production workload.

  • Select a StorageClass that uses a recommended storage format type as its fsType.

  • Use dynamic volume provisioning. The operator requires on-demand volume provisioning to create PVs as needed.

Local volume mounts

The operator mounts a single PVC in the /home/dbadmin/local-data/ directory of each pod to persist data. Each of the following subdirectories is a sub-path into the volume that backs the PVC:

  • /catalog: Optional subdirectory that you can create if your environment requires a catalog location that is separate from the local data. You can customize this path with the local.catalogPath parameter.
    By default, the catalog is stored in the /data subdirectory.

  • /data: Stores any temporary files, and the catalog if local.catalogPath is not set. You can customize this path with the local.dataPath parameter.

  • /depot: Improves depot warming in a rescheduled pod. You can customize this path with the local.depotPath parameter.

  • /opt/vertica/config: Persists the contents of the configuration directory between restarts.

  • /opt/vertica/log: Persists log files between pod restarts.

  • /tmp/scrutinize: Target location for the final scruitinize tar file and any additional files generated during scrutinize diagnositics collection.

By default, each path mounted in the /local-data directory is owned by the user or group specified by the operator. To customize the user or group, set the podSecurityContext custom resource definition parameter.

Custom volume mounts

You might need to persist data between pod lifecycles in one of the following scenarios:

  • An external process performs a task that requires long-term access to the Vertica server data.

  • Your custom resource includes a sidecar container in the Vertica pod.

You can mount a custom volume in the Vertica pod or sidecar filesystem. To mount a custom volume in the Vertica pod, add the definition in the spec section of the CR. To mount the custom volume in the sidecar, add it in an element of the sidecars array.

The CR requires that you provide the volume type and a name for each custom volume. The CR accepts any Kubernetes volume type. The volumeMounts.name value identifies the volume within the CR, and has the following requirements and restrictions:

  • It must match the volumes.name parameter setting.

  • It must be unique among all volumes in the /local-data, /podinfo, or /licensing mounted directories.

For instructions on how to mount a custom volume in either the Vertica server container or in a sidecar, see VerticaDB custom resource definition.

Service objects

Vertica on Kubernetes provides two service objects: a headless service that requires no configuration to maintain DNS records and ordered names for each pod, and a load balancing service that manages internal traffic and external client requests for the pods in your cluster.

Load balancing services

Each subcluster uses a single load balancing service object. You can manually assign a name to a load balancing service object with the subclusters[i].serviceName parameter in the custom resource. Assigning a name is useful when you want to:

  • Direct traffic from a single client to multiple subclusters.

  • Scale subclusters by workload with more flexibility.

  • Identify subclusters by a custom service object name.

To configure the type of service object, use the subclusters[i].serviceType parameter in the custom resource to define a Kubernetes service type. Vertica supports the following service types:

  • ClusterIP: The default service type. This service provides internal load balancing, and sets a stable IP and port that is accessible from within the subcluster only.

  • NodePort: Provides external client access. You can specify a port number for each host node in the subcluster to open for client connections.

  • LoadBalancer: Uses a cloud provider load balancer to create NodePort and ClusterIP services as needed. For details about implementation, see the Kubernetes documentation and your cloud provider documentation.

Because native Vertica load balancing interferes with the Kubernetes service object, Vertica recommends that you allow the Kubernetes services to manage load balancing for the subcluster. You can configure the native Vertica load balancer within the Kubernetes cluster, but you receive unexpected results. For example, if you set the Vertica load balancing policy to ROUNDROBIN, the load balancing appears random.

For additional details about Kubernetes services, see the official Kubernetes documentation.

Security considerations

Vertica on Kubernetes supports both TLS and mTLS for communications between resource objects. You must manually configure TLS in your environment. For details, see TLS protocol.

The VerticaDB operator manages changes to the certificates. If you update an existing certificate, the operator replaces the certificate in the Vertica server container. If you add or delete a certificate, the operator reschedules the pod with the new configuration.

The subsequent sections detail internal and external connections that require TLS for secure communications.

Admission controller webhook certificates

The VerticaDB operator Helm chart includes the admission controller, a webhook that communicates with the Kubernetes API server to validate changes to a resource object. Because the API server communicates over HTTPS only, you must configure TLS certificates to authenticate communications between the API server and the webhook.

The method you use to install the VerticaDB operator determines how you manage TLS certificates for the admission controller:

  • Helm charts: In the default configuration, the operator generates self-signed certificates. You can add custom certificates with the webhook.certSource Helm chart parameter.
  • kubectl: The operator generates self-signed certificates.
  • OperatorHub.io: Runs on the Operator Lifecycle Manager (OLM) and automatically creates and mounts a self-signed certificate for the webhook. This installation method does not require additional action.

For details about each installation method, see Installing the VerticaDB operator.

Node Management Agent certificates

The Node Management Agent (NMA) exposes a REST API for cluster administration. Vertica on Kubernetes manages NMA certificates differently than a non-Kubernetes environment. Non-Kubernetes deployments use the TLS configuration generated by the install_vertica script, but Vertica on Kubernetes uses TLS certificates that are provided when a Vertica server pod starts.

You have the following options to provide TLS certificates at startup:

  • The VerticaDB operator generates production-safe, self-signed certificates for the NMA. This is the default configuration.
  • You can add custom certificates with the nmaTLSSecret custom resource parameter.

Communal storage certificates

Supported storage locations authenticate requests with a self-signed certificate authority (CA) bundle. For TLS configuration details for each provider, see Configuring communal storage.

Client-server certificates

You might require multiple certificates to authenticate external client connections to the load balancing service object. You can mount one or more custom certificates in the Vertica server container with the certSecrets custom resource parameter. Each certificate is mounted in the container at /certs/cert-name/key.

For details, see VerticaDB custom resource definition.

Prometheus metrics certificates

Vertica integrates with Prometheus to scrape metrics about the VerticaDB operator and the server process. The operator and server export metrics independently from one another, and each set of metrics requires a different TLS configuration.

The operator SDK framework enforces role-based access control (RBAC) to the metrics with a proxy sidecar that uses self-signed certificates to authenticate requests for authorized service accounts. If you run Prometheus outside of Kubernetes, you cannot authenticate with a service account, so you must provide the proxy sidecar with custom TLS certificates.

The Vertica server exports metrics with the HTTPS service. This service requires client, server, and CA certificates to configure mutual mode TLS for a secure connection.

For details about both the operator and server metrics, see Prometheus integration.

System configuration

As a best practice, make system configurations on the host node so that pods inherit those settings from the host node. This strategy eliminates the need to provide each pod a privileged security context to make system configurations on the host.

To manually configure host nodes, refer to the following sections:

The superuser account—historically, the dbadmin account—must use one of the authentication techniques described in Dbadmin authentication access.

2 - Vertica images

The following table describes Vertica server and automation tool images:.

The following table describes Vertica server and automation tool images:

Image Tags Description
Minimal image 24.3.0-0-minimal

Optimized for Kubernetes.

This image includes the following packages for UDx development:

  • C++
  • Python
Full image 24.3.0-0

Optimized for Kubernetes, and includes the following packages for machine learning and UDx development:

  • TensorFlow
  • Java 8
  • C++
  • Python
Community Edition 24.3.0-0

A single-node Enterprise Mode image for test environments. For more information, see Vertica community edition (CE).

This image includes the following packages for UDx development:

  • C++
  • Python
VerticaDB operator 24.3.0 The operator monitors the state of your custom resources and automates lifecycle tasks for Vertica on Kubernetes. For installation instructions, see Installing the VerticaDB operator.
Vertica logger 1.0.1 Lightweight image for sidecar logging. The logger sends the contents of vertica.log to STDOUT on the host node. For implementation details, see VerticaDB custom resource definition.
Kafka Scheduler 24.3.0 Containerized version of the Vertica Kafka Scheduler, a mechanism that automatically loads data from Kafka into a Vertica database. For details, see Containerized Kafka Scheduler.

Creating a custom Vertica image

The Creating a Vertica Image tutorial in the Vertica Integrator's Guide provides a line-by-line description of the Dockerfile hosted on GitHub. You can add dependencies to replicate your development and production environments.

Python container UDx

The Vertica images with Python UDx development capabilities include the vertica_sdk package and the Python Standard Library.

If your UDx depends on a Python package that is not included in the image, you must make the package available to the Vertica process during runtime. You can either mount a volume that contains the package dependencies, or you can create a custom Vertica server image.

Use the Python Package Index to download Python package source distributions.

Mounting Python libraries as volumes

You can mount a Python package dependency as a volume in the Vertica server container filesystem. A Python UDx can access the contents of the volume at runtime.

  1. Download the package source distribution to the host machine.

  2. On the host machine, extract the tar file contents into a mountable volume:

    $ tar -xvf lib-name.version.tar.gz -C /path/to/py-dependency-vol
    
  3. Mount the volume that contains the extracted source distribution in the custom resource (CR). The following snippet mounts the py-dependency-vol volume in the Vertica server container:

    spec:
      ...
      volumeMounts:
      - name: nfs
        mountPath: /path/to/py-dependency-vol
      volumes:
      - name: nfs
        nfs:
          path: /nfs
          server: nfs.example.com
      ...
    

    For details about mounting custom volumes in a CR, see VerticaDB custom resource definition.

Adding a Python library to a custom Vertica image

Create a custom image that includes any Python package dependencies in the Vertica server base image.

For a comprehensive guide about creating a custom Vertica image, see the Creating a Vertica Image tutorial in the Vertica Integrator's Guide.

  1. Download the package source distribution on the machine that builds the container.

  2. Create a Dockerfile that includes the Python source distribution. The ADD command automatically extracts the contents of the tar file into the target-dir directory:

    FROM opentext/vertica-k8s:version
    ADD lib-name.version.tar.gz /path/to/target-dir
    ...
    

    For a complete list of available Vertica server images, see opentext/vertica-k8s Docker Hub repository.

  3. Build the Dockerfile:

    $ docker build . -t image-name:tag
    
  4. Push the image to a container registry so that you can add the image to a Vertica custom resource:

    $ docker image push registry-host:port/registry-username/image-name:tag
    

3 - VerticaDB operator

The Vertica operator automates error-prone and time-consuming tasks that a Vertica on Kubernetes administrator must otherwise perform manually.

The Vertica operator automates error-prone and time-consuming tasks that a Vertica on Kubernetes administrator must otherwise perform manually. The operator:

  • Installs Vertica

  • Creates an Eon Mode database

  • Upgrades Vertica

  • Revives an existing Eon Mode database

  • Restarts and reschedules DOWN pods

  • Scales subclusters

  • Manages services for pods

  • Monitors pod health

  • Handles load balancing for internal and external traffic

The Vertica operator is a Go binary that uses the SDK operator framework. It runs in its own pod, and is cluster-scoped to manage any resource objects in any namespace across the cluster.

For details about installing and upgrading the operator, see Installing the VerticaDB operator.

Monitoring desired state

Because the operator is cluster-scoped, each cluster is allowed one operator pod that acts as a custom controller and monitors the state of the custom resource objects within all namespaces across the cluster. The operator uses the control loop mechanism to reconcile state changes by investigating state change notifications from the custom resource instance, and periodically comparing the current state with the desired state.

If the operator detects a change in the desired state, it determines what change occurred and reconciles the current state with the new desired state. For example, if the user deletes a subcluster from the custom resource instance and successfully saves the changes, the operator deletes the corresponding subcluster objects in Kubernetes.

Validating state changes

All VerticaDB operator installation options include an admission controller, which uses a webhook to prevent invalid state changes to the custom resource. When you save a change to a custom resource, the admission controller webhook queries a REST endpoint that provides rules for mutable states in a custom resource. If a change violates the state rules, the admission controller prevents the change and returns an error. For example, it returns an error if you try to save a change that violates K-Safety.

Limitations

The operator has the following limitations:

The VerticaDB operator 2.0.0 does not use Administration tools (admintools) with API version v1. The following features require admintools commands, so they are not available with that operator version and API version configuration:

To use these features with operator 2.0.0, you must a lower server version.

3.1 - Installing the VerticaDB operator

The custom resource definition (CRD), DB operator, and admission controller work together to maintain the state of your environment and automate tasks:.

The VerticaDB operator is a custom controller that monitors CR instances to maintain the desired state of VerticaDB objects. The operator includes an admission controller, which is a webhook that queries a REST endpoint to verify changes to mutable states in a CR instance.

By default, the operator is cluster-scoped—you can deploy one operator per cluster to monitor objects across all namespaces in the cluster. For flexibility, Vertica also provides a Helm chart deployment option that installs the operator at the namespace level.

Installation options

Vertica provides the following options to install the VerticaDB operator and admission controller:

  • Helm charts. Helm is a package manager for Kubernetes. The Helm chart option is the most common installation method and lets you customize your TLS configuration and environment setup. For example, Helm chart installations include operator logging levels and log rotation policy. For details about additional options, see Helm chart parameters.

    Vertica also provides the Quickstart Helm chart option so that you can get started quickly with minimal requirements.

  • kubectl installation. Apply the Custom Resource Definitions (CRDs) and VerticaDB operator directly. You can use the kubectl tool to apply the latest CRD available on vertica-kubernetes GitHub repository.

  • OperatorHub.io. This is a registry that lets vendors share Kubernetes operators.

Helm charts

Vertica packages the VerticaDb operator and admission controller in a Helm chart. The following sections detail different installation methods so that you can install the operator to meet your environment requirements. You can customize your operator during and after installation with Helm chart parameters.

For additional details about Helm, see the Helm documentation.

Prerequisites

Quickstart installation

The quickstart installation installs the VerticaDB Helm chart with minimal commands. This deployment installs the operator in the default configuration, which includes the following:

  • Cluster-scoped webhook and controllers that monitor resources across all namespaces in the cluster. For namespace-scoped deployments, see Namespace-scoped installation.
  • Self-signed certificates to communicate with the Kubernetes API server. If your environment requires custom certificates, see Custom certificate installation.

To quickly install the Helm chart, you must add the latest chart to your local repository and then install it in a namespace:

  1. The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!⎈
    
  2. Install the Helm chart to deploy the VerticaDB operator in your cluster. The following command names this chart instance vdb-op, and creates a default namespace for the operator if it does not already exist:
    $ helm install vdb-op --namespace verticadb-operator --create-namespace vertica-charts/verticadb-operator
    

For helm install options, see the Helm documentation.

Namespace-scoped installation

By default, the VerticaDB operator is cluster-scoped. However, Vertica provides an option to install a namespace-scoped operator for environments that require more granular control over which resources an operator watches for state changes.

The VerticaDB operator includes a webhook and controllers. The webhook is cluster-scoped and verifies state changes for resources across all namespaces in the cluster. The controllers—the control loops that reconcile the current and desired states for resources—do not have a cluster-scope requirement, so you can install them at the namespace level. The namespace-scoped operator installs the webhook once at the cluster level, and then installs the controllers in the specified namespace. You can install these namespaced controllers in multiple namespaces per cluster.

To install a namespace-scoped operator, add the latest chart to your respository and issue separate commands to deploy the webhook and controllers:

  1. The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!⎈
    
  2. Deploy the cluster-scoped webhook and install the required CRDs. To deploy the operator as a webhook without controllers, set controllers.enable to false. The following command deploys the webhook to the vertica namespace, which is the namespace for a Vertica cluster:

    $ helm install webhook vertica-charts/verticadb-operator --namespace vertica --set controllers.enable=false
    
  3. Deploy the namespace-scoped operator. To prevent a second webhook installation, set webhook.enable to false. To deploy only the controllers, set controllers.scope to namespace. The following command installs the operator in the default namespace:

    $ helm install vdb-op vertica-charts/verticadb-operator --namespace default --set webhook.enable=false,controllers.scope=namespace
    

For details about the controllers.* parameter settings, see Helm chart parameters. For helm install options, see the Helm documentation.

Custom certificate installation

The admission controller uses a webhook that communicates with the Kubernetes API over HTTPS. By default, the Helm chart generates a self-signed certificate before installing the admission controller. A self-signed certificate might not be suitable for your environment—you might require custom certificates that are signed by a trusted third-party certificate authority (CA).

To add custom certificates for the webhook:

  1. Set the TLS key's Subjective Alternative Name (SAN) to the admission controller's fully-qualified domain name (FQDN). Set the SAN in a configuration file using the following format:

    [alt_names]
    DNS.1 = verticadb-operator-webhook-service.operator-namespace.svc
    DNS.2 = verticadb-operator-webhook-service.operator-namespace.svc.cluster.local
    
  2. Create a Secret that contains the certificates. A Secret conceals your certificates when you pass them as command-line parameters.

    The following command creates a Secret named tls-secret. It stores the TLS key, TLS certificate, and CA certificate:

    $ kubectl create secret generic tls-secret --from-file=tls.key=/path/to/tls.key --from-file=tls.crt=/path/to/tls.crt --from-file=ca.crt=/path/to/ca.crt
    
  3. Install the Helm chart.

    The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!⎈
    

    When you install the Helm chart with custom certificates for the admission controller, you have to use the webhook.certSource and webhook.tlsSecret Helm chart parameters:

    • webhook.certSource indicates whether you want the admission controller to install user-provided certificates. To install with custom certificates, set this parameter to secret.
    • webhook.tlsSecret accepts a Secret that contains your certificates.

    The following command deploys the operator with the TLS certificates and creates namespace if it does not already exist:

    $ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
        --set webhook.certSource=secret \
        --set webhook.tlsSecret=tls-secret
    

Granting user privileges

After the operator is deployed, the cluster administrator is the only user with privileges to create and modify VerticaDB CRs within the cluster. To grant other users the privileges required to work with custom resources, you can leverage namespaces and Kubernetes RBAC.

To grant these privileges, the cluster administrator creates a namespace for the user, then grants that user edit ClusterRole within that namespace. Next, the cluster administrator creates a Role with specific CR privileges, and binds that role to the user with a RoleBinding. The cluster administrator can repeat this process for each user that must create or modify VerticaDB CRs within the cluster.

To provide a user with privileges to create or modify a VerticaDB CR:

  1. Create a namespace for the application developer:

    $ kubectl create namespace user-namespace
    namespace/user-namespace created
    
  2. Grant the application developer edit role privileges in the namespace:

    $ kubectl create --namespace user-namespace rolebinding edit-access --clusterrole=edit --user=username
    rolebinding.rbac.authorization.k8s.io/edit-access created
    
  3. Create the Role with privileges to create and modify any CRs in the namespace. Vertica provides the verticadb-operator-cr-user-role.yaml file that defines these rules:

    $ kubectl --namespace user-namespace apply -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/verticadb-operator-cr-user-role.yaml
    role.rbac.authorization.k8s.io/vertica-cr-user-role created
    

    Verify the changes with kubectl get:

    $ kubectl get roles --namespace user-namespace
    NAME                   CREATED AT
    vertica-cr-user-role   2023-11-30T19:37:24Z
    
  4. Create a RoleBinding that associates this Role to the user. The following command creates a RoleBinding named vdb-access:

    $ kubectl create --namespace user-namespace rolebinding vdb-access --role=vertica-cr-user-role --user=username
    rolebinding.rbac.authorization.k8s.io/rolebinding created
    

    Verify the changes with kubectl get:

    $ kubectl get rolebinding --namespace user-namespace
    NAME          ROLE                        AGE
    edit-access   ClusterRole/edit            16m
    vdb-access    Role/vertica-cr-user-role   103s
    

Now, the user associated with username has access to create and modify VerticaDB CRs in the isolated user-namespace.

kubectl installation

You can install the VerticaDB operator from GitHub by applying the YAML manifests with the kubectl command-line tool:

  1. Install all Custom resource definitions. Because the size of the CRD is too large for client-side operations, you must use the server-side=true and --force-conflicts options to apply the manifests:

    kubectl apply --server-side=true --force-conflicts -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/crds.yaml
    

    For additional details about these commands, see Server-Side Apply documentation.

  2. Install the VerticaDB operator:
    $ kubectl apply -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/operator.yaml
    

OperatorHub.io

OperatorHub.io is a registry that allows vendors to share Kubernetes operators. Each vendor must adhere to packaging guidelines to simplify user adoption.

To install the VerticaDB operator from OperatorHub.io, navigate to the Vertica operator page and follow the install instructions.

3.2 - Upgrading the VerticaDB operator

Vertica supports two separate options to upgrade the VerticaDB operator:.

Vertica supports two separate options to upgrade the VerticaDB operator:

  • OperatorHub.io

  • Helm Charts

Prerequisites

OperatorHub.io

The Operator Lifecycle Manager (OLM) operator manages upgrades for OperatorHub.io installations. You can configure the OLM operator to upgrade the VerticaDB operator manually or automatically with the Subscription object's spec.installPlanApproval parameter.

Automatic upgrade

To configure automatic version upgrades, set spec.installPlanApproval to Automatic, or omit the setting entirely. When the OLM operator refreshes the catalog source, it installs the new VerticaDB operator automatically.

Manual upgrade

Upgrade the VerticaDB operator manually to approve version upgrades for specific install plans. To manually upgrade, set spec.installPlanApproval parameter to Manual and complete the following:

  1. Verify if there is an install plan that requires approval to proceed with the upgrade:

    $ kubectl get installplan
    NAME CSV APPROVAL APPROVED
    install-ftcj9 verticadb-operator.v1.7.0 Manual false
    install-pw7ph verticadb-operator.v1.6.0 Manual true
    

    The command output shows that the install plan install-ftcj9 for VerticaDB operator version 1.7.0 is not approved.

  2. Approve the install plan with a patch command:

    $ kubectl patch installplan install-ftcj9 --type=merge --patch='{"spec": {"approved": true}}'
    installplan.operators.coreos.com/install-ftcj9 patched
    

    After you set the approval, the OLM operator silently upgrades the VerticaDB operator.

  3. Optional. To monitor its progress, inspect the STATUS column of the Subscription object:

    $ kubectl describe subscription subscription-object-name
    

Helm charts

You must have cluster administrator privileges to upgrade the VerticaDB operator with Helm charts.

The Helm chart includes the CRD, but the helm install command does not overwrite an existing CRD. To upgrade the operator, you must update the CRD with the manifest from the GitHub repository.

Additionally, you must upgrade all custom resource definitions, even if you do deploy them in your environment. These CRDs are installed with the operator and maintained as separate YAML manifests. Upgrading all CRDs ensure that your operator is upgraded completely.

You can upgrade the CRDs and VerticaDB operator from GitHub by applying the YAML manifests with the kubectl command-line tool:

  1. Install all Custom resource definitions. Because the size of the CRD is too large for client-side operations, you must use the server-side=true and --force-conflicts options to apply the manifests:

    kubectl apply --server-side=true --force-conflicts -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/crds.yaml
    

    For additional details about these commands, see Server-Side Apply documentation.

  2. Upgrade the Helm chart:

    $ helm upgrade operator-name --wait vertica-charts/verticadb-operator
    

3.3 - Helm chart parameters

The following table describes the available settings for the VerticaDB operator and admission controller Helm chart.

The following list describes the available settings for the VerticaDB operator and admission controller Helm chart:

affinity
Applies rules that constrain the VerticaDB operator to specific nodes. It is more expressive than nodeSelector. If this parameter is not set, then the operator uses no affinity setting.
controllers.enable
Determines whether controllers are enabled when running the operator. Controllers watch and act on custom resources within the cluster.

For namespace-scoped operators, set this to false. This deploys the cluster-scoped operator only as a webhook, and then you can set webhook.enable to false and deploy the controllers to an individual namespace. For details, see Installing the VerticaDB operator.

Default: true

controllers.scope
Scope of the controllers in the VerticaDB operator. Controllers watch and act on custom resources within the cluster. This parameter accepts the following values:
  • cluster: The controllers watch for changes to all resources across all namespaces in the cluster.
  • namespace: The controllers watch for changes to resources only in the namespace specified during deployment. You must deploy the operator as a webhook for the cluster, then deploy the operator controllers in a namespace. You can deploy multiple namespace-scoped operators within the same cluster.

For details, see Installing the VerticaDB operator.

Default: cluster

image.name
Name of the image that runs the operator.

Default: vertica/verticadb-operator:version

imagePullSecrets
List of Secrets that store credentials to authenticate to the private container repository specified by image.repo and rbac_proxy_image. For details, see Specifying ImagePullSecrets in the Kubernetes documentation.
image.repo
Server that hosts the repository that contains image.name. Use this parameter for deployments that require control over a private hosting server, such as an air-gapped operator.

Use this parameter with rbac_proxy_image.name and rbac_proxy_image.repo.

Default: docker.io

logging.filePath

Path to a log file in the VerticaDB operator filesystem. If this value is not specified, Vertica writes logs to standard output.

Default: Empty string (' ') that indicates standard output.

logging.level
Minimum logging level. This parameter accepts the following values:
  • debug

  • info

  • warn

  • error

Default: info

logging.maxFileSize

When logging.filePath is set, the maximum size in MB of the logging file before log rotation occurs.

Default: 500

logging.maxFileAge

When logging.filePath is set, the maximum age in days of the logging file before log rotation deletes the file.

Default: 7

logging.maxFileRotation

When logging.filePath is set, the maximum number of files that are kept in rotation before the old ones are removed.

Default: 3

nameOverride
Sets the prefix for the name assigned to all objects that the Helm chart creates.

If this parameter is not set, each object name begins with the name of the Helm chart, verticadb-operator.

nodeSelector
Controls which nodes are used to schedule the operator pod. If this is not set, the node selector is omitted from the operator pod when it is created. To set this parameter, provide a list of key/value pairs.

The following example schedules the operator only on nodes that have the region=us-east label:

nodeSelector:
      region: us-east
  
priorityClassName
PriorityClass name assigned to the operator pod. This affects where the pod is scheduled.
prometheus.createProxyRBAC
When set to true, creates role-based access control (RBAC) rules that authorize access to the operator's /metrics endpoint for the Prometheus integration.

Default: true

prometheus.createServiceMonitor

When set to true, creates the ServiceMonitor custom resource for the Prometheus operator. You must install the Prometheus operator before you set this to true and install the Helm chart.

For details, see the Prometheus operator GitHub repository.

Default: false

prometheus.expose
Configures the operator's /metrics endpoint for the Prometheus integration. The following options are valid:
  • EnableWithAuthProxy: Creates a new service object that exposes an HTTPS /metrics endpoint. The RBAC proxy controls access to the metrics.

  • EnableWithoutAuth: Creates a new service object that exposes an HTTP /metrics endpoint that does not authorize connections. Any client with network access can read the metrics.

  • Disable: Prometheus metrics are not exposed.

Default: Disable

prometheus.tlsSecret
Secret that contains the TLS certificates for the Prometheus /metrics endpoint. You must create this Secret in the same namespace that you deployed the Helm chart.

The Secret requires the following values:

  • tls.key: TLS private key

  • tls.crt: TLS certificate for the private key

  • ca.crt: Certificate authority (CA) certificate

To ensure that the operator uses the certificates in this parameter, you must set prometheus.expose to EnableWithAuthProxy.

If prometheus.expose is not set to EnableWithAuthProxy, then this parameter is ignored, and the RBAC proxy sidecar generates its own self-signed certificate.

rbac_proxy_image.name
Name of the Kubernetes RBAC proxy image that performs authorization. Use this parameter for deployments that require authorization by a proxy server, such as an air-gapped operator.

Use this parameter with image.repo and rbac_proxy_image.repo.

Default: kubebuilder/kube-rbac-proxy:v0.11.0

rbac_proxy_image.repo
Server that hosts the repository that contains rbac_proxy_image.name. Use this parameter for deployments that perform authorization by a proxy server, such as an air-gapped operator.

Use this parameter with image.repo and rbac_proxy_image.name.

Default: gcr.io

reconcileConcurrency.verticaautoscaler
Number of concurrent reconciliation loops the operator runs for all VerticaAutoscaler CRs in the cluster.
reconcileConcurrency.verticadb
Number of concurrent reconciliation loops the operator runs for all VerticaDB CRs in the cluster.
reconcileConcurrency.verticaeventtrigger
Number of concurrent reconciliation loops the operator runs for all EventTrigger CRs in the cluster.
resources.limits and resources.requests
The resource requirements for the operator pod.

resources.limits is the maximum amount of CPU and memory that an operator pod can consume from its host node.

resources.requests is the maximum amount of CPU and memory that an operator pod can request from its host node.

Defaults:

resources:
  limits:
    cpu: 100m
    memory: 750Mi
  requests:
    cpu: 100m
    memory: 20Mi
  
serviceAccountAnnotations
Map of annotations that is added to the service account created for the operator.
serviceAccountNameOverride
Controls the name of the service account created for the operator.
tolerations
Any taints and tolerations that influence where the operator pod is scheduled.
webhook.certSource
How TLS certificates are provided for the admission controller webhook. This parameter accepts the following values:
  • internal: The VerticaDB operator internally generates a self-signed, 10-year expiry certificate before starting the managing controller. When the certificate expires, you must manually restart the operator pod to create a new certificate.

  • secret: You generate the custom certificates before you create the Helm chart and store them in a Secret. This option requires that you set webhook.tlsSecret.

    If webhook.tlsSecret is set, then this option is implicitly selected.

Default: internal

For details, see Installing the VerticaDB operator.

webhook.enable
Determines whether the Helm chart installs the admission controller webhooks for the custom resource definitions. The webhook is cluster-scoped, and you can install only one webhook per cluster.

If your environment uses namespace-scoped operators, you must install the webhook for the cluster, then disable the webhook for each namespace installation. For details, see Installing the VerticaDB operator.

Default: true

webhook.tlsSecret
Secret that contains a PEM-encoded certificate authority (CA) bundle and its keys.

The CA bundle validates the webhook's server certificate. If this is not set, the webhook uses the system trust roots on the apiserver.

This Secret includes the following keys for the CA bundle:

  • tls.key

  • ca.crt

  • tls.crt

3.4 - Red Hat OpenShift integration

Red Hat OpenShift is a hybrid cloud platform that provides enhanced security features and greater control over the Kubernetes cluster.

Red Hat OpenShift is a hybrid cloud platform that provides enhanced security features and greater control over the Kubernetes cluster. In addition, OpenShift provides the OperatorHub, a catalog of operators that meet OpenShift requirements.

For comprehensive instructions about the OpenShift platform, refer to the Red Hat OpenShift documentation.

Enhanced security with security context constraints

To enforce security measures, OpenShift requires that each deployment use a security context constraint (SCC). Vertica on Kubernetes supports the restricted-v2 SCC, the most restrictive default SCC available.

The SCC lets administrators control the privileges of the pods in a cluster without manual configuration. For example, you can restrict namespace access for specific users in a multi-user environment.

Installing the operator

The VerticaDB operator is a community operator that is maintained by Vertica. Each operator available in the OperatorHub must adhere to requirements defined by the Operator Lifecycle Manager (OLM). To meet these requirements, vendors must provide a cluster service version (CSV) manifest for each operator. Vertica provides a CSV for each version of the VerticaDB operator available in the OpenShift OperatorHub.

The VerticaDB operator supports OpenShift versions 4.8 and higher.

You must have cluster-admin privileges on your OpenShift account to install the VerticaDB operator. For detailed installation instructions, refer to the OpenShift documentation.

Deploying Vertica on OpenShift

After you installed the VerticaDB operator and added a supported SCC to your Vertica workloads service account, you can deploy Vertica on OpenShift.

For details about installing OpenShift in supported environments, see the OpenShift Container Platform installation overview.

Before you deploy Vertica on OpenShift, create the required Secrets to store sensitive information. For details about Secrets and OpenShift, see the OpenShift documentation. For guidance on deploying a Vertica custom resource, see VerticaDB custom resource definition.

3.5 - Prometheus integration

Vertica on Kubernetes integrates with Prometheus to scrape time series metrics about the VerticaDB operator.

Vertica on Kubernetes integrates with Prometheus to scrape time series metrics about the VerticaDB operator and Vertica server process. These metrics create a detailed model of your application over time to provide valuable performance and troubleshooting insights as well as facilitate internal and external communications and service discovery in microservice and containerized architectures.

Prometheus requires that you set up targets—metrics that you want to monitor. Each target is exposed on an endpoint, and Prometheus periodically scrapes that endpoint to collect target data. Vertica exports metrics and provides access methods for both the VerticaDB operator and server process.

Server metrics

Vertica exports server metrics on port 8443 at the following endpoint:

https://host-address:8443/api-version/metrics

Only the superuser can authenticate to the HTTPS service, and the service accepts only mutual TLS (mTLS) authentication. The setup for both Vertica on Kubernetes and non-containerized Vertica environments is identical. For details, see HTTPS service.

Vertica on Kubernetes lets you set a custom port for its HTTP service with the subclusters[i].verticaHTTPNodePort custom resource parameter. This parameter sets a custom port for the HTTPS service for NodePort serviceTypes.

For request and response examples, see the /metrics endpoint description. For a list of available metrics, see Prometheus metrics.

Grafana dashboards

You can visualize Vertica server time series metrics with Grafana dashboards. Vertica dashboards that use a Prometheus data source are available at Grafana Dashboards:

You can also download the source for each dashboard from the vertica/grafana-dashboards repository.

Operator metrics

The VerticaDB operator supports the Operator SDK framework, which requires that an authorization proxy impose role-based-access control (RBAC) to access operator metrics over HTTPS. To increase flexibility, Vertica provides the following options to access the Prometheus /metrics endpoint:

  • HTTPS access: Meet operator SDK requirements and use a sidecar container as an RBAC proxy to authorize connections.

  • HTTP access: Expose the /metrics endpoint to external connections without RBAC. Any client with network access can read from /metrics.

  • Disable Prometheus entirely.

Vertica provides Helm chart parameters and YAML manifests to configure each option.

Prerequisites

HTTPS with RBAC

The operator SDK framework requires that operators use an authorization proxy for metrics access. Because the operator sends metrics to localhost only, Vertica meets these requirements with a sidecar container with localhost access that enforces RBAC.

RBAC rules are cluster-scoped, and the sidecar authorizes connections from clients associated with a service account that has the correct ClusterRole and ClusterRoleBindings. Vertica provides the following example manifests:

For additional details about ClusterRoles and ClusterRoleBindings, see the Kubernetes documentation.

Create RBAC rules

The following steps create the ClusterRole and ClusterRoleBindings objects that grant access to the /metrics endpoint to a non-Kubernetes resource such as Prometheus. Because RBAC rules are cluster-scoped, you must create or add to an existing ClusterRoleBinding:

  1. Create a ClusterRoleBinding that binds the role for the RBAC sidecar proxy with a service account:

    • Create a ClusterRoleBinding:

      $ kubectl create clusterrolebinding verticadb-operator-proxy-rolebinding \
          --clusterrole=verticadb-operator-proxy-role \
          --serviceaccount=namespace:serviceaccount
      
    • Add a service account to an existing ClusterRoleBinding:

      $ kubectl patch clusterrolebinding verticadb-operator-proxy-rolebinding \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace" } }]'
      
  2. Create a ClusterRoleBinding that binds the role for the non-Kubernetes object to the RBAC sidecar proxy service account:

    • Create a ClusterRoleBinding:

      $ kubectl create clusterrolebinding verticadb-operator-metrics-reader \
          --clusterrole=verticadb-operator-metrics-reader \
          --serviceaccount=namespace:serviceaccount \
          --group=system:authenticated
      
    • Bind the service account to an existing ClusterRoleBinding:

      $ kubectl patch clusterrolebinding verticadb-operator-metrics-reader \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace"},{"op":"add","path":"/subjects/-","value":{"kind": "Group", "name": "system:authenticated"} }]'
      
      $ kubectl patch clusterrolebinding verticadb-operator-metrics-reader \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace" } }]'
      

When you install the Helm chart, the ClusterRole and ClusterRoleBindings are created automatically. By default, the prometheus.expose parameter is set to EnableWithProxy, which creates the service object and exposes the operator's /metrics endpoint.

For details about creating a sidecar container, see VerticaDB custom resource definition.

Service object

Vertica provides a service object verticadb-operator-metrics-service to access the Prometheus /metrics endpoint. The VerticaDB operator does not manage this service object. By default, the service object uses the ClusterIP service type to support RBAC.

Connect to the /metrics endpoint at port 8443 with the following path:

https://verticadb-operator-metrics-service.namespace.svc.cluster.local:8443/metrics

Bearer token authentication

Kubernetes authenticates requests to the API server with service account credentials. Each pod is associated with a service account and has the following credentials stored in the filesystem of each container in the pod:

  • Token at /var/run/secrets/kubernetes.io/serviceaccount/token

  • Certificate authority (CA) bundle at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Use these credentials to authenticate to the /metrics endpoint through the service object. You must use the credentials for the service account that you used to create the ClusterRoleBindings.

For example, the following cURL request accesses the /metrics endpoint. Include the --insecure option only if you do not want to verify the serving certificate:

$ curl --insecure --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://verticadb-operator-metrics-service.vertica:8443/metrics

For additional details about service account credentials, see the Kubernetes documentation.

TLS client certificate authentication

Some environments might prevent you from authenticating to the /metrics endpoint with the service account token. For example, you might run Prometheus outside of Kubernetes. To allow external client connections to the /metrics endpoint, you have to supply the RBAC proxy sidecar with TLS certificates.

You must create a Secret that contains the certificates, and then use the prometheus.tlsSecret Helm chart parameter to pass the Secret to the RBAC proxy sidecar when you install the Helm chart. The following steps create the Secret and install the Helm chart:

  1. Create a Secret that contains the certificates:

    $ kubectl create secret generic metrics-tls --from-file=tls.key=/path/to/tls.key --from-file=tls.crt=/path/to/tls.crt --from-file=ca.crt=/path/to/ca.crt
    
  2. Install the Helm chart with prometheus.tlsSecret set to the Secret that you just created:

    $ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
      --set prometheus.tlsSecret=metrics-tls
    

    The prometheus.tlsSecret parameter forces the RBAC proxy to use the TLS certificates stored in the Secret. Otherwise, the RBAC proxy sidecar generates its own self-signed certificate.

After you install the Helm chart, you can authenticate to the /metrics endpoint with the certificates in the Secret. For example:

$ curl --key tls.key --cert tls.crt --cacert ca.crt https://verticadb-operator-metrics-service.vertica.svc:8443/metrics

HTTP access

You might have an environment that does not require privileged access to Prometheus metrics. For example, you might run Prometheus outside of Kubernetes.

To allow external access to the /metrics endpoint with HTTP, set prometheus.expose to EnableWithoutAuth. For example:

$ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
    --set prometheus.expose=EnableWithoutAuth

Service object

Vertica provides a service object verticadb-operator-metrics-service to access the Prometheus /metrics endpoint. The VerticaDB operator does not manage this service object. By default, the service object uses the ClusterIP service type, so you must change the serviceType for external client access. The service object's fully-qualified domain name (FQDN) is as follows:

verticadb-operator-metrics-service.namespace.svc.cluster.local

Connect to the /metrics endpoint at port 8443 with the following path:

http://verticadb-operator-metrics-service.namespace.svc.cluster.local:8443/metrics

Prometheus operator integration (optional)

Vertica on Kubernetes integrates with the Prometheus operator, which provides custom resources (CRs) that simplify targeting metrics. Vertica supports the ServiceMonitor CR that discovers the VerticaDB operator automatically, and authenticates requests with a bearer token.

The ServiceMonitor CR is available as a release artifact in our GitHub repository. See Helm chart parameters for details about the prometheus.createServiceMonitor parameter.

Disabling Prometheus

To disable Prometheus, set the prometheus.expose Helm chart parameter to Disable:

$ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
    --set prometheus.expose=Disable

For details about Helm install commands, see Installing the VerticaDB operator.

Metrics

The following table describes the available VerticaDB operator metrics:

Name Type Description
controller_runtime_active_workers gauge Number of currently used workers per controller.
controller_runtime_max_concurrent_reconciles gauge Maximum number of concurrent reconciles per controller.
controller_runtime_reconcile_errors_total counter Total number of reconciliation errors per controller.
controller_runtime_reconcile_time_seconds histogram Length of time per reconciliation per controller.
controller_runtime_reconcile_total counter Total number of reconciliations per controller.
controller_runtime_webhook_latency_seconds histogram Histogram of the latency of processing admission requests.
controller_runtime_webhook_requests_in_flight gauge Current number of admission requests being served.
controller_runtime_webhook_requests_total counter Total number of admission requests by HTTP status code.
go_gc_duration_seconds summary A summary of the pause duration of garbage collection cycles.
go_goroutines gauge Number of goroutines that currently exist.
go_info gauge Information about the Go environment.
go_memstats_alloc_bytes gauge Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter Total number of frees.
go_memstats_gc_sys_bytes gauge Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge Number of heap bytes that are in use.
go_memstats_heap_objects gauge Number of allocated objects.
go_memstats_heap_released_bytes gauge Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter Total number of pointer lookups.
go_memstats_mallocs_total counter Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge Number of bytes obtained from system.
go_threads gauge Number of OS threads created.
process_cpu_seconds_total counter Total user and system CPU time spent in seconds.
process_max_fds gauge Maximum number of open file descriptors.
process_open_fds gauge Number of open file descriptors.
process_resident_memory_bytes gauge Resident memory size in bytes.
process_start_time_seconds gauge Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge Maximum amount of virtual memory available in bytes.
vertica_cluster_restart_attempted_total counter The number of times we attempted a full cluster restart.
vertica_cluster_restart_failed_total counter The number of times we failed when attempting a full cluster restart.
vertica_cluster_restart_seconds histogram The number of seconds it took to do a full cluster restart.
vertica_nodes_restart_attempted_total counter The number of times we attempted to restart down nodes.
vertica_nodes_restart_failed_total counter The number of times we failed when trying to restart down nodes.
vertica_nodes_restart_seconds histogram The number of seconds it took to restart down nodes.
vertica_running_nodes_count gauge The number of nodes that have a running pod associated with it.
vertica_subclusters_count gauge The number of subclusters that exist.
vertica_total_nodes_count gauge The number of nodes that currently exist.
vertica_up_nodes_count gauge The number of nodes that have vertica running and can accept connections.
vertica_upgrade_total counter The number of times the operator performed an upgrade caused by an image change.
workqueue_adds_total counter Total number of adds handled by workqueue.
workqueue_depth gauge Current depth of workqueue.
workqueue_longest_running_processor_seconds gauge How many seconds has the longest running processor for workqueue been running.
workqueue_queue_duration_seconds histogram How long in seconds an item stays in workqueue before being requested.
workqueue_retries_total counter Total number of retries handled by workqueue.
workqueue_unfinished_work_seconds gauge How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
workqueue_work_duration_seconds histogram How long in seconds processing an item from workqueue takes.

3.6 - Secrets management

The Kubernetes declarative model requires that you develop applications with manifest files or command line interactions with the Kubernetes API. These workflows expose your sensitive information in your application code and shell history, which compromises your application security.

To mitigate any security risks, Kubernetes uses the concept of a secret to store this sensitive information. A secret is an object with a plain text name and a value stored as a base64 encoded string. When you reference a secret by name, Kubernetes retrieves and decodes its value. This lets you openly reference confidential information in your application code and shell without compromising your data.

Kubernetes supports secret workflows with its native Secret object, and cloud providers offer solutions that store your confidential information in a centralized location for easy management. By default, Vertica on Kubernetes supports native Secrets objects, and it also supports cloud solutions so that you have options for storing your confidential data.

For best practices about handling confidential data in Kubernetes, see the Kubernetes documentation.

Manually encode data

In some circumstances, you might need to manually base64 encode your secret value and add it to a Secret manifest or a cloud service secret manager. You can base64 encode data with tools available in your shell. For example, pass the string value to the echo command, and pipe the output to the base64 command to encode the value. In the echo command, include the -n option so that it does not append a newline character:

$ echo -n 'secret-value' | base64
c2VjcmV0LXZhbHVl

You can take the output of this command and add it to a Secret manifest or cloud service secret manager.

Kubernetes Secrets

A Secret is an Kubernetes object that you can reference by name that conceals confidential data in a base64 encoded string. For example, you can create a Secret named su-password that stores the database superuser password. In a manifest file, you can add su-password in place of the literal password value, and then you can safely store the manifest in a file system or pass it on the command line.

The idiomatic way to create a Secret in Kubernetes is with the kubectl command-line tool's create secret command, which provides options to create Secret object from various data sources. For example, the following command creates a Secret named superuser-password from a literal value passed on the command line:

$ kubectl create secret generic superuser-password \
    --from-literal=password=secret-value
secret/superuser-password created

Instead of creating a Kubernetes Secret with kubectl, you can manually base64 encode a string on the command line, and then add the encoded output to a Secrets manifest.

Cloud providers

Cloud providers offer services that let you store sensitive information in a central location and reference it securely. Vertica on Kubernetes requires a specific format for secrets stored in cloud providers. In addition, each cloud provider requires unique configuration before you can add a secret to your VerticaDB custom resource (CR).

The following VerticaDB CR parameters accept secrets from cloud services:

  • communal.credentialSecret
  • nmaTLSSecret
  • passwordSecret

Format requirements

Cloud provider secrets consist of a name and a secret value. To provide flexibility, cloud services let you store the value in a variety of formats. Vertica on Kubernetes requires that you format the secret value as a JSON document consisting of plain text string keys and base64 encoded values. For example, you might have a secret named tlsSecrets whose value is a JSON document in the following format:

{
  "ca.crt": "base64-endcoded-ca.crt",
  "tls.crt" "base64-endcoded-tls.crt",
  "tls.key": "base64-endcoded-tls.key",
  "password": "base64-endcoded-password"
}

Amazon Web Services

Amazon Web Services (AWS) provides the AWS Secrets Manager, a storage system for your sensitive data. To access secrets from your AWS console, go to Services > Security, Identity, & Compliance > Secrets Manager.

IAM permissions

Before you can add a secret to a CR, you must grant the following permissions to the VerticaDB operator pod and the Vertica server pods so they can access AWS Secret Manager. You can grant these permissions to the worker node's IAM policy or the IAM roles for service account (IRSA):

For instructions about adding permissions to an AWS Secrets Manager secret, see the AWS documentation. For details about Vertica on Kubernetes and AWS IRSA, see Configuring communal storage.

Adding a secret to a CR

AWS stores secrets with metadata that describe and track changes to the secret. An important piece of metadata is the Amazon Resource Name (ARN), a unique identifier for the secret. The ARN uses the following format:

arn:aws:secretsmanager:region:accountId:secret:SecretName-randomChars

To use an AWS secret in a CR, you have to add the ARN to the applicable CR parameter and prefix it with awssm://. For example:

    spec:
      ...
      passwordSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myPasswordSecret-randomChars
      nmaTLSSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myNmaTLSSecret-randomChars
      communal:
        credentialSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myCredentialSecret-randomChars
        path: s3://bucket-name/key-name
        ...

Google Cloud Platform

Google Cloud provides Google Secret Manager, a storage system for your sensitive data. To access your secrets from your Google Cloud console, go to Security > Secret Manager.

When you pass a Google secret as a CRD parameter, use the secret's resource name. The resource name uses the following format:

projects/project-id/secrets/secret-name/versions/version-number

To use a Secret Manager secret in a CR, you have to add the resource name to the applicable CR parameter and prefix it with gsm://. For example:

    spec:
      ...
      passwordSecret: gsm://projects/project-id/secrets/password-secret/versions/version-number
      nmaTLSSecret: gsm://projects/project-id/secrets/nma-certs-secret/versions/version-number
      communal:
        credentialSecret: gsm://projects/project-id/secrets/gcp-creds-secret/versions/version-number
        path: gs://bucket-name/path/to/database-name
        ...

4 - Configuring communal storage

Vertica on Kubernetes supports a variety of communal storage providers to accommodate your storage requirements.

Vertica on Kubernetes supports a variety of communal storage providers to accommodate your storage requirements. Each storage provider uses authentication methods that conceal sensitive information so that you can declare that information in your Custom Resource (CR) without exposing any literal values.

AWS S3 or S3-Compatible storage

Vertica on Kubernetes supports multiple authentication methods for Amazon Web Services (AWS) communal storage locations and private cloud S3 storage such as MinIO.

For additional details about Vertica and AWS, see Vertica on Amazon Web Services.

Secrets authentication

To connect to an S3-compatible storage location, create a Secret to store both your communal access and secret key credentials. Then, add the Secret, path, and S3 endpoint to the CR spec.

  1. The following command stores both your S3-compatible communal access and secret key credentials in a Secret named s3-creds:

    $ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
    
  2. Add the Secret to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: s3-creds
        endpoint: https://path/to/s3-endpoint
        path: s3://bucket-name/key-name
        ...
    

For a detailed description of an S3-compatible storage implementation, see VerticaDB custom resource definition.

IAM profile authentication

Identify and access management (IAM) profiles manage user identities and control which services and resources a user can access. IAM authentication to Vertica on Kubernetes reduces the number of manual updates when you rotate your access keys.

The IAM profile must have read and write access to the communal storage. The IAM profile is associated with the EC2 instances that run worker nodes.

  1. Create an EKS node group using a Node IAM role with a policy that allows read and write access to the S3 bucket used for communal storage.

  2. Deploy the VerticaDB operator in a namespace. For details, see Installing the VerticaDB operator.

  3. Create a VerticaDB custom resource (CR), and omit the communal.credentialSecret field:

    spec:
      ...
      communal:
        endpoint: https://path/to/s3-endpoint
        path: s3://bucket-name/key-name
    

When the Vertica server accesses the communal storage location, it uses the policy associated to the EKS node.

For additional details about authenticating to Vertica with an IAM profile, see AWS authentication.

IRSA profile authentication

You can use IAM roles for service accounts (IRSA) to associate an IAM role with a Kubernetes service account. You must set the IAM policies for the Kubernetes service account, and then pods running that service account have the IAM policies.

Before you begin, complete the following prerequisites:

  • Configure the EKS cluster's control plane. For details, see the Amazon documentation.

  • Create a bucket policy that has access to the S3 communal storage bucket. For details, see the Amazon documentation.

  1. Create an EKS node group using a Node IAM role that does not have S3 access.

  2. Use eksctl to create the IAM OpenID Connect (OIDC) provider for your EKS cluster:

    $ eksctl utils associate-iam-oidc-provider --cluster cluster --approve
    2022-10-07 08:31:37 []  will create IAM Open ID Connect provider for cluster "cluster" in "us-east-1"
    2022-10-07 08:31:38 []  created IAM Open ID Connect provider for cluster "cluster" in "us-east-1"
    
  3. Create the Kubernetes namespace where you plan to create the iamserviceaccount. The following command creates the vertica namespace:

    $ kubectl create ns vertica
    namespace/vertica created
    
  4. Use eksctl to create a Kubernetes service account in the vertica namespace. When you create a service account with eksctl, you can attach an IAM policy that allows S3 access:

    $ eksctl create iamserviceaccount --name shared-service-account --namespace vertica --cluster cluster --attach-policy-arn arn:aws:iam::profile:policy/policy --approve
    2022-10-07 08:38:32 []  1 iamserviceaccount (vertica/my-serviceaccount) was included (based on the include/exclude rules)
    2022-10-07 08:38:32 [!]  serviceaccounts that exist in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
    2022-10-07 08:38:32 []  1 task: {
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "vertica/my-serviceaccount",
            create serviceaccount "vertica/my-serviceaccount",
        } }2022-10-07 08:38:32 []  building iamserviceaccount stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:38:33 []  deploying stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:38:33 []  waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:39:03 []  waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:39:04 []  created serviceaccount "vertica/my-serviceaccount"
    
  5. Create a VerticaDB custom resource (CR). Specify the service account with the serviceAccountName field, and omit the communal.credentialSecret field:

    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
      name: irsadb
      annotations:
        vertica.com/run-nma-in-sidecar: "false"
    spec:
      image: vertica/vertica-k8s:12.0.3-0
      serviceAccountName: shared-service-account
      communal:
        path: "s3://path/to/s3-endpoint
        endpoint: https://s3.amazonaws.com
      subclusters:
        - name: sc
          size: 3
    

When pods are created, they use the service account that has a policy that provides access to the S3 communal storage.

Server-side encryption

If your S3 communal storage uses server-side encryption (SSE), you must configure the encryption type when you create the CR. Vertica supports the following types of SSE:

  • SSE-S3
  • SSE-KMS
  • SSE-C

For details about Vertica support for each encryption type, see S3 object store.

The following tabs provide examples on how to implement each SSE type. For details about the parameters, see Custom resource definition parameters.

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: verticadb
  annotations:
    vertica.com/run-nma-in-sidecar: "false"
spec:
  communal:
    path: "s3://bucket-name"
    s3ServerSideEncryption: SSE-S3

This setting requires that you use the communal.additionalConfig parameter to pass the key identifier (not the key) of the Key management service. Vertica must have permission to use the key, which is managed through KMS:

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: verticadb
  annotations:
    vertica.com/run-nma-in-sidecar: "false"
spec:
  communal:
    path: "s3://bucket-name"
    s3ServerSideEncryption: SSE-KMS
    additionalConfig:
      S3SseKmsKeyId: "kms-key-identifier"

Store the client key contents in a Secret and reference the Secret in the CR. The client key must be either a 32-character plaintext key or a 44-character base64-encoded key.

You must create the Secret in the same namespace as the CR:

  1. Create a Secret that stores the client key contents in the stringData.clientKey field:
   apiVersion: v1
   kind: Secret
   metadata:
     name: sse-c-key
     annotations:
       vertica.com/run-nma-in-sidecar: "false"
   stringData:
     clientKey: client-key-contents
   
  1. Add the Secret to the CR with the communal.s3SseCustomerKeySecret parameter:
   apiVersion: vertica.com/v1
   kind: VerticaDB
   metadata:
     name: verticadb
     annotations:
       vertica.com/run-nma-in-sidecar: "false"
   spec:
     communal:
       path: "s3://bucket-name"
       s3ServerSideEncryption: SSE-C
       s3SseCustomerKeySecret: "sse-c-key"
   ...
   

Google Cloud Storage

Authenticating to Google Cloud Storage (GCS) requires your hash-based message authentication code (HMAC) access and secret keys, and the path to your GCS bucket. For details about HMAC keys, see Eon Mode on GCP prerequisites.

You have two authentication options: you can authenticate with Kubernetes Secrets, or you can use the keys stored in Google Secret Manager.

For additional details about Vertica and GCS, see Vertica on Google Cloud Platform.

Kubernetes Secret authentication

To authenticate with a Kubernetes Secret, create the secret and add it to the CR manifest:

  1. The following command stores your HMAC access and secret key in a Secret named gcs-creds:

    $ kubectl create secret generic gcs-creds --from-literal=accesskey=accessKey --from-literal=secretkey=secretkey
    
  2. Add the Secret and the path to the GCS bucket that contains your Vertica database to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: gcs-creds
        path: gs://bucket-name/path/to/database-name
        ...
    

Azure Blob Storage

Micosoft Azure provides a variety of options to authenticate to Azure Blob Storage location. Depending on your environment, you can use one of the following combinations to store credentials in a Secret:

  • accountName and accountKey

  • accountName and shared access signature (SAS)

If you use an Azure storage emulator such as Azurite in a tesing environment, you can authenticate with accountName and blobStorage values.

  1. The following command stores accountName and accountKey in a Secret named azb-creds:

    $ kubectl create secret generic azb-creds --from-literal=accountKey=accessKey --from-literal=accountName=accountName
    

    Alternately, you could store your accountName and your SAS credentials in azb-creds:

    $ kubectl create secret generic azb-creds --from-literal=sharedAccessSignature=sharedAccessSignature --from-literal=accountName=accountName
    
  2. Add the Secret and the path that contains your AZB storage bucket to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: azb-creds
        path: azb://accountName/bucket-name/database-name
        ...
    

For details about Vertica and authenticating to Microsoft Azure, see Eon Mode on GCP prerequisites.

Hadoop file storage

Connect to Hadoop Distributed Filesystem (HDFS) communal storage with the standard webhdfs scheme, or the swebhdfs scheme for wire encryption. In addition, you must add your HDFS configuration files in a ConfigMap, a Kubernetes object that stores data in key-value pairs. You can optionally configure Kerberos to authenticate connections to your HDFS storage location.

The following example uses the swebhdfs wire encryption scheme that requires a certificate authority (CA) bundle in the CR spec.

  1. The following command stores a PEM-encoded CA bundle in a Secret named hadoop-cert:

    $ kubectl create secret generic hadoop-cert --from-file=ca-bundle.pem
    
  2. HDFS configuration files are located in the /etc/hadoop directory. The following command creates a ConfigMap named hadoop-conf:

    $ kubectl create configmap hadoop-conf --from-file=/etc/hadoop
    
  3. Add the configuration values to the communal and certSecrets sections of the spec:

    spec:
      ...
      hadoopConfig: hadoop-conf
      communal:
        path: "swebhdfs://path/to/database"
    
        caFile: /certs/hadoop-cert/ca-bundle.pem
      certSecrets:
        - name: hadoop-cert
      ...
    

    The previous example defines the following:

    • hadoopConfig: ConfigMap that stores the contents of the /etc/hadoop directory.
    • communal.path: Path to the database, using the wire encryption scheme. Enclose the path in double quotes.
    • communal.caFile: Mount path in the container filesystem containing the CA bundle used to create the hadoop-cert Secret.
    • certSecrets.name: Secret containing the CA bundle.

For additional details about HDFS and Vertica, see Apache Hadoop integration.

Kerberos authentication (optional)

Vertica authenticates connections to HDFS with Kerberos. The Kerberos configuration between Vertica on Kubernetes is the same as between a standard Eon Mode database and Kerberos, as described in Kerberos authentication.

  1. The following command stores the krb5.conf and krb5.keytab files in a Secret named krb5-creds:

    $ kubectl create secret generic krb5-creds --from-file=kerberos-conf=/etc/krb5.conf --from-file=kerberos-keytab=/etc/krb5.keytab
    

    Consider the following when managing the krb5.conf and krb5.keytab files in Vertica on Kubernetes:

    • Each pod uses the same krb5.keytab file, so you must update the krb5.keytab file before you begin any scaling operation.

    • When you update the contents of the krb5.keytab file, the operator updates the mounted files automatically, a process that does not require a pod restart.

    • The krb5.conf file must include a [domain_realm] section that maps the Kubernetes cluster domain to the Kerberos realm. The following example maps the default .cluster.local domain to a Kerberos realm named EXAMPLE.COM:

      [domain_realm]
        .cluster.local = EXAMPLE.COM
      
  2. Add the Secret and additional Kerberos configuration information to the CR:

    spec:
      ...
      hadoopConfig: hadoop-conf
      communal:
        path: "swebhdfs://path/to/database"
        additionalConfig:
          kerberosServiceName: verticadb
          kerberosRealm: EXAMPLE.COM    
      kerberosSecret: krb5-creds
      ...
    

The previous example defines the following:

  • hadoopConfig: ConfigMap that stores the contents of the /etc/hadoop directory.
  • communal.path: Path to the database, using the wire encryption scheme. Enclose the path in double quotes.
  • communal.additionalConfig.kerberosServiceName: Service name for the Vertica principal.
  • communal.additionalConfig.kerberosRealm: Realm portion of the principal.
  • kerberosSecret: Secret containing the krb5.conf and krb5.keytab files.

For a complete definition of each of the previous values, see Custom resource definition parameters.

5 - Custom resource definitions

The custom resource definition (CRD) is a shared global object that extends the Kubernetes API beyond the standard resource types.

The custom resource definition (CRD) is a shared global object that extends the Kubernetes API beyond the standard resource types. The CRD serves as a blueprint for custom resource (CR) instances. You create CRs that specify the desired state of your environment, and the operator monitors the CR to maintain state for the objects within its namespace.

5.1 - VerticaDB custom resource definition

The VerticaDB custom resource definition (CRD) deploys an Eon Mode database. Each subcluster is a StatefulSet, a workload resource type that persists data with ephemeral Kubernetes objects.

A VerticaDB custom resource (CR) requires a primary subcluster and a connection to a communal storage location to persist its data. The VerticaDB operator monitors the CR to maintain its desired state and validate state changes.

The following sections provide a YAML-formatted manifest that defines the minimum required fields to create a VerticaDB CR, and each subsequent section implements a production-ready recommendation or best practice using custom resource parameters. For a comprehensive list of all parameters and their definitions, see custom resource parameters.

Prerequisites

Minimal manifest

At minimum, a VerticaDB CR requires a connection to an empty communal storage bucket and a primary subcluster definition. The operator is namespace-scoped, so make sure that you apply the CR manifest in the same namespace as the operator.

The following VerticaDB CR connects to S3 communal storage and deploys a three-node primary subcluster on three nodes. This manifest serves as the starting point for all implementations detailed in the subsequent sections:

      
apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: cr-name
spec:
  licenseSecret: vertica-license
  passwordSecret: su-password
  communal:
    path: "s3://bucket-name/key-name"
    endpoint: "https://path/to/s3-endpoint"
    credentialSecret: s3-creds
    region: region
  subclusters:
    - name: primary
      size: 3
  shardCount: 6

The following sections detail the minimal manifest's CR parameters, and how to create the CR in the current namespace.

Required fields

Each VerticaDB manifest begins with required fields that describe the version, resource type, and metadata:

  • apiVersion: The API group and Kubernetes API version in api-group/version format.
  • kind: The resource type. VerticaDB is the name of the Vertica custom resource type.
  • metadata: Data that identifies objects in the namespace.
  • metadata.name: The name of this CR object. Provide a unique metadata.name value so that you can identify the CR and its resources in its namespace.

spec definition

The spec field defines the desired state of the CR. The operator control loop compares the spec definition to the current state and reconciles any differences.

Nest all fields that define your StatefulSet under the spec field.

Add a license

By default, the Helm chart pulls the free Vertica Community Edition (CE) image. The CE image has a restricted license that limits you to a three-node cluster and 1TB of data.

To add your license so that you can deploy more nodes and use more data, store your license in a Secret and add it to the manifest:

  1. Create a Secret from your Vertica license file:
    $ kubectl create secret generic vertica-license --from-file=license.dat=/path/to/license-file.dat
    
  2. Add the Secret to the licenseSecret field:
    ...
    spec:
      licenseSecret: vertica-license
      ...
    

The licenseSecret value is mounted in the Vertica server container in the /home/dbadmin/licensing/mnt directory.

Add password authentication

The passwordSecret field enables password authentication for the database. You must define this field when you create the CR—you cannot define a password for an existing database.

To create a database password, conceal it in a Secret before you add it to the mainfest:

  1. Create a Secret from a literal string. You must use password as the key:
    $ kubectl create secret generic su-passwd --from-literal=password=password-value 
    
  2. Add the Secret to the passwordSecret field:
    ...
    spec:
      ...
      passwordSecret: su-password
    

Connect to communal storage

Vertica on Kubernetes supports multiple communal storage locations. For implementation details for each communal storage location, see Configuring communal storage.

This CR connects to an S3 communal storage location. Define your communal storage location with the communal field:

      
...
spec:
  ...
  communal:
    path: "s3://bucket-name/key-name"
    endpoint: "https://path/to/s3-endpoint"
    credentialSecret: s3-creds
    region: region
  ...

This manifest sets the following parameters:

  • credentialSecret: The Secret that contains your communal access and secret key credentials.

    The following command stores both your S3-compatible communal access and secret key credentials in a Secret named s3-creds:

    $ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
    

  • endpoint: The S3 endpoint URL.

  • path: The location of the S3 storage bucket, in S3 bucket notation. This bucket must exist before you create the custom resource. After you create the custom resource, you cannot change this value.

  • region: The geographic location of the communal storage resources. This field is valid for AWS and GCP only. If you set the wrong region, you cannot connect to the communal storage location.

Define a primary subcluster

Each CR requires a primary subcluster or it returns an error. At minimum, you must define the name and size of the subcluster:

...
spec:
  ...
  subclusters:
    - name: primary
      size: 3
  ...

This manifest sets the following parameters:

  • name: The name of the subcluster.
  • size: The number of pods in the subcluster.

When you define a CR with a single subcluster, the operator designates it as the primary subcluster. If your manifest includes multiple subclusters, you must use the type parameter to identify the primary subcluster. For example:

spec:
  ...
  subclusters:
    - name: primary
      size: 3
      type: primary
    - name: secondary
      size: 3

For additional details about primary and secondary subclusters, see Subclusters.

Set the shard count

shardCount specifies the number of shards in the database, which determines how subcluster nodes subscribe to communal storage data. You cannot change this value after you instantiate the CR. When you change the number of pods in a subcluster or add or remove a subcluster, the operator rebalances shards automatically.

Vertica recommends that the shard count equals double the number of nodes in the cluster. Because this manifest creates a three-node cluster with one Vertica server container per node, set shardCount to 6:

...
spec:
  ...
  shardCount: 6

For guidance on selecting the shard count, see Configuring your Vertica cluster for Eon Mode. For details about limiting each node to one Vertica server container, see Node affinity.

Apply the manifest

After you define the minimal manifest in a YAML-formatted file, use kubectl to create the VerticaDB CR. The following command creates a CR in the current namespace:

$ kubectl apply -f minimal.yaml
verticadb.vertica.com/cr-name created

After you apply the manifest, the operator creates the primary subcluster, connects to the communal storage, and creates the database. You can use kubectl wait to see when the database is ready:

$ kubectl wait --for=condition=DBInitialized=True vdb/cr-name --timeout=10m 
verticadb.vertica.com/cr-name condition met

Specify an image

Each time the operator launches a container, it pulls the image for the most recently released Vertica version from the OpenText Dockerhub repository. Vertica recommends that you explicitly set the image that the operator pulls for your CR. For a list of available Vertica images, see the OpenText Dockerhub registry.

To run a specific image version, set the image parameter in docker-registry-hostname/image-name:tag format:

spec:
  ...
  image: vertica/vertica-k8s:version

When you specify an image other than the latest, the operator pulls the image only when it is not available locally. You can control when the operator pulls the image with the imagePullPolicy custom resource parameter.

Communal storage authentication

Your communal storage validates HTTPS connections with a self-signed certificate authority (CA) bundle. You must make the CA bundle's root certificate available to each Vertica server container so that the communal storage can authenticate requests from your subcluster.

This authentication requires that you set the following parameters:

  • certSecrets: Adds a Secret that contains the root certificate.

    This parameter is a list of Secrets that encrypt internal and external communications for your CR. Each certificate is mounted in the Vertica server container filesystem in the /certs/Secret-name/cert-name directory.

  • communal.caFile: Makes the communal storage location aware of the mount path that stores the certificate Secret.

Complete the following to add these parameters to the manifest:

  1. Create a Secret that contains the PEM-encoded root certificate. The following command creates a Secret named aws-cert:
    $ kubectl create secret generic aws-cert --from-file=root-cert.pem
    
  2. Add the certSecrets and communal.caFile parameters to the manifest:
    spec:
      ...
        communal:
          ...
          caFile: /certs/aws-cert/root_cert.pem
        certSecrets:
          - name: aws-cert
    

Now, the communal storage authenticates requests with the /certs/aws-cert/root_cert.pem file, whose contents are stored in the aws-cert Secret.

External client connections

Each subcluster communicates with external clients and internal pods through a service object. To configure the service object to accept external client connections, set the following parameters:

  • serviceName: Assigns a custom name to the service object. A custom name lets you identify it among multiple subclusters.

    Service object names use the metadata.name-serviceName naming convention.

  • serviceType: Defines the type of the subcluster service object.

    By default, a subcluster uses the ClusterIP serviceType, which sets a stable IP and port that is accessible from within Kubernetes only. In many circumstances, external client applications need to connect to a subcluster that is fine-tuned for that specific workload. For external client access, set the serviceType to NodePort or LoadBalancer.

  • serviceAnnotations: Assigns a custom annotation to the service object for implementation-specific services.

Add these external client connection parameters under the subclusters field:

spec:
  ...
  subclusters:
    ...
    serviceName: connections
    serviceType: LoadBalancer
    serviceAnnotations:
      service.beta.kubernetes.io/load-balancer-source-ranges: 10.0.0.0/24

This example creates a LoadBalancer service object named verticadb-connections. The serviceAnnotations parameter defines the CIDRs that can access the network load balancer (NLB). For additional details, see the AWS Load Balancer Controller documentation.

For additional details about Vertica and service objects, see Containerized Vertica on Kubernetes.

Authenticate clients

You might need to connect applications or command-line interface (CLI) tools to your VerticaDB CR. You can add TLS certificates that authenticate client requests with the certSecrets parameter:

  1. Create a Secret that contains your TLS certificates. The following command creates a Secret named mtls:
    $ kubectl create secret generic mtls --from-file=mtls=/path/to/mtls-cert
    
  2. Add the Secret to the certSecrets parameter:
    spec:
      ...
      certSecrets:
        ...
        - name: mtls
    
    This mounts the TLS certificates in the /certs/mtls/mtls-cert directory.

Sidecar logger

A sidecar is a utility container that runs in the same pod as your main application container and performs a task for that main application's process. The VerticaDB CR uses a sidecar container to handle logs for the Vertica server container. You can use the vertica-logger image to add a sidecar that sends logs from vertica.log to standard output on the host node for log aggregation.

Add a sidecar with the sidecars parameter. This parameter accepts a list of sidecar definitions, where each element specifies the following:

  • name: Name of the sidecar. name indicates the beginning of a sidecar element.
  • image: Image for the sidecar container.

The following example adds a single sidecar container that shares a pod with each Vertica server container:

spec:
  ...
  sidecars:
    - name: sidecar-container
      image: sidecar-image:latest

This configuration persists logs only for the lifecycle of the container. To persist log data between pod lifecycles, you must mount a custom volume in the sidecar filesystem.

Persist logs with a volume

An external service that requires long-term access to Vertica server data should use a volume to persist that data between pod lifecycles. For details about volumes, see the Kubernetes documentation.

The following parameters add a volume to your CR and mounts it in a sidecar container:

  • volumes: Make a custom volume available to the CR so that you can mount it in a container filesystem. This parameter requires a name value and a volume type.
  • sidecars[i].volumeMounts: Mounts one or more volumes in the sidecar container filesystem. This parameter requires a name value and a mountPath value that defines where the volume is mounted in the sidecar container.

The following example creates a volume of type emptyDir, and mounts it in the sidecar-container filesystem:

spec:
  ...
  volumes:
    - name: sidecar-vol
      emptyDir: {}
  ...
  sidecars:
    - name: sidecar-container
      image: sidecar-image:latest
      volumeMounts:
        - name: sidecar-vol
          mountPath: /path/to/sidecar-vol

Resource limits and requests

You should limit the amount of CPU and memory resources that each host node allocates for the Vertica server pod, and set the amount of resources each pod can request.

To control these values, set the following parameters under the subclusters.resources field:

  • limits.cpu: Maximum number of CPUs that each server pod can consume.
  • limits.memory: Maximum amount of memory that each server pod can consume.
  • requests.cpu: Number CPUs that each pod requests from the host node.
  • requests.memory: Amount of memory that each pod requests from a PV.

When you change resource settings, Kubernetes restarts each pod with the updated settings.

As a best practice, set resource.limits.* and resource.requests.* to equal values so that the pods are assigned to the Guaranteed Quality of Service (QoS) class. Equal settings also provide the best safeguard against the Out Of Memory (OOM) Killer in constrained environments.

The following example allocates 32 CPUs and 96 gigabytes of memory on the host node, and limits the requests to the same values. Because the limits.* and requests.* values are equal, the pods are assigned the Guaranteed QoS class:

spec:
  ...
  subclusters:
    ...
    resources:
      limits:
        cpu: 32
        memory: 96Gi
      requests:
        cpu: 32
        memory: 96Gi

Node affinity

Kubernetes affinity and anti-affinity settings control which resources the operator uses to schedule pods. As a best practice, you should set affinity to ensure that a single node does not serve more than one Vertica pod.

The following example creates an anti-affinity rule that schedules only one Vertica server pod per node:

spec:
  ...
  subclusters:
    ...
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - vertica
          topologyKey: "kubernetes.io/hostname"

The following provides a detailed explanation about all settings in the previous example:

  • affinity: Provides control over pod and host scheduling using labels.
  • podAntiAffinity: Uses pod labels to prevent scheduling on certain resources.
  • requiredDuringSchedulingIgnoredDuringExecution: The rules defined under this statement must be met before a pod is scheduled on a host node.
  • labelSelector: Identifies the pods affected by this affinity rule.
  • matchExpressions: A list of pod selector requirements that consists of a key, operator, and values definition. This matchExpression rule checks if the host node is running another pod that uses a vertica label.
  • topologyKey: Defines the scope of the rule. Because this uses the hostname topology label, this applies the rule in terms of pods and host nodes.

For additional details, see the Kubernetes documentation.

5.2 - EventTrigger custom resource definition

The EventTrigger custom resource definition (CRD) runs a task when the condition of a Kubernetes object changes to a specified status. EventTrigger extends the Kubernetes Job, a workload resource that creates pods, runs a task, then cleans up the pods after the task completes.

Prerequisites

  • Deploy a VerticaDB operator.
  • Confirm that you have the resources to deploy objects you plan to create.

Limitations

The EventTrigger CRD has the following limitations:

  • It can monitor a condition status on only one VerticaDB custom resource (CR).
  • You can match only one condition status.
  • The EventTrigger and the object that it watches must exist within the same namespace.

Creating an EventTrigger

An EventTrigger resource defines the Kubernetes object that you want to watch, the status condition that triggers the Job, and a pod template that contains the Job logic and provides resources to complete the Job.

This example creates a YAML-formatted file named eventtrigger.yaml. When you apply eventtrigger.yaml to your VerticaDB CR, it creates a single-column database table when the VerticaDB CR's DBInitialized condition status changes to True:

$ kubectl describe vdb verticadb-name
Status:
 ...
  Conditions:
    ...
    Last Transition Time:   transition-time
    Status:                 True 
    Type:                   DBInitialized

The following fields form the spec, which defines the desired state of the EventTrigger object:

  • references: The Kubernetes object whose condition status you want to watch.
  • matches: The condition and status that trigger the Job.
  • template: Specification for the pods that run the Job after the condition status triggers an event.

The following steps create an EventTrigger CR:

  1. Add the apiVersion, kind, and metadata.name required fields:

    apiVersion: vertica.com/v1beta1
    kind: EventTrigger
    metadata:
        name: eventtrigger-example
    
  2. Begin the spec definition with the references field. The object field is an array whose values identify the VerticaDB CR object that you want to watch. You must provide the VerticaDB CR's apiVersion, kind, and name:

    spec:
      references:
      - object:
          apiVersion: vertica.com/v1beta1
          kind: VerticaDB
          name: verticadb-example
    
  3. Define the matches field that triggers the Job. EventTrigger can match only one condition:

    spec:
      ...
      matches:
      - condition:
          type: DBInitialized
          status: "True"
    

    The preceding example defines the following:

    • condition.type: The condition that the operator watches for state change.
    • condition.status: The status that triggers the Job.
  4. Add the template that defines the pod specifications that run the Job after matches.condition triggers an event.

    A pod template requires its own spec definition, and it can optionally have its own metadata. The following example includes metadata.generateName, which instructs the operator to generate a unique, random name for any pods that it creates for the Job. The trailing dash (-) separates the user-provided portion from the generated portion:

    spec:
      ...
      template:
        metadata:
          generateName: create-user-table-
        spec:
          template:
            spec:
              restartPolicy: OnFailure
              containers:
              - name: main
                image: "vertica/vertica-k8s:latest"
                command: ["/opt/vertica/bin/vsql", "-h", "verticadb-sample-defaultsubcluster", "-f", "CREATE TABLE T1 (C1 INT);"]
    

    The remainder of the spec defines the following:

    • restartPolicy: When to restart all containers in the pod.
    • containers: The containers that run the Job.
      • name: The name of the container.
      • image: The image that the container runs.
      • command: An array that contains a command, where each element in the array combines to form a command. The final element creates the single-column SQL table.

Apply the manifest

After you create the EventTrigger, apply the manifest in the same namespace as the VerticaDB CR:

$ kubectl apply -f eventtrigger.yaml

eventtrigger.vertica.com/eventtrigger-example created
configmap/create-user-table-sql created

After you create the database, the operator runs a Job that creates a table. You can check the status with kubectl get job:

$ kubectl get job
NAME                COMPLETIONS   DURATION   AGE
create-user-table   1/1           4s         7s

Verify that the table was created in the logs:

$ kubectl logs create-user-table-guid
CREATE TABLE

Complete file reference

apiVersion: vertica.com/v1beta1
kind: EventTrigger
metadata:
    name: eventtrigger-example
spec:
  references:
  - object:
      apiVersion: vertica.com/v1beta1
      kind: VerticaDB
      name: verticadb-example
  matches:
  - condition:
      type: DBInitialized
      status: "True"
  template:
    metadata:
      generateName: create-user-table-
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: main
            image: "vertica/vertica-k8s:latest"
            command: ["/opt/vertica/bin/vsql", "-h", "verticadb-sample-defaultsubcluster", "-f", "CREATE TABLE T1 (C1 INT);"]

Monitoring an EventTrigger

The following table describes the status fields that help you monitor an EventTrigger CR:

Status Field Description
references[].apiVersion Kubernetes API version of the object that the EventTrigger CR watches.
references[].kind Type of object that the EventTrigger CR watches.
references[].name Name of the object that the EventTrigger CR watches.
references[].namespace Namespace of the object that the EventTrigger CR watches. The EventTrigger and the object that it watches must exist within the same namespace.
references[].uid Generated UID of the reference object. The operator generates this identifier when it locates the reference object.
references[].resourceVersion Current resource version of the object that the EventTrigger watches.
references[].jobNamespace If a Job was created for the object that the EventTrigger watches, the namespace of the Job.
references[].jobName If a Job was created for the object that the EventTrigger watches, the name of the Job.

5.3 - VerticaAutoscaler custom resource definition

The VerticaAutoscaler custom resource (CR) is a HorizontalPodAutoscaler that automatically scales resources for existing subclusters using one of the following strategies:.

The VerticaAutoscaler custom resource (CR) is a HorizontalPodAutoscaler that automatically scales resources for existing subclusters using one of the following strategies:

  • Subcluster scaling for short-running dashboard queries.

  • Pod scaling for long-running analytic queries.

The VerticaAutoscaler CR scales using resource or custom metrics. Vertica manages subclusters by workload, which helps you pinpoint the best metrics to trigger a scaling event. To maintain data integrity, the operator does not scale down unless all connections to the pods are drained and sessions are closed.

For details about the algorithm that determines when the VerticaAutoscaler scales, see the Kubernetes documentation.

Additionally, the VerticaAutoscaler provides a webhook to validate state changes. By default, this webhook is enabled. You can configure this webhook with the webhook.enable Helm chart parameter.

Examples

The examples in this section use the following VerticaDB custom resource. Each example uses CPU to trigger scaling:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: dbname
spec:
  communal:
    path: "path/to/communal-storage"
    endpoint: "path/to/communal-endpoint"
    credentialSecret: credentials-secret
  subclusters:
    - name: primary1
      size: 3
      isPrimary: true
      serviceName: primary1
      resources:
        limits:
          cpu: "8"
        requests:
          cpu: "4"

Prerequisites

  • Set a value for the metric that triggers scaling. For example, if you want to scale by CPU utilization, you must set CPU limits and requests.

Subcluster scaling

Automatically adjust the number of subclusters in your custom resource to fine-tune resources for short-running dashboard queries. For example, increase the number of subclusters to increase throughput. For more information, see Improving query throughput using subclusters.

All subclusters share the same service object, so there are no required changes to external service objects. Pods in the new subcluster are load balanced by the existing service object.

The following example creates a VerticaAutoscaler custom resource that scales by subcluster when the VerticaDB uses 50% of the node's available CPU:

  1. Define the VerticaAutoscaler custom resource in a YAML-formatted manifest:

    apiVersion: vertica.com/v1beta1
    kind: VerticaAutoscaler
    metadata:
      name: autoscaler-name
    spec:
      verticaDBName: dbname
      scalingGranularity: Subcluster
      serviceName: primary1
    
  2. Create the VerticaAutoscaler with the kubectl autoscale command:

    $ kubectl autoscale verticaautoscaler autoscaler-name --cpu-percent=50 --min=3 --max=12
    

    The previous command creates a HorizontalPodAutoscaler object that:

    • Sets the target CPU utilization to 50%.

    • Scales to a minimum of three pods in one subcluster, and 12 pods in four subclusters.

Pod scaling

For long-running, analytic queries, increase the pod count for a subcluster. For additional information about Vertica and analytic queries, see Using elastic crunch scaling to improve query performance.

When you scale pods in an Eon Mode database, you must consider the impact on database shards. For details, see Namespaces and shards.

The following example creates a VerticaAutoscaler custom resource that scales by pod when the VerticaDB uses 50% of the node's available CPU:

  1. Define the VerticaAutoScaler custom resource in a YAML-formatted manifest:

    apiVersion: vertica.com/v1beta1
    kind: VerticaAutoscaler
    metadata:
      name: autoscaler-name
    spec:
      verticaDBName: dbname
      scalingGranularity: Pod
      serviceName: primary1
    
  2. Create the autoscaler instance with the kubectl autoscale command:

    $ kubectl autoscale verticaautoscaler autoscaler-name --cpu-percent=50 --min=3 --max=12
    

    The previous command creates a HorizontalPodAutoscaler object that:

    • Sets the target CPU utilization to 50%.

    • Scales to a minimum of three pods in one subcluster, and 12 pods in four subclusters.

Event monitoring

To view the VerticaAutoscaler object, use the kubetctl describe hpa command:

$ kubectl describe hpa autoscaler-name
Name:                                                  as
Namespace:                                             vertica
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 12 Apr 2022 15:11:28 -0300
Reference:                                             VerticaAutoscaler/as
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (9m) / 50%
Min replicas:                                          3
Max replicas:                                          12
VerticaAutoscaler pods:                                3 current / 3 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range

When a scaling event occurs, you can view the admintools commands to scale the cluster. Use kubectl to view the StatefulSets:

$ kubectl get statefulsets
NAME                                                   READY   AGE
db-name-as-instance-name-0                             0/3     71s
db-name-primary1                                       3/3     39m

Use kubectl describe to view the executing commands:

$ kubectl describe vdb dbname | tail
  Upgrade Status:
Events:
  Type    Reason                   Age   From                Message
  ----    ------                   ----  ----                -------
  Normal  ReviveDBStart            41m   verticadb-operator  Calling 'admintools -t revive_db'
  Normal  ReviveDBSucceeded        40m   verticadb-operator  Successfully revived database. It took 25.255683916s
  Normal  ClusterRestartStarted    40m   verticadb-operator  Calling 'admintools -t start_db' to restart the cluster
  Normal  ClusterRestartSucceeded  39m   verticadb-operator  Successfully called 'admintools -t start_db' and it took 44.713787718s
  Normal  SubclusterAdded          10s   verticadb-operator  Added new subcluster 'as-0'
  Normal  AddNodeStart             9s    verticadb-operator  Calling 'admintools -t db_add_node' for pod(s) 'db-name-as-instance-name-0-0, db-name-as-instance-name-0-1, db-name-as-instance-name-0-2'

5.4 - VerticaRestorePointsQuery custom resource definition

The VerticaRestorePointsQuery custom resource (CR) retrieves details about saved restore points that you can use to roll back your database to a previous state or restore specific objects in a VerticaDB CR.

A VerticaRestorePointsQuery CR defines query parameters that the VerticaDB operator uses to retrieve restore points from an archive. A restore point is a snapshot of a database at a specific point in time that can consist of an entire database or a subset of database objects. Each restore point has a unique identifier and a timestamp. An archive is a collection of chronologically organized restore points.

You specify the archive and an optional period of time, and the operator queries the archive and retrieves details about restore points saved in the archive. You can use the query results to revive a VerticaDB CR with the data saved in the restore point.

Prerequisites

Save restore points

Before the VerticaDB operator can retrieve restore points, you must create an archive and save restore points to that archive. You can leverage stored procedures and scheduled execution to save restore points to an archive on a regular schedule. In the following sections, you schedule a stored procedure to save restore points to an archive every night at 9:00 PM.

Create the archive and schedule restore points

Create an archive and then create a stored procedure that saves a restore point to that archive:

  1. Create an archive with CREATE ARCHIVE. The following statement creates an archive named nightly because it will store restore points that are saved every night:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "CREATE ARCHIVE nightly;"
    CREATE ARCHIVE
    
  2. Create a stored procedure that saves a restore point. The SAVE RESTORE POINT TO ARCHIVE statement creates a restore point and saves it to the nightly archive:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
      -w password \
      -c "CREATE OR REPLACE PROCEDURE take_nightly()
      LANGUAGE PLvSQL AS \$\$
      BEGIN
        EXECUTE 'SAVE RESTORE POINT TO ARCHIVE nightly';
      END;
      \$\$;"
    CREATE PROCEDURE
    
  3. To test the stored procedure, execute it with the CALL statement:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "CALL take_nightly();"
     take_nightly
    --------------
                0
    (1 row)
    
  4. To verify that the stored procedure saved the restore point, query the ARCHIVE_RESTORE_POINTS system table to return the number of restore points in the specified archive:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "SELECT COUNT(*) FROM ARCHIVE_RESTORE_POINTS
    WHERE ARCHIVE = 'nightly';"
     COUNT
    -------
         1
    (1 row)
    

Schedule the stored procedure

Schedule the stored procedure so that it saves a restore point to the nightly archive each night:

  1. Schedule a time to execute the stored procedure with CREATE SCHEDULE. This function uses a cron expression to create a schedule at 9:00 PM each night:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "CREATE SCHEDULE nightly_sched USING CRON '0 21 * * *';"
    CREATE SCHEDULE
    
  2. Set CREATE TRIGGER to execute the take_nightly stored procedure with the nightly_sched schedule:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "CREATE TRIGGER trigger_nightly_sched ON SCHEDULE nightly_sched
    EXECUTE PROCEDURE take_nightly() AS DEFINER;"
    CREATE TRIGGER
    

Verify the archive automation

After you create the stored procedure and configure its schedule, test that it executes and saves a stored procedure at the scheduled time:

  1. Before the cron job is scheduled to run, verify the system time with the date shell built-in:
    $ date -u
    Thu Feb 29 20:59:15 UTC 2024
    
  2. Wait until the scheduled time elapses:
    $ date -u
    Thu Feb 29 21:00:07 UTC 2024
    
  3. To verify that the scheduled stored procedure executed on time, query ARCHIVE_RESTORE_POINTS system table for details about the nightly archive:
    $ kubectl exec -it restorepoints-primary1-0 -c server -- vsql \
    -w password \
    -c "SELECT COUNT(*) FROM ARCHIVE_RESTORE_POINTS WHERE ARCHIVE = 'nightly';"
     COUNT
    -------
         2
    (1 row)
    
    COUNT is incremented by one, so the stored procedure saved the restore point on schedule.

Create a VerticaRestorePointsQuery

A VerticaRestorePointsQuery manifest specifies an archive and an optional time duration. The VerticaDB operator uses this information to retrieve details about the restore points that were saved to the archive.

Create and apply the manifest

The following manifest defines a VerticaRestorePointsQuery CR named vrqp. The vrqp CR instructs the operator to retrieve from the nightly archive all restore points saved on Feburary 29, 2024:

  1. Create a file named vrpq.yaml that contains the following manifest. This CR retrieves restore points :

    apiVersion: vertica.com/v1beta1
    kind: VerticaRestorePointsQuery
    metadata:
      name: vrqp
    spec:
      verticaDBName: restorepoints
      filterOptions:
        archiveName: "nightly"
        startTimestamp: 2024-02-29
        endTimestamp: 2024-02-29
    

    The spec contains the following fields:

    • verticaDBName: Name of the VerticaDB CR that you want to retrieve restore points for.
    • filterOptions.archiveName: Archive that contains the restore points that you want to retrieve.
    • filterOptions.startTimestamp: Retrieve restore points that were saved on or after this date.
    • filterOptions.endTimestamp: Retrieve restore points that were saved on or before this date.

    For additional details about these parameters, see Custom resource definition parameters.

  2. Apply the manifest in the current namespace with kubectl:

    $ kubectl apply -f vrpq.yaml
    verticarestorepointsquery.vertica.com/vrpq created
    

    After you apply the manifest, the operator begins working to retrieve the restore points.

  3. Verify that the query succeeded with kubectl:

    $ kubectl get vrpq
    NAME   VERTICADB       STATE              AGE
    vrpq   restorepoints   Query successful   10s
    

View retrieved restore points

After you apply the VerticaRestorePointsQuery CR, you can view the retrieved restore points with kubectl describe. kubectl describe returns a Status section, which describes the query activity and properties for each retrieved restore point:

$ kubectl describe vrpq
Name:         vrpq
...
Status:
  Conditions:
    Last Transition Time:  2024-03-15T17:40:39Z
    Message:
    Reason:                Completed
    Status:                True
    Type:                  QueryReady
    Last Transition Time:  2024-03-15T17:40:41Z
    Message:
    Reason:                Completed
    Status:                False
    Type:                  Querying
    Last Transition Time:  2024-03-15T17:40:41Z
    Message:
    Reason:                Completed
    Status:                True
    Type:                  QueryComplete
  Restore Points:
    Archive:          nightly
    Id:               af8cd407-246a-4500-bc69-0b534e998cc6
    Index:            1
    Timestamp:        2024-02-29 21:00:00.728787
    vertica_version:  version
  State:              Query successful
...

The Status section contains relevant restore points details in the Conditions and Restore Points fields.

Conditions

The Conditions field summarizes each stage of the restore points query and contains the following fields:

  • Last Transition Time: Timestamp that indicates when the status condition last changed.
  • Message: This field is not in use, you can safely ignore it.
  • Reason: Indicates why the query stage is in its current Status.
  • Status: Boolean, indicates whether the query stage is currently in process.
  • Type: The query that the VerticaDB operator is executing in this stage.

The following table describes each Conditions.Type, and all possible value combinations for its Reason and Status field values:

Type Description Status Reason
QueryReady The operator verified that the query is executable in the environment. True Completed
False

IncompatibleDB: CR specified by verticaDBName is not version 24.2 or later.

AdmintoolsNotSupported: CR specified by verticaDBName does not use apiVersion v1. For details, see VerticaDB custom resource definition.

Querying The operator is running the query. True Started
False

Failed

Completed

QueryComplete The query is complete and the restore points are available in the Restore Points array. True Completed

Restore Points

The Restore Points field lists each restore point that was retrieved from the archive and contains the following fields:

  • Archive: The archive that contains this restore point.
  • Id: Unique identifier for the restore point.
  • Index: Restore point rank ordering in the archive, by descending timestamp. 1 is the most recent restore point.
  • Timestamp: Time that indicates when the restore point was created.
  • vertica_version: Database version when this restore point was saved to the archive.

Restore the database

After the operator retrieves the restore points, you can restore the database with the archive name and either the restore point Index or Id. In addition, you must set initPolicy to Revive:

  1. Delete the existing CR:
    $ kubectl delete -f restorepoints.yaml
    verticadb.vertica.com "restorepoints" deleted
    
  2. Update the CR. Change the initPolicy to Revive, and add the restore point information. You might have to set ignore-cluster-lease to true:
    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
      name: restorepoints
      annotations:
        vertica.com/ignore-cluster-lease: "true"
    spec:
      initPolicy: Revive
      restorePoint:
        archive: "nightly"
        index: 1
      ...
    
  3. Apply the updated manifest:
    $ kubectl apply -f restorepoints.yaml
    verticadb.vertica.com/restorepoints created
    

5.5 - VerticaScrutinize custom resource definition

The VerticaScrutinize custom resource (CR) runs scrutinize on a VerticaDB CR, which collects diagnostic information about the VerticaDB cluster and packages it in a tar file. This diagnostic information is commonly requested when resolving a case with Vertica Support.

When you create a VerticaScrutinize CR in your cluster, the VerticaDB operator creates a short-lived pod and runs scrutinize in two stages:

  1. An init container runs scrutinize on the VerticaDB CR. This produces a tar file named VerticaScrutinize.timestamp.tar that contains the diagnostic information. Optionally, you can define one or more init containers that perform additional processing after scrutinize completes.
  2. A main container persists the tar file in its file system in the /tmp/scrutinize/ directory. This main container lives for 30 minutes.

When resolving a support case, Vertica Support might request that you upload the tar file to a secure location, such as Vertica Advisor Report.

Prerequisites

Create a VerticaScrutinize CR

A VerticaScrutinize CR spec requires only the name of the VerticaDB CR for which you want to collect diagnostic information. The following example defines the CR as a YAML-formatted file named vscrutinize-example.yaml:

apiVersion: vertica.com/v1beta1
kind: VerticaScrutinize
metadata:
  name: vscrutinize-example
spec:
  verticaDBName: verticadb-name

For a complete list of parameters that you can set for a VerticaScrutinize CR, see Custom resource definition parameters.

Apply the manifest

After you create the VerticaScrutinize CR, apply the manifest in the same namespace as the CR specified by verticaDBName:

$ kubectl apply -f vscrutinize-example.yaml
verticascrutinize.vertica.com/vscrutinize-example created

The operator creates an init container that runs scrutinize:

$ kubectl get pods
NAME                                          READY   STATUS     RESTARTS   AGE
...
verticadb-operator-manager-68b7d45854-22c8p   1/1     Running    0          3d17h
vscrutinize-example                           0/1     Init:0/1   0          14s

After the init container completes, a new container is created, and the tar file is stored in its file system at /tmp/scrutinize. This container persists for 30 minutes:

$ kubectl get pods
NAME                                          READY   STATUS    RESTARTS   AGE
...
verticadb-operator-manager-68b7d45854-22c8p   1/1     Running   0          3d20h
vscrutinize-example                           1/1     Running   0          21s

Add init containers

When you apply a VerticaScrutinize CR, the VerticaDB operator creates an init container that prepares and runs the scrutinize command. You can add one or more init containers to perform additional steps after scrutinize creates a tar file and before the tar file is saved in the main container.

For example, you can define an init container that sends the tar file to another location, such as an S3 bucket. The following manifest defines an initContainer field that uploads the scrutinize tar file to an S3 bucket:

apiVersion: vertica.com/v1beta1
kind: VerticaScrutinize
metadata:
  name: vscrutinize-example-copy-to-s3
spec:
  verticaDBName: verticadb-name
  initContainers:
    - command:
        - bash
        - '-c'
        - 'aws s3 cp $(SCRUTINIZE_TARBALL) s3://k8test/scrutinize/'
      env:
        - name: AWS_REGION
          value: us-east-1
      image: 'amazon/aws-cli:2.2.24'
      name: copy-tarfile-to-s3
      securityContext:
        privileged: true

In the previous example, initContainers.command executes a command that accesses the SCRUTINIZE_TARBALL environment variable. The operator sets this environment variable in the scrutinize pod, and it defines the location of the tar file in the main container.

6 - Custom resource definition parameters

The following table describes the available settings for the Vertica Custom Resource Definition.

The following lists describes the available settings for Vertica custom resource definitions (CRDs).

VerticaDB

Parameters

annotations

Custom annotations added to all of the objects that the operator creates. Each annotation is encoded as an environment variable in the Vertica server container. Annotations accept the following characters:

  • Letters
  • Numbers
  • Underscores

Invalid character values are converted to underscore characters. For example:

vertica.com/git-ref: 1234abcd

Is converted to:

VERTICA_COM_GIT_REF=1234abcd

autoRestartVertica
Whether the operator restarts the Vertica process when the process is not running.

Set this parameter to false when performing manual maintenance that requires a DOWN database. This prevents the operator from interfering with the database state.

Default: true

certSecrets
A list of Secrets for custom TLS certificates.

Each certificate is mounted in the container at /certs/cert-name/key. For example, a PEM-encoded CA bundle named root_cert.pem and concealed in a Secret named aws-cert is mounted in /certs/aws-cert/root_cert.pem.

If you update the certificate after you add it to a custom resource, the operator updates the value automatically. If you add or delete a certificate, the operator reschedules the pod with the new configuration.

For implementation details, see VerticaDB custom resource definition.

communal.additionalConfig
Sets one or more configuration parameters in the CR:
      
spec:
  communal:
    additionalConfig:
      config-param: "value"
      ...
... 
  

Configuration parameters are set only when the database is initialized. After the database is initialized, changes to this parameter have no effect in the server.

communal.caFile
The mount path in the container filesystem to a CA certificate file that validates HTTPS connections to a communal storage endpoint.

Typically, the certificate is stored in a Secret and included in certSecrets. For details, see VerticaDB custom resource definition.

communal.credentialSecret
The name of the Secret that stores the credentials for the communal storage endpoint. This parameter is optional when you authenticate to an S3-compatible endpoint with an Identity and Access Management (IAM) profile.

You can store this value as a secret in AWS Secrets Manager or Google Secret Manager. For implementation details, see Secrets management.

For implementation details for each supported communal storage location, see Configuring communal storage.

communal.endpoint
A communal storage endpoint URL. The endpoint must begin with either the http:// or https:// protocol. For example:

https://path/to/endpoint

You cannot change this value after you create the custom resource instance.

If you omit this setting, Vertica selects one of the following endpoints based on your communal storage provider:

  • AWS: https://s3.amazonaws.com
  • GCS: https://storage.googleapis.com
communal.s3ServerSideEncryption
Server-side encryption type used when reading from or writing to S3. The value depends on which type of encryption at rest is configured for S3.

This parameter accepts the following values:

  • SSE-S3
  • SSE-KMS: Requires that you pass the key identifier with the communal.additionalConfig parameter.
  • SSE-C: Requires that you pass the client key with the communal.s3SSECustomerKeySecret parameter.

You cannot change this value after you create the custom resource instance.

For implementation examples of all encryption types, see Configuring communal storage.

For details about each encryption type, see S3 object store.

Default: Empty string (""), no encryption

communal.s3SSECustomerKeySecret
If s3ServerSideEncryption is set to SSE-C, a Secret containing the client key for S3 access with the following requirements:
  • The Secret must be in the same namespace as the CR.
  • You must set the client key contents with the clientKey field.

The client key must use one of the following formats:

  • 32-character plaintext
  • 44-character base64-encoded

For additional implementation details, see Configuring communal storage.

communal.path
The path to the communal storage bucket. For example:

s3://bucket-name/key-name

You must create this bucket before you create the Vertica database.

The following initPolicy values determine how to set this value:

  • Create: The path must be empty.

  • Revive: The path cannot be empty.

You cannot change this value after you create the custom resource.

communal.region
The geographic location where the communal storage resources are located.

If you do not set the correct region, the configuration fails. You might experience a delay because Vertica retries several times before failing.

This setting is valid for Amazon Web Services (AWS) and Google Cloud Platform (GCP) only. Vertica ignores this setting for other communal storage providers.

Default:

  • AWS: us-east-1

  • GCP: US-EAST1

dbName
The database name. When initPolicy is set to Revive or ScheduleOnly, this must match the name of the source database.

Default: vertdb

encryptSpreadComm
Sets the EncryptSpreadComm security parameter to configure Spread encryption for a new Vertica database. The VerticaDB operator ignores this parameter unless you set initPolicy to Create.

Spread encryption is enabled by default. This parameter accepts the following values:

  • vertica or Empty string (""): Enables Spread encryption. Vertica generates the Spread encryption key for the database cluster.

  • disabled: Clears encryption.

Default: Empty string ("")

hadoopConfig
A ConfigMap that contains the contents of the /etc/hadoop directory.

This is mounted in the container to configure connections to a Hadoop Distributed File System (HDFS) communal path.

image
The image that defines the Vertica server container's runtime environment. If the container is hosted in a private container repository, this name must include the path to the repository.

When you update the image, the operator stops and restarts the cluster.

imagePullPolicy
How often Kubernetes pulls the image for an object. For details, see Updating Images in the Kubernetes documentation.

Default: If the image tag is latest, the default is Always. Otherwise, the default is IfNotPresent.

imagePullSecrets
List of Secrets that store credentials for authentication to a private container repository. For details, see Specifying imagePullSecrets in the Kubernetes documentation.
initPolicy
How to initialize the Vertica database in Kubernetes. This parameter accepts the following values:
kerberosSecret
The Secret that stores the following values for Kerberos authentication to Hadoop Distributed File System (HDFS):
  • krb5.conf: Contains Kerberos configuration information.

  • krb5.keytab: Contains credentials for the Vertica Kerberos principal. This file must be readable by the file owner that is running the process.

The default location for each of these files is the /etc directory.

labels
Custom labels added to all of the objects that the operator creates.
licenseSecret
The Secret that contains the contents of license files. The Secret must share a namespace with the custom resource (CR). Each of the keys in the Secret is mounted as a file in /home/dbadmin/licensing/mnt.

If this value is set when the CR is created, the operator installs one of the licenses automatically, choosing the first one alphabetically.

If you update this value after you create the custom resource, you must manually install the Secret in each Vertica pod.

livenessProbeOverride
Overrides default livenessProbe settings that indicate whether the container is running. The VerticaDB operator sets or updates the liveness probe in the StatefulSet.

For example, the following object overrides the default initialDelaySeconds, periodSeconds, and failureThreshold settings:

      
spec:
...
  livenessProbeOverride:
    initialDelaySeconds: 120
    periodSeconds: 15
    failureThreshold: 8
  

For a detailed list of the available probe settings, see the Kubernetes documentation.

local.catalogPath
Optional parameter that sets a custom path in the container filesystem for the catalog, if your environment requires that the catalog is stored in a location separate from the local data.

If initPolicy is set to Revive or ScheduleOnly, local.catalogPath for the new database must match local.catalogPath for the source database.

local.dataPath
The path in the container filesystem for the local data. If local.catalogPath is not set, the catalog is stored in this location.

If initPolicy is set to Revive or ScheduleOnly, the dataPath for the new database must match the dataPath for the source database.

Default: /data

local.depotPath
The path in the container filesystem that stores the depot.

If initPolicy is set to Revive or ScheduleOnly, the depotPath for the new database must match the depotPath for the source database.

Default: /depot

local.depotVolume
The type of volume to use for the depot. This parameter accepts the following values:
  • PersistentVolume: A PersistentVolume is used to store the depot data. This volume type persists depot data between pod lifecycles.
  • EmptyDir: A volume of type emptyDir is used to store the depot data. When the pod is removed from a node, the contents of the volume are deleted. If a container crashes, the depot data is unaffected.

For details about each volume type, see the Kubernetes documentation.

Default: PersistentVolume

local.requestSize
The minimum size of the local data volume when selecting a PersistentVolume (PV).

If local.storageClass allows volume expansion, the operator automatically increases the size of the PV when you change this setting. It expands the size of the depot if the following conditions are met:

  • local.storageClass is set to PersistentVolume.
  • Depot storage is allocated using a percentage of the total disk space rather than a unit, such as a gigabyte.

If you decrease this value, the operator does not decrease the size of the PV or the depot.

Default: 500 Gi

local.storageClass
The StorageClass for the PersistentVolumes that persist local data between pod lifecycles. Select this value when defining the persistent volume claim (PVC).

By default, this parameter is not set. The PVC in the default configuration uses the default storage class set by Kubernetes.

nmaTLSSecret
Adds custom Node Management Agent (NMA) certificates to the CR. The value must include the tls.key, tls.crt, and ca.crt encoded in base64 format.

You can store this value as a secret in AWS Secrets Manager or Google Secret Manager. For implementation details, see Secrets management.

If you omit this setting, the operator generates self-signed certificates for the NMA.

passwordSecret
The Secret that contains the database superuser password. Create this Secret before deployment.

If you do not create this Secret before deployment, there is no password authentication for the database.

The Secret must use a key named password:

$ kubectl create secret generic su-passwd --from-literal=password=secret-password

Add this Secret to the custom resource:

      
spec:
  passwordSecret: su-passwd
  

You can store this value as a secret in AWS Secrets Manager or Google Secret Manager. For implementation details, see Secrets management.

podSecurityContext
Overrides any pod-level security context. This setting is merged with the default context for the pods in the cluster.

vclusterops deployments can use this parameter to set a custom UID or GID:

...
spec:
  ...
  podSecurityContext:
    runAsUser: 3500
    runAsGroup: 3500
  ...

For details about the available settings for this parameter, see the Kubernetes documentation.

readinessProbeOverride
Overrides default readinessProbe settings that indicate whether the Vertica pod is ready to accept traffic. The VerticaDB operator sets or updates the readiness probe in the StatefulSet.

For example, the following object overrides the default timeoutSeconds and periodSeconds settings:

      
spec:
...
  readinessProbeOverride:
    initialDelaySeconds: 0
    periodSeconds: 10
    failureThreshold: 3
  

For a detailed list of the available probe settings, see the Kubernetes documentation.

restorePoint.archive
Archive that contains the restore points that you want to use in a restore operation. When you revive a database with a restore point, this parameter is required.
restorePoint.id
Unique identifier for the restore point. When you revive a database with a restore point, you must provide either restorePoint.id or restorePoint.index.
restorePoint.index
Identifier that describes the restore point's chronological position in the archive. Restore points are ordered by descending timestamp, where the most recent index is 1.
reviveOrder
The order of nodes during a revive operation. Each entry contains the subcluster index, and the number of pods to include from the subcluster.

For example, consider a database with the following setup:

      
- v_db_node0001: subcluster A
- v_db_node0002: subcluster A
- v_db_node0003: subcluster B
- v_db_node0004: subcluster A
- v_db_node0005: subcluster B
- v_db_node0006: subcluster B
  

If the subclusters[] list is defined as {'A', 'B'}, the revive order is as follows:

      
- {subclusterIndex:0, podCount:2} # 2 pods from subcluster A
- {subclusterIndex:1, podCount:1} # 1 pod from subcluster B
- {subclusterIndex:0, podCount:1} # 1 pod from subcluster A
- {subclusterIndex:1, podCount:2} # 2 pods from subcluster B
  

This parameter is used only when initPolicy is set to Revive.

sandboxes[i].image
Name of the image to use for the sandbox. If omitted, image from the main cluster will be used. Changing this will force an upgrade for the sandbox where it is defined.
sandboxes[i].name
Name of the sandbox.
sandboxes[i].subclusters[i].name
Name of the secondary subcluster to be added to the sandbox. The sandbox must include at least one secondary subcluster.

The following example adds a sandbox named sandbox1 with subclusters sc2 and sc3 to the custom resource:

      
spec:
...
  sandboxes:
  - name: sandbox1
    subclusters:
      - name: sc2
	  - name: sc3
  
securityContext
Sets any additional security context for the Vertica server container. This setting is merged with the security context value set for the VerticaDB Operator.

For example, if you need a core file for the Vertica server process, you can set the privileged property to true to elevate the server privileges on the host node:

      
spec:
  ...
  securityContext:
    privileged: true
  

For additional information about generating a core file, see Metrics gathering. For details about this parameter, see the Kubernetes documentation.

serviceAccountName
Sets the name of the ServiceAccount. This lets you create a service account independently of an operator or VerticaDB instance so that you can add it to the CR as needed.

If you omit this setting, the operator uses the default service account. If you specify a service account that does not exist, the operator creates that service account and then uses it.

shardCount
The number of shards in the database. You cannot update this value after you create the custom resource.

For more information about database shards and Eon Mode, see Configuring your Vertica cluster for Eon Mode.

sidecars[]
One or more optional utility containers that complete tasks for the Vertica server container. Each sidecar entry is a fully-formed container spec, similar to the container that you add to a Pod spec.

The following example adds a sidecar named vlogger to the custom resource:

      
  spec:
  ...
  sidecars:
    - name: vlogger
      image: vertica/vertica-logger:1.0.0
      volumeMounts:
        - name: my-custom-vol
          mountPath: /path/to/custom-volume
  

volumeMounts.name is the name of a custom volume. This value must match volumes.name to mount the custom volume in the sidecar container filesystem. See volumes for additional details.

For implementation details, see VerticaDB custom resource definition.

sidecars[i].volumeMounts
List of custom volumes and mount paths that persist sidecar container data. Each volume element requires a name value and a mountPath.

To mount a volume in the Vertica sidecar container filesystem, volumeMounts.name must match the volumes.name value for the corresponding sidecar definition, or the webhook returns an error.

For implementation details, see VerticaDB custom resource definition.

startupProbeOverride
Overrides the default startupProbe settings that indicate whether the Vertica process is started in the container. The VerticaDB operator sets or updates the startup probe in the StatefulSet.

For example, the following object overrides the default initialDelaySeconds, periodSeconds, and failureThreshold settings:

      
spec:
...
  startupProbeOverride:
    initialDelaySeconds: 30
    periodSeconds: 10
    failureThreshold: 117
    timeoutSeconds: 5
  

For a detailed list of the available probe settings, see the Kubernetes documentation.

subclusters[i].affinity
Applies rules that constrain the Vertica server pod to specific nodes. It is more expressive than nodeSelector. If this parameter is not set, then the pods use no affinity setting.

In production settings, it is a best practice to configure affinity to run one server pod per host node. For configuration details, see VerticaDB custom resource definition.

subclusters[i].externalIPs
Enables the service object to attach to a specified external IP.

If not set, the external IP is empty in the service object.

subclusters[i].verticaHTTPNodePort
When subclusters[i].serviceType is set to NodePort, sets the port on each node that listens for external connections to the HTTPS service. The port must be within the defined range allocated by the control plane (ports 30000-32767).

If you do not manually define a port number, Kubernetes chooses the port automatically.

subclusters[i].type
Indicates the subcluster type. Valid values include the following:
  • primary
  • secondary
  • sandboxprimary: Subcluster type automatically assigned by the VerticaDB operator when a subcluster is sandboxed and cannot be manually selected in the VerticaDB CRD.

The admission controller's webhook verifies that each database has at least one primary subcluster.

Default: primary

subclusters[i].loadBalancerIP
When subcluster[i].serviceType is set to LoadBalancer, assigns a static IP to the load balancing service.

Default: Empty string ("")

subclusters[i].name
The subcluster name. This is a required setting. If you change the name of an existing subcluster, the operator deletes the old subcluster and creates a new one with the new name.

Kubernetes derives names for the subcluster Statefulset, service object, and pod from the subcluster name. For additional details about Kubernetes and subcluster naming conventions, see Subclusters on Kubernetes.

subclusters[i].clientNodePort
When subclusters[i].serviceType is set to NodePort, sets the port on each node that listens for external client connections. The port must be within the defined range allocated by the control plane (ports 30000-32767).

If you do not manually define a port number, Kubernetes chooses the port automatically.

subclusters[i].nodeSelector

List of label key/value pairs that restrict Vertica pod scheduling to nodes with matching labels. For details, see the Kubernetes documentation.

The following example schedules server pods only at nodes that have the disktype=ssd and region=us-east labels:

      
subclusters:
  - name: defaultsubcluster
    nodeSelector:
      disktype: ssd
      region: us-east
  
subclusters[i].priorityClassName

The PriorityClass name assigned to pods in the StatefulSet. This affects where the pod gets scheduled.

For details, see the Kubernetes documentation.

subclusters[i].resources.limits
The resource limits for pods in the StatefulSet, which sets the maximum amount of CPU and memory that the server pod can consume from its host.

Vertica recommends that you set these values equal to subclusters[i].resources.requests to ensure that the pods are assigned to the guaranteed QoS class. This reduces the possibility that pods are chosen by the out of memory (OOM) Killer.

For more information, see Recommendations for Sizing Vertica Nodes and Clusters in the Vertica Knowledge Base.

subclusters[i].resources.requests
The resource requests for pods in the StatefulSet, which sets the amount of CPU and memory that the server pod requests during pod scheduling.

Vertica recommends that you set these values equal to subclusters[i].resources.limits to ensure that the pods are assigned to the guaranteed QoS class. This reduces the possibility that pods are chosen by the out of memory (OOM) Killer.

For more information, see Recommendations for Sizing Vertica Nodes and Clusters in the Vertica Knowledge Base.

subclusters[i].serviceAnnotations

Custom annotations added to implementation-specific services. Managed Kubernetes use service annotations to configure services such as network load balancers, virtual private cloud (VPC) subnets, and loggers.

subclusters[i].serviceName
Identifies the service object that directs client traffic to the subcluster. Assign a single service object to multiple subclusters to process client data with one or more subclusters. For example:
      
spec:
  ...
  subclusters:
    - name: subcluster-1
      size: 3
      serviceName: connections
    - name: subcluster-2
      size: 3
      serviceName: connections
  

The previous example creates a service object named metadata.name-connections that load balances client traffic among its assigned subclusters.

For implementation details, see VerticaDB custom resource definition.

subclusters[i].serviceType
Identifies the type of Kubernetes service to use for external client connectivity. The default is type is ClusterIP, which sets a stable IP and port that is accessible only from within Kubernetes itself.

Depending on the service type, you might need to set nodePort or externalIPs in addition to this configuration parameter.

Default: ClusterIP

subclusters[i].size
The number of pods in the subcluster. This determines the number of Vertica nodes in the subcluster. Changing this number deletes or schedules new pods.

The minimum size of a subcluster is 1. The subclusters kSafety setting determines the minimum and maximum size of the cluster.

subclusters[i].tolerations

Any tolerations and taints that aid in determining where to schedule a pod.

temporarySubclusterRouting.names
The existing subcluster that accepts traffic during a read-only online upgrade. The operator routes traffic to the first subcluster that is online. For example:
      
spec:
  ...
  temporarySubclusterRouting:
    names:
      - subcluster-2
      - subcluster-1
  

In the previous example, the operator selects subcluster-2 during the upgrade, and then routes traffic to subcluster-1 when subcluster-2 is down. As a best practice, use secondary subclusters when rerouting traffic.

temporarySubclusterRouting.template
Instructs the operator to create a new secondary subcluster during a read-only online upgrade. The operator creates the subcluster when the upgrade begins and deletes it when the upgrade completes.

To define a temporary subcluster, provide a name and size value. For example:

      
spec:
  ...
  temporarySubclusterRouting:
    template:
      name: transient
      size: 1
  
upgradePolicy
Determines how the operator upgrades Vertica server versions. Accepts the following values:
  • Offline: The operator stops the cluster to prevent multiple versions from running simultaneously.
  • ReadOnlyOnline: The cluster continues to operator during a rolling update. The data is in read-only mode while the operator upgrades the image for the primary subcluster.
  • Online: The cluster continues to operate during an online upgrade. You can modify the data while the operator upgrades the image for the primary subcluster.

The ReadOnlyOnline setting has the following restrictions:

  • The cluster must currently run Vertica server version 11.1.0 or higher.

  • If you have only one subcluster, you must configure temporarySubclusterRouting.template to create a new secondary subcluster during the read-only online upgrade. Otherwise, the operator performs an Offline upgrade, regardless of the setting.

  • Auto: The operator selects either Offline or ReadOnlyOnline depending on the configuration. The operator selects ReadOnlyOnline if all of the following are true:

    • A license Secret exists.

    • K-Safety is 1.

    • The cluster is currently running Vertica version 11.1.0 or higher.

The Online setting has the following restrictions:

  • The cluster must currently run Vertica server version 24.3.0-2 or higher. If not, the operator will fallback to ReadOnlyOnline.

  • The cluster must be deployed using vclusterops.

  • The cluster must have sufficient resources. During the upgrade, the operator creates a sandbox that replicates the cluster, doubling the number of pods temporarily.

Default: Auto

volumeMounts
List of custom volumes and mount paths that persist Vertica server container data. Each volume element requires a name value and a mountPath.

To mount a volume in the Vertica server container filesystem, volumeMounts.name must match the volumes.name value defined in the spec definition, or the webhook returns an error.

For implementation details, see VerticaDB custom resource definition.

volumes
List of custom volumes that persist Vertica server container data. Each volume element requires a name value and a volume type. volumes accepts any Kubernetes volume type.

To mount a volume in a filesystem, volumes.name must match the volumeMounts.name value for the corresponding volume mount, or the webhook returns an error.

For implementation details, see VerticaDB custom resource definition.

Annotations

Apply each of the following annotations to the metadata.annotations section in the CR:

vertica.com/https-tls-conf-generation
Determines whether the Vertica pod stores a plain text configuration file used to generate default certificates for the HTTPS service.

Set this to false to hide the configuration file when you are certain that the HTTPS service can start in its current configuration. For example:

The presence of this configuration file does not interfere with either of these certificate configurations.

Default: true

vertica.com/ignore-cluster-lease
Ignore the cluster lease when starting or reviving the database.

Default: false

vertica.com/ignore-upgrade-path
When set to false, the operator ensures that you do not downgrade to an earlier release.

Default: true

vertica.com/include-uid-in-path
When set to true, the operator includes in the path the unique identifier (UID) that Kubernetes assigns to the VerticaDB object. Including the UID creates a unique database path so that you can reuse the communal path in the same endpoint.

Default: false

vertica.com/restart-timeout
When restarting pods, the number of seconds before the operation times out.

Default: 0. The operator uses a 20 minute default.

vertica.com/superuser-name
For vclusterops deployments, sets a custom superuser name. All admintools deployments use the default superuser name, dbadmin.
vertica.com/vcluster-ops
Determines whether the VerticaDB CR installs the vclusterops library to manage the cluster. When omitted, API version v1 assumes this annotation is set to true, and the v1beta1 annotation is set to false.

API version v1 must install vclusterops. You can omit this setting to use the default empty string, or explicitly set this to true.

For deprecated API version v1beta1, you must set this to true for vclusterops deployments. For admintools deployments, you can omit this setting or set it to false.

Default: Empty string ("")

EventTrigger

For implementation details, see EventTrigger custom resource definition.

matches[].condition.status
The status portion of the status condition match. The operator watches the condition specified by matches[].condition.type on the EventTrigger reference object. When that condition changes to the status specified in this parameter, the operator runs the task defined in the EventTrigger.
matches[].condition.type
The condition portion of the status condition match. The operator watches this condition on the EventTrigger reference object. When this condition changes to the status specified with matches[].condition.status, the operator runs the task defined in the EventTrigger.
references[].object.apiVersion
Kubernetes API version of the object that the EventTrigger watches.
references[].object.kind
The type of object that the EventTrigger watches.
references[].object.name
The name of the object that the EventTrigger watches.
references[].object.namespace
Optional. The namespace of the object that the EventTrigger watches. The object and the EventTrigger CR must exist within the same namespace.

If omitted, the operator uses the same namespace as the EventTrigger.

template
Full spec for the Job that EventTrigger runs when references[].condition.type and references[].condition.status are found for a reference object.

For implementation details, see EventTrigger custom resource definition.

VerticaAutoScaler

verticaDBName
Required. Name of the VerticaDB CR that the VerticaAutoscaler CR scales resources for.
scalingGranularity
Required. The scaling strategy. This parameter accepts one of the following values:
  • Subcluster: Create or delete entire subclusters. To create a new subcluster, the operator uses a template or an existing subcluster with the same serviceName.
  • Pod: Increase or decrease the size of an existing subcluster.

Default: Subcluster

serviceName
Required. Refers to the subclusters[i].serviceName for the VerticaDB CR.

VerticaAutoscaler uses this value as a selector when scaling subclusters together.

template
When scalingGranularity is set to Subcluster, you can use this parameter to define how VerticaAutoscaler scales the new subcluster. The following is an example:
      
spec:
    verticaDBName: dbname
    scalingGranularity: Subcluster
    serviceName: service-name
    template:
        name: autoscaler-name
        size: 2
        serviceName: service-name
        isPrimary: false
  

If you set template.size to 0, VerticaAutoscaler selects as a template an existing subcluster that uses service-name.

This setting is ignored when scalingGranularity is set to Pod.

VerticaReplicator

source.passwordSecret
Stores the password secret for the specified username. If this field and username are omitted, the default is set to the superuser password secret found in the VerticaDB. An empty value indicates no password. By default, the secret is assumed to be a Kubernetes (k8s) secret unless a secret path reference is specified, in which case it is retrieved from an external secret storage manager.
source.sandboxName
Specify the sandbox name to establish a connection. If no sandbox name is provided, the system defaults to the main database cluster.
source.userName
The username to connect to Vertica with. If no username is specified, the database will default to the superuser. Custom username for source database is not supported yet.
source.verticaDB
Required. Name of an existing VerticaDB.
target.passwordSecret
Stores the password secret for the specified username. If this field and username are omitted, the default is set to the superuser password secret found in the VerticaDB. An empty value indicates no password. By default, the secret is assumed to be a Kubernetes (k8s) secret unless a secret path reference is specified, in which case it is retrieved from an external secret storage manager.
target.sandboxName
Specify the sandbox name to establish a connection. If no sandbox name is provided, the system defaults to the main database cluster.
target.userName
The username to connect to Vertica with. If no username is specified, the database will default to the superuser. Custom username for source database is not supported yet.
target.verticaDB
Required. Name of an existing VerticaDB.
tlsConfig
Optional. TLS configurations to use when connecting from the source database to the target database. It refers to an existing TLS configuration in the source. Using TLS configuration for target database authentication requires the same username for both source and target. Additionally, the security config parameter EnableConnectCredentialForwarding must be enabled on the source database. Custom username for source and target databases is not yet supported when using TLS configuration.

VerticaRestorePointsQuery

verticaDBName
The VerticaDB CR instance that you want to retrieve restore points for.
filterOptions.archive
The archive that contains the restore points that you want to retrieve. If omitted, the query returns all restore points from all archives.
filterOptions.startTimestamp
Limits the query results to restore points with a UTC timestamp that is equal to or later than this value. This parameter accepts the following UTC formats:
  • YYYY-MM-DD
  • YYYY-MM-DD HH:MM:ss
  • YYYY-MM-DD HH:MM:ss.SSSSSSSSS
filterOptions.endTimestamp
Limits the query results to restore points with a UTC timestamp that is equal to or earlier than this value. This parameter accepts the following UTC date and time formats:
  • YYYY-MM-DD. When you use this format, the VerticaDB operator populates the time portion with 23:59:59.999999999.
  • YYYY-MM-DD HH:MM:ss.
  • YYYY-MM-DD HH:MM:ss.SSSSSSSSS.

VerticaScrutinize

affinity
Applies rules that constrain the VerticaScrutinize pod to specific nodes. It is more expressive than nodeSelector. If this parameter is not set, then the scrutinize pod uses no affinity setting.

For details, see the Kubernetes documentation.

annotations
Custom annotations added to all objects created to run scrutinize.
initContainers
A list of custom init containers to run after the init container collects diagnotic information with scrutinize. You can use an init container to perform additional processing on the scrutinize tar file, such as uploading it to an external storage location.
labels
Custom labels added to the scrutinize pod.
nodeSelector

List of label key/value pairs that restrict Vertica pod scheduling to nodes with matching labels. For details, see the Kubernetes documentation.

priorityClassName
The PriorityClass name assigned to the scrutinize pod. This affects where the pod gets scheduled.

For details, see the Kubernetes documentation.

resources.limits
The resource limits for the scrutinize pod, which sets the maximum amount of CPU and memory that the pod can consume from its host.
resources.requests
The resource requests for the scrutinize pod, which sets the amount of CPU and memory that the pod requests during pod scheduling.
tolerations

Any tolerations and taints that aid in determining where to schedule a pod.

verticaDBName
Required. Name of the VerticaDB CR that the VerticaScrutinize CR collects diagnostic information for. The VerticaDB CR must exist in the same namespace as the VerticaScrutinize CR.
volume
Custom volume that stores the finalized scrutinize tar file and any intermediate files. The volume must have enough space to store the scrutinize data. The volume is mounted in /tmp/scrutinize.

If this setting is omitted, an emptyDir volume is created to store the scrutinize data.

Default: emptyDir

7 - Subclusters on Kubernetes

Eon Mode uses subclusters for workload isolation and scaling.

Eon Mode uses subclusters for workload isolation and scaling. The VerticaDB operator provides tools to direct external client communications to specific subclusters, and automate scaling without stopping your database.

The custom resource definition (CRD) provides parameters that allow you to fine-tune each subcluster for specific workloads. For example, you can increase the subcluster size setting for increased throughput, or adjust the resource requests and limits to manage compute power. When you create a custom resource instance, the operator deploys each subcluster as a StatefulSet. Each StatefulSet has a service object, which allows an external client to connect to a specific subcluster.

Naming conventions

Kubernetes derives names for the subcluster Statefulset, service object, and pod from the subcluster name. This naming convention tightly couples the subcluster objects to help Kubernetes manage the cluster effectively. If you want to rename a subcluster, you must delete it from the CRD and redefine it so that the operator can create new objects with a derived name.

Kubernetes forms an object's fully qualified domain name (FQDN) with its resource type name, so resource type names must follow FQDN naming conventions. The underscore character ( "_" ) does not follow FQDN rules, but you can use it in the subcluster name. Vertica converts each underscore to a hyphen ( "-" ) in the FQDN for any object name derived from the subcluster name. For example, Vertica generates a default subcluster and names it default_subcluster, and then converts the corresponding portion of the derived object's FQDN to default-subcluster.

For additional naming guidelines, see the Kubernetes documentation.

External client connections

External clients can target specific subclusters that are fine-tuned to handle their workload. Each subcluster has a service object that handles external connections. To target multiple subclusters with a single service object, assign each subcluster the same spec.subclusters.serviceName value in the custom resource (CR). For implementation details, see VerticaDB custom resource definition.

The operator performs health monitoring that checks whether the Vertica daemon is running on each pod. If the daemon is running, then the operator allows the service object to route traffic to the pod.

By default, the service object derives its name from the custom resource name and the associated subcluster and uses the following format:

customResourceName-subclusterName

To override this default format, set the subclusters[i].serviceName CR parameter, which changes the format to the following:

metadata.name-serviceName

Vertica supports the following service object types:

  • ClusterIP: The default service type. This service provides internal load balancing, and sets a stable IP and port that is accessible from within the subcluster only.

  • NodePort: Provides external client access. You can specify a port number for each host node in the subcluster to open for client connections.

  • LoadBalancer: Uses a cloud provider load balancer to create NodePort and ClusterIP services as needed. For details about implementation, see the Kubernetes documentation and your cloud provider documentation.

For configuration details, see VerticaDB custom resource definition.

Managing internal and external workloads

The Vertica StatefulSet is associated with an external service object. All external client requests are sent through this service object and load balanced among the pods in the cluster.

Import and export

Importing and exporting data between a cluster outside of Kubernetes requires that you expose the service with the NodePort or LoadBalancer service type and properly configure the network.

7.1 - Scaling subclusters

The operator enables you to scale the number of subclusters and the number of pods per subcluster automatically.

The operator enables you to scale the number of subclusters and the number of pods per subcluster automatically. This utilizes or conserves resources depending on the immediate needs of your workload.

The following sections explain how to scale resources for new workloads. For details about scaling resources for existing workloads, see VerticaAutoscaler custom resource definition.

Prerequisites

Scaling the number of subclusters

Adjust the number of subclusters in your custom resource to fine-tune resources for short-running dashboard queries. For example, increase the number of subclusters to increase throughput. For more information, see Improving query throughput using subclusters.

  1. Use kubectl edit to open your default text editor and update the YAML file for the specified custom resource. The following command opens a custom resource named vdb for editing:

    $ kubectl edit vdb
    
  2. In the spec section of the custom resource, locate the subclusters subsection. Begin with the type field to define a new subcluster.

    The type field indicates the subcluster type. Because there is already a primary subcluster, enter Secondary:

    spec:
    ...
      subclusters:
      ...
      - type: secondary
    
  3. Follow the steps in VerticaDB custom resource definition to complete the subcluster definition. The following completed example adds a secondary subcluster for dashboard queries:

    spec:
    ...
      subclusters:
      - type: primary
        name: primary-subcluster
      ...
      - type: secondary
        name: dashboard
        clientNodePort: 32001
        resources:
          limits:
            cpu: 32
            memory: 96Gi
          requests:
            cpu: 32
            memory: 96Gi
        serviceType: NodePort
        size: 3
    
  4. Save and close the custom resource file. When the update completes, you receive a message similar to the following:

    verticadb.vertica.com/vertica-db edited
    
  5. Use the kubectl wait command to monitor when the new pods are ready:

    $ kubectl wait --for=condition=Ready pod --selector app.kubernetes.io/name=verticadb --timeout 180s
    pod/vdb-dashboard-0 condition met
    pod/vdb-dashboard-1 condition met
    pod/vdb-dashboard-2 condition met
    

Scaling the pods in a subcluster

For long-running, analytic queries, increase the pod count for a subcluster. See Using elastic crunch scaling to improve query performance.

  1. Use kubectl edit to open your default text editor and update the YAML file for the specified custom resource. The following command opens a custom resource named verticadb for editing:

    $ kubectl edit verticadb
    
  2. Update the subclusters.size value to 6:

    spec:
    ...
      subclusters:
      ...
      - type: secondary
        ...
        size: 6
    

    Shards are rebalanced automatically.

  3. Save and close the custom resource file. You receive a message similar to the following when you successfully update the file:

    verticadb.vertica.com/verticadb edited

  4. Use the kubectl wait command to monitor when the new pods are ready:

    $ kubectl wait --for=condition=Ready pod --selector app.kubernetes.io/name=verticadb --timeout 180s
    pod/vdb-subcluster1-3 condition met
    pod/vdb-subcluster1-4 condition met
    pod/vdb-subcluster1-5 condition met
    

Removing a subcluster

Remove a subcluster when it is no longer needed, or to preserve resources.

  1. Use kubectl edit to open your default text editor and update the YAML file for the specified custom resource. The following command opens a custom resource named verticadb for editing:

    $ kubectl edit verticadb
    
  2. In the subclusters subsection nested under spec, locate the subcluster that you want to delete. Delete the element in the subcluster array represents the subcluster that you want to delete. Each element is identified by a hyphen (-).

  3. After you delete the subcluster and save, you receive a message similar to the following:

    verticadb.vertica.com/verticadb edited
    

8 - Sandboxing on K8s

Sandboxing on Kubernetes allows you to create isolated testing environments without the need to set up a new database or reload data, making it easier to test Vertica features in new versions. Sandboxing enables seamless online upgrades in Kubernetes. While users stay connected to the main cluster, the upgrade is performed on the sandbox. Once the upgrade is complete, the sandbox is promoted to the main cluster. The operator automates the sandboxing process for Vertica subclusters within a custom resource (CR). For more information, see Subcluster sandboxing.

Prerequisites

Sandboxing a Subcluster

The following specification in VerticaDB CR (VerticaDB custom resource definition) has the sandbox information:

Parameter Description
spec.sandboxes[i].name Name of the sandbox.
spec.sandboxes[i].subclusters[i].name Name of the secondary subcluster to be added to the sandbox. The sandbox must include at least one secondary subcluster.
spec.sandboxes[i].image Name of the image to use for the sandbox. If omitted, image from the main cluster will be used. Changing this will force an upgrade for the sandbox where it is defined.

To sandbox a subcluster

  1. Use kubectl edit to open your default text editor and update the YAML file for the specified custom resource. The following command opens a custom resource named vdb for editing:

    $ kubectl edit vdb
    
  2. In the spec section of the custom resource, locate the subclusters subsection and identify the secondary subcluster that you want to sandbox. In the following example, we will sandbox the secondary subcluster, sc2:

    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
     name: vertica-db
    spec:
    ...
     subclusters:
     - affinity: {}
       name: sc1
       resources: {}
       serviceName: sc1
       serviceType: ClusterIP
       size: 3
       type: primary
     - affinity: {}
       name: sc2
       resources: {}
       serviceName: sc2
       serviceType: ClusterIP
       size: 1
       type: secondary
     - affinity: {}
       name: sc3
       resources: {}
       serviceName: sc3
       serviceType: ClusterIP
       size: 1
       type: secondary
    
  3. Now, add an entry for the sandbox. Provide a sandbox name and name of the subcluster(s) that you want to sandbox. For example, we will sandbox subcluster sc2 and name it sandbox1:

    spec:
    ...
     subclusters:
     - affinity: {}
       name: sc1
       resources: {}
       serviceName: sc1
       serviceType: ClusterIP
       size: 3
       type: primary
     - affinity: {}
       name: sc2
       resources: {}
       serviceName: sc2
       serviceType: ClusterIP
       size: 3
       type: secondary
     - affinity: {}
       name: sc3
       resources: {}
       serviceName: sc3
       serviceType: ClusterIP
       size: 3
       type: secondary
     sandboxes:
     - name: sandbox1
       subclusters:
        - name: sc2
    
  4. Save and close the custom resource file. When the update completes, you will receive the following message:

    verticadb.vertica.com/vertica-db edited
    

If you want to include another subcluster in the sandbox, go back to VerticaDB and modify the sandbox information. Following are the contents of the VerticaDB after adding sc3:

spec:
...
  subclusters:
  - affinity: {}
    name: sc1
    resources: {}
    serviceName: sc1
    serviceType: ClusterIP
    size: 3
    type: primary
  - affinity: {}
    name: sc2
    resources: {}
    serviceName: sc2
    serviceType: ClusterIP
    size: 3
    type: secondary
  - affinity: {}
    name: sc3
    resources: {}
    serviceName: sc3
    serviceType: ClusterIP
    size: 3
    type: secondary
  sandboxes:
  - name: sandbox1
	image: opentext/vertica-k8s:24.3.0-0
    subclusters:
      - name: sc2
	  - name: sc3

Checking sandboxing status

You can check the status of sandboxing as follows:

$ kubectl describe vdb
Name:         vertica-db
...
Events:
  Type     Reason                      Age                From                Message
  ----     ------                      ----               ----                -------
  ...
  Normal   SandboxSubclusterStart      8m1s               verticadb-operator  Starting add subcluster "sc2" to sandbox "sandbox1"
  Normal   SandboxSubclusterSucceeded  7m39s              verticadb-operator  Successfully added subcluster "sc2" to sandbox "sandbox1"

You can verify if sandboxing is successful by checking the VerticaDB CR to see if the subcluster type changed from secondary to sandboxprimary.

spec:
...
  subclusters:
  - affinity: {}
    name: sc1
    resources: {}
    serviceName: sc1
    serviceType: ClusterIP
    size: 3
    type: primary
  - affinity: {}
    name: sc2
    resources: {}
    serviceName: sc2
    serviceType: ClusterIP
    size: 3
    type: sandboxprimary
  - affinity: {}
    name: sc3
    resources: {}
    serviceName: sc3
    serviceType: ClusterIP
    size: 3
    type: secondary
  sandboxes:
  - name: sandbox1
	image: opentext/vertica-k8s:24.3.0-0
    subclusters:
      - name: sc2
	  - name: sc3

Alternatively, you can connect to any node in the subcluster using vsql client and query the system table subclusters to check that sandboxing is successful.

$ vsql -h 10.244.2.166 -U dbadmin
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)

select distinct subcluster_name, is_primary, sandbox from subclusters where subcluster_name = 'sc2';
 subcluster_name | is_primary | sandbox
-----------------+------------+----------
 sc2             | t          | sandbox1
(1 row)

Upgrading the subcluster

To upgrade the sandbox, simply update the spec.sandboxes.image field.

spec:
...
  sandboxes:
  - image: opentext/vertica-k8s:24.3.0-1
    name: sandbox1
    subclusters:
    - name: sc2
    - name: sc3

Removing Sandboxes

Removing sandboxes allows you to remove a subcluster from the sandbox and return it to the main cluster.

To remove a subcluster from the sandbox, you need to remove the subcluster name from spec.sandboxes.subclusters in VerticaDB. In the following example, subcluster sc3 will be removed from the sandbox:

spec:
...
  sandboxes:
  - image: opentext/vertica-k8s:24.3.0-1
    name: sandbox1
    subclusters:
    - name: sc2

To remove the complete sandbox, remove its information from the VerticaDB:

spec:
...
  sandboxes:[]

Checking unsandboxing status

You can check if the sandbox was removed as follows:

$ kubectl describe vdb
Name:         vertica-db
...
Events:
  Type    Reason                        Age   From                Message
  ----    ------                        ----  ----                -------
  Normal  UnsandboxSubclusterStart      2m3s  verticadb-operator  Starting unsandbox subcluster "sc2"
  Normal  UnsandboxSubclusterSucceeded  111s  verticadb-operator  Successfully unsandboxed subcluster "sc2"
  Normal  UnsandboxSubclusterStart      111s  verticadb-operator  Starting unsandbox subcluster "sc3"
  Normal  UnsandboxSubclusterSucceeded  99s   verticadb-operator  Successfully unsandboxed subcluster "sc3"
  Normal  NodeRestartStarted            90s   verticadb-operator  Starting database restart node of the following pods: vertica-db-sc2-0, vertica-db-sc3-0
  Normal  NodeRestartSucceeded          65s   verticadb-operator  Successfully restarted database nodes and it took 24s

You can verify if the sandbox was removed successfully by opening the VerticaDB CR and checking that subcluster type has changed from sandboxprimary to secondary.

spec:
...
  subclusters:
  - affinity: {}
    name: sc1
    resources: {}
    serviceName: sc1
    serviceType: ClusterIP
    size: 3
    type: primary
  - affinity: {}
    name: sc2
    resources: {}
    serviceName: sc2
    serviceType: ClusterIP
    size: 3
    type: secondary
  - affinity: {}
    name: sc3
    resources: {}
    serviceName: sc3
    serviceType: ClusterIP
    size: 3
    type: secondary

Alternatively, you can connect to any node in the subcluster using the vsql client and query the subclusters system table to verify if unsandboxing was successful.

$ vsql -h 10.244.2.166 -U dbadmin
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)

vertdb=> select distinct subcluster_name, is_primary, sandbox from subclusters where subcluster_name = 'sc2';
 subcluster_name | is_primary | sandbox
-----------------+------------+---------
 sc2             | f          |
(1 row)

9 - Upgrading Vertica on Kubernetes

The operator automates Vertica server version upgrades for a custom resource (CR).

The operator automates Vertica server version upgrades for a custom resource (CR). Use the upgradePolicy setting in the CR to determine whether your cluster remains online or is taken offline during the version upgrade.

Prerequisites

Before you begin, complete the following:

Setting the policy

The upgradePolicy CR parameter setting determines how the operator upgrades Vertica server versions. It provides the following options:

Setting Description
Offline

The operator shuts down the cluster to prevent multiple versions from running simultaneously.

The operator performs all server version upgrades using the Offline setting in the following circumstances:

  • You have only one subcluster

  • You are upgrading from a Vertica server version prior to version 11.1.0

ReadOnlyOnline

The cluster continues to operate during a read-only online upgrade. The database is in read-only mode while the operator upgrades the image for the primary subcluster.

Online

The cluster continues to operate during an online upgrade. You can modify the data while the operator upgrades the database.

Auto

The default setting. The operator selects either Offline or ReadOnlyOnline depending on the configuration. The operator performs a ReadOnlyOnline upgrade if all of the following are true:

  • A license Secret exists

  • K-Safety is 1

  • The cluster is currently running a Vertica version 11.1.0 or higher

If the current configuration does not meet all of the previous requirements, the operator performs an Offline upgrade.

Reconcile loop iteration time

During an upgrade, the operator runs the reconcile loop to compare the actual state of the objects to the desired state defined in the CR. The operator requeues any unfinished work, and the reconcile loop compares states with a set period of time between each reconcile iteration.

Online upgrade

An online upgrade allows you to load data with minimal downtime, keeping the database active with continuous writes though replication. By leveraging sandboxes, instead of shutting down the primary subclusters and limiting secondary subclusters to read-only mode, you can sandbox a secondary subcluster. This allows ongoing read and write access to the database while the primary subcluster is being upgraded.

Online upgrade workflow

The following outlines the workflow during an online upgrade:

  1. Enable no-ddl mode: This mode restricts certain actions, such as creating new users or views. You can only insert data into existing tables or create new tables.
  2. Create a sandbox: The operator creates a new sandbox that replicates the main cluster. This requires additional resources temporarily. In the following example vertica-db-sc1 is the original cluster while vertica-db-sc1-sb is the sandboxed copy.
    $ kubectl get pods
    NAME                                          READY   STATUS    RESTARTS   AGE
    vertica-db-sc1-0                              2/2     Running   0          23m
    vertica-db-sc1-1                              2/2     Running   0          23m
    vertica-db-sc1-2                              2/2     Running   0          23m
    vertica-db-sc1-sb-0                           2/2     Running   0          83s
    vertica-db-sc1-sb-1                           2/2     Running   0          83s
    vertica-db-sc1-sb-2                           2/2     Running   0          83s
    verticadb-operator-manager-5f4564f946-qmklq   1/1     Running   163        7d4h
    
  3. Upgrade the sandbox: The sandbox is upgraded in Offline mode.
  4. Replicate data and redirect connections: Changes are synchronized by replicating data from the main cluster to the sandbox and connections are redirected to the sandbox environment.
  5. Promote the sandbox: The sandbox is now promoted to the main cluster.
  6. Remove the old cluster: After redirect is complete, the old cluster is removed. The new StatefulSet name and Vertica node names will differ from those of the old cluster.
    $ kubectl get pods
    NAME                                          READY   STATUS    RESTARTS   AGE
    vertica-db-sc1-sb-0                           2/2     Running   0          3m34s
    vertica-db-sc1-sb-1                           2/2     Running   0          3m34s
    vertica-db-sc1-sb-2                           2/2     Running   0          3m33s
    verticadb-operator-manager-5f4564f946-qmklq   1/1     Running   163        7d4h
    

Routing client traffic during a ReadOnlyOnline upgrade

During a read-only online upgrade, the operator begins by upgrading the Vertica server version in the primary subcluster to form a cluster with the new version. When the operator restarts the primary nodes, it places the secondary subclusters in read-only mode. Next, the operator upgrades any secondary subclusters one at a time. During the upgrade for any subcluster, all client connections are drained, and traffic is rerouted to either an existing subcluster or a temporary subcluster.

Read-only online upgrades require more than one subcluster so that the operator can reroute client traffic for the subcluster while it is upgrading. By default, the operator selects which subcluster receives the rerouted traffic using the following rules:

  • When rerouting traffic for the primary subcluster, the operator selects the first secondary subcluster defined in the CR.

  • When restarting the first secondary subcluster after the upgrade, the operator selects the first subcluster that is defined in the CR that is up.

  • If no secondary subclusters exist, you cannot perform a read-only online upgrade. The operator selects the first primary subcluster defined in the CR and performs an offline upgrade.

Route to an existing subcluster

You might want to control which subclusters handle rerouted client traffic due to subcluster capacity or licensing limitations. You can set the temporarySubclusterRouting.names parameter to specify an existing subcluster to receive the rerouted traffic:

spec:
  ...
  temporarySubclusterRouting:
    names:
      - subcluster-2
      - subcluster-1

In the previous example, subcluster-2 accepts traffic when the other subcluster-1 is offline. When subcluster-2 is down, subcluster-1 accepts its traffic.

Route to a temporary subcluster

To create a temporary subcluster that exists for the duration of the upgrade process, use the temporarySubclusterRouting.template parameter to provide a name and size for the temporary subcluster:

spec:
  ...
  temporarySubclusterRouting:
    template:
      name: transient
      size: 3

If you choose to upgrade with a temporary subcluster, ensure that you have the necessary resources.

Migrating deployment types

Beginning with Vertica server version 24.1.0, the operator manages deployments with vclusterops, a Go library that uses a high-level REST interface to perform database operations with the Node Management Agent (NMA) and HTTPS service. The vclusterops library replaces Administration tools (admintools), a traditional command-line interface that executes administrator commands through STDIN and required SSH keys for internal node communications. The vclusterops deployment is more efficient in containerized environments than the admintools deployment.

Because version 24.1.0 does not include admintools, you must migrate to the vcluster deployment type when you upgrade from an earlier server version.

Migrate the VerticaDB CR

Before you can migrate deployment types, you must upgrade the VerticaDB operator to version 2.0.0.

To migrate deployment types, update the manifest and apply it:

  1. Update the manifest to a vcluster deployment. The following sample manifest includes all fields that are required to migrate to a vclusterops deployment:

    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
      name: cr-name
      annotations:
        vertica.com/vcluster-ops: "true"
        vertica.com/run-nma-in-sidecar: "false"
    spec:
      image: "vertica/vertica-k8s:24.1.0-0"
      ...
    

    This manifest sets the following parameters:

    • apiVersion: By default, v1 supports vcluster deployments. Deprecated API version v1beta1 also supports vcluster, but Vertica recommends that you change to v1.
    • vertica.com/vcluster-ops: Set to true. With API version v1, this field and setting are optional. If you use the deprecated v1beta1, this setting is required or the migration fails.
    • vertica.com/run-nma-in-sidecar: You must set this to false for vcluster deployments. For additional details, see VerticaDB custom resource definition.
    • spec.image: Set this to a 24.1.0 image version. For a list images, see Vertica images.
  2. Apply the updated manifest to complete the migration:

    $ kubectl apply -f migration.yaml
    

Upgrade the Vertica server version

After you select your upgrade policy, use the kubectl command line tool to perform the upgrade and monitor its progress. The following steps demonstrate an online upgrade:

  1. Set the upgrade policy to Online:

    $ kubectl patch verticadb cluster-name --type=merge --patch '{"spec": {"upgradePolicy": "Online"}}'
    
  2. Update the image setting in the CR:

    $ kubectl patch verticadb cluster-name --type=merge --patch '{"spec": {"image": "vertica/vertica-k8s:new-version"}}'
    
  3. Use kubectl wait to wait until the operator leaves upgrade mode:

    $ kubectl wait --for=condition=UpgradeInProgress=False vdb/cluster-name --timeout=800s
    

View the upgrade process

To view the current phase of the upgrade process, use kubectl get to inspect the upgradeStatus status field:

$ kubectl get vdb -n namespacedatabase-name -o jsonpath='{.status.upgradeStatus}{"\n"}'
Restarting cluster with new image

To view the entire upgrade process, use kubectl describe to list the events the operator generated during the upgrade:

$ kubectl describe vdb cluster-name

...
Events:
  Type     Reason                                   Age                From                Message
  ----     ------                                   ----               ----                -------
  Normal   SubclusterRemoved                        32m                verticadb-operator  Removed subcluster 'sc_3'
  Normal   SubclusterRemoved                        32m                verticadb-operator  Removed subcluster 'sc2'
  Normal   SubclusterAdded                          18m                verticadb-operator  Added new subcluster 'sc1-sb'
  Normal   AddNodeStart                             18m                verticadb-operator  Starting add database node for pod(s) 'vertica-db-sc1-0, vertica-db-sc1-1, vertica-db-sc1-2'
  Normal   AddNodeSucceeded                         17m                verticadb-operator  Successfully added database nodes and it took 38s
  Normal   RebalanceShards                          17m                verticadb-operator  Successfully called 'rebalance_shards' for 'sc1-sb'
  Normal   SandboxSubclusterStart                   17m                verticadb-operator  Starting add subcluster "sc1-sb" to sandbox "replica-group-b-e904a"
  Normal   SandboxSubclusterSucceeded               17m                verticadb-operator  Successfully added subcluster "sc1-sb" to sandbox "replica-group-b-e904a"
  Normal   UpgradeStart                             16m (x2 over 18m)  verticadb-operator  Vertica server upgrade has started.
  Normal   ClusterShutdownStarted                   16m                verticadb-operator  Starting stop database on sandbox replica-group-b-e904a
  Normal   ClusterShutdownSucceeded                 16m                verticadb-operator  Successfully shutdown the database on sandbox replica-group-b-e904a and it took 17s
  Warning  LowLocalDataAvailSpace                   16m                verticadb-operator  Low disk space in persistent volume attached to vertica-db-sc1-sb-1
  Normal   ClusterRestartStarted                    14m (x3 over 15m)  verticadb-operator  Starting restart of the sandbox replica-group-b-e904a
  Normal   ClusterRestartSucceeded                  13m                verticadb-operator  Successfully restarted the sandbox replica-group-b-e904a and it took 70s
  Normal   PromoteSandboxSubclusterToMainStart      12m                verticadb-operator  Starting promote sandbox "replica-group-b-e904a" to main
  Normal   PromoteSandboxSubclusterToMainSucceeded  11m                verticadb-operator  Successfully promote sandbox "replica-group-b-e904a" to main
  Normal   SubclusterRemoved                        11m                verticadb-operator  Removed subcluster 'sc1'
  Normal   RenameSubclusterStart                    11m                verticadb-operator  Starting rename subcluster "sc1-sb" to "sc1"
  Normal   RenameSubclusterSucceeded                11m                verticadb-operator  Successfully rename subcluster "sc1-sb" to "sc1"
  Normal   UpgradeSucceeded                         11m                verticadb-operator  Vertica server upgrade has completed successfully.  New image is 'vertica/vertica-k8s:new-version'

10 - Hybrid Kubernetes clusters

An Eon Mode database can run hosts separate from the database and within Kubernetes.

An Eon Mode database can run hosts separate from the database and within Kubernetes. This architecture is useful in the following scenarios:

  • Leveraging Kubernetes tooling to quickly create a secondary subcluster for a database.

  • Creating an isolated sandbox environment to run ad hoc queries on a communal dataset.

  • Experimenting with the Vertica on Kubernetes performance overhead without migrating your primary subcluster into Kubernetes.

Define the Kubernetes portion of a hybrid architecture with a custom resource (CR). The custom resource has no knowledge of Vertica hosts that exist separately from the custom resource. This limits the operator's functionality and requires that you manually complete some tasks that the operator automates for a standard Vertica on Kubernetes custom resource.

Requirements and restrictions

The hybrid Kubernetes architecture has the following requirements and restrictions:

  • Hybrid Kubernetes clusters require a tool that enables Border Gateway Protocol (BGP) so that pods are accessible to your on-premises subcluster for external communication. For example, you can use the Calico CNI plugin to enable BGP.

  • You cannot use network address translation (NAT) between the Kubernetes pods and the on-premises cluster.

Operator limitations

In a hybrid architecture, the operator has no visibility outside of the custom resource. This limited visibility means that the operator cannot interact with the Eon Mode database or the primary subcluster. Within the scope of the custom resource, the operator automates only the following:

  • Schedules pods based on the manifest.

  • Creates service objects for the subcluster.

  • Creates a PersistentVolumeClaim (PVC) that persists data for each pod.

  • Executes the restart_node administration tool command if the Vertica server process is not running. To override this default behavior, set the autoRestartVertica custom resource parameter to false.

Defining a hybrid cluster

To define a hybrid cluster, you must set up SSH communications between the Eon Mode nodes and containers, and then define the hybrid CR.

SSH between environments

In an Eon Mode database, nodes communicate through SSH. Vertica containers use SSH with a static key. Because the CR has no knowledge of any of the Eon Mode hosts, you must make the containers aware of the Eon Mode SSH keys.

You can create a Secret for the CR that stores SSH credentials for both the Eon Mode database and the Vertica container. The Secret must contain the following:

  • id_rsa: private key shared among the pods.
  • id_rsa.pub: public key shared among the pods.
  • authorized_keys: file that contains the following keys:
    • id_rsa.pub for pod-to-pod traffic.
    • public key of on-premises root account.
    • public key of on-prem dbadmin account.

The following command creates a Secret named ssh-key that stores these SSH credentials. The Secret persists between life cycles to allow secure connections between the on-premises nodes and the CR:

$ kubectl create secret generic ssh-keys --from-file=$HOME/.ssh

Hybrid CR definition

Create a custom resource to define a subcluster that runs outside your standard Eon Mode database:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: hybrid-secondary-sc
spec:
  image: vertica/vertica-k8s:latest
  initPolicy: ScheduleOnly
  sshSecret: ssh-keys
  local:
    dataPath: /data
    depotPath: /depot
  dbName: vertdb
  subclusters:
    - name: sc1
      size: 3
    - name: sc2
      size: 3

In the previous example:

  • initPolicy: Hybrid clusters require that you set this to ScheduleOnly.

  • sshSecret: The Secret that contains SSH keys that authenticate connections to Vertica hosts outside the CR.

  • local: Required. The values persist data to the PersistentVolume (PV). These values must match the directory locations in the Eon Mode database that is associated with the Kubernetes pods.

  • dbName: This value must match the name of the standard Eon Mode database that is associated with this subcluster.

  • subclusters: Definition for each subcluster.

For complete implementation details, see VerticaDB custom resource definition. For details about each setting, see Custom resource definition parameters.

Maintaining quorum

If quorum is lost, you must manually restart the cluster with admintools:

$ /opt/vertica/bin/admintools -t restart_db --database database-name;

For details about maintaining quorum, see Data integrity and high availability in an Eon Mode database.

Scaling the Kubernetes subcluster

When you scale a hybrid cluster, you add nodes from the primary subcluster to the secondary subcluster on Kubernetes.

HDFS with Kerberos authentication

If you are scaling a cluster that authenticates Hadoop file storage (HDFS) data with Kerberos, you must alter the database configuration before you scale.

In the default configuration, the Vertica server process running in the Kubernetes pods cannot access the HDFS data due to incorrect permissions on the keytab file mounted in the pod. This requires that you set the KerberosEnableKeytabPermissionCheck Kerberos parameter:

  1. Set the KerberosEnableKeytabPermissionCheck configuration parameter to 0:
    => ALTER DATABASE DEFAULT SET KerberosEnableKeytabPermissionCheck = 0;
    WARNING 4324:  Parameter KerberosEnableKeytabPermissionCheck will not take effect until database restart
    ALTER DATABASE
    
  2. Restart the cluster with admintools so that the new setting takes effect:
    $ /opt/vertica/bin/admintools -t restart_db --database database-name;
    

For additional details about Vertica on Kubernetes and HDFS, see Configuring communal storage.

Scale the subcluster

When you add nodes from the primary subcluster to the secondary subcluster on Kubernetes, you must set up the configuration directory for the new nodes and change operator behavior during the scaling event:

  1. Execute the update_vertica script to set up the configuration directory. Vertica on Kubernetes requires the following configuration options for update_vertica:

    $ /opt/vertica/sbin/update_vertica \
        --accept-eula \
        --add-hosts host-list \
        --dba-user-password dba-user-password \
        --failure-threshold NONE \
        --no-system-configuration \
        --point-to-point \
        --data-dir /data-dir \
        --dba-user dbadmin \
        --no-package-checks \
        --no-ssh-key-install
    
  2. Set autoRestartVertica to false so that the operator does not interfere with the scaling operation:

    $ kubectl patch vdb database-name --type=merge --patch='{"spec": {"autoRestartVertica": false}}'
    
  3. Add the new nodes with the admintools db_add_node option:

    $ /opt/vertica/bin/admintools \
     -t db_add_node \
     --hosts host-list \
     --database database-name\
     --subcluster sc-name \
     --noprompt
    

    For details, see Adding and removing nodes from subclusters.

  4. After the scaling operation, set autoRestartVertica back to true:

    $ kubectl patch vdb database-name --type=merge --patch='{"spec": {"autoRestartVertica": true}}'
    

11 - Generating a custom resource from an existing Eon Mode database

To simplify Vertica on Kubernetes adoption, Vertica provides the vdb-gen migration tool that revives an existing Eon Mode database as a StatefulSet in Kubernetes.

To simplify Vertica on Kubernetes adoption, Vertica provides the vdb-gen migration tool that revives an existing Eon Mode database as a StatefulSet in Kubernetes. vdb-gen generates a custom resource (CR) from an existing Eon Mode database by connecting to the database and writing to standard output.

The vdb-gen tool is available for download as a release artifact in the vertica-kubernetes GitHub repository.

Use the -h flag to view a full list of the available vdb-gen options, including options for debugging and working with environment variables. The following steps generate a CR using basic commands:

  1. Execute vdb-gen and redirect the output to a YAML-formatted file:

    $ vdb-gen --password secret --name mydb 10.20.30.40 vertdb > vdb.yaml
    

    The previous command uses the following flags and values:

    • password: The existing database superuser secret password.

    • name: The name of the new custom resource object.

    • 10.20.30.40: The IP address of the existing database

    • vertdb: The name of the existing Eon Mode database.

    • vdb.yaml: The YAML formatted file that contains the custom resource definition generated by the vdb-gen tool.

  2. Use the admintools stop_db command to stop the existing database:

    $ /opt/vertica/bin/admintools -t stop_db -d vertdb
    

    Wait for the cluster lease to expire before continuing. For details, see Reviving an Eon Mode database cluster.

  3. Apply the YAML-formatted manifest that was generated by the vdb-gen tool:

    $ kubectl apply -f vdb.yaml
    verticadb.vertica.com/mydb created
    
  4. The operator creates the StatefulSet, installs Vertica on each pod, and runs revive. To view the events generated for the new database, use kubectl describe:

    $ kubectl describe vdb mydb
    

12 - Containerized Kafka Scheduler

The Vertica Apache Kafka integration includes a scheduler, a mechanism that you can configure to automatically consume data from Kafka and load that data into a Vertica database. The Vertica Kafka Scheduler is the containerized version of that scheduler that runs natively on Kubernetes. Both schedulers have identical functionality and accept the same configuration parameters.

This document provides quickstart instructions about how to create, configure, and launch the Vertica Kafka Scheduler on Kubernetes. It includes minimal details about each command. For in-depth documentation about scheduler behavior and advanced configuration, see Automatically consume data from Kafka with a scheduler.

Prerequisites

Add the Helm charts

To simplify deployment, Vertica packages the Kafka Scheduler in a Helm chart. Add the charts to your local helm repository:

$ helm repo add vertica-charts https://vertica.github.io/charts
$ helm repo update

Launch a scheduler

When you launch a scheduler, you must update the scheduler configuration, create the scheduler, set up a Vertica database to consume data from the scheduler, and then launch the scheduler.

The Vertica Kafka scheduler has two modes:

  • initializer: Configuration mode. Starts a container so that you can exec into it and configure it.
  • launcher: Launch mode. Launches the scheduler. Starts a container that calls vkconfig launch automatically. Run this mode after you configure the container in initializer mode.

Use the initializer mode to configure all the scheduler settings. After you configure the scheduler, you upgrade the helm chart to launch it in launcher mode.

Install the scheduler

Install the scheduler Helm chart to start the scheduler in initializer mode. The following helm install command deploys a scheduler named vkscheduler in the kafka namespace:

$ helm install vkscheduler --namespace kafka vertica-charts/vertica-kafka-scheduler
NAME: vkscheduler
LAST DEPLOYED: Tue Apr  2 11:53:49 2024
NAMESPACE: kafka
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Vertica's Kafka Scheduler has been deployed.

The initializer pod is running. You can exec into it and run your vkconfig
commands with this command:

kubectl exec -n kafka -it vkscheduler-vertica-kafka-scheduler-initializer -- bash

The command output provides the kubectl exec command that you can use to access a shell in the initializer pod and configure the scheduler.

Verify that the scheduler's initializer pod is running:

$ kubectl get pods --namespace kafka
NAME                                              READY   STATUS    RESTARTS      AGE
...
vkscheduler-vertica-kafka-scheduler-initializer   1/1     Running   1 (12s ago)   77s

Create the target table

The target table is the Vertica database table that stores the data that the scheduler loads from Kafka. In this example, you create a flex table so that you can load data with an unknown or varying schema:

  1. Create a flex table to store the data:
    => CREATE FLEX TABLE KafkaFlex();
    CREATE TABLE
    
  2. Create a user for the flex table:
    => CREATE USER KafkaUser;
    CREATE USER
    
  3. Create a resource pool for the scheduler. Vertica recommends that each scheduler have exclusive use of its own resource pool so that you can fine-tune the scheduler's impact on your Vertica cluster's performance:
    => CREATE RESOURCE POOL scheduler_pool PLANNEDCONCURRENCY 1;
    CREATE RESOURCE POOL
    
    For additional details, see Managing scheduler resources and performance.

Override scheduler configuration

After you install the scheduler, you need to configure it for your environment. The scheduler configuration file is vkconfig.conf, and it is stored in the following location in the initializer pod:

/opt/vertica/packages/kafka/config/vkconfig.conf

By default, vkconfig.conf contains the following values:

config-schema=Scheduler
dbport=5433
enable-ssl=false
username=dbadmin

vkconfig.conf is read-only from within the filesystem, so you must upgrade the Helm chart to override the default settings. The following YAML-formatted file provides a template for scheduler overrides:

image:
  repository: opentext/kafka-scheduler
  pullPolicy: IfNotPresent
  tag: scheduler-version
launcherEnabled: false
replicaCount: 1
initializerEnabled: true
conf:
  generate: true
  content:
    config-schema: scheduler-name
    username: dbadmin
    dbport: "5433"
    enable-ssl: "false"
    dbhost: vertica-db-host-ip
tls:
  enabled: false
serviceAccount:
  create: true

This template requires that you update the following values:

  • image.tag: Scheduler version. The scheduler version must match the version of the Vertica database that you used to create the target table.
  • conf.content.config-schema: Scheduler name. When you launch the scheduler, the Vertica database creates a schema that you can track with data streaming tables.
  • conf.content.dbhost: IP address for a host in your Vertica cluster.

For example, the scheduler-overrides.yaml file contains the following values:

image:
  repository: opentext/kafka-scheduler
  pullPolicy: IfNotPresent
  tag: 24.2.0
launcherEnabled: false
replicaCount: 1
initializerEnabled: true
conf:
  generate: true
  content:
    config-schema: scheduler-sample
    username: dbadmin
    dbport: "5433"
    enable-ssl: "false"
    dbhost: 10.20.30.40
tls:
  enabled: false
serviceAccount:
  create: true

After you define your overrides, use helm upgrade to apply the overrides to the scheduler initializer pod:

$ helm upgrade vkscheduler --namespace kafka vertica-charts/vertica-kafka-scheduler -f scheduler-overrides.yaml
Release "vkscheduler" has been upgraded. Happy Helming!
NAME: vkscheduler
LAST DEPLOYED: Tue Apr  2 11:54:35 2024
NAMESPACE: kafka
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Vertica's Kafka Scheduler has been deployed.

The initializer pod is running. You can exec into it and run your vkconfig
commands with this command:

kubectl exec -n kafka -it vkscheduler-vertica-kafka-scheduler-initializer -- bash

Configure the scheduler

After you update vkconfig.conf, you need to configure the scheduler itself. A scheduler is a combination of multiple components that you must configure individually with the vkconfig command.

To configure the scheduler, you must access the scheduler initializer pod to execute the vkconfig commands:

  1. Access a bash shell in the scheduler initializer pod:

    $ kubectl exec -n kafka -it vkscheduler-vertica-kafka-scheduler-initializer -- bash
    
  2. Define the scheduler. This command identifies the Vertica user, resource pool, and settings such as frame duration:

    bash-5.1$ vkconfig scheduler --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --frame-duration 00:00:10 \
         --create --operator KafkaUser \
         --eof-timeout-ms 2000 \
         --config-refresh 00:01:00 \
         --new-source-policy START \
         --resource-pool scheduler_pool
    
  3. Define the target, which is the Vertica database table that the loads data from the scheduler:

    bash-5.1$ vkconfig target --add --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --target-schema public \
         --target-table KafkaFlex
    
  4. Define the load spec. This defines how Vertica parses the data from Kafka:

    bash-5.1$ vkconfig load-spec --add --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --load-spec KafkaSpec \
         --parser kafkajsonparser \
         --load-method DIRECT \
         --message-max-bytes 1000000
    
  5. Define the cluster. This identifies your Kafka cluster:

    bash-5.1$ vkconfig cluster --add --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --cluster KafkaCluster \
         --hosts kafka01.example.com:9092,kafka03.example.com:9092
    
  6. Define the source. The source is the Kafka topic and partitions that you want to load data from:

    bash-5.1$ vkconfig source --add --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --cluster KafkaCluster \
         --source KafkaTopic1 \
         --partitions 1
    
  7. Define the microbatch. The microbatch combines the components you created in the previous steps:

    bash-5.1$ vkconfig microbatch --add --conf /opt/vertica/packages/kafka/config/vkconfig.conf \
         --microbatch KafkaBatch1 \
         --add-source KafkaTopic1 \
         --add-source-cluster KafkaCluster \
         --target-schema public \
         --target-table KafkaFlex \
         --rejection-schema public \
         --rejection-table KafkaFlex_rej \
         --load-spec KafkaSpec
    

After you configure the scheduler, exit the pod:

bash-5.1$ exit
exit
$

Launch the scheduler

After you configure the scheduler, you must launch it. To launch the scheduler, upgrade the Helm chart to change the launcherEnabled field to true:

$ helm upgrade --namespace kafka vkscheduler vertica-charts/vertica-kafka-scheduler \
    --set "launcherEnabled=true"

A new pod starts that runs the scheduler in launch mode:

$ kubectl get pods
NAME                                              READY   STATUS      RESTARTS      AGE
tester-vertica-kafka-scheduler-66d5c49dbf-nc86k   1/1     Running     0             14s
tester-vertica-kafka-scheduler-initializer        1/1     Running     0             85m

Test your deployment

Now that you have a containerized Kafka cluster and VerticaDB CR running, you can test that the scheduler is automatically sending data from the Kafka producer to Vertica:

  1. Open a shell that is running your Kafka producer and send sample JSON data:

    >{"a": 1}
    >{"a": 1000}
    
  2. Open a terminal with access to your Vertica cluster and vsql. Query the KafkaFlex table to confirm that it contains the sample JSON data that you sent through the Kafka producer:

    => SELECT compute_flextable_keys_and_build_view('KafkaFlex');
                                     compute_flextable_keys_and_build_view                    
    --------------------------------------------------------------------------------------------------------
     Please see public.KafkaFlex_keys for updated keys
    The view public.KafkaFlex_view is ready for querying
    (1 row)
    
    => SELECT a from KafkaFlex_view;
     a
    -----
     1
     1000
    (2 rows)
    

Clean up

To delete the scheduler, you must use the vkconfig command with the scheduler tool --drop option and the scheduler schema. You must access a shell within the scheduler pod to run the commands:

$ kubectl exec -n kafka -it vkscheduler-vertica-kafka-scheduler-initializer -- bash
bash-5.1$ /opt/vertica/packages/kafka/bin/vkconfig scheduler --drop --config-schema scheduler_sample

You can delete your Kubeneretes resources with the helm uninstall command:

$ helm uninstall vkscheduler -n kafka

12.1 - Kafka scheduler parameters

The following list describes the available settings for the Vertica Kafka Scheduler:

affinity
Applies affinity rules that constrain the scheduler to specific nodes.
conf.configMapName
Name of the ConfigMap to use and optionally generate. If omitted, the chart picks a suitable default.
conf.content
Set of key-value pairs in the generated ConfigMap. If conf.generate is false, this setting is ignored.
conf.generate
When set to true, the Helm chart controls the creation of the vkconfig.conf ConfigMap.

Default: true

fullNameOverride
Gives the Helm chart full control over the name of the objects that get created. This takes precedence over nameOverride.
initializerEnabled
When set to true, the initializer pod is created. This can be used to run any setup tasks needed.

Default: true

image.pullPolicy
How often Kubernetes pulls the image for an object. For details, see Updating Images in the Kubernetes documentation.

Default: IfNotPresent

image.repository
The image repository and name that contains the Vertica Kafka Scheduler.

Default: opentext/kafka-scheduler

image.tag
Version of the Vertica Kafka Scheduler. This setting must match the version of the Vertica server that the scheduler connects to.

For a list of available tags, see opentext/kafka-scheduler.

Default: Helm chart's appVersion

imagePullSecrets
List of Secrets that contain the required credentials to pull the image.
launcherEnabled
When set to true, the Helm chart creates the launch deployment. Enable this setting after you configure the scheduler options in the container.

Default: true

jvmOpts
Values to assign to the VKCONFIG_JVM_OPTS environment variable in the pods.
nameOverride
Controls the name of the objects that get created. This is combined with the Helm chart release to form the name.
nodeSelector
nodeSelector that controls where the pod is scheduled.
podAnnotations
Annotations that you want to attach to the pods.
podSecurityContext
Security context for the pods.
replicaCount
Number of launch pods that the chart deploys.

Default: 1

resources
Host resources to use for the pod.
securityContext
Security context for the container in the pod.
serviceAccount.annotations
Annotations to attach to the ServiceAccount.
serviceAccount.create
When set to true, a ServiceAccount is created as part of the deployment.

Default: true

serviceAccount.name
Name of the service accountt. If this parameter is not set and serviceAccount.create is set to true, a name is generated using the fullname template.
timezone
Manages the timezone of the logger. As logging employs log4j, ensure you use a Java-friendly timezone ID. For details, see this Oracle documentation.

Default: UTC

tls.enabled
When set to true, the scheduler is set up for TLS authentication.

Default: false

tls.keyStoreMountPath
Directory name where the keystore is mounted in the pod. This setting controls the name of the keystore within the pod. The full path to the keystore is constructed by combining this parameter and tls.keyStoreSecretKey.
tls.keyStorePassword
Password that protects the keystore. If this setting is omitted, then no password is used.
tls.keyStoreSecretKey
Key within tls.keyStoreSecretName that is used as the keystore file name. This setting and tls.keyStoreMountPath form the full path to the key in the pod.
tls.keyStoreSecretName
Name of an existing Secret that contains the keystore. If this setting is omitted, no keystore information is included.
tls.trustStoreMountPath
Directory name where the truststore is mounted in the pod. This setting controls the name of the truststore within the pod. The full path to the truststore is constructed by combining this parameter with tls.trustStoreSecretKey.
tls.trustStorePassword
Password that protects the truststore. If this setting is omitted, then no password is used.
tls.trustStoreSecretKey
Key within tls.trustStoreSecretName that is used as the truststore file name. This is used with tls.trustStoreMountPath to form the full path to the key in the pod.
tls.trustStoreSecretName
Name of an existing Secret that contains the truststore. If this setting is omitted, then no truststore information is included.

13 - Backup and restore containerized Vertica (vbr)

In Vertica on Kubernetes, backup and restore operations use the same components and tooling as non-containerized environments, including the following:

Containerized backup and restore operations require that you make these components and tools available to the Vertica server process running within a pod. The following sections describe strategies that back up and restore your VerticaDB custom resource (CR) with Kubernetes objects.

For comprehensive backup and restore documentation, see Backing up and restoring the database.

Prerequisites

Sample configuration file

The vbr configuration file defines parameters that the vbr utility uses to execute backup and restore tasks. For details, see the following:

To define a vbr configuration file in Kubernetes, you can create a ConfigMap whose data field defines vbr configuration values. After you create the ConfigMap object in your Kubernetes environment, you can run vbr commands from within a pod that has access to the ConfigMap.

The following backup-configmap.yaml manifest creates a configuration file named backup.ini that backs up to an S3 bucket:

apiVersion: v1
kind: ConfigMap
metadata:
  name: backup-configmap
data:
  backup-host: |
        backup-pod-dns
  backup.ini: |
    [CloudStorage]
    cloud_storage_backup_path = s3://backup-bucket/database-backup-path
    cloud_storage_backup_file_system_path = [backup-pod-dns]:/opt/vertica/config/

    [Database]
    dbName = database-name

    [Misc]
    tempDir = /tmp/vbr
    restorePointLimit = 7
    objectRestoreMode = coexist    

To create the ConfigMap object, apply the manifest to your Kubernetes environment:

$ kubectl apply -f backup-configmap.yaml

backup-host definition

In the sample configuration file, backup-pod-dns is a portion of the pod's fully qualified domain name (FQDN). Vertica on Kubernetes creates a headless service object that constructs the FQDN for each object. The DNS format for each pod is as follows:

podName.headlessServiceName

The podName portion of the DNS is itself constructed from Kubernetes object names. For example, the following is a complete pod DNS:

vdb-main-0.vdb

In the preceding example:

  • vdb: VerticaDB CR name
  • main: Subcluster name
  • 0: StatefulSet ordinal index
  • vdb: Headless service name (always identical to the VerticaDB CR name)

To access a pod from outside the namespace, append the namespace to the pod DNS:

podName.headlessService.namespace

For additional details, see the Kubernetes documentation.

Mount the configuration file

After you define a ConfigMap with vbr configuration information, you must make it available to the Vertica pods that can execute the vbr utility. You can mount the ConfigMap as a volume in a VerticaDB CR instance. For details about mounting volumes in the VerticaDB CR, see Mounting custom volumes.

Cloud storage locations require access to information that you cannot provide in your configuration file, such as environment variables. You can set environment variables in your CR with annotations.

The following mounted-vbr-config.yaml manifest mounts a backup-config ConfigMap object in the Vertica container's /vbr directory:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: verticadb
spec:
  annotations:
    VBR_BACKUP_STORAGE_SECRET_ACCESS_KEY: "access-key"
    VBR_BACKUP_STORAGE_ACCESS_KEY_ID: "access-key-id"
    VBR_BACKUP_STORAGE_ENDPOINT_URL: "https://path/to/backup/storage"
    VBR_COMMUNAL_STORAGE_SECRET_ACCESS_KEY: "access-key"
    VBR_COMMUNAL_STORAGE_ACCESS_KEY_ID: "access-key-id"
    VBR_COMMUNAL_STORAGE_ENDPOINT_URL: "https://path/to/communal/storage"
  communal:
    endpoint: https://path/to/s3-endpoint
    path: "s3://bucket/database-path"
  image: vertica/vertica-k8s:version
  subclusters:
    - isPrimary: true
      name: main
  volumeMounts:
    - name: backup-configmap
      mountPath: /vbr
  volumes:
    - name: backup-configmap
      configMap:
        name: backup-configmap

To mount the ConfigMap, apply the manifest

$ kubectl apply -f mounted-vbr-config.yaml

After you apply the manifest, each Vertica pod restarts, and the new backup volume is mounted.

Prepare the backup location

Before you can run a backup, you must prepare the backup location with the vbr init command. This command initializes a directory on the backup host to receive and store Vertica backup data. You need to initialize a backup location only once. For details, see Setting up backup locations.

The following backup-init.yaml manifest creates a pod to initialize the backup-host defined in the sample configuration file:

apiVersion: v1
kind: Pod
metadata:
  name: backup-init
spec:
  restartPolicy: OnFailure
  containers:
    - name: main
      image: vertica/vertica-k8s:version
      command:
        - bash
        - -c
        - "ssh -o 'StrictHostKeyChecking no' -i /home/dbadmin/.ssh/id_rsa dbadmin@$BACKUP_HOST '/opt/vertica/bin/vbr -t init --cloud-force-init --config-file /vbr/backup.ini'"
      env:
        - name: BACKUP_HOST
          valueFrom:
            configMapKeyRef:
              key: backup-host
              name: backup-configmap

Apply the manifest to initialize the backup location:

$ kubectl create -f backup-init.yaml

Run a backup

Your organization might run backups as needed or on a schedule. The following sections use the sample configuration file ConfigMap to demonstrate both scenarios.

On-demand backups

In some circumstances, you might need to run backup operations as needed. You can create a Kubernetes Job to run an on-demand backup. The following backup-on-demand.yaml manifest creates a Job object that executes a backup:

apiVersion: batch/v1
kind: Job
metadata:
  generateName: vertica-backup-
spec:
  template:
    spec:
      restartPolicy: OnFailure
      containers:
        - name: main
          image: vertica/vertica-k8s:version
          command:
            - bash
            - -c
            - "ssh -o 'StrictHostKeyChecking no' -i /home/dbadmin/.ssh/id_rsa dbadmin@$BACKUP_HOST '/opt/vertica/bin/vbr -t backup --config-file /vbr/backup.ini'"
          env:
            - name: BACKUP_HOST
              valueFrom:
                configMapKeyRef:
                  key: backup-host
                  name: backup-configmap

Each time that you want to create a new backup, execute the following command:

$ kubectl create -f backup-on-demand.yaml

Scheduled backups

You might need to schedule a backup at a fixed time or interval. You can run the backup as a Kubenetes CronJob object that schedules a Kubernetes Job as specified in Cron format.

The following backup-cronjob.yaml manifest runs a daily backup at 2:00 AM:

apiVersion: batch/v1
kind: CronJob
metadata:
  generateName: vertica-backup-
spec:
  schedule: "00 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: main
              image: vertica/vertica-k8s:version
              command:
                - bash
                - -c
                - "ssh -o 'StrictHostKeyChecking no' -i /home/dbadmin/.ssh/id_rsa dbadmin@$BACKUP_HOST '/opt/vertica/bin/vbr -t backup --config-file /vbr/backup.ini'"
              env:
                - name: BACKUP_HOST
                  valueFrom:
                    configMapKeyRef:
                      key: backup-host
                      name: backup-configmap

To schedule the backup, create the CronJob object:

$ kubectl create -f backup-cronjob.yaml

Restore from a backup

You can create a Kubernetes Job to restore database objects from a backup. For comprehensive documentation about the vbr restore task, see Restoring backups.

The following restore-on-demand-job.yaml manifest creates a Job object that restores a database:

apiVersion: batch/v1
kind: Job
metadata:
  generateName: vertica-restore-
spec:
  template:
    spec:
      restartPolicy: OnFailure
      containers:
        - name: main
          image: vertica/vertica-k8s:version
          command:
            - bash
            - -c
            - "ssh -o 'StrictHostKeyChecking no' -i /home/dbadmin/.ssh/id_rsa dbadmin@$BACKUP_HOST '/opt/vertica/bin/vbr -t restore --config-file /vbr/backup.ini'"
          env:
            - name: BACKUP_HOST
              valueFrom:
                configMapKeyRef:
                  key: backup-host
                  name: backup-configmap

The restore process requires that you stop the database, run the restore operation, and then restart the database. This workflow requires additional steps in a containerized environment because Kubernetes has components that monitor and maintain the desired state of the database. You must temporarily adjust some settings to provide time for the restore operation to complete. For details about the settings in this section, see Custom resource definition parameters.

The following steps change the environment for the resource process and then restore the original values:

  1. Update the CR to extend the livenessProbe timeout. This timeout triggers a container restart when it expires. The default livenessProbe timeout is about two and half minutes, which does not provide enough time to restore the database. The following patch command uses the livenessProberOverride parameter to set the timeout to about 20 minutes:

    $ kubectl patch vdb customResourceName --type=json --patch '[ { "op": "add", "path": "/spec/livenessProbeOverride", "value": {"initialDelaySeconds": 60, "periodSeconds": 30, "failureThreshold": 38}}]'
    
  2. Delete the StatefulSet for each subcluster so that the pods are restarted with the new livenessProberOverride setting:

    $ kubectl delete statefulset customResourceName-subclusterName
    
  3. Wait until the pods restart and the new pod IPs are present in admintools.conf:

    $ kubectl wait --for=condition=Ready=True pod --selector=app.kubernetes.io/instance=customResourceName --timeout=10m
    
  4. Set autoRestartVertica to false so that the Vertica server process does not automatically restart when you stop the database:

    $ kubectl patch vdb customResourceName --type=merge --patch '{"spec": {"autoRestartVertica": false}}'
    
  5. Access a shell in a host that is running a Vertica pod, and stop the database with admintools:

    $ kubectl exec -it hostname -- admintools -t stop_db -d database-name
    

    After you stop the database, wait for the cluster lease to expire.

  6. Apply the manifest to run a Job that restores the backup:

    $ kubectl create -f restore-on-demand-job.yaml
    
  7. After the Job completes, use patch to reset the livenessProbe timeout to its default setting:

    $ kubectl patch vdb customResourceName --type=json --patch '[{ "op": "remove", "path": "/spec/livenessProbeOverride" }]'
    
  8. Set autoRestartVertica back to true to reset the restart behavior to its state before the restore operation:

    $ kubectl patch vdb customResourceName --type=merge --patch '{"spec": {"autoRestartVertica": true}}'
    
  9. To speed up the restart process, delete the StatefulSet for each subcluster. The restart speed was affected when you increased the livenessProbeOverride setting:

    $ kubectl delete statefulset customResourceName-subclusterName
    
  10. Wait for the Vertica server to restart:

    $ kubectl wait --for=condition=Ready=True pod --selector=app.kubernetes.io/instance=customResourceName --timeout=10m
    

14 - Troubleshooting your Kubernetes cluster

These tips can help you avoid issues related to your Vertica on Kubernetes deployment and troubleshoot any problems that occur.

These tips can help you avoid issues related to your Vertica on Kubernetes deployment and troubleshoot any problems that occur.

Download the kubectl command line tool to debug your Kubernetes resources.

14.1 - General cluster and database

Inspect objects to diagnose issues

When you deploy a custom resource (CR), you might encounter a variety of issues. To pinpoint an issue, use the following commands to inspect the objects that the CR creates:

kubectl get returns basic information about deployed objects:

$ kubectl get pods -n namespace
$ kubectl get statefulset -n namespace
$ kubectl get pvc -n namespace
$ kubectl get event

kubectl describe returns detailed information about deployed objects:

$ kubectl describe pod pod-name -n namespace
$ kubectl describe statefulset name -n namespace
$ kubectl describe custom-resource-name -n namespace

Verify updates to a custom resource

Because the operator takes time to perform tasks, updates to the custom resource are not effective immediately. Use the kubectl command line tool to verify that changes are applied.

You can use the kubectl wait command to wait for a specified condition. For example, the operator uses the UpgradeInProgress condition to provide an upgrade status. After you begin the image version upgrade, wait until the operator acknowledges the upgrade and sets this condition to True:

$ kubectl wait --for=condition=UpgradeInProgress=True vdb/cluster-name –-timeout=180s

After the upgrade begins, you can wait until the operator leaves upgrade mode and sets this condition to False:

$ kubectl wait --for=condition=UpgradeInProgress=False vdb/cluster-name –-timeout=800s

For more information about kubectl wait, see the kubectl reference documentation.

Pods are running but the database is not ready

When you check the pods in your cluster, the pods are running but the database is not ready:

$ kubectl get pods
NAME                                                    READY   STATUS    RESTARTS   AGE
vertica-crd-sc1-0                                       0/1     Running   0          12m
vertica-crd-sc1-1                                       0/1     Running   1          12m
vertica-crd-sc1-2                                       0/1     Running   0          12m
verticadb-operator-controller-manager-5d9cdc9b8-kw9nv   2/2     Running   0          24m

To find the root cause of the issue, use kubectl logs to check the operator manager. The following example shows that the communal storage bucket does not exist:

$ kubectl logs -l app.kubernetes.io/name=verticadb-operator -c manager -f
2021-08-04T20:03:00.289Z        INFO    controllers.VerticaDB   ExecInPod entry {"verticadb": "default/vertica-crd", "pod": {"namespace": "default", "name": "vertica-crd-sc1-0"}, "command": "bash -c ls -l /opt/vertica/config/admintools.conf && grep '^node\\|^v_\\|^host' /opt/vertica/config/admintools.conf "}
2021-08-04T20:03:00.369Z        INFO    controllers.VerticaDB   ExecInPod stream        {"verticadb": "default/vertica-crd", "pod": {"namespace": "default", "name": "vertica-crd-sc1-0"}, "err": null, "stdout": "-rw-rw-r-- 1 dbadmin verticadba 1243 Aug  4 20:00 /opt/vertica/config/admintools.conf\nhosts = 10.244.1.5,10.244.2.4,10.244.4.6\nnode0001 = 10.244.1.5,/data,/data\nnode0002 = 10.244.2.4,/data,/data\nnode0003 = 10.244.4.6,/data,/data\n", "stderr": ""}
2021-08-04T20:03:00.369Z        INFO    controllers.VerticaDB   ExecInPod entry {"verticadb": "default/vertica-crd", "pod": {"namespace": "default", "name": "vertica-crd-sc1-0"}, "command": "/opt/vertica/bin/admintools -t create_db --skip-fs-checks --hosts=10.244.1.5,10.244.2.4,10.244.4.6 --communal-storage-location=s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c --communal-storage-params=/home/dbadmin/auth_parms.conf --sql=/home/dbadmin/post-db-create.sql --shard-count=12 --depot-path=/depot --database verticadb --force-cleanup-on-failure --noprompt --password ******* "}
2021-08-04T20:03:00.369Z        DEBUG   controller-runtime.manager.events       Normal  {"object": {"kind":"VerticaDB","namespace":"default","name":"vertica-crd","uid":"26100df1-93e5-4e64-b665-533e14abb67c","apiVersion":"vertica.com/v1","resourceVersion":"11591"}, "reason": "CreateDBStart", "message": "Calling 'admintools -t create_db'"}
2021-08-04T20:03:17.051Z        INFO    controllers.VerticaDB   ExecInPod stream        {"verticadb": "default/vertica-crd", "pod": {"namespace": "default", "name": "vertica-crd-sc1-0"}, "err": "command terminated with exit code 1", "stdout": "Default depot size in use\nDistributing changes to cluster.\n\tCreating database verticadb\nBootstrap on host 10.244.1.5 return code 1 stdout '' stderr 'Logged exception in writeBufferToFile: RecvFiles failed in closing file [s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/verticadb_rw_access_test.txt]: The specified bucket does not exist. Writing test data to file s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/verticadb_rw_access_test.txt failed.\\nTesting rw access to communal location s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/ failed\\n'\n\nError: Bootstrap on host 10.244.1.5 return code 1 stdout '' stderr 'Logged exception in writeBufferToFile: RecvFiles failed in closing file [s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/verticadb_rw_access_test.txt]: The specified bucket does not exist. Writing test data to file s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/verticadb_rw_access_test.txt failed.\\nTesting rw access to communal location s3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c/ failed\\n'\n\n", "stderr": ""}
2021-08-04T20:03:17.051Z        INFO    controllers.VerticaDB   aborting reconcile of VerticaDB {"verticadb": "default/vertica-crd", "result": {"Requeue":true,"RequeueAfter":0}, "err": null}
2021-08-04T20:03:17.051Z        DEBUG   controller-runtime.manager.events       Warning {"object": {"kind":"VerticaDB","namespace":"default","name":"vertica-crd","uid":"26100df1-93e5-4e64-b665-533e14abb67c","apiVersion":"vertica.com/v1","resourceVersion":"11591"}, "reason": "S3BucketDoesNotExist", "message": "The bucket in the S3 path 's3://newbucket/db/26100df1-93e5-4e64-b665-533e14abb67c' does not exist"}

Create an S3 bucket for the cluster:

$ S3_BUCKET=newbucket
$ S3_CLUSTER_IP=$(kubectl get svc | grep minio | head -1 | awk '{print $3}')
$ export AWS_ACCESS_KEY_ID=minio
$ export AWS_SECRET_ACCESS_KEY=minio123
$ aws s3 mb s3://$S3_BUCKET --endpoint-url http://$S3_CLUSTER_IP
make_bucket: newbucket

Use kubectl get pods to verify that the cluster uses the new S3 bucket and the database is ready:

$ kubectl get pods
NAME                                                    READY   STATUS    RESTARTS   AGE
minio-ss-0-0                                            1/1     Running   0          18m
minio-ss-0-1                                            1/1     Running   0          18m
minio-ss-0-2                                            1/1     Running   0          18m
minio-ss-0-3                                            1/1     Running   0          18m
vertica-crd-sc1-0                                       1/1     Running   0          20m
vertica-crd-sc1-1                                       1/1     Running   0          20m
vertica-crd-sc1-2                                       1/1     Running   0          20m
verticadb-operator-controller-manager-5d9cdc9b8-kw9nv   2/2     Running   0          63m

Database is not available

After you create a custom resource instance, the database is not available. The kubectl get custom-resource command does not display information:

$ kubectl get vdb
NAME          AGE   SUBCLUSTERS   INSTALLED   DBADDED   UP
vertica-crd   4s

Use kubectl describe custom-resource to check the events for the pods to identify any issues:

$ kubectl describe vdb
Name:         vertica-crd
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  vertica.com/v1
Kind:         VerticaDB
Metadata:
  ...
  Superuser Password Secret:  su-passwd
Events:
  Type     Reason                           Age                From                Message
  ----     ------                           ----               ----                -------
  Warning  SuperuserPasswordSecretNotFound  5s (x12 over 15s)  verticadb-operator  Secret for superuser password 'su-passwd' was not found

In this circumstance, the custom resource uses a Secret named su-passwd to store the Superuser Password Secret, but there is no such Secret available. Create a Secret named su-passwd to store the Secret:

$ kubectl create secret generic su-passwd --from-literal=password=sup3rs3cr3t
secret/su-passwd created

Use kubectl get custom-resource to verify the issue is resolved:

$ kubectl get vdb
NAME          AGE   SUBCLUSTERS   INSTALLED   DBADDED   UP
vertica-crd   89s   1             0           0         0

Image pull failure

You receive an ImagePullBackOff error when you deploy a Vertica cluster with Helm charts, but you do not pre-pull the Vertica image from the local registry server:

$ kubectl describe pod pod-name-0
...
Events:
  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  ...
  Warning  Failed            2m32s                  kubelet            Failed to pull image "k8s-rhel7-01:5000/vertica-k8s:default-1": rpc error: code = Unknown desc = context canceled
  Warning  Failed            2m32s                  kubelet            Error: ErrImagePull
  Normal   BackOff           2m32s                  kubelet            Back-off pulling image "k8s-rhel7-01:5000/vertica-k8s:default-1"
  Warning  Failed            2m32s                  kubelet            Error: ImagePullBackOff
  Normal   Pulling           2m18s (x2 over 4m22s)  kubelet            Pulling image "k8s-rhel7-01:5000/vertica-k8s:default-1"

This occurs because the Vertica image size is too big to pull from the registry while deploying the Vertica cluster. Execute the following command on a Kubernetes host:

$ docker image list | grep vertica-k8s
k8s-rhel7-01:5000/vertica-k8s default-1 2d6f5d3d90d6 9 days ago 1.55GB

To solve this issue, complete one of the following:

  • Pull the Vertica images on each node before creating the Vertica StatefulSet:

    $ NODES=`kubectl get nodes | grep -v NAME | awk '{print $1}'`
    $ for node in $NODES; do ssh $node docker pull $DOCKER_REGISTRY:5000/vertica-k8s:$K8S_TAG; done
    
  • Use the reduced-size vertica/vertica-k8s:latest image for the Vertica server.

Pending pods due to insufficient CPU

If your host nodes do not have enough resources to fulfill the resource request from a pod, the pod stays in pending status.

In the following example, the pod requests 40 CPUs on the host node, and the pod stays in Pending:

$ kubectl describe pod cluster-vertica-defaultsubcluster-0
...
Status:         Pending
...
Containers:
  server:
    Image:       docker.io/library/vertica-k8s:default-1
    Ports:       5433/TCP, 5434/TCP, 22/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Command:
      /opt/vertica/bin/docker-entrypoint.sh
      restart-vertica-node
    Limits:
      memory:  200Gi
    Requests:
      cpu: 40
      memory:  200Gi
...
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  3h20m  default-scheduler  0/5 nodes are available: 5 Insufficient cpu.

To confirm the resources available on the host node. The following command confirms that the host node has only 40 allocatable CPUs:

$ kubectl describe node host-node-1
...
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sat, 20 Mar 2021 22:39:10 -0400   Sat, 20 Mar 2021 13:07:02 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sat, 20 Mar 2021 22:39:10 -0400   Sat, 20 Mar 2021 13:07:02 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sat, 20 Mar 2021 22:39:10 -0400   Sat, 20 Mar 2021 13:07:02 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Sat, 20 Mar 2021 22:39:10 -0400   Sat, 20 Mar 2021 13:07:12 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  172.19.0.5
  Hostname:    eng-g9-191
Capacity:
  cpu:                40
  ephemeral-storage:  285509064Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             263839236Ki
  pods:               110
Allocatable:
  cpu:                40
  ephemeral-storage:  285509064Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             263839236Ki
  pods:               110
...
Non-terminated Pods:          (3 in total)
  Namespace                   Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                                   ------------  ----------  ---------------  -------------  ---
  default                     cluster-vertica-defaultsubcluster-0    38 (95%)      0 (0%)      200Gi (79%)      200Gi (79%)    51m
  kube-system                 kube-flannel-ds-8brv9                  100m (0%)     100m (0%)   50Mi (0%)        50Mi (0%)      9h
  kube-system                 kube-proxy-lgjhp                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         9h
...

To correct this issue, reduce the resource.requests in the subcluster to values lower than the maximum allocatable CPUs. The following example uses a YAML-formatted file named patch.yaml to lower the resource requests for the pod:

$ cat patch.yaml
spec:
  subclusters:
    - name: defaultsubcluster
      resources:
        requests:
          memory: 238Gi
          cpu: "38"
        limits:
          memory: 238Gi
$ kubectl patch vdb cluster-vertica –-type=merge --patch “$(cat patch.yaml)verticadb.vertica.com/cluster-vertica patched

Pending pod after node removed

When you remove a host node from your Kubernetes cluster, a Vertica pod might stay in pending status if the pod uses a PersistentVolume (PV) that has a node affinity rule that prevents the pod from running on another node.

To resolve this issue, you must verify that the pods are pending because of an affinity rule, and then use the vdb-gen tool to revive the entire cluster.

First, determine if the pod is pending because of a node affinity rule. This requires details about the pending pod, the PersistentVolumeClaim (PVC) associated with the pod, and the PersistentVolume (PV) associated with the PVC:

  1. Use kubectl describe to return details about the pending pod:

    $ kubectl describe pod pod-name
    ...
    Events:
      Type     Reason            Age                From               Message
      ----     ------            ----               ----               -------
      Warning  FailedScheduling  28s (x2 over 48s)  default-scheduler  0/2 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 1 node(s) had volume node affinity conflict, 1 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
    

    The Message column verifies that the pod was not scheduled due a volume node affinity conflict.

  2. Get the name of the PVC associated with the pod:

    $ kubectl get pod -o jsonpath='{.spec.volumes[0].persistentVolumeClaim.claimName}{"\n"}' pod-name
    local-data-pod-name
    
  3. Use the PVC to get the PV. PVs are associated with nodes:

    $ kubectl get pvc -o jsonpath='{.spec.volumeName}{"\n"}' local-data-pod-name
    pvc-1926ae96-574d-4433-99b4-ec9ab0e5e497
    
  4. Use the PV to get the name of the node that has the affinity rule:

    $ kubectl get pv -o jsonpath='{.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0]}{"\n"}' pvc-1926ae96-574d-4433-99b4-ec9ab0e5e497
    ip-10-20-30-40.ec2.internal
    
  5. Verify that the node with the affinity rule is the node that was removed from the Kubernetes cluster.

Next, you must revive the entire cluster to get all pods running again. When you revive the cluster, you create new PVCs that restore the association between each pod and a PV to satisfy the node affinity rule.

While you have nodes running in the cluster, you can use the vdb-gen tool to generate a manifest and revive the database:

  1. Download the vdb-gen tool from the vertica-kubernetes GitHub repository:

    $ wget https://github.com/vertica/vertica-kubernetes/releases/latest/download/vdb-gen
    
  2. Copy the tool into a pod that has a running Vertica process:

    $ kubectl cp vdb-gen pod-name:/tmp/vdb-gen
    
  3. The vdb-gen tool requires the database name, so retrieve it with the following command:

    $ kubectl get vdb -o jsonpath='{.spec.dbName}{"\n"}' v
    database-name
    
  4. Run the vdb-gen tool with the database name. The following command runs the tool and pipes the output to a file named revive.yaml:

    $ kubectl exec -i pod-name -- bash -c "chmod +x /tmp/vdb-gen && /tmp/vdb-gen --ignore-cluster-lease --name v localhost database-name | tee /tmp/revive.yaml"
    
  5. Copy revive.yaml to your local machine so that you can use it after you remove the cluster:

    $ kubectl cp pod-name:/tmp/revive.yaml revive.yaml
    
  6. Save the current VerticaDB Custom Resource (CR). For example, the following command saves a CR named vertdb to a file named orig.yaml:

    $ kubectl get vdb vertdb -o yaml > orig.yaml
    
  7. Update revive.yaml with parts of orig.yaml that vdb-gen did not capture. For example, custom resource limits.

  8. Delete the existing Vertica cluster:

    $ kubectl delete vdb vertdb
    verticadb.vertica.com "vertdb" deleted
    
  9. Confirm that all PVCs that are associated with the deleted cluster were removed:

    1. Retrieve the PVC names. A PVC name uses the dbname-subcluster-podindex format:

      $ kubectl get pvc
      NAME                     STATUS   VOLUME                                     CAPACITY ACCESS MODES   STORAGECLASS   AGE
      local-data-vertdb-sc-0   Bound    pvc-e9834c18-bf60-4a4b-a686-ba8f7b601230   1Gi      RWO            local-path     34m
      local-data-vertdb-sc-1   Bound    pvc-1926ae96-574d-4433-99b4-ec9ab0e5e497   1Gi      RWO            local-path     34m
      local-data-vertdb-sc-2   Bound    pvc-4541f7c9-3afc-47f0-8d04-67fac370ee88   1Gi      RWO            local-path     34m
      
    2. Delete the PVCs:

      $ kubectl delete pvc local-data-vertdb-sc-0 local-data-vertdb-sc-1 local-data-vertdb-sc-2
      persistentvolumeclaim "local-data-vertdb-sc-0" deleted
      persistentvolumeclaim "local-data-vertdb-sc-1" deleted
      persistentvolumeclaim "local-data-vertdb-sc-2" deleted
      
  10. Revive the database with revive.yaml:

    $ kubectl apply -f revive.yaml
    verticadb.vertica.com/vertdb created
    

After the revive completes, all Vertica pods are running, and PVCs are recreated on new nodes. Wait for the operator to start the database.

Deploying to Istio

Vertica does not officially support Istio because the Istio sidecar port requirement conflicts with the port that Vertica requires for internal node communication. However, you can deploy Vertica on Kubernetes to Istio with changes to the Istio InboundInterceptionMode setting. Vertica provides access to this setting with annotations on the VerticaDB CR.

REDIRECT mode

REDIRECT mode is the default InboundInterceptionMode setting, and it requires that you disable network address translation (NAT) on port 5434, the port that the pods use for internal communication. Disable NAT on this port with the excludeInboundPorts annotation:

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: vdb
spec:
  annotations:
    traffic.sidecar.istio.io/excludeInboundPorts: "5434"

14.2 - Helm charts

Custom certificate helm install error

If you use custom certificates when you install the operator with the Helm chart, the helm install or kubectl apply command might return an error similar to the following:

$ kubectl apply -f ../operatorcrd.yaml
Error from server (InternalError): error when creating "../operatorcrd.yaml": Internal error occurred: failed calling webhook "mverticadb.kb.io": Post "https://verticadb-operator-webhook-service.namespace.svc:443/mutate-vertica-com-v1-verticadb?timeout=10s": x509: certificate is valid for ip-10-0-21-169.ec2.internal, test-bastion, not verticadb-operator-webhook-service.default.svc

You receive this error when the TLS key's Domain Name System (DNS) or Subject Alternate Name (SAN) is incorrect. To correct this error, define the DNS and SAN in a configuration file in the following format:

commonName = verticadb-operator-webhook-service.namespace.svc
...
[alt_names]
DNS.1 = verticadb-operator-webhook-service.namespace.svc
DNS.2 = verticadb-operator-webhook-service.namespace.svc.cluster.local

For additional details, see Installing the VerticaDB operator.

14.3 - Metrics gathering

Adding and testing the vlogger sidecar

Vertica provides the vlogger image that sends logs from vertica.log to standard output on the host node for log aggregation.

To add the sidecar to the CR, add an element to the spec.sidecars definition:

spec:
  ...
  sidecars:
    - name: vlogger
      image: vertica/vertica-logger:1.0.0

To test the sidecar, run the following command and verify that it returns logs:

$ kubectl logs pod-name -c vlogger

2021-12-08 14:39:08.538 DistCall Dispatch:0x7f3599ffd700-c000000000997e [Txn
2021-12-08 14:40:48.923 INFO New log
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> Log /data/verticadb/v_verticadb_node0002_catalog/vertica.log opened; #1
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> Processing command line: /opt/vertica/bin/vertica -D /data/verticadb/v_verticadb_node0002_catalog -C verticadb -n v_verticadb_node0002 -h 10.20.30.40 -p 5433 -P 4803 -Y ipv4
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> Starting up Vertica Analytic Database v11.0.2-20211201
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO>
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> vertica(v11.0.2) built by @re-docker5 from master@a44ffabdf3f05e8d104426506b088192f741c485 on 'Wed Dec  1 06:10:34 2021' $BuildId$
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> CPU architecture: x86_64
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> 64-bit Optimized Build
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> Compiler Version: 7.3.1 20180303 (Red Hat 7.3.1-5)
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> LD_LIBRARY_PATH=/opt/vertica/lib
2021-12-08 14:40:48.923 Main Thread:0x7fbbe2cf6280 [Init] <INFO> LD_PRELOAD=
2021-12-08 14:40:48.925 Main Thread:0x7fbbe2cf6280 <LOG> @v_verticadb_node0002: 00000/5081: Total swap memory used: 0
2021-12-08 14:40:48.925 Main Thread:0x7fbbe2cf6280 <LOG> @v_verticadb_node0002: 00000/4435: Process size resident set: 28651520
2021-12-08 14:40:48.925 Main Thread:0x7fbbe2cf6280 <LOG> @v_verticadb_node0002: 00000/5075: Total Memory free + cache: 59455180800
2021-12-08 14:40:48.925 Main Thread:0x7fbbe2cf6280 [Txn] <INFO> Looking for catalog at: /data/verticadb/v_verticadb_node0002_catalog/Catalog
...

Generating core files

In some circumstances, you might need to examine a core file that contains information about the Vertica server container process.

Vertica server container process

The following steps generate a core file for the Vertica server process:

  1. Use the securityContext value to set the privileged property to true:

    apiVersion: vertica.com/v1
    kind: VerticaDB
    ...
    spec:
      ...
      securityContext:
        privileged: true
    
  2. On the host machine, verify that /proc/sys/kernel/core_pattern is set to core:

    $ cat /proc/sys/kernel/core_pattern
    core
    

    The /proc/sys/kernel/core_pattern file is not namespaced, so setting this value affects all containers running on that host.

When Vertica generates a core, the machine writes a message to vertica.log that indicates where you can locate the core file.

OpenShift core files

If you want to generate a core file in OpenShift, you must add the SYS_PTRACE capability in the CR to collect vstacks:

  1. Use the securityContext value to set the capabilities.add property to ["SYS_PRTRACE"]:

        apiVersion: vertica.com/v1
     kind: VerticaDB
     ...
     spec:
       ...
       securityContext:
         capabilities:
           add: ["SYS_PTRACE"]
    
  2. Apply the changes:

    $ kubectl apply -f core-file-manifest.yaml
    
  3. Get a shell in the container and execute vstack as the superuser:

    $ kubectl exec svc/subcluster-name -- sh -c "echo root | su root /opt/vertica/bin/vstack"
    

14.4 - VerticaAutoscaler

Cannot find CPU metrics with VerticaAutoscaler

You might notice that your VerticaAutoScaler is not scaling correctly according to CPU utilization:

$ kubectl get hpa
NAME                REFERENCE                           TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
autoscaler-name     VerticaAutoscaler/autoscaler-name   <unknown>/50%   3         12        0          19h

$ kubectl describe hpa
Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler
Name: autoscaler-name
Namespace: namespace
Labels: <none>
Annotations: <none>
CreationTimestamp: Thu, 12 May 2022 10:25:02 -0400
Reference: VerticaAutoscaler/autoscaler-name
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 50%
Min replicas: 3
Max replicas: 12
VerticaAutoscaler pods: 3 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 7s horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 7s horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)

You receive this error because the metrics server is not installed:

$ kubectl top nodes
error: Metrics API not available

To install the metrics server:

  1. Download the components.yaml file:

    $ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
    
  2. Optionally, disable TLS:

    $ if ! grep kubelet-insecure-tls components.yaml; then
      sed -i 's/- args:/- args:\n - --kubelet-insecure-tls/' components.yaml;
    
  3. Apply the YAML file:

    $ kubectl apply -f components.yaml
    
  4. Verify that the metrics server is running:

    $ kubectl get svc metrics-server -n namespace
    NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
    metrics-server   ClusterIP   10.105.239.175   <none>        443/TCP   19h
    

CPU request error with VerticaAutoscaler

You might receive an error that states:

failed to get cpu utilization: missing request for cpu

You get this error because you must set resource limits on all containers, including sidecar containers. To correct this error:

  1. Verify the error:

    $ kubectl get hpa
    NAME                REFERENCE                           TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
    autoscaler-name     VerticaAutoscaler/autoscaler-name   <unknown>/50%   3         12        0          19h
    
    $ kubectl describe hpa
    Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler
    Name: autoscaler-name
    Namespace: namespace
    Labels: <none>
    Annotations: <none>
    CreationTimestamp: Thu, 12 May 2022 15:58:31 -0400
    Reference: VerticaAutoscaler/autoscaler-name
    Metrics: ( current / target )
    resource cpu on pods (as a percentage of request): <unknown> / 50%
    Min replicas: 3
    Max replicas: 12
    VerticaAutoscaler pods: 3 current / 0 desired
    Conditions:
    Type Status Reason Message
    ---- ------ ------ -------
    AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
    ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get cpu utilization: missing request for cpu
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Warning FailedGetResourceMetric 4s (x5 over 64s) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu
    Warning FailedComputeMetricsReplicas 4s (x5 over 64s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
    
  2. Add resource limits to the CR:

    $ cat /tmp/vdb.yaml
    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
      name: vertica-vdb
    spec:
      sidecars:
        - name: vlogger
          image: vertica/vertica-logger:latest
          resources:
            requests:
              memory: "100Mi"
              cpu: "100m"
            limits:
              memory: "100Mi"
              cpu: "100m"
      communal:
        credentialSecret: communal-creds
        endpoint: https://endpoint
            path: s3://bucket-location
      dbName: verticadb
      image: vertica/vertica-k8s:latest
      subclusters:
      - type: primary
        name: sc1
        resources:
          requests:
            memory: "4Gi"
            cpu: 2
          limits:
            memory: "4Gi"
            cpu: 2
        serviceType: ClusterIP
        serviceName: sc1
        size: 3
      upgradePolicy: Auto
    
  3. Apply the update:

    $ kubectl apply -f /tmp/vdb.yaml
    verticadb.vertica.com/vertica-vdb created
    

When you set a new CPU resource limit, Kubernetes reschedules each pod in the StatefulSet in a rolling update until all pods have the updated CPU resource limit.