Creating a custom resource

The custom resource definition (CRD) is a shared global object that extends the Kubernetes API beyond the standard resource types.

The custom resource definition (CRD) is a shared global object that extends the Kubernetes API beyond the standard resource types. The CRD serves as a blueprint for custom resource (CR) instances. You create CRs that specify the desired state of your environment, and the operator monitors the CR to maintain state for the objects within its namespace.

For convenience, this example CR uses a YAML-formatted file. For details about all available CR settings, see custom resource parameters.

Prerequisites

Creating secrets

Use the kubectl command line tool to create Secrets that store sensitive information in your custom resource without exposing the values they represent.

  1. Create a secret named vertica-license for your Vertica license:

    $ kubectl create secret generic vertica-license --from-file=license.dat=/path/to/license.dat
    

    By default, the Helm chart uses the free Community Edition license. This license is limited to 3 nodes and 1 TB of data.

  2. Create a secret named su-passwd to store your superuser password. If you do not add a superuser password, there is not one associated with the database:

    $ kubectl create secret generic su-passwd --from-literal=password=secret-password
    
  3. The following command stores both your S3-compatible communal access and secret key credentials in a Secret named s3-creds:

    $ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
    
  4. This tutorial configures a certificate authority (CA) bundle that authenticates the S3-compatible connections to your custom resource. Create a Secret named aws-cert:

    $ kubectl create secret generic aws-cert --from-file=root-cert.pem
    
  5. You can mount multiple certificates in the Vertica server filesystem. The following command creates a Secret for your mTLS certificate in a Secret named mtls:

    $ kubectl create secret generic mtls --from-file=mtls=/path/to/mtls-cert
    

Required fields

The VerticaDB definition begins with required fields that describe the version, resource type, and metadata:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: verticadb

The previous example defines the following:

  • apiVersion: The API group and Kubernetes API version in api-group/version format.

  • kind: The resource type. VerticaDB is the name of the Vertica custom resource type.

  • metadata: Data that identifies objects in the namespace.

    • name: The name of this CR object.

Spec definition

The spec field defines the desired state of the CR. During the control loop, the operator compares the spec values to the current state and reconciles any differences.

The following sections nest values under the spec field to define the desired state of your custom resource object.

Image management

Each custom resource instance requires access to Vertica server image and instruction on how often to download a new image:

spec:
  image: vertica/vertica-k8s:latest
  imagePullPolicy: Always

The previous example defines the following:

  • image: The image to run in the Vertica server container pod, defined here in docker-registry-hostname/image-name:tag format. For a full list of available Vertica images, see the Vertica Dockerhub registry.

  • imagePullPolicy: Controls when the operator pulls the image from the container registry. When you use the latest tag, set this to Always. The latest tag is overwritten with each new release, so you should check with the image registry to ensure that the correct most recent image is in use.

Cluster description values

This section logically groups fields that configure the database and how it operates:

spec:
  ...
  initPolicy: Create
  kSafety: "1"
  licenseSecret: vertica-license
  superuserPasswordSecret: su-passwd

The previous example defines the following:

  • initPolicy: Specifies how to initialize the database. Create initializes a new database for the custom resource.

  • kSafety: Determines the fault tolerance for the subcluster. For a three-pod subcluster, set kSafety to 1.

  • licenseSecret: The Secret that contains your Vertica license key. The license is mounted in the /home/dbadmin/licensing/mnt directory.

  • superuserPasswordSecret: The Secret that contains the database superuser password.

Mounting custom TLS certificates

certSecrets is a list that contains each Secret that you created to encrypt internal and external communications for your CR. Use the name key to add each certificate:

spec:
  ...
  certSecrets:
    - name: mtls
    - name: aws-cert

certSecrets accepts an unlimited number of name values. If you update an existing certificate, the operator replaces the certificate in the Vertica server container. If you add or delete a certificate, the operator reschedules the pod with the new configuration.

Each certSecret is mounted in the Vertica server container in the /certs/certSecrets.name directory. For example, the aws-cert Secret is mounted in the certs/aws-cert directory.

Configuring communal storage

The following example configures communal storage for an S3 endpoint. For a list of supported communal storage locations, see Containerized environments. For implementation details for each communal storage location, see Configuring communal storage.

Provide the location and credentials for the storage location in the communal section:

spec:
  ...
  communal:
    credentialSecret: s3-creds
    endpoint: https://path/to/s3-endpoint
    path: s3://bucket-name/key-name
    caFile: /certs/aws-cert/root_cert.pem
    region: aws-region

The previous example defines the following:

  • credentialSecret: The Secret that contains your communal access and secret key credentials.

  • endpoint: The S3 endpoint URL.

  • path: The location of the S3 storage bucket, in S3 bucket notation. This bucket must exist before you create the custom resource. After you create the custom resource, you cannot change this value.

  • caFile: Mounts in the server container filesystem the certificate file that validates S3-compatible connections to your custom resource. The CA file is mounted in the same directory as the aws-cert Secret that was added in Mounting Custom TLS Certificates.

  • region: The geographic location of the communal storage resources.

Adding a sidecar container

A sidecar is a utility container that runs in the same pod as the Vertica server container and performs a task for the Vertica server process. For example, you can use the vertica-logger image to add a sidecar that sends logs from vertica.log to standard output on the host node for log aggregation.

sidecars accepts a list of sidecar definitions, where each element defines the following values:

spec:
  ...
  sidecars:
    - name: sidecar-container
      image: sidecar-image:latest

The previous example defines the following:

  • name: The name of the sidecar. name indicates the beginning of a sidecar element.

  • image: The image for the sidecar container.

A sidecar that shares information with the Vertica server process must persist data between pod life cycles. The following section mounts a custom volume in the sidecar filesystem.

Mounting custom volumes

You might need to mount a custom volume to persist data between pod life cycles if an external service requires long-term access to your Vertica server data.

Use the volumeMounts.* parameters to mount one or more custom volumes. To mount a custom volume for the Vertica server container, add the volumeMounts.* values directly under spec. To mount a custom volume for a sidecar container, nest the volumeMounts.* values in the sidecars array as part of an individual sidecar element definition.

The volumes.* parameters make the custom volume available to the CR to mount in the appropriate container filesystem. Indent volumes to the same level as its corresponding volumeMounts entry. The following example mounts custom volumes for both the Vertica server container and the sidecar utility container:

spec:
  ...
  volumeMounts:
  - name: tenants-vol
    mountPath: /path/to/tenants-vol
  volumes:
    - name: tenants-vol
      persistentVolumeClaim:
        claimName: vertica-pvc
  ...
  sidecars:
    - name: sidecar-container
      image: sidecar-image:latest
      volumeMounts:
        - name: sidecar-vol
          mountPath: /path/to/sidecar-vol
      volumes:
        - name: sidecar-vol
          emptyDir: {}

The previous example defines the following:

  • volumes: Accepts a list of custom volumes and volume types to persist data for a container.

  • volumes.name: The name of the custom volume that persists data. This value must match the corresponding volumeMounts.name value.

  • persistentVolumeClaim and emptyDir: The volume type and name. The Vertica custom resource accepts any Kubernetes volume type.

Local container information

Each container persists catalog, depot, configuration, and log data in a PersistentVolume (PV). You must provide information about the data and depot locations for operations such as pod rescheduling:

spec:
  ...
  local:
    dataPath: /data
    depotPath: /depot
    requestSize: 500Gi

The previous example defines the following:

  • dataPath: Where the /data directory is mounted in the container filesystem. The /data directory stores the local catalogs and temporary files.

  • depotPath: Where the depot is mounted in the container filesystem. Eon Mode databases cache data locally in a depot to reduce the time it takes to fetch data from communal storage to perform operations.

  • requestSize: The minimum size of local data volume available when binding a PV to the pod.

You must configure a StorageClass to bind the pods to a PersistentVolumeClaim (PVC). For details, see Containerized Vertica on Kubernetes.

Shard count

The shardCount setting specifies the number of shards in the database:

spec:
  ...
  shardCount: 12

You cannot change this value after you instantiate the CR. When you change the number of pods in a subcluster, or add or remove a subcluster, the operator rebalances shards automatically.

For guidance on selecting the shard count, see Configuring your Vertica cluster for Eon Mode.

Subcluster definition

The subclusters section is a list of elements, where each element represents a subcluster and its properties. Each CR requires a primary subcluster or it returns an error:

spec:
  ...
  subclusters:
  - isPrimary: true
    name: primary-subcluster
    size: 3

The previous example defines the following:

  • isPrimary: Designates a subcluster as primary or secondary. Each CR requires a primary subcluster or it returns an error. For details, see Subclusters.

  • name: The name of the subcluster.

  • size: The number of pods in the subcluster.

Subcluster service object

Each subcluster communicates with external clients and internal pods through a service object:

spec:
  ...
  subclusters:
    ...
    serviceName: connections
    serviceType: LoadBalancer
    serviceAnnotations:
      service.beta.kubernetes.io/load-balancer-source-ranges: 10.0.0.0/24

In the previous example:

  • serviceName: Assigns a custom name to the service object so that you can use the same service object for multiple subclusters, if needed.

    Service object names use the metadata.name-serviceName naming convention. This example creates a service object named verticadb-connections.

  • serviceType: Defines the subcluster service object.

    By default, a subcluster uses the ClusterIP serviceType, which sets a stable IP and port that is accessible from within Kubernetes only. In many circumstances, external client applications need to connect to a subcluster that is fine-tuned for that specific workload. For external client access, set the serviceType to NodePort or LoadBalancer.

    The LoadBalancer service type is managed by your cloud provider. For implementation details, refer to the Kubernetes documentation and your cloud provider's documentation.

  • serviceAnnotations: Assigns a custom annotation to the service. This annotation defines the CIDRs that can access the network load balancer (NLB). For additional details, see the AWS Load Balancer Controller documentation.

For details about Vertica and service objects, see Containerized Vertica on Kubernetes.

Pod resource limits and requests

Set the amount of CPU and memory resources each host node allocates for the Vertica server pod, and the amount of resources each pod can request:

spec:
  ...
  subclusters:
    ...
    resources:
      limits:
        cpu: 32
        memory: 96Gi
      requests:
        cpu: 32
        memory: 96Gi

In the previous example:

  • resources: The amount of resources each pod requests from its host node. When you change resource settings, Kubernetes restarts each pod with the updated resource configuration.

  • limits: The maximum amount of CPU and memory that each server pod can consume.

  • requests: The amount of CPU and memory resources that each pod requests from a PV.

    For guidance on setting production limits and requests, see Recommendations for Sizing Vertica Nodes and Clusters.

    As a best practice, set the resource request and limit to equal values so that they are assigned to the guaranteed QoS class. Equal settings also provide the best safeguard against the Out Of Memory (OOM) Killer in constrained environments.

Node affinity

Kubernetes provides affinity and anti-affinity settings to control which resources the operator uses to schedule pods. As a best practice, set affinity to ensure that a single node does not serve two Vertica pods:

spec:
  ...
  subclusters:
    ...
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - vertica
          topologyKey: "kubernetes.io/hostname"

In the previous example:

  • affinity: Provides control over pod and host scheduling using labels.

  • podAntiAffinity: Uses pod labels to prevent scheduling on certain resources.

  • requiredDuringSchedulingIgnoredDuringExecution: The rules defined under this statement must be met before a pod is scheduled on a host node.

  • labelSelector: Identifies the pods affected by this affinity rule.

  • matchExpressions: A list of pod selector requirements that consists of a key, operator, and values definition. This matchExpression rule checks if the host node is running another pod that uses a vertica label.

  • topologyKey: Defines the scope of the rule. Because this uses the hostname topology label, this applies the rule in terms of pods and host nodes.

Complete file reference

As a reference, below is the complete CR YAML file:

apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
  name: verticadb
spec:
  image: vertica/vertica-k8s:latest
  imagePullPolicy: Always
  initPolicy: Create
  kSafety: "1"
  licenseSecret: vertica-license
  superuserPasswordSecret: su-passwd
  communal:
    credentialSecret: s3-creds
    endpoint: https://path/to/s3-endpoint
    path: s3://bucket-name/key-name
    caFile: /certs/aws-certs/root_cert.pem
    region: aws-region
  volumeMounts:
  - name: tenants-vol
    mountPath: /path/to/tenants-vol
  volumes:
    - name: tenants-vol
      persistentVolumeClaim:
        claimName: vertica-pvc
  sidecars:
    - name: sidecar-container
      image: sidecar-image:latest
      volumeMounts:
        - name: sidecar-vol
          mountPath: /path/to/sidecar-vol
      volumes:
        - name: sidecar-vol
          emptyDir: {}
  certSecrets:
    - name: mtls
    - name: aws-cert
  local:
    dataPath: /data
    depotPath: /depot
    requestSize: 500Gi
  shardCount: 12
  subclusters:
  - isPrimary: true
    name: primary-subcluster
    size: 3
    serviceName: connections
    serviceType: LoadBalancer
    serviceAnnotations:
      service.beta.kubernetes.io/load-balancer-source-ranges: 10.0.0.0/24
    resources:
      limits:
        cpu: 32
        memory: 96Gi
      requests:
        cpu: 32
        memory: 96Gi
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - vertica
          topologyKey: "kubernetes.io/hostname"