This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

VerticaDB operator

The Vertica operator automates error-prone and time-consuming tasks that a Vertica on Kubernetes administrator must otherwise perform manually.

The Vertica operator automates error-prone and time-consuming tasks that a Vertica on Kubernetes administrator must otherwise perform manually. The operator:

  • Installs Vertica

  • Creates an Eon Mode database

  • Upgrades Vertica

  • Revives an existing Eon Mode database

  • Restarts and reschedules DOWN pods

  • Scales subclusters

  • Manages services for pods

  • Monitors pod health

  • Handles load balancing for internal and external traffic

The Vertica operator is a Go binary that uses the SDK operator framework. It runs in its own pod, and is cluster-scoped to manage any resource objects in any namespace across the cluster.

For details about installing and upgrading the operator, see Installing the VerticaDB operator.

Monitoring desired state

Because the operator is cluster-scoped, each cluster is allowed one operator pod that acts as a custom controller and monitors the state of the custom resource objects within all namespaces across the cluster. The operator uses the control loop mechanism to reconcile state changes by investigating state change notifications from the custom resource instance, and periodically comparing the current state with the desired state.

If the operator detects a change in the desired state, it determines what change occurred and reconciles the current state with the new desired state. For example, if the user deletes a subcluster from the custom resource instance and successfully saves the changes, the operator deletes the corresponding subcluster objects in Kubernetes.

Validating state changes

All VerticaDB operator installation options include an admission controller, which uses a webhook to prevent invalid state changes to the custom resource. When you save a change to a custom resource, the admission controller webhook queries a REST endpoint that provides rules for mutable states in a custom resource. If a change violates the state rules, the admission controller prevents the change and returns an error. For example, it returns an error if you try to save a change that violates K-Safety.

Limitations

The operator has the following limitations:

The VerticaDB operator 2.0.0 does not use Administration tools (admintools) with API version v1. The following features require admintools commands, so they are not available with that operator version and API version configuration:

To use these features with operator 2.0.0, you must a lower server version.

1 - Installing the VerticaDB operator

The custom resource definition (CRD), DB operator, and admission controller work together to maintain the state of your environment and automate tasks:.

The VerticaDB operator is a custom controller that monitors CR instances to maintain the desired state of VerticaDB objects. The operator includes an admission controller, which is a webhook that queries a REST endpoint to verify changes to mutable states in a CR instance.

By default, the operator is cluster-scoped—you can deploy one operator per cluster to monitor objects across all namespaces in the cluster. For flexibility, Vertica also provides a Helm chart deployment option that installs the operator at the namespace level.

Installation options

Vertica provides the following options to install the VerticaDB operator and admission controller:

  • Helm charts. Helm is a package manager for Kubernetes. The Helm chart option is the most common installation method and lets you customize your TLS configuration and environment setup. For example, Helm chart installations include operator logging levels and log rotation policy. For details about additional options, see Helm chart parameters.

    Vertica also provides the Quickstart Helm chart option so that you can get started quickly with minimal requirements.

  • kubectl installation. Apply the Custom Resource Definitions (CRDs) and VerticaDB operator directly. You can use the kubectl tool to apply the latest CRD available on vertica-kubernetes GitHub repository.

  • OperatorHub.io. This is a registry that lets vendors share Kubernetes operators.

Helm charts

Vertica packages the VerticaDb operator and admission controller in a Helm chart. The following sections detail different installation methods so that you can install the operator to meet your environment requirements. You can customize your operator during and after installation with Helm chart parameters.

For additional details about Helm, see the Helm documentation.

Prerequisites

Quickstart installation

The quickstart installation installs the VerticaDB Helm chart with minimal commands. This deployment installs the operator in the default configuration, which includes the following:

  • Cluster-scoped webhook and controllers that monitor resources across all namespaces in the cluster. For namespace-scoped deployments, see Namespace-scoped installation.
  • Self-signed certificates to communicate with the Kubernetes API server. If your environment requires custom certificates, see Custom certificate installation.

To quickly install the Helm chart, you must add the latest chart to your local repository and then install it in a namespace:

  1. The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!
  2. Install the Helm chart to deploy the VerticaDB operator in your cluster. The following command names this chart instance vdb-op, and creates a default namespace for the operator if it does not already exist:
    $ helm install vdb-op --namespace verticadb-operator --create-namespace vertica-charts/verticadb-operator
    

For helm install options, see the Helm documentation.

Namespace-scoped installation

By default, the VerticaDB operator is cluster-scoped. However, Vertica provides an option to install a namespace-scoped operator for environments that require more granular control over which resources an operator watches for state changes.

The VerticaDB operator includes a webhook and controllers. The webhook is cluster-scoped and verifies state changes for resources across all namespaces in the cluster. The controllers—the control loops that reconcile the current and desired states for resources—do not have a cluster-scope requirement, so you can install them at the namespace level. The namespace-scoped operator installs the webhook once at the cluster level, and then installs the controllers in the specified namespace. You can install these namespaced controllers in multiple namespaces per cluster.

To install a namespace-scoped operator, add the latest chart to your respository and issue separate commands to deploy the webhook and controllers:

  1. The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!
  2. Deploy the cluster-scoped webhook and install the required CRDs. To deploy the operator as a webhook without controllers, set controllers.enable to false. The following command deploys the webhook to the vertica namespace, which is the namespace for a Vertica cluster:

    $ helm install webhook vertica-charts/verticadb-operator --namespace vertica --set controllers.enable=false
    
  3. Deploy the namespace-scoped operator. To prevent a second webhook installation, set webhook.enable to false. To deploy only the controllers, set controllers.scope to namespace. The following command installs the operator in the default namespace:

    $ helm install vdb-op vertica-charts/verticadb-operator --namespace default --set webhook.enable=false,controllers.scope=namespace
    

For details about the controllers.* parameter settings, see Helm chart parameters. For helm install options, see the Helm documentation.

Custom certificate installation

The admission controller uses a webhook that communicates with the Kubernetes API over HTTPS. By default, the Helm chart generates a self-signed certificate before installing the admission controller. A self-signed certificate might not be suitable for your environment—you might require custom certificates that are signed by a trusted third-party certificate authority (CA).

To add custom certificates for the webhook:

  1. Set the TLS key's Subjective Alternative Name (SAN) to the admission controller's fully-qualified domain name (FQDN). Set the SAN in a configuration file using the following format:

    [alt_names]
    DNS.1 = verticadb-operator-webhook-service.operator-namespace.svc
    DNS.2 = verticadb-operator-webhook-service.operator-namespace.svc.cluster.local
    
  2. Create a Secret that contains the certificates. A Secret conceals your certificates when you pass them as command-line parameters.

    The following command creates a Secret named tls-secret. It stores the TLS key, TLS certificate, and CA certificate:

    $ kubectl create secret generic tls-secret --from-file=tls.key=/path/to/tls.key --from-file=tls.crt=/path/to/tls.crt --from-file=ca.crt=/path/to/ca.crt
    
  3. Install the Helm chart.

    The add command downloads the chart to your local repository, and the update command gets the latest charts from the remote repository. When you add the Helm chart to your local chart repository, provide a descriptive name for future reference.

    The following add command names the charts vertica-charts:

    $ helm repo add vertica-charts https://vertica.github.io/charts
      "vertica-charts" has been added to your repositories
    $ helm repo update
      Hang tight while we grab the latest from your chart repositories...
      ...Successfully got an update from the "vertica-charts" chart repository
      Update Complete. ⎈Happy Helming!

    When you install the Helm chart with custom certificates for the admission controller, you have to use the webhook.certSource and webhook.tlsSecret Helm chart parameters:

    • webhook.certSource indicates whether you want the admission controller to install user-provided certificates. To install with custom certificates, set this parameter to secret.
    • webhook.tlsSecret accepts a Secret that contains your certificates.

    The following command deploys the operator with the TLS certificates and creates namespace if it does not already exist:

    $ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
        --set webhook.certSource=secret \
        --set webhook.tlsSecret=tls-secret
    

Granting user privileges

After the operator is deployed, the cluster administrator is the only user with privileges to create and modify VerticaDB CRs within the cluster. To grant other users the privileges required to work with custom resources, you can leverage namespaces and Kubernetes RBAC.

To grant these privileges, the cluster administrator creates a namespace for the user, then grants that user edit ClusterRole within that namespace. Next, the cluster administrator creates a Role with specific CR privileges, and binds that role to the user with a RoleBinding. The cluster administrator can repeat this process for each user that must create or modify VerticaDB CRs within the cluster.

To provide a user with privileges to create or modify a VerticaDB CR:

  1. Create a namespace for the application developer:

    $ kubectl create namespace user-namespace
    namespace/user-namespace created
    
  2. Grant the application developer edit role privileges in the namespace:

    $ kubectl create --namespace user-namespace rolebinding edit-access --clusterrole=edit --user=username
    rolebinding.rbac.authorization.k8s.io/edit-access created
    
  3. Create the Role with privileges to create and modify any CRs in the namespace. Vertica provides the verticadb-operator-cr-user-role.yaml file that defines these rules:

    $ kubectl --namespace user-namespace apply -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/verticadb-operator-cr-user-role.yaml
    role.rbac.authorization.k8s.io/vertica-cr-user-role created
    

    Verify the changes with kubectl get:

    $ kubectl get roles --namespace user-namespace
    NAME                   CREATED AT
    vertica-cr-user-role   2023-11-30T19:37:24Z
    
  4. Create a RoleBinding that associates this Role to the user. The following command creates a RoleBinding named vdb-access:

    $ kubectl create --namespace user-namespace rolebinding vdb-access --role=vertica-cr-user-role --user=username
    rolebinding.rbac.authorization.k8s.io/rolebinding created
    

    Verify the changes with kubectl get:

    $ kubectl get rolebinding --namespace user-namespace
    NAME          ROLE                        AGE
    edit-access   ClusterRole/edit            16m
    vdb-access    Role/vertica-cr-user-role   103s
    

Now, the user associated with username has access to create and modify VerticaDB CRs in the isolated user-namespace.

kubectl installation

You can install the VerticaDB operator from GitHub by applying the YAML manifests with the kubectl command-line tool:

  1. Install all Custom resource definitions. Because the size of the CRD is too large for client-side operations, you must use the server-side=true and --force-conflicts options to apply the manifests:

    kubectl apply --server-side=true --force-conflicts -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/crds.yaml
    

    For additional details about these commands, see Server-Side Apply documentation.

  2. Install the VerticaDB operator:
    $ kubectl apply -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/operator.yaml
    

OperatorHub.io

OperatorHub.io is a registry that allows vendors to share Kubernetes operators. Each vendor must adhere to packaging guidelines to simplify user adoption.

To install the VerticaDB operator from OperatorHub.io, navigate to the Vertica operator page and follow the install instructions.

2 - Upgrading the VerticaDB operator

Vertica supports two separate options to upgrade the VerticaDB operator:.

Vertica supports two separate options to upgrade the VerticaDB operator:

  • OperatorHub.io

  • Helm Charts

Prerequisites

OperatorHub.io

The Operator Lifecycle Manager (OLM) operator manages upgrades for OperatorHub.io installations. You can configure the OLM operator to upgrade the VerticaDB operator manually or automatically with the Subscription object's spec.installPlanApproval parameter.

Automatic upgrade

To configure automatic version upgrades, set spec.installPlanApproval to Automatic, or omit the setting entirely. When the OLM operator refreshes the catalog source, it installs the new VerticaDB operator automatically.

Manual upgrade

Upgrade the VerticaDB operator manually to approve version upgrades for specific install plans. To manually upgrade, set spec.installPlanApproval parameter to Manual and complete the following:

  1. Verify if there is an install plan that requires approval to proceed with the upgrade:

    $ kubectl get installplan
    NAME CSV APPROVAL APPROVED
    install-ftcj9 verticadb-operator.v1.7.0 Manual false
    install-pw7ph verticadb-operator.v1.6.0 Manual true
    

    The command output shows that the install plan install-ftcj9 for VerticaDB operator version 1.7.0 is not approved.

  2. Approve the install plan with a patch command:

    $ kubectl patch installplan install-ftcj9 --type=merge --patch='{"spec": {"approved": true}}'
    installplan.operators.coreos.com/install-ftcj9 patched
    

    After you set the approval, the OLM operator silently upgrades the VerticaDB operator.

  3. Optional. To monitor its progress, inspect the STATUS column of the Subscription object:

    $ kubectl describe subscription subscription-object-name
    

Helm charts

You must have cluster administrator privileges to upgrade the VerticaDB operator with Helm charts.

The Helm chart includes the CRD, but the helm install command does not overwrite an existing CRD. To upgrade the operator, you must update the CRD with the manifest from the GitHub repository.

Additionally, you must upgrade all custom resource definitions, even if you do deploy them in your environment. These CRDs are installed with the operator and maintained as separate YAML manifests. Upgrading all CRDs ensure that your operator is upgraded completely.

You can upgrade the CRDs and VerticaDB operator from GitHub by applying the YAML manifests with the kubectl command-line tool:

  1. Install all Custom resource definitions. Because the size of the CRD is too large for client-side operations, you must use the server-side=true and --force-conflicts options to apply the manifests:

    kubectl apply --server-side=true --force-conflicts -f https://github.com/vertica/vertica-kubernetes/releases/latest/download/crds.yaml
    

    For additional details about these commands, see Server-Side Apply documentation.

  2. Upgrade the Helm chart:

    $ helm upgrade operator-name --wait vertica-charts/verticadb-operator
    

3 - Helm chart parameters

The following table describes the available settings for the VerticaDB operator and admission controller Helm chart.

The following list describes the available settings for the VerticaDB operator and admission controller Helm chart:

affinity
Applies rules that constrain the VerticaDB operator to specific nodes. It is more expressive than nodeSelector. If this parameter is not set, then the operator uses no affinity setting.
controllers.enable
Determines whether controllers are enabled when running the operator. Controllers watch and act on custom resources within the cluster.

For namespace-scoped operators, set this to false. This deploys the cluster-scoped operator only as a webhook, and then you can set webhook.enable to false and deploy the controllers to an individual namespace. For details, see Installing the VerticaDB operator.

Default: true

controllers.scope
Scope of the controllers in the VerticaDB operator. Controllers watch and act on custom resources within the cluster. This parameter accepts the following values:
  • cluster: The controllers watch for changes to all resources across all namespaces in the cluster.
  • namespace: The controllers watch for changes to resources only in the namespace specified during deployment. You must deploy the operator as a webhook for the cluster, then deploy the operator controllers in a namespace. You can deploy multiple namespace-scoped operators within the same cluster.

For details, see Installing the VerticaDB operator.

Default: cluster

image.name
Name of the image that runs the operator.

Default: vertica/verticadb-operator:version

imagePullSecrets
List of Secrets that store credentials to authenticate to the private container repository specified by image.repo and rbac_proxy_image. For details, see Specifying ImagePullSecrets in the Kubernetes documentation.
image.repo
Server that hosts the repository that contains image.name. Use this parameter for deployments that require control over a private hosting server, such as an air-gapped operator.

Use this parameter with rbac_proxy_image.name and rbac_proxy_image.repo.

Default: docker.io

logging.filePath

Path to a log file in the VerticaDB operator filesystem. If this value is not specified, Vertica writes logs to standard output.

Default: Empty string (' ') that indicates standard output.

logging.level
Minimum logging level. This parameter accepts the following values:
  • debug

  • info

  • warn

  • error

Default: info

logging.maxFileSize

When logging.filePath is set, the maximum size in MB of the logging file before log rotation occurs.

Default: 500

logging.maxFileAge

When logging.filePath is set, the maximum age in days of the logging file before log rotation deletes the file.

Default: 7

logging.maxFileRotation

When logging.filePath is set, the maximum number of files that are kept in rotation before the old ones are removed.

Default: 3

nameOverride
Sets the prefix for the name assigned to all objects that the Helm chart creates.

If this parameter is not set, each object name begins with the name of the Helm chart, verticadb-operator.

nodeSelector
Controls which nodes are used to schedule the operator pod. If this is not set, the node selector is omitted from the operator pod when it is created. To set this parameter, provide a list of key/value pairs.

The following example schedules the operator only on nodes that have the region=us-east label:

nodeSelector:
      region: us-east
  
priorityClassName
PriorityClass name assigned to the operator pod. This affects where the pod is scheduled.
prometheus.createProxyRBAC
When set to true, creates role-based access control (RBAC) rules that authorize access to the operator's /metrics endpoint for the Prometheus integration.

Default: true

prometheus.createServiceMonitor

When set to true, creates the ServiceMonitor custom resource for the Prometheus operator. You must install the Prometheus operator before you set this to true and install the Helm chart.

For details, see the Prometheus operator GitHub repository.

Default: false

prometheus.expose
Configures the operator's /metrics endpoint for the Prometheus integration. The following options are valid:
  • EnableWithAuthProxy: Creates a new service object that exposes an HTTPS /metrics endpoint. The RBAC proxy controls access to the metrics.

  • EnableWithoutAuth: Creates a new service object that exposes an HTTP /metrics endpoint that does not authorize connections. Any client with network access can read the metrics.

  • Disable: Prometheus metrics are not exposed.

Default: Disable

prometheus.tlsSecret
Secret that contains the TLS certificates for the Prometheus /metrics endpoint. You must create this Secret in the same namespace that you deployed the Helm chart.

The Secret requires the following values:

  • tls.key: TLS private key

  • tls.crt: TLS certificate for the private key

  • ca.crt: Certificate authority (CA) certificate

To ensure that the operator uses the certificates in this parameter, you must set prometheus.expose to EnableWithAuthProxy.

If prometheus.expose is not set to EnableWithAuthProxy, then this parameter is ignored, and the RBAC proxy sidecar generates its own self-signed certificate.

rbac_proxy_image.name
Name of the Kubernetes RBAC proxy image that performs authorization. Use this parameter for deployments that require authorization by a proxy server, such as an air-gapped operator.

Use this parameter with image.repo and rbac_proxy_image.repo.

Default: kubebuilder/kube-rbac-proxy:v0.11.0

rbac_proxy_image.repo
Server that hosts the repository that contains rbac_proxy_image.name. Use this parameter for deployments that perform authorization by a proxy server, such as an air-gapped operator.

Use this parameter with image.repo and rbac_proxy_image.name.

Default: gcr.io

reconcileConcurrency.verticaautoscaler
Number of concurrent reconciliation loops the operator runs for all VerticaAutoscaler CRs in the cluster.
reconcileConcurrency.verticadb
Number of concurrent reconciliation loops the operator runs for all VerticaDB CRs in the cluster.
reconcileConcurrency.verticaeventtrigger
Number of concurrent reconciliation loops the operator runs for all EventTrigger CRs in the cluster.
resources.limits and resources.requests
The resource requirements for the operator pod.

resources.limits is the maximum amount of CPU and memory that an operator pod can consume from its host node.

resources.requests is the maximum amount of CPU and memory that an operator pod can request from its host node.

Defaults:

resources:
  limits:
    cpu: 100m
    memory: 750Mi
  requests:
    cpu: 100m
    memory: 20Mi
  
serviceAccountAnnotations
Map of annotations that is added to the service account created for the operator.
serviceAccountNameOverride
Controls the name of the service account created for the operator.
tolerations
Any taints and tolerations that influence where the operator pod is scheduled.
webhook.certSource
How TLS certificates are provided for the admission controller webhook. This parameter accepts the following values:
  • internal: The VerticaDB operator internally generates a self-signed, 10-year expiry certificate before starting the managing controller. When the certificate expires, you must manually restart the operator pod to create a new certificate.

  • secret: You generate the custom certificates before you create the Helm chart and store them in a Secret. This option requires that you set webhook.tlsSecret.

    If webhook.tlsSecret is set, then this option is implicitly selected.

Default: internal

For details, see Installing the VerticaDB operator.

webhook.enable
Determines whether the Helm chart installs the admission controller webhooks for the custom resource definitions. The webhook is cluster-scoped, and you can install only one webhook per cluster.

If your environment uses namespace-scoped operators, you must install the webhook for the cluster, then disable the webhook for each namespace installation. For details, see Installing the VerticaDB operator.

Default: true

webhook.tlsSecret
Secret that contains a PEM-encoded certificate authority (CA) bundle and its keys.

The CA bundle validates the webhook's server certificate. If this is not set, the webhook uses the system trust roots on the apiserver.

This Secret includes the following keys for the CA bundle:

  • tls.key

  • ca.crt

  • tls.crt

4 - Red Hat OpenShift integration

Red Hat OpenShift is a hybrid cloud platform that provides enhanced security features and greater control over the Kubernetes cluster.

Red Hat OpenShift is a hybrid cloud platform that provides enhanced security features and greater control over the Kubernetes cluster. In addition, OpenShift provides the OperatorHub, a catalog of operators that meet OpenShift requirements.

For comprehensive instructions about the OpenShift platform, refer to the Red Hat OpenShift documentation.

Enhanced security with security context constraints

To enforce security measures, OpenShift requires that each deployment use a security context constraint (SCC). Vertica on Kubernetes supports the restricted-v2 SCC, the most restrictive default SCC available.

The SCC lets administrators control the privileges of the pods in a cluster without manual configuration. For example, you can restrict namespace access for specific users in a multi-user environment.

Installing the operator

The VerticaDB operator is a community operator that is maintained by Vertica. Each operator available in the OperatorHub must adhere to requirements defined by the Operator Lifecycle Manager (OLM). To meet these requirements, vendors must provide a cluster service version (CSV) manifest for each operator. Vertica provides a CSV for each version of the VerticaDB operator available in the OpenShift OperatorHub.

The VerticaDB operator supports OpenShift versions 4.8 and higher.

You must have cluster-admin privileges on your OpenShift account to install the VerticaDB operator. For detailed installation instructions, refer to the OpenShift documentation.

Deploying Vertica on OpenShift

After you installed the VerticaDB operator and added a supported SCC to your Vertica workloads service account, you can deploy Vertica on OpenShift.

For details about installing OpenShift in supported environments, see the OpenShift Container Platform installation overview.

Before you deploy Vertica on OpenShift, create the required Secrets to store sensitive information. For details about Secrets and OpenShift, see the OpenShift documentation. For guidance on deploying a Vertica custom resource, see VerticaDB custom resource definition.

5 - Prometheus integration

Vertica on Kubernetes integrates with Prometheus to scrape time series metrics about the VerticaDB operator.

Vertica on Kubernetes integrates with Prometheus to scrape time series metrics about the VerticaDB operator and Vertica server process. These metrics create a detailed model of your application over time to provide valuable performance and troubleshooting insights as well as facilitate internal and external communications and service discovery in microservice and containerized architectures.

Prometheus requires that you set up targets—metrics that you want to monitor. Each target is exposed on an endpoint, and Prometheus periodically scrapes that endpoint to collect target data. Vertica exports metrics and provides access methods for both the VerticaDB operator and server process.

Server metrics

Vertica exports server metrics on port 8443 at the following endpoint:

https://host-address:8443/api-version/metrics

Only the superuser can authenticate to the HTTPS service, and the service accepts only mutual TLS (mTLS) authentication. The setup for both Vertica on Kubernetes and non-containerized Vertica environments is identical. For details, see HTTPS service.

Vertica on Kubernetes lets you set a custom port for its HTTP service with the subclusters[i].verticaHTTPNodePort custom resource parameter. This parameter sets a custom port for the HTTPS service for NodePort serviceTypes.

For request and response examples, see the /metrics endpoint description. For a list of available metrics, see Prometheus metrics.

Grafana dashboards

You can visualize Vertica server time series metrics with Grafana dashboards. Vertica dashboards that use a Prometheus data source are available at Grafana Dashboards:

You can also download the source for each dashboard from the vertica/grafana-dashboards repository.

Operator metrics

The VerticaDB operator supports the Operator SDK framework, which requires that an authorization proxy impose role-based-access control (RBAC) to access operator metrics over HTTPS. To increase flexibility, Vertica provides the following options to access the Prometheus /metrics endpoint:

  • HTTPS access: Meet operator SDK requirements and use a sidecar container as an RBAC proxy to authorize connections.

  • HTTP access: Expose the /metrics endpoint to external connections without RBAC. Any client with network access can read from /metrics.

  • Disable Prometheus entirely.

Vertica provides Helm chart parameters and YAML manifests to configure each option.

Prerequisites

HTTPS with RBAC

The operator SDK framework requires that operators use an authorization proxy for metrics access. Because the operator sends metrics to localhost only, Vertica meets these requirements with a sidecar container with localhost access that enforces RBAC.

RBAC rules are cluster-scoped, and the sidecar authorizes connections from clients associated with a service account that has the correct ClusterRole and ClusterRoleBindings. Vertica provides the following example manifests:

For additional details about ClusterRoles and ClusterRoleBindings, see the Kubernetes documentation.

Create RBAC rules

The following steps create the ClusterRole and ClusterRoleBindings objects that grant access to the /metrics endpoint to a non-Kubernetes resource such as Prometheus. Because RBAC rules are cluster-scoped, you must create or add to an existing ClusterRoleBinding:

  1. Create a ClusterRoleBinding that binds the role for the RBAC sidecar proxy with a service account:

    • Create a ClusterRoleBinding:

      $ kubectl create clusterrolebinding verticadb-operator-proxy-rolebinding \
          --clusterrole=verticadb-operator-proxy-role \
          --serviceaccount=namespace:serviceaccount
      
    • Add a service account to an existing ClusterRoleBinding:

      $ kubectl patch clusterrolebinding verticadb-operator-proxy-rolebinding \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace" } }]'
      
  2. Create a ClusterRoleBinding that binds the role for the non-Kubernetes object to the RBAC sidecar proxy service account:

    • Create a ClusterRoleBinding:

      $ kubectl create clusterrolebinding verticadb-operator-metrics-reader \
          --clusterrole=verticadb-operator-metrics-reader \
          --serviceaccount=namespace:serviceaccount \
          --group=system:authenticated
      
    • Bind the service account to an existing ClusterRoleBinding:

      $ kubectl patch clusterrolebinding verticadb-operator-metrics-reader \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace"},{"op":"add","path":"/subjects/-","value":{"kind": "Group", "name": "system:authenticated"} }]'
      
      $ kubectl patch clusterrolebinding verticadb-operator-metrics-reader \
          --type='json' \
          -p='[{"op": "add", "path": "/subjects/-", "value": {"kind": "ServiceAccount", "name": "serviceaccount","namespace": "namespace" } }]'
      

When you install the Helm chart, the ClusterRole and ClusterRoleBindings are created automatically. By default, the prometheus.expose parameter is set to EnableWithProxy, which creates the service object and exposes the operator's /metrics endpoint.

For details about creating a sidecar container, see VerticaDB custom resource definition.

Service object

Vertica provides a service object verticadb-operator-metrics-service to access the Prometheus /metrics endpoint. The VerticaDB operator does not manage this service object. By default, the service object uses the ClusterIP service type to support RBAC.

Connect to the /metrics endpoint at port 8443 with the following path:

https://verticadb-operator-metrics-service.namespace.svc.cluster.local:8443/metrics

Bearer token authentication

Kubernetes authenticates requests to the API server with service account credentials. Each pod is associated with a service account and has the following credentials stored in the filesystem of each container in the pod:

  • Token at /var/run/secrets/kubernetes.io/serviceaccount/token

  • Certificate authority (CA) bundle at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Use these credentials to authenticate to the /metrics endpoint through the service object. You must use the credentials for the service account that you used to create the ClusterRoleBindings.

For example, the following cURL request accesses the /metrics endpoint. Include the --insecure option only if you do not want to verify the serving certificate:

$ curl --insecure --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://verticadb-operator-metrics-service.vertica:8443/metrics

For additional details about service account credentials, see the Kubernetes documentation.

TLS client certificate authentication

Some environments might prevent you from authenticating to the /metrics endpoint with the service account token. For example, you might run Prometheus outside of Kubernetes. To allow external client connections to the /metrics endpoint, you have to supply the RBAC proxy sidecar with TLS certificates.

You must create a Secret that contains the certificates, and then use the prometheus.tlsSecret Helm chart parameter to pass the Secret to the RBAC proxy sidecar when you install the Helm chart. The following steps create the Secret and install the Helm chart:

  1. Create a Secret that contains the certificates:

    $ kubectl create secret generic metrics-tls --from-file=tls.key=/path/to/tls.key --from-file=tls.crt=/path/to/tls.crt --from-file=ca.crt=/path/to/ca.crt
    
  2. Install the Helm chart with prometheus.tlsSecret set to the Secret that you just created:

    $ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
      --set prometheus.tlsSecret=metrics-tls
    

    The prometheus.tlsSecret parameter forces the RBAC proxy to use the TLS certificates stored in the Secret. Otherwise, the RBAC proxy sidecar generates its own self-signed certificate.

After you install the Helm chart, you can authenticate to the /metrics endpoint with the certificates in the Secret. For example:

$ curl --key tls.key --cert tls.crt --cacert ca.crt https://verticadb-operator-metrics-service.vertica.svc:8443/metrics

HTTP access

You might have an environment that does not require privileged access to Prometheus metrics. For example, you might run Prometheus outside of Kubernetes.

To allow external access to the /metrics endpoint with HTTP, set prometheus.expose to EnableWithoutAuth. For example:

$ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
    --set prometheus.expose=EnableWithoutAuth

Service object

Vertica provides a service object verticadb-operator-metrics-service to access the Prometheus /metrics endpoint. The VerticaDB operator does not manage this service object. By default, the service object uses the ClusterIP service type, so you must change the serviceType for external client access. The service object's fully-qualified domain name (FQDN) is as follows:

verticadb-operator-metrics-service.namespace.svc.cluster.local

Connect to the /metrics endpoint at port 8443 with the following path:

http://verticadb-operator-metrics-service.namespace.svc.cluster.local:8443/metrics

Prometheus operator integration (optional)

Vertica on Kubernetes integrates with the Prometheus operator, which provides custom resources (CRs) that simplify targeting metrics. Vertica supports the ServiceMonitor CR that discovers the VerticaDB operator automatically, and authenticates requests with a bearer token.

The ServiceMonitor CR is available as a release artifact in our GitHub repository. See Helm chart parameters for details about the prometheus.createServiceMonitor parameter.

Disabling Prometheus

To disable Prometheus, set the prometheus.expose Helm chart parameter to Disable:

$ helm install operator-name --namespace namespace --create-namespace vertica-charts/verticadb-operator \
    --set prometheus.expose=Disable

For details about Helm install commands, see Installing the VerticaDB operator.

Metrics

The following table describes the available VerticaDB operator metrics:

Name Type Description
controller_runtime_active_workers gauge Number of currently used workers per controller.
controller_runtime_max_concurrent_reconciles gauge Maximum number of concurrent reconciles per controller.
controller_runtime_reconcile_errors_total counter Total number of reconciliation errors per controller.
controller_runtime_reconcile_time_seconds histogram Length of time per reconciliation per controller.
controller_runtime_reconcile_total counter Total number of reconciliations per controller.
controller_runtime_webhook_latency_seconds histogram Histogram of the latency of processing admission requests.
controller_runtime_webhook_requests_in_flight gauge Current number of admission requests being served.
controller_runtime_webhook_requests_total counter Total number of admission requests by HTTP status code.
go_gc_duration_seconds summary A summary of the pause duration of garbage collection cycles.
go_goroutines gauge Number of goroutines that currently exist.
go_info gauge Information about the Go environment.
go_memstats_alloc_bytes gauge Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter Total number of frees.
go_memstats_gc_sys_bytes gauge Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge Number of heap bytes that are in use.
go_memstats_heap_objects gauge Number of allocated objects.
go_memstats_heap_released_bytes gauge Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter Total number of pointer lookups.
go_memstats_mallocs_total counter Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge Number of bytes obtained from system.
go_threads gauge Number of OS threads created.
process_cpu_seconds_total counter Total user and system CPU time spent in seconds.
process_max_fds gauge Maximum number of open file descriptors.
process_open_fds gauge Number of open file descriptors.
process_resident_memory_bytes gauge Resident memory size in bytes.
process_start_time_seconds gauge Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge Maximum amount of virtual memory available in bytes.
vertica_cluster_restart_attempted_total counter The number of times we attempted a full cluster restart.
vertica_cluster_restart_failed_total counter The number of times we failed when attempting a full cluster restart.
vertica_cluster_restart_seconds histogram The number of seconds it took to do a full cluster restart.
vertica_nodes_restart_attempted_total counter The number of times we attempted to restart down nodes.
vertica_nodes_restart_failed_total counter The number of times we failed when trying to restart down nodes.
vertica_nodes_restart_seconds histogram The number of seconds it took to restart down nodes.
vertica_running_nodes_count gauge The number of nodes that have a running pod associated with it.
vertica_subclusters_count gauge The number of subclusters that exist.
vertica_total_nodes_count gauge The number of nodes that currently exist.
vertica_up_nodes_count gauge The number of nodes that have vertica running and can accept connections.
vertica_upgrade_total counter The number of times the operator performed an upgrade caused by an image change.
workqueue_adds_total counter Total number of adds handled by workqueue.
workqueue_depth gauge Current depth of workqueue.
workqueue_longest_running_processor_seconds gauge How many seconds has the longest running processor for workqueue been running.
workqueue_queue_duration_seconds histogram How long in seconds an item stays in workqueue before being requested.
workqueue_retries_total counter Total number of retries handled by workqueue.
workqueue_unfinished_work_seconds gauge How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
workqueue_work_duration_seconds histogram How long in seconds processing an item from workqueue takes.

6 - Secrets management

The Kubernetes declarative model requires that you develop applications with manifest files or command line interactions with the Kubernetes API. These workflows expose your sensitive information in your application code and shell history, which compromises your application security.

To mitigate any security risks, Kubernetes uses the concept of a secret to store this sensitive information. A secret is an object with a plain text name and a value stored as a base64 encoded string. When you reference a secret by name, Kubernetes retrieves and decodes its value. This lets you openly reference confidential information in your application code and shell without compromising your data.

Kubernetes supports secret workflows with its native Secret object, and cloud providers offer solutions that store your confidential information in a centralized location for easy management. By default, Vertica on Kubernetes supports native Secrets objects, and it also supports cloud solutions so that you have options for storing your confidential data.

For best practices about handling confidential data in Kubernetes, see the Kubernetes documentation.

Manually encode data

In some circumstances, you might need to manually base64 encode your secret value and add it to a Secret manifest or a cloud service secret manager. You can base64 encode data with tools available in your shell. For example, pass the string value to the echo command, and pipe the output to the base64 command to encode the value. In the echo command, include the -n option so that it does not append a newline character:

$ echo -n 'secret-value' | base64
c2VjcmV0LXZhbHVl

You can take the output of this command and add it to a Secret manifest or cloud service secret manager.

Kubernetes Secrets

A Secret is an Kubernetes object that you can reference by name that conceals confidential data in a base64 encoded string. For example, you can create a Secret named su-password that stores the database superuser password. In a manifest file, you can add su-password in place of the literal password value, and then you can safely store the manifest in a file system or pass it on the command line.

The idiomatic way to create a Secret in Kubernetes is with the kubectl command-line tool's create secret command, which provides options to create Secret object from various data sources. For example, the following command creates a Secret named superuser-password from a literal value passed on the command line:

$ kubectl create secret generic superuser-password \
    --from-literal=password=secret-value
secret/superuser-password created

Instead of creating a Kubernetes Secret with kubectl, you can manually base64 encode a string on the command line, and then add the encoded output to a Secrets manifest.

Cloud providers

Cloud providers offer services that let you store sensitive information in a central location and reference it securely. Vertica on Kubernetes requires a specific format for secrets stored in cloud providers. In addition, each cloud provider requires unique configuration before you can add a secret to your VerticaDB custom resource (CR).

The following VerticaDB CR parameters accept secrets from cloud services:

  • communal.credentialSecret
  • nmaTLSSecret
  • passwordSecret

Format requirements

Cloud provider secrets consist of a name and a secret value. To provide flexibility, cloud services let you store the value in a variety of formats. Vertica on Kubernetes requires that you format the secret value as a JSON document consisting of plain text string keys and base64 encoded values. For example, you might have a secret named tlsSecrets whose value is a JSON document in the following format:

{
  "ca.crt": "base64-endcoded-ca.crt",
  "tls.crt" "base64-endcoded-tls.crt",
  "tls.key": "base64-endcoded-tls.key",
  "password": "base64-endcoded-password"
}

Amazon Web Services

Amazon Web Services (AWS) provides the AWS Secrets Manager, a storage system for your sensitive data. To access secrets from your AWS console, go to Services > Security, Identity, & Compliance > Secrets Manager.

IAM permissions

Before you can add a secret to a CR, you must grant the following permissions to the VerticaDB operator pod and the Vertica server pods so they can access AWS Secret Manager. You can grant these permissions to the worker node's IAM policy or the IAM roles for service account (IRSA):

For instructions about adding permissions to an AWS Secrets Manager secret, see the AWS documentation. For details about Vertica on Kubernetes and AWS IRSA, see Configuring communal storage.

Adding a secret to a CR

AWS stores secrets with metadata that describe and track changes to the secret. An important piece of metadata is the Amazon Resource Name (ARN), a unique identifier for the secret. The ARN uses the following format:

arn:aws:secretsmanager:region:accountId:secret:SecretName-randomChars

To use an AWS secret in a CR, you have to add the ARN to the applicable CR parameter and prefix it with awssm://. For example:

    spec:
      ...
      passwordSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myPasswordSecret-randomChars
      nmaTLSSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myNmaTLSSecret-randomChars
      communal:
        credentialSecret: awssm://arn:aws:secretsmanager:region:account-id:secret:myCredentialSecret-randomChars
        path: s3://bucket-name/key-name
        ...

Google Cloud Platform

Google Cloud provides Google Secret Manager, a storage system for your sensitive data. To access your secrets from your Google Cloud console, go to Security > Secret Manager.

When you pass a Google secret as a CRD parameter, use the secret's resource name. The resource name uses the following format:

projects/project-id/secrets/secret-name/versions/version-number

To use a Secret Manager secret in a CR, you have to add the resource name to the applicable CR parameter and prefix it with gsm://. For example:

    spec:
      ...
      passwordSecret: gsm://projects/project-id/secrets/password-secret/versions/version-number
      nmaTLSSecret: gsm://projects/project-id/secrets/nma-certs-secret/versions/version-number
      communal:
        credentialSecret: gsm://projects/project-id/secrets/gcp-creds-secret/versions/version-number
        path: gs://bucket-name/path/to/database-name
        ...