Hybrid Kubernetes clusters

An Eon Mode database can run hosts separate from the database and within Kubernetes.

An Eon Mode database can run hosts separate from the database and within Kubernetes. This architecture is useful in the following scenarios:

Leveraging Kubernetes tooling to quickly create a secondary subcluster for a database.
Creating an isolated sandbox environment to run ad hoc queries on a communal dataset.
Experimenting with OpenText™ Analytics Database on Kubernetes performance overhead without migrating your primary subcluster into Kubernetes.

Define the Kubernetes portion of a hybrid architecture with a custom resource (CR). The custom resource has no knowledge of database hosts that exist separately from the custom resource. This limits the operator's functionality and requires that you manually complete some tasks that the operator automates for a standard OpenText™ Analytics Database on Kubernetes custom resource.

Requirements and restrictions

The hybrid Kubernetes architecture has the following requirements and restrictions:

Hybrid Kubernetes clusters require a tool that enables Border Gateway Protocol (BGP) so that pods are accessible to your on-premises subcluster for external communication. For example, you can use the Calico CNI plugin to enable BGP.
You cannot use network address translation (NAT) between the Kubernetes pods and the on-premises cluster. For example, you can set natOutgoing to false in the Calico IPPool:
```
kubectl patch ippools hybrid-ipv4-ippool --type=merge --patch '{\"spec\": {\"natOutgoing\": false}}'"
```

Operator limitations

In a hybrid architecture, the operator has no visibility outside of the custom resource. This limited visibility means that the operator cannot interact with the Eon Mode database or the primary subcluster. Within the scope of the custom resource, the operator automates only the following:

Schedules pods with initPolicy set as ScheduleOnly in the Hybrid CR manifest.
Creates service objects for the subcluster.
Creates a PersistentVolumeClaim (PVC) that persists data for each pod.
Set the autoRestartVertica custom resource parameter to false.

Define an on-premises primary cluster

An Eon Mode database can run on hosts in both the on-premises cluster and the Kubernetes cluster.

Install OpenText™ Analytics Database in the on-premises cluster:
```
sudo /opt/vertica/sbin/install_vertica \
    --hosts $HOSTLIST \
    --failure-threshold NONE \
    --point-to-point \
    --data-dir /data \
    --license /path/to/vertica/license.key \
    --accept-eula
```
In this example:
- --point-to-point: Configures spread to use direct point-to-point communication between all database nodes.
- --data-dir: These values must match the directory locations in the spec.local in the Kubernetes Hybrid CR manifest.

Deploy the primary database cluster:

/opt/vertica/bin/vcluster create_db  \
        --db-name vertdb \
        --hosts $HOSTLIST \
        --password <password> \
        --catalog-path /data \
        --data-path /data \
        --depot-path /depot \
       --shard-count 12 \
        --generate-http-certs \
        --communal-storage-location s3://nimbusdb/db-$$ \
        --config-param AWSAuth=$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY,AWSEndpoint=$AWS_ENDPOINT,AWSEnableHttps=$AWS_ENABLE_HTTPS,AWSRegion=$AWS_REGION

In this example:

--db-name: This value must match the name of the Kubernetes dbName.
--catalog-path, --data-path, --depot-path: These values must match the directory locations in the spec.local in the Kubernetes Hybrid CR manifest.
--generate-http-certs: To enable https certificate.

Define a hybrid cluster

Create a customized NMA secret

To communicate with the database nodes in the primary cluster, the Kubernetes cluster must use the same NMA certificates as the primary cluster. For example, you can create a custom TLS secret using the certificate files from the on-premises primary cluster:

# Get NMA certs from the on-premises primary cluster host
scp <primary-cluster-host>:/opt/vertica/config/https_certs/dbadmin.key tls.key
scp <primary-cluster-host>:/opt/vertica/config/https_certs/dbadmin.pem tls.crt
scp <primary-cluster-host>:/opt/vertica/config/https_certs/rootca.pem ca.crt

# Create a customized tls secret:
kubectl create secret generic nma-tls-certs --from-file=tls.key=tls.key --from-file=tls.crt=tls.crt --from-file=ca.crt=ca.crt

Next, create the Hybrid CR manifest that defines the configuration of the secondary cluster within the Kubernetes environment:
```
apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: hybrid-secondary-sc
spec:
  autoRestartVertica: false
  image: opentext/vertica-k8s:latest
  initPolicy: ScheduleOnly
  local:
    dataPath: /data
    depotPath: /depot
  dbName: vertdb
  nmaTLSSecret: nma-tls-certs
  subclusters:
    - name: sc1
      size: 3
    - name: sc2
      size: 3
```
In this example:
- apiVersion: vertica.com/v1: Use v1 as v1beta1 is deprecated.
- initPolicy: ScheduleOnly: Required for hybrid clusters. Ensures pods are scheduled without initializing the database.
- autoRestartVertica: false: Disable auto restart for the operator so it does not interfere with the scaling operation.
- local: Required. The values persist data to the PersistentVolume (PV). These values must match the directory locations in the Eon Mode database that is associated with the Kubernetes pods.
- dbName: This value must match the name of the standard Eon Mode database that is associated with this subcluster.
- nmaTLSSecret: nma-tls-certs: NMA certificates used to communicate with the database nodes in the primary cluster must match the certificates used by the primary cluster.
- subclusters: Defines the subclusters and their sizes.

Note

The Hybrid CR can be applied inside a vcluster namespace using standard kubectl commands.
Disable firewall or configure firewall to allow database ports.
The operator runs inside vcluster and manages only the resources within its namespace.
Hybrid CRs ignore parameters such as communal.* and subclusters[i].isPrimary.

Scale the hybrid subcluster

When you scale a hybrid cluster, you add nodes of the secondary subcluster on Kubernetes to the on-premises primary subcluster.

Add secondary subcluster to the on-premises cluster:
```
/opt/vertica/bin/vcluster add_subcluster --subcluster sc1 --password <password>
```
In this example:
- --subcluster sc1: The subcluster name must match spec.subclusters[].name in the Kubernetes Hybrid CR manifest.

Scale the secondary subcluster by adding nodes

After the subcluster is created, add the Kubernetes nodes to it. Run the following command on the on-premises primary cluster:

/opt/vertica/bin/vcluster \
        add_node \
        --new-hosts hybrid-secondary-sc1-0,hybrid-secondary-sc1-1,hybrid-secondary-sc1-2 \
        --subcluster sc1 \
        --password <password>

Enable auto-restart for OpenText™ Analytics Database (admintools only)

After the scaling operation, set autoRestartVertica back to true to enable automatic restart of VerticaDB:

$ kubectl patch vdb database-name --type=merge --patch='{"spec": {"autoRestartVertica": true}}'

When VerticaDB is deployed with admintools, the operator automatically restarts the database process and rejoins the on-premises cluster if the process is terminated or the VerticaDB pod crashes.