Configuring communal storage
Vertica on Kubernetes supports a variety of communal storage providers to accommodate your storage requirements. Each storage provider uses authentication methods that conceal sensitive information so that you can declare that information in your Custom Resource (CR) without exposing any literal values.
Note
If your Kubernetes cluster is in the cloud or on a managed service, each Vertica node must operate in the same availability zone.AWS S3 or S3-Compatible storage
Vertica on Kubernetes supports multiple authentication methods for Amazon Web Services (AWS) communal storage locations and private cloud S3 storage such as MinIO.
For additional details about Vertica and AWS, see Vertica on Amazon Web Services.
Secrets authentication
To connect to an S3-compatible storage location, create a Secret to store both your communal access and secret key credentials. Then, add the Secret, path, and S3 endpoint to the CR spec.
-
The following command stores both your S3-compatible communal access and secret key credentials in a Secret named
s3-creds
:$ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
-
Add the Secret to the
communal
section of the CR spec:spec: ... communal: credentialSecret: s3-creds endpoint: https://path/to/s3-endpoint path: s3://bucket-name/key-name ...
For a detailed description of an S3-compatible storage implementation, see VerticaDB CRD.
IAM profile authentication
Identify and access management (IAM) profiles manage user identities and control which services and resources a user can access. IAM authentication to Vertica on Kubernetes reduces the number of manual updates when you rotate your access keys.
The IAM profile must have read and write access to the communal storage. The IAM profile is associated with the EC2 instances that run worker nodes.
-
Create an EKS node group using a Node IAM role with a policy that allows read and write access to the S3 bucket used for communal storage.
-
Deploy the VerticaDB operator in a namespace. For details, see Installing the Vertica DB operator.
-
Create a VerticaDB custom resource (CR), and omit the
communal.credentialSecret
field:spec: ... communal: endpoint: https://path/to/s3-endpoint path: s3://bucket-name/key-name
When the Vertica server accesses the communal storage location, it uses the policy associated to the EKS node.
For additional details about authenticating to Vertica with an IAM profile, see AWS authentication.
IRSA profile authentication
Important
This authentication method requires an image running Vertica server version 12.0.3 or later.You can use IAM roles for service accounts (IRSA) to associate an IAM role with a Kubernetes service account. You must set the IAM policies for the Kubernetes service account, and then pods running that service account have the IAM policies.
Before you begin, complete the following prerequisites:
-
Configure the EKS cluster's control plane. For details, see the Amazon documentation.
-
Create a bucket policy that has access to the S3 communal storage bucket. For details, see the Amazon documentation.
-
Create an EKS node group using a Node IAM role that does not have S3 access.
-
Use
eksctl
to create the IAM OpenID Connect (OIDC) provider for your EKS cluster:$ eksctl utils associate-iam-oidc-provider --cluster cluster --approve 2022-10-07 08:31:37 [ℹ] will create IAM Open ID Connect provider for cluster "cluster" in "us-east-1" 2022-10-07 08:31:38 [✔] created IAM Open ID Connect provider for cluster "cluster" in "us-east-1"
-
Create the Kubernetes namespace where you deploy the VerticaDB operator:
$ kubectl create ns vertica namespace/vertica created
-
Use
eksctl
to create a Kubernetes service account in the vertica namespace. When you create a service account witheksctl
, you can attach an IAM policy that allows S3 access:$ eksctl create iamserviceaccount --name my-serviceaccount --namespace vertica --cluster cluster --attach-policy-arn arn:aws:iam::profile:policy/policy --approve 2022-10-07 08:38:32 [ℹ] 1 iamserviceaccount (vertica/my-serviceaccount) was included (based on the include/exclude rules) 2022-10-07 08:38:32 [!] serviceaccounts that exist in Kubernetes will be excluded, use --override-existing-serviceaccounts to override 2022-10-07 08:38:32 [ℹ] 1 task: { 2 sequential sub-tasks: { create IAM role for serviceaccount "vertica/my-serviceaccount", create serviceaccount "vertica/my-serviceaccount", } }2022-10-07 08:38:32 [ℹ] building iamserviceaccount stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount" 2022-10-07 08:38:33 [ℹ] deploying stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount" 2022-10-07 08:38:33 [ℹ] waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount" 2022-10-07 08:39:03 [ℹ] waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount" 2022-10-07 08:39:04 [ℹ] created serviceaccount "vertica/my-serviceaccount"
-
Install the VerticaDB operator, and set the service account:
$ helm install vdb-op --namespace vertica vertica-charts/verticadb-operator --set serviceAccountNameOverride=my-serviceaccount
-
Create a VerticaDB custom resource (CR), and omit the
communal.credentialSecret
field. When pods are created, they use the service account that has a policy that provides access to the S3 communal storage:apiVersion: vertica.com/v1beta1 kind: VerticaDB metadata: name: irsadb spec: image: vertica/vertica-k8s:12.0.3-0 communal: path: "s3://path/to/s3-endpoint endpoint: https://s3.amazonaws.com subclusters: - name: sc size: 3
Server-side encryption
Important
Vertica supports S3 server-side encryption in versions 12.0.1 and higher.If your S3 communal storage uses server-side encryption (SSE), you must configure the encryption type when you create the CR. Vertica supports the following types of SSE:
- SSE-S3
- SSE-KMS
- SSE-C
For details about Vertica support for each encryption type, see S3 object store.
The following tabs provide examples on how to implement each SSE type. For details about the parameters, see Custom resource definition parameters.
apiVersion: vertica.com/v1beta1
kind: VerticaDB
metadata:
name: verticadb
spec:
communal:
path: "s3://bucket-name"
s3ServerSideEncryption: SSE-S3
This setting requires that you use the communal.additionalConfig
parameter to pass the key identifier (not the key) of the Key management service. Vertica must have permission to use the key, which is managed through KMS:
apiVersion: vertica.com/v1beta1 kind: VerticaDB metadata: name: verticadb spec: communal: path: "s3://bucket-name" s3ServerSideEncryption: SSE-KMS additionalConfig: S3SseKmsKeyId: "kms-key-identifier"
Store the client key contents in a Secret and reference the Secret in the CR. The client key must be either a 32-character plaintext key or a 44-character base64-encoded key.
You must create the Secret in the same namespace as the CR:
- Create a Secret that stores the client key contents in the
stringData.clientKey
field:
apiVersion: v1
kind: Secret
metadata:
name: sse-c-key
stringData:
clientKey: client-key-contents
- Add the Secret to the CR with the
communal.s3SseCustomerKeySecret
parameter:
apiVersion: vertica.com/v1beta1 kind: VerticaDB metadata: name: verticadb spec: communal: path: "s3://bucket-name" s3ServerSideEncryption: SSE-C s3SseCustomerKeySecret: "sse-c-key" ...
Google Cloud Storage
Authenticating to Google Cloud Storage (GCS) requires your hash-based message authentication code (HMAC) access and secret keys, and the path to your GCS bucket. For details about HMAC keys, see Eon Mode on GCP prerequisites.
-
The following command stores your HMAC access and secret key in a Secret named
gcs-creds
:$ kubectl create secret generic gcs-creds --from-literal=accesskey=accessKey --from-literal=secretkey=secretkey
-
Add the Secret and the path to the GCS bucket that contains your Vertica database to the
communal
section of the CR spec:spec: ... communal: credentialSecret: gcs-creds path: gs://bucket-name/path/to/database-name ...
For additional details about Vertica and GCS, see Vertica on Google Cloud Platform.
Azure Blob Storage
Micosoft Azure provides a variety of options to authenticate to Azure Blob Storage location. Depending on your environment, you can use one of the following combinations to store credentials in a Secret:
-
accountName and accountKey
-
accountName and shared access signature (SAS)
If you use an Azure storage emulator such as Azurite in a tesing environment, you can authenticate with accountName and blobStorage values.
Important
Vertica does not officially support Azure storage emulators as a communal storage location.-
The following command stores accountName and accountKey in a Secret named
azb-creds
:$ kubectl create secret generic azb-creds --from-literal=accountKey=accessKey --from-literal=accountName=accountName
Alternately, you could store your accountName and your SAS credentials in
azb-creds
:$ kubectl create secret generic azb-creds --from-literal=sharedAccessSignature=sharedAccessSignature --from-literal=accountName=accountName
-
Add the Secret and the path that contains your AZB storage bucket to the
communal
section of the CR spec:spec: ... communal: credentialSecret: azb-creds path: azb://accountName/bucket-name/database-name ...
For details about Vertica and authenticating to Microsoft Azure, see Eon Mode on GCP prerequisites.
Hadoop file storage
Connect to Hadoop Distributed Filesystem (HDFS) communal storage with the standard webhdfs
scheme, or the swebhdfs
scheme for wire encryption. In addition, you must add your HDFS configuration files in a ConfigMap, a Kubernetes object that stores data in key-value pairs. You can optionally configure Kerberos to authenticate connections to your HDFS storage location.
The following example uses the swebhdfs
wire encryption scheme that requires a certificate authority (CA) bundle in the CR spec.
-
The following command stores a PEM-encoded CA bundle in a Secret named
hadoop-cert
:$ kubectl create secret generic hadoop-cert --from-file=ca-bundle.pem
-
HDFS configuration files are located in the
/etc/hadoop
directory. The following command creates a ConfigMap namedhadoop-conf
:$ kubectl create configmap hadoop-conf --from-file=/etc/hadoop
-
Add the configuration values to the
communal
andcertSecrets
sections of the spec:spec: ... communal: path: "swebhdfs://path/to/database" hadoopConfig: hadoop-conf caFile: /certs/hadoop-cert/ca-bundle.pem certSecrets: - name: hadoop-cert ...
The previous example defines the following:
-
communal.path
: The path to the database, using the wire encryption scheme. Enclose the path in double quotes. -
communal.hadoopConfig
: The ConfigMap storing the contents of the /etc/hadoop directory. -
communal.caFile
: The mount path in the container filesystem containing the CA bundle used to create thehadoop-cert
Secret. -
certSecrets.name
: The Secret containing the CA bundle.
-
For additional details about HDFS and Vertica, see Apache Hadoop integration.
Kerberos authentication (optional)
Vertica authenticates connections to HDFS with Kerberos. The Kerberos configuration between Vertica on Kubernetes is the same as between a standard Eon Mode database and Kerberos, as described in Kerberos authentication.
-
The following command stores the
krb5.conf
andkrb5.keytab
files in a Secret namedkrb5-creds
:$ kubectl create secret generic krb5-creds --from-file=kerberos-conf=/etc/krb5.conf --from-file=kerberos-keytab=/etc/krb5.keytab
Consider the following when managing the
krb5.conf
andkrb5.keytab
files in Vertica on Kubernetes:-
Each pod uses the same
krb5.keytab
file, so you must update thekrb5.keytab
file before you begin any scaling operation. -
When you update the contents of the
krb5.keytab
file, the operator updates the mounted files automatically, a process that does not require a pod restart. -
The
krb5.conf
file must include a[domain_realm]
section that maps the Kubernetes cluster domain to the Kerberos realm. The following example maps the default.cluster.local
domain to a Kerberos realm named EXAMPLE.COM:[domain_realm] .cluster.local = EXAMPLE.COM
-
-
Add the Secret and additional Kerberos configuration information to the CR:
spec: ... communal: path: "swebhdfs://path/to/database" hadoopConfig: hadoop-conf kerberosServiceName: verticadb kerberosRealm: EXAMPLE.COM kerberosSecret: krb5-creds ...
The previous example defines the following:
-
communal.path
: The path to the database, using the wire encryption scheme. Enclose the path in double quotes. -
communal.hadoopConfig
: The ConfigMap storing the contents of the /etc/hadoop directory. -
communal.kerberosServiceName
: The service name for the Vertica principal. -
communal.kerberosRealm
: The realm portion of the principal. -
kerberosSecret
: The Secret containing thekrb5.conf
andkrb5.keytab
files.
For a complete definition of each of the previous values, see Custom resource definition parameters.