Configuring communal storage

Vertica on Kubernetes supports a variety of communal storage providers to accommodate your storage requirements.

Vertica on Kubernetes supports a variety of communal storage providers to accommodate your storage requirements. Each storage provider uses authentication methods that conceal sensitive information so that you can declare that information in your Custom Resource (CR) without exposing any literal values.

AWS S3 or S3-Compatible storage

Vertica on Kubernetes supports multiple authentication methods for Amazon Web Services (AWS) communal storage locations and private cloud S3 storage such as MinIO.

For additional details about Vertica and AWS, see Vertica on Amazon Web Services.

Secrets authentication

To connect to an S3-compatible storage location, create a Secret to store both your communal access and secret key credentials. Then, add the Secret, path, and S3 endpoint to the CR spec.

  1. The following command stores both your S3-compatible communal access and secret key credentials in a Secret named s3-creds:

    $ kubectl create secret generic s3-creds --from-literal=accesskey=accesskey --from-literal=secretkey=secretkey
    
  2. Add the Secret to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: s3-creds
        endpoint: https://path/to/s3-endpoint
        path: s3://bucket-name/key-name
        ...
    

For a detailed description of an S3-compatible storage implementation, see VerticaDB custom resource definition.

IAM profile authentication

Identify and access management (IAM) profiles manage user identities and control which services and resources a user can access. IAM authentication to Vertica on Kubernetes reduces the number of manual updates when you rotate your access keys.

The IAM profile must have read and write access to the communal storage. The IAM profile is associated with the EC2 instances that run worker nodes.

  1. Create an EKS node group using a Node IAM role with a policy that allows read and write access to the S3 bucket used for communal storage.

  2. Deploy the VerticaDB operator in a namespace. For details, see Installing the VerticaDB operator.

  3. Create a VerticaDB custom resource (CR), and omit the communal.credentialSecret field:

    spec:
      ...
      communal:
        endpoint: https://path/to/s3-endpoint
        path: s3://bucket-name/key-name
    

When the Vertica server accesses the communal storage location, it uses the policy associated to the EKS node.

For additional details about authenticating to Vertica with an IAM profile, see AWS authentication.

IRSA profile authentication

You can use IAM roles for service accounts (IRSA) to associate an IAM role with a Kubernetes service account. You must set the IAM policies for the Kubernetes service account, and then pods running that service account have the IAM policies.

Before you begin, complete the following prerequisites:

  • Configure the EKS cluster's control plane. For details, see the Amazon documentation.

  • Create a bucket policy that has access to the S3 communal storage bucket. For details, see the Amazon documentation.

  1. Create an EKS node group using a Node IAM role that does not have S3 access.

  2. Use eksctl to create the IAM OpenID Connect (OIDC) provider for your EKS cluster:

    $ eksctl utils associate-iam-oidc-provider --cluster cluster --approve
    2022-10-07 08:31:37 []  will create IAM Open ID Connect provider for cluster "cluster" in "us-east-1"
    2022-10-07 08:31:38 []  created IAM Open ID Connect provider for cluster "cluster" in "us-east-1"
    
  3. Create the Kubernetes namespace where you plan to create the iamserviceaccount. The following command creates the vertica namespace:

    $ kubectl create ns vertica
    namespace/vertica created
    
  4. Use eksctl to create a Kubernetes service account in the vertica namespace. When you create a service account with eksctl, you can attach an IAM policy that allows S3 access:

    $ eksctl create iamserviceaccount --name shared-service-account --namespace vertica --cluster cluster --attach-policy-arn arn:aws:iam::profile:policy/policy --approve
    2022-10-07 08:38:32 []  1 iamserviceaccount (vertica/my-serviceaccount) was included (based on the include/exclude rules)
    2022-10-07 08:38:32 [!]  serviceaccounts that exist in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
    2022-10-07 08:38:32 []  1 task: {
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "vertica/my-serviceaccount",
            create serviceaccount "vertica/my-serviceaccount",
        } }2022-10-07 08:38:32 []  building iamserviceaccount stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:38:33 []  deploying stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:38:33 []  waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:39:03 []  waiting for CloudFormation stack "eksctl-cluster-addon-iamserviceaccount-vertica-my-serviceaccount"
    2022-10-07 08:39:04 []  created serviceaccount "vertica/my-serviceaccount"
    
  5. Create a VerticaDB custom resource (CR). Specify the service account with the serviceAccountName field, and omit the communal.credentialSecret field:

    apiVersion: vertica.com/v1
    kind: VerticaDB
    metadata:
      name: irsadb
      annotations:
        vertica.com/run-nma-in-sidecar: "false"
    spec:
      image: vertica/vertica-k8s:12.0.3-0
      serviceAccountName: shared-service-account
      communal:
        path: "s3://path/to/s3-endpoint
        endpoint: https://s3.amazonaws.com
      subclusters:
        - name: sc
          size: 3
    

When pods are created, they use the service account that has a policy that provides access to the S3 communal storage.

Server-side encryption

If your S3 communal storage uses server-side encryption (SSE), you must configure the encryption type when you create the CR. Vertica supports the following types of SSE:

  • SSE-S3
  • SSE-KMS
  • SSE-C

For details about Vertica support for each encryption type, see S3 object store.

The following tabs provide examples on how to implement each SSE type. For details about the parameters, see Custom resource definition parameters.

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: verticadb
  annotations:
    vertica.com/run-nma-in-sidecar: "false"
spec:
  communal:
    path: "s3://bucket-name"
    s3ServerSideEncryption: SSE-S3

This setting requires that you use the communal.additionalConfig parameter to pass the key identifier (not the key) of the Key management service. Vertica must have permission to use the key, which is managed through KMS:

apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: verticadb
  annotations:
    vertica.com/run-nma-in-sidecar: "false"
spec:
  communal:
    path: "s3://bucket-name"
    s3ServerSideEncryption: SSE-KMS
    additionalConfig:
      S3SseKmsKeyId: "kms-key-identifier"

Store the client key contents in a Secret and reference the Secret in the CR. The client key must be either a 32-character plaintext key or a 44-character base64-encoded key.

You must create the Secret in the same namespace as the CR:

  1. Create a Secret that stores the client key contents in the stringData.clientKey field:
   apiVersion: v1
   kind: Secret
   metadata:
     name: sse-c-key
     annotations:
       vertica.com/run-nma-in-sidecar: "false"
   stringData:
     clientKey: client-key-contents
   
  1. Add the Secret to the CR with the communal.s3SseCustomerKeySecret parameter:
   apiVersion: vertica.com/v1
   kind: VerticaDB
   metadata:
     name: verticadb
     annotations:
       vertica.com/run-nma-in-sidecar: "false"
   spec:
     communal:
       path: "s3://bucket-name"
       s3ServerSideEncryption: SSE-C
       s3SseCustomerKeySecret: "sse-c-key"
   ...
   

Google Cloud Storage

Authenticating to Google Cloud Storage (GCS) requires your hash-based message authentication code (HMAC) access and secret keys, and the path to your GCS bucket. For details about HMAC keys, see Eon Mode on GCP prerequisites.

You have two authentication options: you can authenticate with Kubernetes Secrets, or you can use the keys stored in Google Secret Manager.

For additional details about Vertica and GCS, see Vertica on Google Cloud Platform.

Kubernetes Secret authentication

To authenticate with a Kubernetes Secret, create the secret and add it to the CR manifest:

  1. The following command stores your HMAC access and secret key in a Secret named gcs-creds:

    $ kubectl create secret generic gcs-creds --from-literal=accesskey=accessKey --from-literal=secretkey=secretkey
    
  2. Add the Secret and the path to the GCS bucket that contains your Vertica database to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: gcs-creds
        path: gs://bucket-name/path/to/database-name
        ...
    

Azure Blob Storage

Micosoft Azure provides a variety of options to authenticate to Azure Blob Storage location. Depending on your environment, you can use one of the following combinations to store credentials in a Secret:

  • accountName and accountKey

  • accountName and shared access signature (SAS)

If you use an Azure storage emulator such as Azurite in a tesing environment, you can authenticate with accountName and blobStorage values.

  1. The following command stores accountName and accountKey in a Secret named azb-creds:

    $ kubectl create secret generic azb-creds --from-literal=accountKey=accessKey --from-literal=accountName=accountName
    

    Alternately, you could store your accountName and your SAS credentials in azb-creds:

    $ kubectl create secret generic azb-creds --from-literal=sharedAccessSignature=sharedAccessSignature --from-literal=accountName=accountName
    
  2. Add the Secret and the path that contains your AZB storage bucket to the communal section of the CR spec:

    spec:
      ...
      communal:
        credentialSecret: azb-creds
        path: azb://accountName/bucket-name/database-name
        ...
    

For details about Vertica and authenticating to Microsoft Azure, see Eon Mode on GCP prerequisites.

Hadoop file storage

Connect to Hadoop Distributed Filesystem (HDFS) communal storage with the standard webhdfs scheme, or the swebhdfs scheme for wire encryption. In addition, you must add your HDFS configuration files in a ConfigMap, a Kubernetes object that stores data in key-value pairs. You can optionally configure Kerberos to authenticate connections to your HDFS storage location.

The following example uses the swebhdfs wire encryption scheme that requires a certificate authority (CA) bundle in the CR spec.

  1. The following command stores a PEM-encoded CA bundle in a Secret named hadoop-cert:

    $ kubectl create secret generic hadoop-cert --from-file=ca-bundle.pem
    
  2. HDFS configuration files are located in the /etc/hadoop directory. The following command creates a ConfigMap named hadoop-conf:

    $ kubectl create configmap hadoop-conf --from-file=/etc/hadoop
    
  3. Add the configuration values to the communal and certSecrets sections of the spec:

    spec:
      ...
      hadoopConfig: hadoop-conf
      communal:
        path: "swebhdfs://path/to/database"
    
        caFile: /certs/hadoop-cert/ca-bundle.pem
      certSecrets:
        - name: hadoop-cert
      ...
    

    The previous example defines the following:

    • hadoopConfig: ConfigMap that stores the contents of the /etc/hadoop directory.
    • communal.path: Path to the database, using the wire encryption scheme. Enclose the path in double quotes.
    • communal.caFile: Mount path in the container filesystem containing the CA bundle used to create the hadoop-cert Secret.
    • certSecrets.name: Secret containing the CA bundle.

For additional details about HDFS and Vertica, see Apache Hadoop integration.

Kerberos authentication (optional)

Vertica authenticates connections to HDFS with Kerberos. The Kerberos configuration between Vertica on Kubernetes is the same as between a standard Eon Mode database and Kerberos, as described in Kerberos authentication.

  1. The following command stores the krb5.conf and krb5.keytab files in a Secret named krb5-creds:

    $ kubectl create secret generic krb5-creds --from-file=kerberos-conf=/etc/krb5.conf --from-file=kerberos-keytab=/etc/krb5.keytab
    

    Consider the following when managing the krb5.conf and krb5.keytab files in Vertica on Kubernetes:

    • Each pod uses the same krb5.keytab file, so you must update the krb5.keytab file before you begin any scaling operation.

    • When you update the contents of the krb5.keytab file, the operator updates the mounted files automatically, a process that does not require a pod restart.

    • The krb5.conf file must include a [domain_realm] section that maps the Kubernetes cluster domain to the Kerberos realm. The following example maps the default .cluster.local domain to a Kerberos realm named EXAMPLE.COM:

      [domain_realm]
        .cluster.local = EXAMPLE.COM
      
  2. Add the Secret and additional Kerberos configuration information to the CR:

    spec:
      ...
      hadoopConfig: hadoop-conf
      communal:
        path: "swebhdfs://path/to/database"
        additionalConfig:
          kerberosServiceName: verticadb
          kerberosRealm: EXAMPLE.COM    
      kerberosSecret: krb5-creds
      ...
    

The previous example defines the following:

  • hadoopConfig: ConfigMap that stores the contents of the /etc/hadoop directory.
  • communal.path: Path to the database, using the wire encryption scheme. Enclose the path in double quotes.
  • communal.additionalConfig.kerberosServiceName: Service name for the Vertica principal.
  • communal.additionalConfig.kerberosRealm: Realm portion of the principal.
  • kerberosSecret: Secret containing the krb5.conf and krb5.keytab files.

For a complete definition of each of the previous values, see Custom resource definition parameters.