Per-bucket S3 configurations

You can manage configurations and credentials for individual buckets with the configuration parameters S3BucketConfig and S3BucketCredentials.

You can manage configurations and credentials for individual buckets with the configuration parameters S3BucketConfig and S3BucketCredentials. These parameters take a JSON object, whose respective properties behave like AWSAuth and AWSEndpoint.

For example, you can create a different configuration for each of your S3 buckets by setting S3BucketConfig at the database level with ALTER DATABASE. The following S3BucketConfig specifies all possible properties:

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "exampleAWS",
        "region": "us-east-2",
        "protocol": "https",
        "requesterPays": true
    },
    {
        "bucket": "examplePureStorage",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';

Users can then access a bucket by setting S3BucketCredentials at the session level with ALTER SESSION. The following S3BucketCredentials specifies all possible properties and authenticates to both exampleAWS and examplePureStorage simultaneously.

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "exampleAWS",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "examplePureStorage",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>",
    }
]';

The recommended usage is as follows:

  • Define in your S3 storage system one set of credentials per principal, per storage system.

  • It is often most convenient to set S3BucketConfig once at the database level and have users authenticate by setting S3BucketCredentials at the session level.

  • To access buckets outside those configured at the database level, set both S3BucketConfig and S3BucketCredentials at the session level.

If you cannot define credentials for your S3 storage, you can set S3BucketCredentials or AWSAuth at the database level with ALTER DATABASE, but this comes with certain drawbacks:

  • Storing credentials statically in another location (in this case, in the Vertica catalog) always incurs additional risk.

  • This increases overhead for the dbadmin, who needs to create user storage locations and grant access to each user or role.

  • Users share one set of credentials, increasing the potential impact if the credentials are compromised.

Precedence of per-bucket and standard parameters

Vertica uses the following rules to determine the effective set of properties for an S3 connection:

  • If set, S3BucketCredentials takes priority over its standard parameters. S3BucketCredentials is checked first at the session level and then at the database level.

  • The level/source of the S3 credential parameters determines the source of the S3 configuration parameters:

    • If credentials come from the session level, then the configuration can come from either the session or database level (with the session level taking priority).

    • If your credentials come from the database level, then the configuration can only come from the database level.

  • If S3BucketConfig is set, it takes priority over its standard parameters. If an S3BucketConfig property isn't specified, Vertica falls back to the missing property's equivalent parameter. For example, if S3BucketConfig specifies every property except protocol, Vertica falls back to the standard parameter AWSEnableHttps.

Examples

Using per-bucket parameters

This example configures a real Amazon S3 bucket AWSBucket and a Pure Storage bucket PureStorageBucket with S3BucketConfig.

AWSBucket does not specify an endpoint or protocol, so Vertica falls back to AWSEndpoint (defaults to s3.amazonaws.com) and AWSEnableHttps (defaults to 1).

In this example environment, access to the PureStorageBucket is over a secure network, so HTTPS is disabled:

ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2"
    },
    {
        "bucket": "PureStorageBucket",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';

Bob can then set S3BucketCredentials at the session level to authenticate to AWSBucket:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    }
]';

Similarly, Alice can authenticate to PureStorageBucket:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

Charlie provides credentials for both AWSBucket and PureStorageBucket and authenticates to them simultaneously. This allows him to perform cross-endpoint joins, export from one bucket to another, etc.

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

Non-amazon S3 storage with AWSEndpoint and S3BucketConfig

If AWSEndpoint is set to a non-Amazon S3 bucket like Pure Storage or MinIO and you want to configure S3BucketConfig for a real Amazon S3 bucket, the following requirements apply:

  • If your real Amazon S3 region is not us-east-1 (the default), you must specify the region.

  • Set endpoint to an empty string ("").

In this example, AWSEndpoint is set to a Pure Storage bucket.

=> ALTER DATABASE DEFAULT SET AWSEndpoint='pure.mycorp.net:1234';

To configure S3BucketConfig for a real Amazon S3 bucket realAmazonS3Bucket in region "us-east-2":

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "realAmazonS3Bucket",
        "region": "us-east-2",
        "endpoint": ""
    },
]';