This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Deploy Vertica from the Google cloud marketplace

The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP).

The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP). Currently, three entries let you select the database mode and the license you want to use:

  • The Enterprise Mode launcher deploys a Vertica database with 3 or more nodes, plus an additional VM running the Management Console (MC). See Deploying an Enterprise Mode database in GCP from the marketplace for more information.

  • The Eon Mode BYOL (bring your own license) launcher deploys a single instance running the MC. You use this MC instance to deploy a Vertica database running on Eon Mode. This database has a community license applied to it initially. You can later upgrade it to a license you have obtained from Vertica. See Deploying an Eon Mode database on GCP for more information.

  • The Eon Mode BTH (by the hour) launcher also deploys a single instance running the MC that you use to deploy a database. This database has a by-the-hour license applied to it. Instead of paying for a license up front, you pay an hourly fee that covers both Vertica and running your instances. The BTH license is automatically applied to all clusters you create using a BTH MC instance. See Deploying an Eon Mode database on GCP for more information. If you choose, you can upgrade this hourly license to a longer-term license you purchase from Vertica. To move a BTH cluster to a BYOL license, follow the instructions in Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL) for more information.

1 - Deploying an Enterprise Mode database in GCP from the marketplace

The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database.

The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database. The solution includes the Vertica Management Console (MC) as the primary UI for you to get started.

The launcher automatically creates a database named vdb using the Community Edition (CE) license. The CE license is limited to a maximum of 3 nodes. You can tell the launcher to add more than 3 nodes to your deployment. In this case, it uses the first three nodes in the cluster to create the database. The remaining nodes are not part of the database, but are added to your cluster. To add these nodes to your database, you must replace the Community Edition license with a license key you receive from the Software Entitlement support site. See Managing licenses for more information.

After the launcher creates the initial database, it configures the MC to attach to that database automatically.

Configure the Vertica cloud launcher solution

To get started with a deployment of Vertica from the Google Cloud Launcher, search for the Vertica Data Warehouse, Enterprise Mode entry.

Follow these steps:

  1. Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.

  2. From the listing page, click LAUNCH.

  3. On the New Vertica Analytics Platform deployment page, enter the following information:

    • Deployment name: Each deployment must have a unique name. That name is used as the prefix for the names of all VMs created during the deployment. The deployment name can only contain lowercase characters, numbers, and dashes. The name must start with a lowercase letter and cannot end with a dash.

    • Zone: GCP breaks its cloud data centers into regions and zones. Regions are a collection of zones in the same geographical location. Zones are collections of compute resources, which vary from zone to zone.

      For best results, pick the zone in your designated region that supports the latest Intel CPUs. For a complete listing of regions and zones, including supported processors, see Regions and Zones.

    • Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.

    • Under Vertica Management Console, choose the configuration for the virtual machine that will run the Management Console. The Vertica Analytics Platform in Cloud Launcher always deploys the Vertica Management Console (MC) as part of the solution.

      The default machine type for MC is sufficient for most deployments. You can choose another machine type that better suits any additional purposes, such serving as a target node for backups, data transformation, or additional management tools.

    • Node count for Vertica Cluster: The total number of VMs you want to deploy in the Vertica Cluster. The default is 3.

    • Machine type for Vertica Cluster nodes: The Cloud Launcher builds each node in the cluster using the same machine type. Modify the machine type for your nodes based on the workloads you expect your database to handle. See Supported GCP machine types for more information.

    • Data disk type: GCP offers two types of persistent disk storage: Standard and SSD. The costs associated with Standard are less, but the performance of SSD storage is much better. Vertica recommends you use SSD storage. For more information on Standard and SSD persistent disks, see Storage Options.

    • Disk size in GB: Disk performance is directly tied to the disk size in GCP. The default value of 2000 GBs (2 TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.

      If you select a smaller disk size, the throughput performance decreases. If you select a large disk size, the performance remains the same as the 2 TB option.

    • Network: VMs in GCP must exist on a virtual private cloud (VPC). When you created your GCP account, a default VPC was created. Create additional VPCs to isolate solutions or projects from one another. The Vertica Analytics Plaform creates all the nodes in the same VPC.

    • Subnetwork: Just as a GCP account may have multiple VPCs, each VPC may also have multiple subnets. Use additional subnets to group or isolate solutions within the same VPC.

    • Firewall: If you want your MC to be accessible via the internet, check the Allow access to the Management Console from the Internet box. Vertica recommends you protect your MC using a firewall that restricts access to just the IP addresses of users that need to access it. You can enter one or more comma-separated CIDR address ranges.

After you have entered all the required information, click Deploy to begin the deployment process.

Monitor the deployment

After the deployment begins, Google Cloud Launcher automatically opens the Deployment Manager page that displays the status of the deployment. Items that are still being processed have a spinning circle to the left of them and the text is a light gray color. Items that have been created are dark gray in color, with an icon designating that resource type on the left.

After the deployment completes, a green check mark appears next to the deployment name in the upper left-hand section of the screen.

Accessing the cluster after deployment

After the deployment completes, the right-hand section of the screen displays the following information:

  • dbadmin password: A randomly generated password for the dbadmin account on the nodes. For security reasons, change the dbadmin password when you first log in to one of the Vertica cluster nodes.

  • mcadmin password: A randomly generated password for the mcadmin account for accessing the Management Console. For security reasons, change the mcadmin password after you first log in to the MC.

  • Vertica Node 1 IP address: The external IP address for the first node in the Vertica cluster is exposed here so that you can connect to the VM using a standard SSH client.To access the MC, press the Access Vertica MC button in the Get Started section of the dialog box. Copy the mcadmin password and paste it when asked.

For more information on using the MC, see Management Console.

Access the cluster nodes

There are two ways to access the cluster nodes directly:

  • Use GCP's integrated SSH shell by selecting the SSH button in the Get Started section. This shell opens a pop-up in your browser that runs GCP's web-based SSH client. You are automatically logged on as the user you authenticated as in the GCP environment.

    After you have access to the first Vertica cluster node, execute the su dbadmin command, and authenticate using the dbadmin password.

  • In addition, use other standard SSH clients to connect directly to the first Vertica cluster node. Use the Vertica Node 1 IP address listed on the screen as the dbadmin user, and authenticate with the dbadmin password.

    Follow the on-screen directions to log in using the mcadmin account and accept the EULA. After you've been authenticated, access the initial database by clicking the vdb icon (looks like a green cylinder) in the Recent Databases section.

Using a custom service account

In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:

  • The custom service account must have the Editor role.

  • Individual user accounts must have the Service Account User role on the custom service account.

2 - Eon Mode databases on GCP

You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance.

You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance. You then use the MC instance to provision and deploy an Eon Mode database.

2.1 - GCP Eon Mode instance recommendations

When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes.

When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes. The default instance settings in the MC are the more conservative option (currently, n1-standard-16). They are sufficient for most workloads. However, you may choose instances with more memory (such as n1-highmem-16) if your queries perform complex joins that may otherwise spill to disk. You can also choose instances with more cores (such as n1-standard-32), if you perform highly-complex compute-intensive analysis. The following links provide additional information about GCP machine type instances and Vertica:

The more powerful instance you choose, the higher the cost per hour. You need to balance whether you want to use fewer, higher-powered but more expensive instances vs. relying on more lower-powered instances that cost less. Thanks to Eon Mode's elasticity, if you choose to use the less-powerful instances, you can always add more nodes to meet peak demands. When you reduce the number of instances to a minimum during off-peak times, you'll spend less than if you had a similar number of more-powerful instances.

Storage options

The MC's deployment wizard also asks you to select the type of local storage for your instances. You can select different options for each type of local storage that Vertica uses: the catalog, the depot, and temporary space. For all of these storage locations, you choose the type of disks to use (standard vs. SSD). You will see the best performance with SSD disks. However, SSD disks cost more.

For the depot, you also choose whether to use local or persistent disks. The local option is faster, as it resides directly on the virtual machine host. However, whenever you shut down the node, this storage is wiped clean. The persistent storage is slower than the local option, as it is not stored directly on the machine hosting the instance. However, it is not wiped out whenever you shut down the instance. See the Google Cloud documentation's Storage options page for more information.

Which of these options you choose depends on how much depot warming the nodes must perform when starting. If the content of your node's depots change little over time (or you tend to frequently start and stop instances), using persistent storage makes sense. In this case, the depot's warming period will be shorter because most of the data the node needs to participate in queries may still be in its depot when it starts. It will perform fewer fetches of data from communal storage while participating in queries.

If your working data set is rapidly changing or you tend to leave nodes stopped for extended periods of time, your best choice is usually to use local storage. In this scenario, the data in the node's depot when it restarts is usually stale. To participate in queries, the node must fetch much of the data it needs from communal storage, resulting in slower performance until it has warmed its depot. Using local ephemeral storage makes sense here, because you will get the benefit of having faster depot storage. Because your nodes have to warm their depots anyhow, there is less of a downside of having the depot on ephemeral storage.

For general guidelines on scaling your cluster for Eon Mode database, see Configuring your Vertica cluster for Eon Mode.

2.2 - Eon Mode on GCP prerequisites

Before deploying an Eon Mode database on GCP, you must take several steps:.

Before deploying an Eon Mode database on GCP, you must take several steps:

  • Review the default service account's permissions for your GCP project.

  • Create an HMAC key to use when creating your cluster.

  • Create a communal storage location.

Service account permissions

Service accounts allow automated processes to authenticate with GCP. The Eon Mode database deployment process uses the project's service account for your GCP project to deploy instances. When you create a new project, GCP automatically creates a default service account (identified by project_number-compute@developer.gserviceaccount.com) for the project and grants it the IAM role Editor. See the Google Cloud documentation's Understanding roles for details about this and other IAM roles.

The Editor role lets the service account create resources from the Marketplace. When you create an instance of the Management Console (MC), the MC uses the account to deploy further resources, such as provisioning instances for an database.

For details, see the Google Cloud documentation's Understanding service accounts page.

Permissions and roles

To deploy Vertica on GCP, your user account must have the:

  • Editor role.

  • runtimeconfig.waiters.getIamPolicy permission.

Creating an HMAC key

Vertica uses a hash-based message authentication code (HMAC) key to authenticate requests to access the communal storage location. This key has two parts: an access ID and a secret. When you create an Eon Mode database in GCP, you provide both parts of an HMAC key for the nodes to use to access communal storage.

To create an HMAC key:

  1. Log in to your Google Cloud account.

  2. If the name of the project you will use to create your database does not appear in the top banner, click the dropdown and select the correct project.

  3. In the navigation menu in the upper-left corner, under the Storage heading, click Storage and select Settings.

  4. In the Settings page, click Interoperability.

  5. Scroll to the bottom of the page and find the User account HMAC heading.

  6. Unless you have already set a default project, you will see the message stating you haven’t set a default project for your user account yet. Click the Set project-id as default project button to choose the current project as your default for interoperability.

  7. Under Access keys for your user account, click Create a key.

  8. Your new access key and secret appear in the HMAC key list. You will need them when you create your Eon Mode database. You can copy them to a handy location (such as a text editor) or leave a browser tab open to this page while you use another tab or window to create your database. These keys remain available on this page, so you do not need to worry about saving them elsewhere.

Creating a communal storage location

Your Eon Mode database needs a storage location for its communal storage. Eon Mode databases running on GCP use Google Cloud Storage (GCS) for their communal storage location. When you create your new Eon Mode database, you will supply the MC's wizard with a GCS URL for the storage location.

This location needs to meet the following criteria:

  • The URL must include at least a bucket name. You can use one or more levels of folders, as well. For example, the following GCS URLs are valid:

    • gs://verticabucket/mydatabase

    • gs://verticabucket/databases/mydatabase

    • gs://verticabucket

    Multiple databases can share the same bucket, as long as each has its own folder.

  • If provided, the lowest-level folder in the URL must not already exist. For example, in the GCS URL gs://verticabucket/databases/mydatabase, the bucket named verticabucket and the directory named databases must exist. The subdirectory named mydatabase must not exist. The Vertica install process expects to create the final folder itself. If the folder already exists, the installation process fails.

  • The permissions on the bucket must be set to allow the service account read, write, and delete privileges on the bucket. The best role to assign to the user to gain these permissions is Storage Object Admin.

  • To prevent performance issues, the bucket must be in the same region as all of the nodes running the Eon Mode database.

  • If you create the database through the admintools UI, you must set gcsauth as a bootstrap parameter in admintools.conf. For more information on this and other GCP parameters, see Google Cloud Storage parameters.

    [BootstrapParameters]
    gcsauth = ID:secret
    

2.3 - Deploying an Eon Mode database on GCP

Once you have taken the steps listed in Eon Mode on GCP Prerequisites, you are ready to deploy an Eon Mode database in GCP.

Once you have taken the steps listed in Eon Mode on GCP prerequisites, you are ready to deploy an Eon Mode database in GCP. This process has two steps: deploy a single-node MC instance, then use the MC to provision and deploy a database. The following topics explain these steps.

2.3.1 - Deploying an MC instance to GCP for Eon Mode

To deploy an MC instance that is able to deploy Eon Mode databases to GCP:.

To deploy an MC instance that is able to deploy Eon Mode databases to GCP:

  1. Log into your GCP account, if you are not currently logged in.

  2. Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.

  3. Verify that the name of the GCP project you want to use for the deployment appears in the top banner. If it does not, click the down arrow next to the project name and select the correct project.

  4. Click the navigation menu icon in the top left of the page and select Marketplace.

  5. In the Search for solutions box, type Vertica Eon Mode and press enter.

  6. Click the search result for Vertica Data Warehouse, Eon Mode. There are two license options: by the hour (BTH) and bring your own license (BYOL). See Deploy Vertica from the Google cloud marketplace for more information on this license choice.

  7. Click Launch on the license option you prefer.

  8. On the following page, fill in the fields to configure your MC instance:

    • Deployment name identifies your MC deployment in the GCP Deployments page.

    • Zone is the location where the virtual machine running your MC instance will be deployed. Make this the same location where your communal storage bucket is located.

    • Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.

    • Machine Type is the virtual hardware configuration of the instance that will run the MC. The default values here are "middle of the road" settings which are sufficient for most use cases. If you are doing a small proof-of-concept deployment, you can choose a less powerful instance to save some money. If you are planning on deploying multiple large databases, consider increasing the count of virtual CPUs and RAM.
      For details about Vertica's default volume configurations, see Eon Mode volume configuration defaults for GCP.

    • User Name for Access to MC is the administrator username for the MC. You can customize this if you want.

    • Network and Subnetwork are the virtual private cloud (VPC) network and subnet within that network you want your MC instance and your Vertica nodes to use. This setting does not affect your MC's external network address. If you want to isolate your Vertica cluster from other GCP instances in your project, create a custom VPC network and optionally a subnet in your GCP project and select them in these fields. See the Google Cloud documentation's VPC network overview page for more information.

    • Firewall enables access to the MC from the internet by opening port 5450 in the firewall. You can choose to not open this port by clearing the I accept opening a port in the firewall (5450) for Vertica box. However, if you do not open the port in the firewall, your MC instance will only be accessible from within the VPC network. Not opening the port will make accessing your MC instance much harder.

    • Source IP ranges for MC traffic: If you choose to open the MC for external access, add one or more or more CIDR address ranges to this box for network addresses that you want to be able to access to the MC.

  9. Click the Deploy button to start the deployment of your MC instance.

The deployment process will take several minutes.

Using a custom service account

In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:

  • The custom service account must have the Editor role.

  • Individual user accounts must have the Service Account User role on the custom service account.

Connect and log into the MC instance

After the deployment process is finished, the Deployment Manager page for your MC instance contains links to connect to the MC via your browser or ssh.

To connect to the MC instance:

  1. The MC administrator user has a randomly-generated password that you need to log into the MC. Copy the password in the MC Admin Password field to the clipboard.

  2. Click Access Management Console.

  3. A new browser tab or window opens, showing you a page titled Redirection Notice. Click the link for the MC URL to continue to the MC login page.

  4. Your browser will likely show you a security warning. The MC instance uses a self-signed security certificate. Most browsers treat these certificates as a security hazard because they cannot verify their origin. You can safely ignore this warning and continue. In most browsers, click the Advanced button on the warning page, and select the option to proceed. In Chrome, this is a link titled Proceed to xxx.xxx.xxx.xxx (unsafe). In Firefox, it is a button labeled Accept the Risk and Continue.

  5. At the login screen, enter the MC administrator user name into the Username box. This user name is mcadmin, unless you changed the user name in the MC deployment form.

  6. Paste the automatically-generated password you copied from the MC Admin Password field earlier into the Password box.

  7. Click Log In.

Once you have logged into the MC, change the MC administrator account's password.

To change the password:

  1. On the home page of the MC, under the MC Tools section, click MC Settings.

  2. In the left-hand menu, click User Management.

  3. Select the entry for the MC administrator account and click Edit.

  4. Click either the Generate new or Edit password button to change the password. If you click the Generate new button, be sure to save the automatically-generated password in a safe location. If you click Edit password, you are prompted to enter a new password twice.

  5. Click Save to update the password.

Now that you have created your MC instance, you are ready to deploy a Vertica Eon Mode cluster. See Using the MC to provision and create an Eon Mode database in GCP.

2.3.2 - Using the MC to provision and create an Eon Mode database in GCP

After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.

After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.

To use the MC to provision and deploy a new Eon Mode database on GCP:

  1. From the MC home screen, click Create new database to launch the Create a Vertica Cluster on Google Cloud wizard.

  2. On the first page of the wizard enter the following information:

    • Google Cloud Storage HMAC Access Key and HMAC Secret Key: Copy and paste the HMAC access key and secret you created earlier. You find these values on the Interoperability tab of the of the Storage Settings page. See Eon Mode on GCP prerequisites for details.

    • Zone: This value defaults to the zone containing your MC instance. Make this value the same as the zone containing the Google Cloud Storage bucket that your database will use for communal storage.

    • CIDR Range: The IP address range for clients to whom you want to grant access to your database. Make this range as restrictive as possible to limit access to your database.

  3. Click Next, and supply the following information:

    • Vertica Database Name: the name for your new database. See Creating a database name and password for database name requirements.

    • Vertica Version: select the desired Vertica database version. You can select from the latest hotfix of recent Vertica releases. For each database version, you can also select the operating system.

    • Vertica Database User Name: the name of the database superuser. This name defaults to dbadmin, but you can enter another user name here.

    • Password and Confirm Password: Enter a password for the database superuser account.

    • Database Size: The number of nodes in your initial database. If you specify more than three nodes here, you must supply a valid Vertica license file in the Vertica License field (below).

    • Vertica License: Click Browse to locate and upload your Vertica license key file. If you do not supply a license key file here, the wizard deploys your database with a Vertica Community Edition license. This license has a three node limit, so the value in the Database Size filed cannot be larger than 3 if you do not supply a license. If you use a Community Edition license for your deployment, you can upgrade the license later to expand your cluster load more than 1TB of data. See Managing licenses form more information.

    • Load example data: Check this box if you want your deployed database to load some example clickstream data. This option is useful if you are testing features and just want some preloaded data in the database to query.

  4. Click Next and supply the following information:

    • Instance Type: the specifications of the virtual machine instances the MC will use to deploy your database nodes. See the Google Cloud documentation's Machine types page for details of each instance type. Also see GCP Eon Mode instance recommendations.

    • Database Depot Path and Disk Type: the local mount point for the depot, and the type and number of local disks dedicated to the depot for each node. You cannot change the mount path for the depot. The disks you select in the Disk Type field are only used to store the depot. On the next page of the wizard, you will configure disks for the catalog and temporary disk space. You will see the best performance when using SSD disks, although at a higher cost. You can choose to use faster local storage for your depot. However, local storage is ephemeral—GCP wipes the disk clean whenever you stop the instance. This means each time you start a node, it will have to warm its depot from scratch, rather than taking advantage of any still-current data in its depot. See the Google Cloud documentation's Storage options page for more information about the local disk options.

    • Volume Size: the amount of disk space available on each disk attached to each node in your cluster. This field shows you the total disk space available per node in your cluster. For the best practices on choosing the amount of disk space for your nodes, see Configuring your Vertica cluster for Eon Mode.

    • Data Segmentation Shards: sets the number of shards in your database. After you set this value, you cannot change it later. See Configuring your Vertica cluster for Eon Mode for recommendations. The default value is based on the number of nodes you entered in the Database size you specified earlier. It is usually sufficient, unless you anticipate greatly expanding your cluster beyond your initial node count.

    • Communal Location: a Google Cloud Storage URL that specifies where to store your database's communal data. See Eon Mode on GCP prerequisites for requirements.

    • Instance IP settings: specify whether the nodes in your database will have static or ephemeral network addresses that are accessible from the internet, or addresses that are only accessible from within the internal virtual network.

  5. Click Next. The wizard validates your communal storage location URL. If there is an problem with the URL you entered, it displays an error message and prompts you to fix the URL.

    After your communal storage URL passes validation, fill in the following information:

    • Database Catalog Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the local copy of the database catalog on each node. You cannot edit the mount point. You choose the type of local disk to use for the catalog, and its size. You can only choose persistent disk storage for the catalog. SSD drives are faster, but more expensive than standard disks. The default setting for the disk size is adequate for most medium size databases. Increase the size if you anticipate maintaining a large database.

    • Database Temp Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the temporary storage space on each node. You cannot edit the mount point. You choose the type of local disk to use, and its size. You can only choose persistent disk storage for the temporary disk space. SSD drives are faster, but more expensive than standard disks. The default setting is adequate for most databases. Consider increasing the temporary space if you perform many complex merges that spill to disk.

    • Label Instances: check this box to enable adding labels to your node's instances. Many organizations use labels to organize, track responsibility, and assign costs for instances. See the Google Cloud documentation's Labeling resources page for more information. If you choose to add labels, enter the label name and value, and click Add.

  6. Click Next. Review the summary of all your database settings. If you need to make a correction, use the Back button to step back to previous pages of the wizard.

  7. When you are satisfied with the database settings, check Accept terms and conditions and click Create.

The process of provisioning and creating the database takes several minutes. After it completes successfully, the MC displays a Get Started button. This button leads to a page of useful links for getting started with your new database.

See also