Vertica supports automatic deployment on Google Cloud Platform (GCP) through the Google Cloud Launcher, or manual installation and deployment on GCP machines.
Vertica supports automatic deployment on Google Cloud Platform (GCP) through the Google Cloud Launcher, or manual installation and deployment on GCP machines.
You can deploy a Vertica database on GCP running in either Enterprise Mode or Eon Mode. In Eon Mode, Vertica stores its data communally using Google Cloud Storage (GCS).
This section explains how to deploy a Vertica database to GCP.
Vertica Analytic Database supports a range of machine types, each optimized for different workloads.
Vertica Analytic Database supports a range of machine types, each optimized for different workloads. When you deploy your Vertica Analytic Database cluster to the Google Cloud Platform (GCP), different machine types are available depending on how you provision your database.
Note
Some machine types are not available across all regions.
The sections below list the GCP machine types that Vertica supports for Vertica cluster hosts, and for use in Management Console. For details on the configuration of the machine type options, see the Google Cloud documentation's Machine types page.
Machine types available for MC hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy an instance for running the Vertica Management Console.
Tip
In most cases, 8 vCPUs are sufficient when selecting a machine type for running the Management Console.
Machine types available for Vertica database cluster hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy cluster hosts.
Machine types for Vertica database cluster hosts provisioned from MC
The table below lists the GCP machine types that Vertica supports when you provision your cluster from Management Console.
Machine Type
Machine Name
N1 standard
n1-standard-16
n1-standard-32
n1-standard-64
N1 high-memory
n1-highmem-16
n1-highmem-32
n1-highmem-64
N2 standard
n2-standard-16
n2-standard-32
n2-standard-48
n2-standard-64
N2 high-memory
n2-highmem-16
n2-highmem-32
n2-highmem-48
n2-highmem-64
2 - Deploy Vertica from the Google cloud marketplace
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP).
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP). Currently, three entries let you select the database mode and the license you want to use:
The Eon Mode BYOL (bring your own license) launcher deploys a single instance running the MC. You use this MC instance to deploy a Vertica database running on Eon Mode. This database has a community license applied to it initially. You can later upgrade it to a license you have obtained from Vertica. See Deploy an MC instance in GCP for Eon Mode for more information.
The Eon Mode BTH (by the hour) launcher also deploys a single instance running the MC that you use to deploy a database. This database has a by-the-hour license applied to it. Instead of paying for a license up front, you pay an hourly fee that covers both Vertica and running your instances. The BTH license is automatically applied to all clusters you create using a BTH MC instance. See Deploy an MC instance in GCP for Eon Mode for more information. If you choose, you can upgrade this hourly license to a longer-term license you purchase from Vertica. To move a BTH cluster to a BYOL license, follow the instructions in Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL).
Note
Vertica clusters that use IPv6 to identify hosts have not been tested on GCP. Vertica recommends you use IPv4 addresses to identify the hosts in your cluster on GCP.
2.1 - Eon Mode on GCP prerequisites
Before deploying an Eon Mode database on GCP, you must take several steps:.
Before deploying an Eon Mode database on GCP, you must take several steps:
Review the default service account's permissions for your GCP project.
Create an HMAC key to use when creating your cluster.
Create a communal storage location.
Service account permissions
Service accounts allow automated processes to authenticate with GCP. The Eon Mode database deployment process uses the project's service account for your GCP project to deploy instances. When you create a new project, GCP automatically creates a default service account (identified by project_number-compute@developer.gserviceaccount.com) for the project and grants it the IAM role Editor. See the Google Cloud documentation's Understanding roles for details about this and other IAM roles.
The Editor role lets the service account create resources from the Marketplace. When you create an instance of the Management Console (MC), the MC uses the account to deploy further resources, such as provisioning instances for an database.
To deploy Vertica on GCP, your user account must have the:
Editor role.
runtimeconfig.waiters.getIamPolicy permission.
Creating an HMAC key
Vertica uses a hash-based message authentication code (HMAC) key to authenticate requests to access the communal storage location. This key has two parts: an access ID and a secret. When you create an Eon Mode database in GCP, you provide both parts of an HMAC key for the nodes to use to access communal storage.
To create an HMAC key:
Log in to your Google Cloud account.
If the name of the project you will use to create your database does not appear in the top banner, click the dropdown and select the correct project.
In the navigation menu in the upper-left corner, under the Storage heading, click Storage and select Settings.
In the Settings page, click Interoperability.
Scroll to the bottom of the page and find the User account HMAC heading.
Unless you have already set a default project, you will see the message stating you haven’t set a default project for your user account yet. Click the Set project-id as default project button to choose the current project as your default for interoperability.
Note
The project ID appears in the button label, not the project name.
Under Access keys for your user account, click Create a key.
Your new access key and secret appear in the HMAC key list. You will need them when you create your Eon Mode database. You can copy them to a handy location (such as a text editor) or leave a browser tab open to this page while you use another tab or window to create your database. These keys remain available on this page, so you do not need to worry about saving them elsewhere.
Caution
It is vital that you protect the security of your HMAC key. It can grant others access to your Eon Mode database's communal storage location. This means they could access all of the data in your database. Do not write the HMAC key anyplace where it may be exposed, such as email, shared folders, or similar insecure locations.
Creating a communal storage location
Your Eon Mode database needs a storage location for its communal storage. Eon Mode databases running on GCP use Google Cloud Storage (GCS) for their communal storage location. When you create your new Eon Mode database, you will supply the MC's wizard with a GCS URL for the storage location.
This location needs to meet the following criteria:
The URL must include at least a bucket name. You can use one or more levels of folders, as well. For example, the following GCS URLs are valid:
gs://verticabucket/mydatabase
gs://verticabucket/databases/mydatabase
gs://verticabucket
Multiple databases can share the same bucket, as long as each has its own folder.
If provided, the lowest-level folder in the URL must not already exist. For example, in the GCS URL gs://verticabucket/databases/mydatabase, the bucket named verticabucket and the directory named databases must exist. The subdirectory named mydatabase must not exist. The Vertica install process expects to create the final folder itself. If the folder already exists, the installation process fails.
The permissions on the bucket must be set to allow the service account read, write, and delete privileges on the bucket. The best role to assign to the user to gain these permissions is Storage Object Admin.
To prevent performance issues, the bucket must be in the same region as all of the nodes running the Eon Mode database.
If you create the database through the admintools UI, you must set gcsauth as a bootstrap parameter in admintools.conf. For more information on this and other GCP parameters, see Google Cloud Storage parameters.
[BootstrapParameters]
gcsauth = ID:secret
2.2 - Deploy an Enterprise Mode database in GCP from the marketplace
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database.
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database. The solution includes the Vertica Management Console (MC) as the primary UI for you to get started.
The launcher automatically creates a database named vdb using the Community Edition (CE) license. The CE license is limited to a maximum of 3 nodes. You can tell the launcher to add more than 3 nodes to your deployment. In this case, it uses the first three nodes in the cluster to create the database. The remaining nodes are not part of the database, but are added to your cluster. To add these nodes to your database, you must replace the Community Edition license with a license key you receive from the Software Entitlement support site. See Managing licenses for more information.
After the launcher creates the initial database, it configures the MC to attach to that database automatically.
Configure the Vertica cloud launcher solution
To get started with a deployment of Vertica from the Google Cloud Launcher, search for the Vertica Data Warehouse, Enterprise Mode entry.
Follow these steps:
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
From the listing page, click LAUNCH.
On the New Vertica Analytics Platform deployment page, enter the following information:
Deployment name: Each deployment must have a unique name. That name is used as the prefix for the names of all VMs created during the deployment. The deployment name can only contain lowercase characters, numbers, and dashes. The name must start with a lowercase letter and cannot end with a dash.
Zone: GCP breaks its cloud data centers into regions and zones. Regions are a collection of zones in the same geographical location. Zones are collections of compute resources, which vary from zone to zone.
For best results, pick the zone in your designated region that supports the latest Intel CPUs. For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Under Vertica Management Console, choose the configuration for the virtual machine that will run the Management Console. The Vertica Analytics Platform in Cloud Launcher always deploys the Vertica Management Console (MC) as part of the solution.
The default machine type for MC is sufficient for most deployments. You can choose another machine type that better suits any additional purposes, such serving as a target node for backups, data transformation, or additional management tools.
Node count for Vertica Cluster: The total number of VMs you want to deploy in the Vertica Cluster. The default is 3.
Note
As mentioned above, the Cloud Launcher automatically deploys the Vertica Community Edition license, which limits the database to 3 nodes and up to 1 TB in raw data. Any additional nodes will be part of your database cluster, but will not be part of your database.
If you intend to use the Community Edition license for your database, leave the setting at 3. Otherwise, you would add nodes that will sit idle and cost you money without being part of your database.
Machine type for Vertica Cluster nodes: The Cloud Launcher builds each node in the cluster using the same machine type. Modify the machine type for your nodes based on the workloads you expect your database to handle. See Supported GCP machine types for more information.
Data disk type: GCP offers two types of persistent disk storage: Standard and SSD. The costs associated with Standard are less, but the performance of SSD storage is much better. Vertica recommends you use SSD storage. For more information on Standard and SSD persistent disks, see Storage Options.
Disk size in GB: Disk performance is directly tied to the disk size in GCP. The default value of 2000 GBs (2 TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
If you select a smaller disk size, the throughput performance decreases. If you select a large disk size, the performance remains the same as the 2 TB option.
Network: VMs in GCP must exist on a virtual private cloud (VPC). When you created your GCP account, a default VPC was created. Create additional VPCs to isolate solutions or projects from one another. The Vertica Analytics Plaform creates all the nodes in the same VPC.
Subnetwork: Just as a GCP account may have multiple VPCs, each VPC may also have multiple subnets. Use additional subnets to group or isolate solutions within the same VPC.
Firewall: If you want your MC to be accessible via the internet, check the Allow access to the Management Console from the Internet box. Vertica recommends you protect your MC using a firewall that restricts access to just the IP addresses of users that need to access it. You can enter one or more comma-separated CIDR address ranges.
After you have entered all the required information, click Deploy to begin the deployment process.
Monitor the deployment
After the deployment begins, Google Cloud Launcher automatically opens the Deployment Manager page that displays the status of the deployment. Items that are still being processed have a spinning circle to the left of them and the text is a light gray color. Items that have been created are dark gray in color, with an icon designating that resource type on the left.
After the deployment completes, a green check mark appears next to the deployment name in the upper left-hand section of the screen.
Accessing the cluster after deployment
After the deployment completes, the right-hand section of the screen displays the following information:
dbadmin password: A randomly generated password for the dbadmin account on the nodes. For security reasons, change the dbadmin password when you first log in to one of the Vertica cluster nodes.
mcadmin password: A randomly generated password for the mcadmin account for accessing the Management Console. For security reasons, change the mcadmin password after you first log in to the MC.
Vertica Node 1 IP address: The external IP address for the first node in the Vertica cluster is exposed here so that you can connect to the VM using a standard SSH client.To access the MC, press the Access Vertica MC button in the Get Started section of the dialog box. Copy the mcadmin password and paste it when asked.
There are two ways to access the cluster nodes directly:
Use GCP's integrated SSH shell by selecting the SSH button in the Get Started section. This shell opens a pop-up in your browser that runs GCP's web-based SSH client. You are automatically logged on as the user you authenticated as in the GCP environment.
After you have access to the first Vertica cluster node, execute the su dbadmin command, and authenticate using the dbadmin password.
In addition, use other standard SSH clients to connect directly to the first Vertica cluster node. Use the Vertica Node 1 IP address listed on the screen as the dbadmin user, and authenticate with the dbadmin password.
Follow the on-screen directions to log in using the mcadmin account and accept the EULA. After you've been authenticated, access the initial database by clicking the vdb icon (looks like a green cylinder) in the Recent Databases section.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
2.3 - Deploy an MC instance in GCP for Eon Mode
To deploy an Eon Mode database to GCP using Google Cloud Platform Launcher, you must deploy a Management Console (MC) instance.
To deploy an Eon Mode database to GCP using Google Cloud Platform Launcher, you must deploy a Management Console (MC) instance. You then use the MC instance to provision and deploy an Eon Mode database.
Once you have taken the steps listed in Eon Mode on GCP prerequisites, you are ready to deploy an Eon Mode database in GCP. To deploy an MC instance that is able to deploy Eon Mode databases to GCP:
Log into your GCP account, if you are not currently logged in.
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
Verify that the name of the GCP project you want to use for the deployment appears in the top banner. If it does not, click the down arrow next to the project name and select the correct project.
Click the navigation menu icon in the top left of the page and select Marketplace.
In the Search for solutions box, type Vertica Eon Mode and press enter.
Click the search result for Vertica Data Warehouse, Eon Mode. There are two license options: by the hour (BTH) and bring your own license (BYOL). See Deploy Vertica from the Google cloud marketplace for more information on this license choice.
Click Launch on the license option you prefer.
On the following page, fill in the fields to configure your MC instance:
Deployment name identifies your MC deployment in the GCP Deployments page.
Zone is the location where the virtual machine running your MC instance will be deployed. Make this the same location where your communal storage bucket is located.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Machine Type is the virtual hardware configuration of the instance that will run the MC. The default values here are "middle of the road" settings which are sufficient for most use cases. If you are doing a small proof-of-concept deployment, you can choose a less powerful instance to save some money. If you are planning on deploying multiple large databases, consider increasing the count of virtual CPUs and RAM. For details about Vertica's default volume configurations, see Eon Mode volume configuration defaults for GCP.
User Name for Access to MC is the administrator username for the MC. You can customize this if you want.
Network and Subnetwork are the virtual private cloud (VPC) network and subnet within that network you want your MC instance and your Vertica nodes to use. This setting does not affect your MC's external network address. If you want to isolate your Vertica cluster from other GCP instances in your project, create a custom VPC network and optionally a subnet in your GCP project and select them in these fields. See the Google Cloud documentation's VPC network overview page for more information.
Firewall enables access to the MC from the internet by opening port 5450 in the firewall. You can choose to not open this port by clearing the I accept opening a port in the firewall (5450) for Vertica box. However, if you do not open the port in the firewall, your MC instance will only be accessible from within the VPC network. Not opening the port will make accessing your MC instance much harder.
Source IP ranges for MC traffic: If you choose to open the MC for external access, add one or more or more CIDR address ranges to this box for network addresses that you want to be able to access to the MC.
Caution
Make the address ranges as limited as possible to reduce the chances of unauthorized access to your MC instance.
Click the Deploy button to start the deployment of your MC instance.
The deployment process will take several minutes.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
Connect and log into the MC instance
After the deployment process is finished, the Deployment Manager page for your MC instance contains links to connect to the MC via your browser or ssh.
To connect to the MC instance:
The MC administrator user has a randomly-generated password that you need to log into the MC. Copy the password in the MC Admin Password field to the clipboard.
Click Access Management Console.
A new browser tab or window opens, showing you a page titled Redirection Notice. Click the link for the MC URL to continue to the MC login page.
Your browser will likely show you a security warning. The MC instance uses a self-signed security certificate. Most browsers treat these certificates as a security hazard because they cannot verify their origin. You can safely ignore this warning and continue. In most browsers, click the Advanced button on the warning page, and select the option to proceed. In Chrome, this is a link titled Proceed toxxx.xxx.xxx.xxx(unsafe). In Firefox, it is a button labeled Accept the Risk and Continue.
At the login screen, enter the MC administrator user name into the Username box. This user name is mcadmin, unless you changed the user name in the MC deployment form.
Paste the automatically-generated password you copied from the MC Admin Password field earlier into the Password box.
Click Log In.
Once you have logged into the MC, change the MC administrator account's password.
Caution
The automatically-generated password appears on the MC instance's deployment page and can be revealed in several locations in the deployment logs. Failure to change this password can lead to unauthorized access to your MC instance.
To change the password:
On the home page of the MC, under the MC Tools section, click MC Settings.
In the left-hand menu, click User Management.
Select the entry for the MC administrator account and click Edit.
Click either the Generate new or Edit password button to change the password. If you click the Generate new button, be sure to save the automatically-generated password in a safe location. If you click Edit password, you are prompted to enter a new password twice.
3 - Manually deploy an Enterprise Mode database on GCP
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Configure and launch a new instance
All VM instances that you create should be launched in the same virtual public cloud (VPC).
To configure and launch a new VM instance, follow these instructions:
From within the Compute Engine section of GCP, from the menu on the left-hand site of the screen, select VM Instances.
GCP displays all the VM instances that you have created so far.
Click CREATE INSTANCE.
Enter a name for the new instance.
Select the zone where you plan to deploy the instance.
GCP breaks its cloud data centers down by regions and zones. Regions are a collection of zones that are all in the same geographical location. Zones are collections of compute resources, which vary from zone to zone. Always pick the zone in your designated region that supports the latest Intel CPUs.
For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Select a machine type.
GCE offers many different types of VM instances. For best results, only deploy Vertica on VM instances with 8 vCPus or more and at least 30 GB of RAM.
Select the boot disk (image).
You create VM instances from a public or custom image. If you are starting with Vertica in GCP for the first time, select either the CentOS 7 or RHEL 7 public image. Those images have been tested thoroughly with Vertica.
After you have configured the VM instance to be used as a Vertica cluster node, GCP allows you to convert that instance into a custom image. Doing so allows you to deploy multiple versions of that VM instance; each VM instance is identical except for the node name and IP address.
Before you can connect to any of the VMs you created, you must first identify the external IP address. The VM instance section of GCP contains a list of all currently deployed VMs and their associated external IP addresses.
Connect to your VM
To connect to your VM, complete the following tasks:
Connect to your VM using SSH with the external IP address you created in the configuration steps.
Authenticate using the credentials and SSH key that you provided to your GCP account upon creation.
Connect to other VMs
To connect to other virtual machines in your virtual network:
Use SSH to connect to your publicly connected VM.
Use SSH again from that VM to connect through the private IP addresses of your other VMs.
Because GCP forces the use of private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use SSH to connect to other VMs in your virtual network.
Prepare the virtual machines
After you create your VMs, you need to prepare them for cluster formation.
Add the Vertica license and private key
Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. The following steps assume that the initial user you configured is the DBADMIN user:
As the DBADMIN user, copy your private key file from where you saved it locally onto your primary node.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
If you see the previous failure message, enter the following command to correct permissions on your private key file:
$ chmod 600 /<name-of-key>.pem
Copy your Vertica license to your primary VM. Save it in your home directory or other known location.
Install software dependencies for Vertica on GCP
In addition to the Vertica standard Package dependencies, as the root user, you must install the following packages before you install Vertica:
pstack
mcelog
sysstat
dialog
Configure storage
For best disk performance in GCP, Vertica recommends customers use SSD persistent storage, configured to at least 2TB (2000 GB) in size. Disk performance is directly tied to the disk size in GCP. 2000 GBs (2TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
Caution
Do not store your information on the root volume, especially in your data and catalog directories. Storing information on the root volume may result in data loss.
When configuring your storage, make sure to use a supported file system. See for details.
Create a swap file
In addition to storage volumes to store your data, Vertica requires a swap volume or swap file for the setup script to complete.
Create a swap file or swap volume of at least 2 GB. The following steps show how to create a swap file within Vertica on GCP:
After you complete the download and extraction, use the install_vertica script to form a cluster and install the Vertica database software, as described in the next section.
Form a cluster and install Vertica
Use the install_vertica script to combine two or more individual VMs to form a cluster and install your Vertica database.
Before you run the install_vertica script, follow these steps:
Check the VM Instances page of the Compute Engine section on GCP to locate a list of current VMs and their associated internal IP addresses.
Identify your storage location on your VMs. The installer assumes that you have mounted your storage to /home/dbadmin. To specify another location, use the --data-dir argument.
Caution
Do not store your data on the root drive.
The following steps show how to combine virtual machines (VMs) into a cluster using the install_vertica script:
While connected to your primary node, construct the following command to combine your nodes into a cluster.
Substitute the IP addresses for your VMs, and include your root key file name, if applicable.
Include the --point-to-point parameter to configure spread to use direct point-to-point communication among all Vertica nodes, as required for clusters on GCP when installing or updating Vertica.
If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.
After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:
$ shred examplekey.pem
Important
You need your key file to perform future Vertica updates.
When you installed Vertica, a database administrator user was created with the DBADMIN role (usually named dbadmin). Use this account to create and start a database.