This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Connection load balancing
Each client connection to a host in the Vertica cluster requires a small overhead in memory and processor time.
Each client connection to a host in the Vertica cluster requires a small overhead in memory and processor time. If many clients connect to a single host, this overhead can begin to affect the performance of the database. You can spread the overhead of client connections by dictating that certain clients connect to specific hosts in the cluster. However, this manual balancing becomes difficult as new clients and hosts are added to your environment.
Connection load balancing helps automatically spread the overhead of client connections across the cluster by having hosts redirect client connections to other hosts. By redirecting connections, the overhead from client connections is spread across the cluster without having to manually assign particular hosts to individual clients. Clients can connect to a small handful of hosts, and they are naturally redirected to other hosts in the cluster. Load balancing does not redirect connections to draining hosts. For more information see, Drain client connections.
Native connection load balancing
Native connection load balancing is a feature built into the Vertica Analytic Database server and client libraries as well as vsql. Both the server and the client need to enable load balancing for it to function. If connection load balancing is enabled, a host in the database cluster can redirect a client's attempt to connect to it to another currently-active host in the cluster. This redirection is based on a load balancing policy. This redirection only takes place once, so a client is not bounced from one host to another.
Because native connection load balancing is incorporated into the Vertica client libraries, any client application that connects to Vertica transparently takes advantage of it simply by setting a connection parameter.
How you choose to implement connection load balancing depends on your network environment. Since native load connection balancing is easier to implement, you should use it unless your network configuration requires that clients be separated from the hosts in the Vertica database by a firewall.
For more about native connection load balancing, see About Native Connection Load Balancing.
Workload routing
Workload routing lets you create rules for routing client connections to particular subclusters based on their workloads.
The primary advantages of this type of load balancing is as follows:
- Database administrators can associate certain subclusters with certain workloads (as opposed to client IP addresses).
- Clients do not need to know anything about the subcluster they will be routed to, only the type of workload they have.
- Database administrators can change workload routing policies at any time, and these changes are transparent to all clients.
For details, see Workload routing.
1 - About native connection load balancing
Native connection load balancing is a feature built into the Vertica server and client libraries that helps spread the CPU and memory overhead caused by client connections across the hosts in the database.
Native connection load balancing is a feature built into the Vertica server and client libraries that helps spread the CPU and memory overhead caused by client connections across the hosts in the database. It can prevent unequal distribution of client connections among hosts in the cluster.
There are two types of native connection load balancing:
-
Cluster-wide balancing—This method the legacy method of connection load balancing. It was the only type of load balancing prior to Vertica version 9.2. Using this method, you apply a single load balancing policy across the entire cluster. All connection to the cluster are handled the same way.
-
Load balancing policies—This method lets you set different load balancing policies depending on the source of client connection. For example, you can have a policy that redirects connections from outside of your local network to one set of nodes in your cluster, and connections from within your local network to another set of nodes.
2 - Classic connection load balancing
The classic connection load balancing feature applies a single policy for all client connections to your database.
The classic connection load balancing feature applies a single policy for all client connections to your database. Both your database and the client must enable the load balancing option in order for connections to be load balanced. When both client and server enable load balancing, the following process takes place when the client attempts to open a connection to Vertica:
-
The client connects to a host in the database cluster, with a connection parameter indicating that it is requesting a load-balanced connection.
-
The host chooses a host from the list of currently up hosts in the cluster, according to the current load balancing scheme. Under all schemes, it is possible for a host to select itself.
-
The host tells the client which host it selected to handle the client's connection.
-
If the host chose another host in the database to handle the client connection, the client disconnects from the initial host. Otherwise, the client jumps to step 6.
-
The client establishes a connection to the host that will handle its connection. The client sets this second connection request so that the second host does not interpret the connection as a request for load balancing.
-
The client connection proceeds as usual, (negotiating encryption if the connection has SSL enabled, and proceeding to authenticating the user ).
This process is transparent to the client application. The client driver automatically disconnects from the initial host and reconnects to the host selected for load balancing.
Requirements
-
In mixed IPv4 and IPv6 environments, balancing only works for the address family for which you have configured native load balancing. For example, if you have configured load balancing using an IPv4 address, then IPv6 clients cannot use load balancing, however the IPv6 clients can still connect, but load balancing does not occur.
-
The native load balancer returns an IP address for the client to use. This address must be one that the client can reach. If your nodes are on a private network, native load-balancing requires you to publish a public address in one of two ways:
-
Set the public address on each node. Vertica saves that address in the export_address
field in the NODES system table.
-
Set the subnet on the database. Vertica saves that address in the export_subnet
field in the DATABASES system table.
Load balancing schemes
The load balancing scheme controls how a host selects which host to handle a client connection. There are three available schemes:
-
NONE
(default): Disables native connection load balancing.
-
ROUNDROBIN
: Chooses the next host from a circular list of hosts in the cluster that are up—for example, in a three-node cluster, iterates over node1, node2, and node3, then wraps back to node1. Each host in the cluster maintains its own pointer to the next host in the circular list, rather than there being a single cluster-wide state.
-
RANDOM
: Randomly chooses a host from among all hosts in the cluster that are up.
You set the native connection load balancing scheme using the SET_LOAD_BALANCE_POLICY function. See Enabling and Disabling Native Connection Load Balancing for instructions.
Driver notes
-
Native connection load balancing works with the ADO.NET driver's connection pooling. The connection the client makes to the initial host, and the final connection to the load-balanced host, use pooled connections if they are available.
-
If a client application uses the JDBC and ODBC driver with third-party connection pooling solutions, the initial connection is not pooled because it is not a full client connection. The final connection is pooled because it is a standard client connection.
Connection failover
The client libraries include a failover feature that allow them to connect to backup hosts if the host specified in the connection properties is unreachable. When using native connection load balancing, this failover feature is only used for the initial connection to the database. If the host to which the client was redirected does not respond to the client's connection request, the client does not attempt to connect to a backup host and instead returns a connection error to the user.
Clients are redirected only to hosts that are known to be up. Thus, this sort of connection failure should only occur if the targeted host goes down at the same moment the client is redirected to it. For more information, see ADO.NET connection failover, JDBC connection failover, and Connection failover.
2.1 - Enabling and disabling classic connection load balancing
Only a database can enable or disable classic cluster-wide connection load balancing.
Only a database superuser can enable or disable classic cluster-wide connection load balancing. To enable or disable load balancing, use the SET_LOAD_BALANCE_POLICY function to set the load balance policy. Setting the load balance policy to anything other than 'NONE' enables load balancing on the server. The following example enables native connection load balancing by setting the load balancing policy to ROUNDROBIN.
=> SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)
To disable native connection load balancing, use SET_LOAD_BALANCE_POLICY to set the policy to 'NONE':
=> SELECT SET_LOAD_BALANCE_POLICY('NONE');
SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: none
(1 row)
Note
When a client makes a connection, the native load-balancer chooses a node and returns the value from the
export_address
column in the
NODES table. The client then uses the
export_address
to connect. The
node_address
specifies the address to use for inter-node and spread communications. When a database is installed, the
export_address
and
node_address
are set to the same value. If you installed Vertica on a private address, then you must set the export_address to a
public address for each node.
By default, client connections are not load balanced, even when connection load balancing is enabled on the server. Clients must set a connection parameter to indicates they are willing to have their connection request load balanced. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing, for information on enabling load balancing on the client. For vsql, use the -C
command-line option to enable load balancing.
Important
In mixed IPv4 and IPv6 environments, balancing only works for the address family for which you have configured load balancing. For example, if you have configured load balancing using an IPv4 address, then IPv6 clients cannot use load balancing, however the IPv6 clients can still connect, but load balancing does not occur.
Resetting the load balancing state
When the load balancing policy is ROUNDROBIN, each host in the Vertica cluster maintains its own state of which host it will select to handle the next client connection. You can reset this state to its initial value (usually, the host with the lowest-node id) using the RESET_LOAD_BALANCE_POLICY function:
=> SELECT RESET_LOAD_BALANCE_POLICY();
RESET_LOAD_BALANCE_POLICY
-------------------------------------------------------------------------
Successfully reset stateful client load balance policies: "roundrobin".
(1 row)
See also
2.2 - Monitoring legacy connection load balancing
Query the LOAD_BALANCE_POLICY column of the V_CATALOG.DATABASES to determine the state of native connection load balancing on your server:.
Query the LOAD_BALANCE_POLICY column of the V_CATALOG.DATABASES to determine the state of native connection load balancing on your server:
=> SELECT LOAD_BALANCE_POLICY FROM V_CATALOG.DATABASES;
LOAD_BALANCE_POLICY
---------------------
roundrobin
(1 row)
Determining to which node a client has connected
A client can determine the node to which is has connected by querying the NODE_NAME column of the V_MONITOR.CURRENT_SESSION table:
=> SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION;
NODE_NAME
------------------
v_vmart_node0002
(1 row)
3 - Connection load balancing policies
Connection load balancing policies help spread the load of servicing client connections by redirecting connections based on the connection's origin.
Connection load balancing policies help spread the load of servicing client connections by redirecting connections based on the connection's origin. These policies can also help prevent nodes reaching their client connection limits and rejecting new connections by spreading connections among nodes. See Limiting the number and length of client connections for more information about client connection limits.
A load balancing policy consists of:
-
Network addresses that identify particular IP address and port number combinations on a node.
-
One or more connection load balancing groups that consists of network addresses that you want to handle client connections. You define load balancing groups using fault groups, subclusters, or a list of network addresses.
-
One or more routing rules that map a range of client IP addresses to a connection load balancing group.
When a client connects to a node in the database with load balancing enabled, the node evaluates all of the routing rules based on the client's IP address to determine if any match. If more than one rule matches the IP address, the node applies the most specific rule (the one that affects the fewest IP addresses).
If the node finds a matching rule, it uses the rule to determine the pool of potential nodes to handle the client connection. When evaluating potential target nodes, it always ensures that the nodes are currently up. The initially-contacted node then chooses one of the nodes in the group based on the group's distribution scheme. This scheme can be either choosing a node at random, or choosing a node in a rotating "round-robin" order. For example, in a three-node cluster, the round robin order would be node 1, then node 2, then node 3, and then back to node 1 again.
After it processes the rules, if the node determines that another node should handle the client's connection, it tells the client which node it has chosen. The client disconnects from the initial node and connects to the chosen node to continue with the connection process (either negotiating encryption if the connection has TLS/SSL enabled, or authentication).
If the initial node chooses itself based on the routing rules, it tells the client to proceed to the next step of the connection process.
If no routing rule matches the incoming IP address, the node checks to see if classic connection load balancing is enabled by both Vertica and the client. If so, it handles the connection according to the classic load balancing policy. See Classic connection load balancing for more information.
Finally, if the database is running in Eon Mode, the node tries to apply a default interior load balancing rule. See Default Subcluster Interior Load Balancing Policy below.
If no routing rule matches the incoming IP address and classic load balancing and the default subcluster interior load balancing rule did not apply, the node handles the connection itself. It also handles the connection itself if it cannot follow the load balancing rule. For example, if all nodes in the load balancing group targeted by the rule are down, then the initially-contacted node handles the client connection itself. In this case, the node does not attempt to apply any other less-restrictive load balancing rules that would apply to the incoming connection. It only attempts to apply a single load balancing rule.
Use cases
Using load balancing policies you can:
-
Ensure connections originating from inside or outside of your internal network are directed to a valid IP address for the client. For example, suppose your Vertica nodes have two IP addresses: one for the external network and another for the internal network. These networks are mutually exclusive. You cannot reach the private network from the public, and you cannot reach the public network from the private. Your load balancing rules need to provide the client with an IP address they can actually reach.
-
Enable access to multiple nodes of a Vertica cluster that are behind a NAT router. A NAT router is accessible from the outside network via a single IP address. Systems within the NAT router's private network can be accessed on this single IP address using different port numbers. You can create a load balancing policy that redirects a client connection to the NAT's IP address but with a different port number.
-
Designate sets of nodes to service client connections from an IP address range. For example, if your ETL systems have a set range of IP addresses, you could limit their client connections to an arbitrary set of Vertica nodes, a subcluster, or a fault group. This technique lets you isolate the overhead of servicing client connections to a few nodes. It is useful when you are using subclusters in an Eon Mode database to isolate workloads (see Subclusters for more information).
Using connection load balancing policies with IPv4 and IPv6
Connection load balancing policies work with both IPv4 and IPv6. As far as the load balancing policies are concerned, the two address families represent separate networks. If you want your load balancing policy to handle both IPv4 and IPv6 addresses, you must create separate sets of network addresses, load balancing groups, and rules for each protocol. When a client opens a connection to a node in the cluster, the addressing protocol it uses determines which set of rules Vertica consults when deciding whether and how to balance the connection.
Default subcluster interior load balancing policy
Databases running in Eon Mode have a default connection load balancing policy that helps spread the load of handling client connections among the nodes in a subcluster. When a client connects to a node while opting into connection load balancing, the node checks for load balancing policies that apply to the client's IP address. If it does not find any applicable load balancing rule, and classic load balancing is not enabled, the node falls back to the default interior load balancing rule. This rule distributes connections among the nodes in the same subcluster as the initially-contacted node.
As with other connection load balancing policies, the nodes in the subcluster must have a network address defined for them to be eligible to handle the client connection. If no nodes in the subcluster have a network address, the node does not apply the default subcluster interior load balancing rule, and the connection is not load balanced.
This default rule is convenient when you are primarily interested in load balancing connections within each subcluster. You just create network addresses for the nodes in your subcluster. You do not need to create load balancing groups or rules. Clients that opt-in to load balancing are then automatically balanced among the nodes in the subcluster.
Interior load balancing policy with multiple network addresses
If your nodes have multiple network addresses, the default subcluster interior load balancing rule chooses the address that was created first as the target of load balancing rule. For example, suppose you create a network address on a node for the private IP address 192.168.1.10. Then you create another network address for the node for the public IP address 233.252.0.1. The default subcluster interior connection load balancing rule always selects 192.168.1.10 as the target of the rule.
If you want the default interior load balancing rule to choose a different network address as its target, drop the other network addresses on the node and then recreate them. Deleting and recreating other addresses makes the address you want the rule to select the oldest address. For example, suppose you want the rule to use a public address (233.252.0.1) that was created after a private address (192.168.1.10). In this case, you can drop the address for 192.168.1.10 and then recreate it. The rule then defaults to the older public 233.252.0.1 address.
If you intend to create multiple network addresses for the nodes in your subcluster, create the network addresses you want to use with the default subcluster interior load balancing first. For example, suppose you want to use the default interior load balancing subcluster rule to load balance most client connections. However, you also want to create a connection load balancing policy to manage connections coming in from a group of ETL systems. In this case, create the network addresses you want to use for the default interior load balancing rule first, then create the network addresses for the ETL systems.
Load balancing policies vs. classic load balancing
There are several differences between the classic load balancing feature and the load balancing policy feature:
-
In classic connection load balancing, you just enable the load balancing option on both client and server, and load balancing is enabled. There are more steps to implement load balancing policies: you have to create addresses, groups, and rules and then enable load balancing on the client.
-
Classic connection load balancing only supports a single, cluster-wide policy for redirecting connections. With connection load balancing policies, you get to choose which nodes handle client connections based on the connection's origin. This gives you more flexibility to handle complex situations. Examples include routing connections through a NAT-based router or having nodes that are accessible via multiple IP addresses on different networks.
-
In classic connection load balancing, each node in the cluster can only be reached via a single IP address. This address is set in the EXPORT_ADDRESS column of the NODES system table. With connection load balancing policies, you can create a network address for each IP address associated with a node. Then you create rules that redirect to those addresses.
Steps to create a load balancing policy
There are three steps you must follow to create a load balancing policy:
-
Create one or more network addresses for each node that you want to participate in the connection load balancing policies.
-
Create one or more load balancing groups to be the target of the routing rules. Load balancing groups can target a collection of specific network addresses. Alternatively, you can create a group from a fault group or subcluster. You can limit the members of the load balance group to a subset of the fault group or subcluster using an IP address filter.
-
Create one or more routing rules.
While not absolutely necessary, it is always a good idea to idea to test your load balancing policy to ensure it works the way you expect it to.
After following these steps, Vertica will apply the load balancing policies to client connections that opt into connection load balancing. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing, for information on enabling load balancing on the client. For vsql, use the -C
command-line option to enable load balancing.
These steps are explained in the other topics in this section.
See also
3.1 - Creating network addresses
Network addresses assign a name to an IP address and port number on a node.
Network addresses assign a name to an IP address and port number on a node. You use these addresses when you define load balancing groups. A node can have multiple network addresses associated with it. For example, suppose a node has one IP address that is only accessible from outside of the local network, and another that is accessible only from inside the network. In this case, you can define one network address using the external IP address, and another using the internal address. You can then create two different load balancing policies, one for external clients, and another for internal clients.
Note
You must create network addresses for your nodes, even if you intend to base your connection load balance groups on fault groups or subclusters. Load balancing rules can only select nodes that have a network address defined for them.
You create a network address using the CREATE NETWORK ADDRESS statement. This statement takes:
-
The name to assign to the network address
-
The name of the node
-
The IP address of the node to associate with the network address
-
The port number the node uses to accept client connections (optional)
Note
You can use hostnames instead of IP addresses when creating network addresses. However, doing so may lead to confusion if you are not sure which IP address a hostname resolves to. Using hostnames can also cause problems if your DNS server maps the hostname to multiple IP addresses.
The following example demonstrates creating three network addresses, one for each node in a three-node database.
=> SELECT node_name,node_address,node_address_family FROM v_catalog.nodes;
node_name | node_address | node_address_family
------------------+--------------+----------------------
v_vmart_node0001 | 10.20.110.21 | ipv4
v_vmart_node0002 | 10.20.110.22 | ipv4
v_vmart_node0003 | 10.20.110.23 | ipv4
(4 rows)
=> CREATE NETWORK ADDRESS node01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS
Creating network addresses for IPv6 addresses works the same way:
=> CREATE NETWORK ADDRESS node1_ipv6 ON v_vmart_node0001 WITH '2001:0DB8:7D5F:7433::';
CREATE NETWORK ADDRESS
Vertica does not perform any tests on the IP address you supply in the CREATE NETWORK ADDRESS statement. You must test the IP addresses you supply to this statement to confirm they correspond to the right node.
Vertica does not restrict the address you supply because it is often not aware of all the network addresses through which the node is accessible. For example, your node may be accessible from an external network via an IP address that Vertica is not configured to use. Or, your node can have both an IPv4 and an IPv6 address, only one of which Vertica is aware of.
For example, suppose v_vmart_node0003 from the previous example is not accessible via the IP address 192.168.1.5. You can still create a network address for it using that address:
=> CREATE NETWORK ADDRESS node04 ON v_vmart_node0003 WITH '192.168.1.5';
CREATE NETWORK ADDRESS
If you create a network group and routing rule that targets this address, client connections would either connect to the wrong node, or fail due to being connected to a host that's not part of a Vertica cluster.
Specifying a port number in a network address
By default, the CREATE NETWORK ADDRESS statement assumes the port number for the node's client connection is the default 5433. Sometimes, you may have a node listening for client connections on a different port. You can supply an alternate port number for the network address using the PORT keyword.
For example, suppose your nodes are behind a NAT router. In this case, you can have your nodes listen on different port numbers so the NAT router can route connections to them. When creating network addresses for these nodes, you supply the IP address of the NAT router and the port number the node is listening on. For example:
=> CREATE NETWORK ADDRESS node1_nat ON v_vmart_node0001 WITH '192.168.10.10' PORT 5433;
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node2_nat ON v_vmart_node0002 with '192.168.10.10' PORT 5434;
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node3_nat ON v_vmart_node0003 with '192.168.10.10' PORT 5435;
CREATE NETWORK ADDRESS
3.2 - Creating connection load balance groups
After you have created network addresses for nodes, you create collections of them so you can target them with routing rules.
After you have created network addresses for nodes, you create collections of them so you can target them with routing rules. These collections of network addresses are called load balancing groups. You have two ways to select the addresses to include in a load balancing group:
-
A list of network addresses
-
The name of one or more fault groups or subclusters, plus an IP address range in CIDR format. The address range selects which network addresses in the fault groups or subclusters Vertica adds to the load balancing group. Only the network addresses that are within the IP address range you supply are added to the load balance group. This filter lets you base your load balance group on a portion of the nodes that make up the fault group or subcluster.
Note
Load balance groups can only be based on fault groups or subclusters, or contain an arbitrary list of network addresses. You cannot mix these sources. For example, if you create a load balance group based on one or more fault groups, then you can only add additional fault groups to it. Vertica will return an error if you try to add a network address or subcluster to the load balance group.
You create a load balancing group using the CREATE LOAD BALANCE GROUP statement. When basing your group on a list of addresses, this statement takes the name for the group and the list of addresses. The following example demonstrates creating addresses for four nodes, and then creating two groups based on those nodes.
=> CREATE NETWORK ADDRESS addr01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr04 on v_vmart_node0004 WITH '10.20.110.24';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP group_1 WITH ADDRESS addr01, addr02;
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_2 WITH ADDRESS addr03, addr04;
CREATE LOAD BALANCE GROUP
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
------------+------------+-----------------+-----------------------+-------------
group_1 | ROUNDROBIN | | Network Address Group | addr01
group_1 | ROUNDROBIN | | Network Address Group | addr02
group_2 | ROUNDROBIN | | Network Address Group | addr03
group_2 | ROUNDROBIN | | Network Address Group | addr04
(4 rows)
A network address can be a part of as many load balancing groups as you like. However, each group can only have a single network address per node. You cannot add two network addresses belonging to the same node to the same load balancing group.
Creating load balancing groups from fault groups
To create a load balancing group from one or more fault groups, you supply:
-
The name for the load balancing group
-
The name of one or more fault groups
-
An IP address filter in CIDR format that filters the fault groups to be added to the load balancing group basd on their IP addresses. Vertica excludes any network addresses in the fault group that do not fall within this range. If you want all of the nodes in the fault groups to be added to the load balance group, specify the filter 0.0.0.0/0.
This example creates two load balancing groups from a fault group. The first includes all network addresses in the group by using the CIDR notation for all IP addresses. The second limits the fault group to three of the four nodes in the fault group by using the IP address filter.
=> CREATE FAULT GROUP fault_1;
CREATE FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0001;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0002;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0003;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0004;
ALTER FAULT GROUP
=> SELECT node_name,node_address,node_address_family,export_address
FROM v_catalog.nodes;
node_name | node_address | node_address_family | export_address
------------------+--------------+---------------------+----------------
v_vmart_node0001 | 10.20.110.21 | ipv4 | 10.20.110.21
v_vmart_node0002 | 10.20.110.22 | ipv4 | 10.20.110.22
v_vmart_node0003 | 10.20.110.23 | ipv4 | 10.20.110.23
v_vmart_node0004 | 10.20.110.24 | ipv4 | 10.20.110.24
(4 rows)
=> CREATE LOAD BALANCE GROUP group_all WITH FAULT GROUP fault_1 FILTER
'0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_some WITH FAULT GROUP fault_1 FILTER
'10.20.110.21/30';
CREATE LOAD BALANCE GROUP
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
----------------+------------+-----------------+-----------------------+-------------
group_all | ROUNDROBIN | 0.0.0.0/0 | Fault Group | fault_1
group_some | ROUNDROBIN | 10.20.110.21/30 | Fault Group | fault_1
(2 rows)
You can also supply multiple fault groups to the CREATE LOAD BALANCE GROUP statement:
=> CREATE LOAD BALANCE GROUP group_2_faults WITH FAULT GROUP
fault_2, fault_3 FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
Note
If you supply a filter range that does not match any network addresses of the nodes in the fault groups, Vertica creates an empty load balancing group. Any routing rules that direct connections to the empty load balance group will fail, because no nodes are set to handle connections for the group. In this case, the node that the client connected to initially handles the client connection itself.
Creating load balance groups from subclusters
Creating a load balance group from a subcluster is similar to creating a load balance group from a fault group. You just use WITH SUBCLUSTER instead of WITH FAULT GROUP in the CREATE LOAD BALANCE GROUP statement.
=> SELECT node_name,node_address,node_address_family,subcluster_name
FROM v_catalog.nodes;
node_name | node_address | node_address_family | subcluster_name
----------------------+--------------+---------------------+--------------------
v_verticadb_node0001 | 10.11.12.10 | ipv4 | load_subcluster
v_verticadb_node0002 | 10.11.12.20 | ipv4 | load_subcluster
v_verticadb_node0003 | 10.11.12.30 | ipv4 | load_subcluster
v_verticadb_node0004 | 10.11.12.40 | ipv4 | analytics_subcluster
v_verticadb_node0005 | 10.11.12.50 | ipv4 | analytics_subcluster
v_verticadb_node0006 | 10.11.12.60 | ipv4 | analytics_subcluster
(6 rows)
=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH '10.11.12.10';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH '10.11.12.20';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH '10.11.12.30';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '10.11.12.40';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '10.11.12.50';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '10.11.12.60';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP load_subcluster WITH SUBCLUSTER load_subcluster
FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP analytics_subcluster WITH SUBCLUSTER
analytics_subcluster FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
Setting the group's distribution policy
A load balancing group has a policy setting that determines how the initially-contacted node chooses a target from the group. CREATE LOAD BALANCE GROUP supports three policies:
-
ROUNDROBIN (default) rotates among the available members of the load balancing group. The initially-contacted node keeps track of which node it chose last time, and chooses the next one in the cluster.
Note
Each node in the cluster maintains its own round-robin pointer that indicates which node it should pick next for each load-balancing group. Therefore, if clients connect to different initial nodes, they may be redirected to the same node.
-
RANDOM chooses an available node from the group randomly.
-
NONE disables load balancing.
The following example demonstrates creating a load balancing group with a RANDOM distribution policy.
=> CREATE LOAD BALANCE GROUP group_random WITH ADDRESS node01, node02,
node03, node04 POLICY 'RANDOM';
CREATE LOAD BALANCE GROUP
The next step
After creating the load balancing group, you must add a load balancing routing rule that tells Vertica how incoming connections should be redirected to the groups. See Creating load balancing routing rules.
3.3 - Creating load balancing routing rules
Once you have created one or more connection load balancing groups, you are ready to create load balancing routing rules.
Once you have created one or more connection load balancing groups, you are ready to create load balancing routing rules. These rules tell Vertica how to redirect client connections based on their IP addresses.
You create routing rules using the CREATE ROUTING RULE statement. You pass this statement:
The following example creates two rules. The first redirects connections coming from the IP address range 192.168.1.0 through 192.168.1.255 to a load balancing group named group_1. The second routes connections from the IP range 10.20.1.0 through 10.20.1.255 to the load balancing group named group_2.
=> CREATE ROUTING RULE internal_clients ROUTE '192.168.1.0/24' TO group_1;
CREATE ROUTING RULE
=> CREATE ROUTING RULE external_clients ROUTE '10.20.1.0/24' TO group_2;
CREATE ROUTING RULE
Creating a catch-all routing rule
Vertica applies routing rules in most specific to least specific order. This behavior lets you create a "catch-all" rule that handles all incoming connections. Then you can create rules to handle smaller IP address ranges for specific purposes. For example, suppose you wanted to create a catch-all rule that worked with the rules created in the previous example. Then you can create a new rule that routes 0.0.0.0/0 (the CIDR notation for all IP addresses) to a group that should handle connections that aren't handled by either of the previously-created rules. For example:
=> CREATE LOAD BALANCE GROUP group_all WITH ADDRESS node01, node02, node03, node04;
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE catch_all ROUTE '0.0.0.0/0' TO group_all;
CREATE ROUTING RULE
After running the above statements, any connection that does not originate from the IP address ranges 192.168.1.* or 10.20.1.* are routed to the group_all group.
3.4 - Testing connection load balancing policies
After creating your routing rules, you should test them to verify that they perform the way you expect.
After creating your routing rules, you should test them to verify that they perform the way you expect. The best way to test your rules is to call the DESCRIBE_LOAD_BALANCE_DECISION function with an IP address. This function evaluates the routing rules and reports back how Vertica would route a client connection from the IP address. It uses the same logic that Vertica uses when handling client connections, so the results reflect the actual connection load balancing result you will see from client connections. It also reflects the current state of the your Vertica cluster, so it will not redirect connections to down nodes.
The following example demonstrates testing a set of rules. One rule handles all connections from the range 192.168.1.0 to 192.168.1.255, while the other handles all connections originating from the 192 subnet. The third call demonstrates what happens when no rules apply to the IP address you supply.
=> SELECT describe_load_balance_decision('192.168.1.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]
(1 row)
=> SELECT describe_load_balance_decision('192.168.2.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.2.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
matches this rule
Matched to load balance group [group_all] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
(2) LB Address: [10.20.100.249]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]
(1 row)
=> SELECT describe_load_balance_decision('1.2.3.4');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either
the policy was NONE or no target was available. Details: [NONE or invalid]
(1 row)
The DESCRIBE_LOAD_BALANCE_DECISION function also takes into account the classic cluster-wide load balancing settings:
=> SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('1.2.3.4');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not
match any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Success. Load balance redirect to: [10.20.100.247]
port [5433]
(1 row)
Note
The DESCRIBE_LOAD_BALANCE_DECISION function assumes the client connection has opted to be load balanced. In reality, clients may not enable load balancing. This setting prevents the load-balancing features from redirecting the connection.
The function can also help you debug connection issues you notice after going live with your load balancing policy. For example, if you notice that one node is handling a large number of client connections, you can test the client IP addresses against your policies to see why the connections are not being balanced.
3.5 - Load balancing policy examples
The following examples demonstrate some common use cases for connection load balancing policies.
The following examples demonstrate some common use cases for connection load balancing policies.
Enabling client connections from multiple networks
Suppose you have a Vertica cluster that is accessible from two (or more) different networks. Some examples of this situation are:
-
You have an internal and an external network. In this configuration, your database nodes usually have two or more IP addresses, which each address only accessible from one of the networks. This configuration is common when running Vertica in a cloud environment. In many cases, you can create a catch-all rule that applies to all IP addresses, and then add additional routing rules for the internal subnets.
-
You want clients to be load balanced whether they use IPv4 or IPv6 protocols. From the database's perspective, IPv4 and IPv6 connections are separate networks because each node has a separate IPv4 and IPv6 IP address.
When creating a load balancing policy for a database that is accessible from multiple networks, client connections must be directed to IP addresses on the network they can access. The best solution is to create load balancing groups for each set of IP addresses assigned to a node. Then create routing rules that redirect client connections to the IP addresses that are accessible from their network.
The following example:
-
Creates two sets of network addresses: one for the internal network and another for the external network.
-
Creates two load balance groups: one for the internal network and one for the external.
-
Creates three routing rules: one for the internal network, and two for the external. The internal routing rule covers a subset of the network covered by one of the external rules.
-
Tests the routing rules using internal and external IP addresses.
=> CREATE NETWORK ADDRESS node01_int ON v_vmart_node0001 WITH '192.168.0.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node01_ext ON v_vmart_node0001 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_int ON v_vmart_node0002 WITH '192.168.0.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_ext ON v_vmart_node0002 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_int ON v_vmart_node0003 WITH '192.168.0.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_ext ON v_vmart_node0003 WITH '203.0.113.3';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP internal_group WITH ADDRESS node01_int, node02_int, node03_int;
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP external_group WITH ADDRESS node01_ext, node02_ext, node03_ext;
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE internal_rule ROUTE '192.168.0.0/24' TO internal_group;
CREATE ROUTING RULE
=> CREATE ROUTING RULE external_rule ROUTE '0.0.0.0/0' TO external_group;
CREATE ROUTING RULE
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.100.10');
DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
Describing load balance decision for address [198.51.100.10]
Load balance cache internal version id (node-local): [3]
Considered rule [internal_rule] source ip filter [192.168.0.0/24]... input
address does not match source ip filter for this rule.
Considered rule [external_rule] source ip filter [0.0.0.0/0]... input
address matches this rule
Matched to load balance group [external_group] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [203.0.113.1]:5433
(1) LB Address: [203.0.113.2]:5433
(2) LB Address: [203.0.113.3]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [203.0.113.3] port [5433]
(1 row)
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.100.10');
DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
Describing load balance decision for address [192.168.0.79]
Load balance cache internal version id (node-local): [3]
Considered rule [internal_rule] source ip filter [192.168.0.0/24]... input
address matches this rule
Matched to load balance group [internal_group] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [192.168.0.1]:5433
(1) LB Address: [192.168.0.3]:5433
(2) LB Address: [192.168.0.2]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [192.168.0.2] port
[5433]
(1 row)
Isolating workloads
You may want to control which nodes in your cluster are used by specific types of clients. For example, you may want to limit clients that perform data-loading tasks to one set of nodes, and reserve the rest of the nodes for running queries. This separation of workloads is especially common for Eon Mode databases. See Controlling Where a Query Runs for an example of using load balancing policies in an Eon Mode database to control which subcluster a client connects to.
You can create client load balancing policies that support workload isolation if clients performing certain types of tasks always originate from a limited IP address range. For example, if the clients that load data into your system always fall into a specific subnet, you can create a policy that limits which nodes those clients can access.
In the following example:
-
There are two fault groups (group_a and group_b) that separate workloads in an Eon Mode database. These groups are used as the basis of the load balancing groups.
-
The ETL client connections all originate from the 203.0.113.0/24 subnet.
-
User connections originate in the range of 192.0.0.0 to 199.255.255.255.
=> CREATE NETWORK ADDRESS node01 ON v_vmart_node0001 WITH '192.0.2.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_node0002 WITH '192.0.2.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_vmart_node0003 WITH '192.0.2.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_vmart_node0004 WITH '192.0.2.4';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_vmart_node0005 WITH '192.0.2.5';
CREATE NETWORK ADDRESS
^
=> CREATE LOAD BALANCE GROUP lb_users WITH FAULT GROUP group_a FILTER '192.0.2.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP lb_etl WITH FAULT GROUP group_b FILTER '192.0.2.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE users_rule ROUTE '192.0.0.0/5' TO lb_users;
CREATE ROUTING RULE
=> CREATE ROUTING RULE etl_rule ROUTE '203.0.113.0/24' TO lb_etl;
CREATE ROUTING RULE
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.200.129');
DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
Describing load balance decision for address [198.51.200.129]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
does not match source ip filter for this rule.
Considered rule [users_rule] source ip filter [192.0.0.0/5]... input address
matches this rule
Matched to load balance group [lb_users] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [192.0.2.1]:5433
(1) LB Address: [192.0.2.2]:5433
(2) LB Address: [192.0.2.3]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [192.0.2.2] port
[5433]
(1 row)
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('203.0.113.24');
DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
Describing load balance decision for address [203.0.113.24]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
matches this rule
Matched to load balance group [lb_etl] the group has policy [ROUNDROBIN] number
of addresses [2]
(0) LB Address: [192.0.2.4]:5433
(1) LB Address: [192.0.2.5]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [192.0.2.5] port
[5433]
(1 row)
=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('10.20.100.25');
DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
Describing load balance decision for address [10.20.100.25]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
does not match source ip filter for this rule.
Considered rule [users_rule] source ip filter [192.0.0.0/5]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either the
policy was NONE or no target was available. Details: [NONE or invalid]
(1 row)
Enabling the default subcluster interior load balancing policy
Vertica attempts to apply the default subcluster interior load balancing policy if no other load balancing policy applies to an incoming connection and classic load balancing is not enabled. See Default Subcluster Interior Load Balancing Policy for a description of this rule.
To enable default subcluster interior load balancing, you must create network addresses for the nodes in a subcluster. Once you create the addresses, Vertica attempts to apply this rule to load balance connections within a subcluster when no other rules apply.
The following example confirms the database has no load balancing groups or rules. Then it adds publicly-accessible network addresses to the nodes in the primary subcluster. When these addresses are added, Vertica applies the default subcluster interior load balancing policy.
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
------+--------+--------+------+-------------
(0 rows)
=> SELECT * FROM ROUTING_RULES;
name | source_address | destination_name
------+----------------+------------------
(0 rows)
=> CREATE NETWORK ADDRESS node01_ext ON v_verticadb_node0001 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_ext ON v_verticadb_node0002 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_ext ON v_verticadb_node0003 WITH '203.0.113.3';
CREATE NETWORK ADDRESS
=> SELECT describe_load_balance_decision('11.0.0.100');
describe_load_balance_decision
-----------------------------------------------------------------------------------------------
Describing load balance decision for address [11.0.0.100] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [2]
Considered rule [auto_rr_default_subcluster] subcluster interior filter [default_subcluster]...
current subcluster matches this rule
Matched to load balance group [auto_lbg_sc_default_subcluster] the group has policy
[ROUNDROBIN] number of addresses [3]
(0) LB Address: [203.0.113.1]:5433
(1) LB Address: [203.0.113.2]:5433
(2) LB Address: [203.0.113.3]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [203.0.113.2] port [5433]
(1 row)
Load balance both IPv4 and IPv6 connections
Connection load balancing policies regard IPv4 and IPv6 connections as separate networks. To load balance both types of incoming client connections, create two sets of network addresses, at least two load balancing groups, and two load balancing , once for each network address family.
This example creates two load balancing policies for the default subcluster: one for the IPv4 network addresses (192.168.111.31 to 192.168.111.33) and one for the IPv6 network addresses (fd9b:1fcc:1dc4:78d3::31 to fd9b:1fcc:1dc4:78d3::33).
=> SELECT node_name,node_address,subcluster_name FROM NODES;
node_name | node_address | subcluster_name
----------------------+----------------+--------------------
v_verticadb_node0001 | 192.168.111.31 | default_subcluster
v_verticadb_node0002 | 192.168.111.32 | default_subcluster
v_verticadb_node0003 | 192.168.111.33 | default_subcluster
=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH
'192.168.111.31';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node01_ipv6 ON v_verticadb_node0001 WITH
'fd9b:1fcc:1dc4:78d3::31';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH
'192.168.111.32';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_ipv6 ON v_verticadb_node0002 WITH
'fd9b:1fcc:1dc4:78d3::32';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH
'192.168.111.33';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_ipv6 ON v_verticadb_node0003 WITH
'fd9b:1fcc:1dc4:78d3::33';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP group_ipv4 WITH SUBCLUSTER default_subcluster
FILTER '192.168.111.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_ipv6 WITH SUBCLUSTER default_subcluster
FILTER 'fd9b:1fcc:1dc4:78d3::0/64';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE all_ipv4 route '0.0.0.0/0' TO group_ipv4;
CREATE ROUTING RULE
=> CREATE ROUTING RULE all_ipv6 route '0::0/0' TO group_ipv6;
CREATE ROUTING RULE
=> SELECT describe_load_balance_decision('203.0.113.50');
describe_load_balance_decision
-----------------------------------------------------------------------------------------------
Describing load balance decision for address [203.0.113.50] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [3]
Considered rule [all_ipv4] source ip filter [0.0.0.0/0]... input address matches this rule
Matched to load balance group [ group_ipv4] the group has policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [192.168.111.31]:5433
(1) LB Address: [192.168.111.32]:5433
(2) LB Address: [192.168.111.33]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [192.168.111.33] port [5433]
(1 row)
=> SELECT describe_load_balance_decision('2001:0DB8:EA04:8F2C::1');
describe_load_balance_decision
---------------------------------------------------------------------------------------------------------
Describing load balance decision for address [2001:0DB8:EA04:8F2C::1] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [3]
Considered rule [all_ipv4] source ip filter [0.0.0.0/0]... input address does not match source ip filter for this rule.
Considered rule [all_ipv6] source ip filter [0::0/0]... input address matches this rule
Matched to load balance group [ group_ipv6] the group has policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [fd9b:1fcc:1dc4:78d3::31]:5433
(1) LB Address: [fd9b:1fcc:1dc4:78d3::32]:5433
(2) LB Address: [fd9b:1fcc:1dc4:78d3::33]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [fd9b:1fcc:1dc4:78d3::33] port [5433]
(1 row)
Other examples
For other examples of using connection load balancing, see the following topics:
3.6 - Viewing load balancing policy configurations
Query the following system tables in the V_CATALOG Schema to see the load balance policies defined in your database:.
Query the following system tables in the V_CATALOG schema to see the load balance policies defined in your database:
-
NETWORK_ADDRESSES lists all of the network addresses defined in your database.
-
LOAD_BALANCE_GROUPS lists the contents of your load balance groups.
Note
This table does not directly lists all of your database's load balance groups. Instead, it lists the contents of the load balance groups. It is possible for your database to have load balancing groups that are not in this table because they do not contain any network addresses or fault groups.
-
ROUTING_RULES lists all of the routing rules defined in your database.
This example demonstrates querying each of the load balancing policy system tables.
=> \x
Expanded display is on.
=> SELECT * FROM V_CATALOG.NETWORK_ADDRESSES;
-[ RECORD 1 ]----+-----------------
name | node01
node | v_vmart_node0001
address | 10.20.100.247
port | 5433
address_family | ipv4
is_enabled | t
is_auto_detected | f
-[ RECORD 2 ]----+-----------------
name | node02
node | v_vmart_node0002
address | 10.20.100.248
port | 5433
address_family | ipv4
is_enabled | t
is_auto_detected | f
-[ RECORD 3 ]----+-----------------
name | node03
node | v_vmart_node0003
address | 10.20.100.249
port | 5433
address_family | ipv4
is_enabled | t
is_auto_detected | f
-[ RECORD 4 ]----+-----------------
name | alt_node1
node | v_vmart_node0001
address | 192.168.1.200
port | 8080
address_family | ipv4
is_enabled | t
is_auto_detected | f
-[ RECORD 5 ]----+-----------------
name | test_addr
node | v_vmart_node0001
address | 192.168.1.100
port | 4000
address_family | ipv4
is_enabled | t
is_auto_detected | f
=> SELECT * FROM LOAD_BALANCE_GROUPS;
-[ RECORD 1 ]----------------------
name | group_all
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node01
-[ RECORD 2 ]----------------------
name | group_all
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node02
-[ RECORD 3 ]----------------------
name | group_all
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node03
-[ RECORD 4 ]----------------------
name | group_1
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node01
-[ RECORD 5 ]----------------------
name | group_1
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node02
-[ RECORD 6 ]----------------------
name | group_2
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node01
-[ RECORD 7 ]----------------------
name | group_2
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node02
-[ RECORD 8 ]----------------------
name | group_2
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node03
-[ RECORD 9 ]----------------------
name | etl_group
policy | ROUNDROBIN
filter |
type | Network Address Group
object_name | node01
=> SELECT * FROM ROUTING_RULES;
-[ RECORD 1 ]----+-----------------
name | internal_clients
source_address | 192.168.1.0/24
destination_name | group_1
-[ RECORD 2 ]----+-----------------
name | etl_rule
source_address | 10.20.100.0/24
destination_name | etl_group
-[ RECORD 3 ]----+-----------------
name | subnet_192
source_address | 192.0.0.0/8
destination_name | group_all
3.7 - Maintaining load balancing policies
Once you have created load balancing policies, you maintain them using the following statements:.
Once you have created load balancing policies, you maintain them using the following statements:
-
ALTER NETWORK ADDRESS letsyou: rename, change the IP address, and enable or disable a network address.
-
ALTER LOAD BALANCE GROUP letsyou rename, add or remove network addresses or fault groups, change the fault group IP address filter, or change the policy of a load balance group.
-
ALTER ROUTING RULE letsyou rename, change the source IP address, and the target load balance group of a rule.
See the refence pages for these statements for examples.
Deleting load balancing policy objects
You can also delete existing load balance policy objects using the following statements:
4 - Workload routing
Workload routing routes client connections to subclusters based on their workloads.
Workload routing routes client connections to subclusters. This lets you reserve subclusters for certain types of tasks.
When a client connects to Vertica, they connect to a Connection node, which then routes the client to the correct subcluster based on the client's specified workload, user, or role and the database's routing rules. If multiple subclusters are associated with the same workload, the client is randomly routed to one of those subclusters.
In this context, "routing" refers to the connection node acting as a proxy for the client and the Execution node in the target subcluster. All queries and query results are first sent to the connection node and then passed on to the execution node and client, respectively.
The primary advantages of this type of load balancing are as follows:
- Database administrators can associate certain subclusters with certain workloads and roles (as opposed to client IP addresses).
- Clients do not need to know anything about the subcluster they will be routed to, only the type of workload they have or the role they should use.
Workload routing depends on actions from both the database administrator and the client:
- The database administrator must create rules for handling various workloads.
- The client must either specify the type of workload they have or have an enabled role (either from a manual SET ROLE or default role) associated with a routing rule.
View the current workload
To view the workload associated with the current session, use SHOW WORKLOAD:
=> SHOW WORKLOAD;
name | setting
----------+------------
workload | my_workload
(1 row)
Create workload routing rules
Routing rules apply to a client's specified workload and either user or role. If you specify more than one subcluster in a routing rule, the client is randomly routed to one of those subclusters.
If multiple routing rules could apply to a client's session, the rule with the highest priority is used. For details, see Priorities.
To view existing routing rules, see WORKLOAD_ROUTING_RULES.
To view workloads available to you and your enabled roles, use SHOW AVAILABLE WORKLOADS:
=> SHOW AVAILABLE WORKLOADS;
name | setting
---------------------+------------------------
available workloads | reporting, analytics
(1 row)
Workload-based routing
Workload-based routing rules apply to clients that specify a particular workload, routing them to one of the subclusters listed in the rule. In this example, when a client connects to the database and specifies the analytics
workload, their connection is randomly routed to either sc_analytics
or sc_analytics_2
:
=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;
To alter a routing rule, use ALTER ROUTING RULE. For example, to route analytics
workloads to sc_analytics
:
=> ALTER ROUTING RULE FOR WORKLOAD analytics SET SUBCLUSTER TO sc_analytics;
To add or remove a subcluster, use ALTER ROUTING RULE. For example:
- To add a subcluster
sc_01
:
=> ALTER ROUTING RULE FOR WORKLOAD analytics ADD SUBCLUSTER sc_01;
- To remove a subcluster
sc_01
:
=> ALTER ROUTING RULE FOR WORKLOAD analytics REMOVE SUBCLUSTER sc_01;
To drop a routing rule, use DROP ROUTING RULE and specify the workload. For example, to drop the routing rule for the analytics
workload:
=> DROP ROUTING RULE FOR WORKLOAD analytics;
User- and role-based routing
You can grant USAGE privileges to a user or role to let them route their queries to one of the subclusters listed in the routing rule. In this example, when a client connects to the database and enables the analytics_role
role, their connection is randomly routed to either sc_analytics
or sc_analtyics_2
:
=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;
=> GRANT USAGE ON ROUTING RULE analytics TO analytics_role;
Users can then enable the role and set their workload to analytics
for the session:
=> SET ROLE analytics_role;
=> SET SESSION WORKLOAD analytics;
Users can also enable the role automatically by setting it as a default role and then specify the workload when they connect. For details on default roles, see Enabling roles automatically.
Similarly, in this example, when a client connects to the database as user analytics_user
, they are randomly routed to either sc_analytics
or sc_analtyics_2
:
=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;
=> GRANT USAGE ON ROUTING RULE analytics TO analytics_user;
Priorities
Only one workload routing rule can apply to a session at any given time. If multiple routing rules are granted to a user or role, the priority associated with the rule determines which one applies. If multiple rules have the highest priority, Vertica chooses one of those rules randomly.
You can force a routing rule to apply to your session, ignoring priority, by using SET SESSION WORKLOAD.
In this example, the user Gunther is granted usage on the routing rule for workload reporting
. The following statement does not specify a priority, so the priority defaults to 0:
=> GRANT USAGE ON WORKLOAD reporting TO Gunther;
Gunther also has the roles analytics_role
and qa_role
, each of which are granted their own routing rules:
=> GRANT USAGE ON WORKLOAD analytics TO analytics_role;
=> GRANT USAGE ON WORKLOAD qa TO qa_role;
When Gunther first connects to Vertica, the routing rule for reporting
applies. If Gunther then enables the analytics_role
and qa_role
, the routing rule for qa_role
applies instead because it has the greatest priority value.
Similarly, if the priority of the routing rule for reporting
is changed to 3 with ALTER ROUTING RULE, that rule applies instead:
=> ALTER ROUTING RULE FOR WORKLOAD reporting SET PRIORITY TO 3;
To ignore priorities and force the session to use the analytics
rule, Gunther can use SET SESSION WORKLOAD:
=> SET SESSION WORKLOAD analytics;
Specify a workload
The workload for a given connection can be reported by the client when they connect. The method for specifying a workload depends on the client.
The following examples set the workload to analytics
for several clients. After you connect, you can verify that the workload was set with SHOW WORKLOAD.
vsql uses --workload
:
$ vsql --dbname databaseName --host node01.example.com --username Bob --password my_pwd --workload analytics
JDBC uses workload
:
jdbc:vertica://node01.example.com:5443/databaseName?user=analytics_user&password=***&workload=analytics
ODBC uses Workload
:
Database=databaseName;Servername=node01.mydomain.com;Port=5433;UID=analytics_user;PWD=***;Workload=analytics
ADO.NET uses Workload
:
Database=databaseName;Host=node01.mydomain.com;Port=5433;User=analytics_user;Password=***;Workload=analytics
vertica-sql-go uses Workload
:
var query = url.URL{
Scheme: "vertica",
User: url.UserPassword(user, password),
Host: fmt.Sprintf("%s:%d", host, port),
Path: databaseName,
Workload: "analytics",
RawQuery: rawQuery.Encode(),
}
vertica-python uses workload
:
conn_info = {'host': '127.0.0.1',
'port': 5433,
'user': 'some_user',
'password': 'my_pwd',
'database': 'databaseName',
'workload': 'analytics',
# autogenerated session label by default,
'session_label': 'some_label',
# default throw error on invalid UTF-8 results
'unicode_error': 'strict',
# SSL is disabled by default
'ssl': False,
# autocommit is off by default
'autocommit': True,
# using server-side prepared statements is disabled by default
'use_prepared_statements': False,
# connection timeout is not enabled by default
# 5 seconds timeout for a socket operation (Establishing a TCP connection or read/write operation)
'connection_timeout': 5}
vertica-nodejs uses workload
:
const client = new Client({
user: "vertica_user",
host: "node01.example.com",
database: "verticadb",
password: "",
port: "5433",
workload: "analytics",
})
Clients can also change their workload type after they connect with SET SESSION WORKLOAD:
=> SET SESSION WORKLOAD my_workload;