This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Managing client connections

Vertica provides several settings to control client connections:.

Vertica provides several settings to control client connections:

  • Limit the number of client connections a user can have open at the same time.

  • Limit the time a client connection can be idle before being automatically disconnected.

  • Use connection load balancing to spread the overhead of servicing client connections among nodes.

  • Detect unresponsive clients with TCP keepalive.

  • Drain a subcluster to reject any new client connections to that subcluster. For details, see Drain client connections.

Total client connections to a given node cannot exceed the limits set in MaxClientSessions.

Changes to a client's MAXCONNECTIONS property have no effect on current sessions; these changes apply only to new sessions. For example, if you change user's connection mode from DATABASE to NODE, current node connections are unaffected. This change applies only to new sessions, which are reserved on the invoking node.

When Vertica closes a client connection, the client's ongoing operations, if any, are canceled.

1 - Limiting the number and length of client connections

You can manage how many active sessions a user can open to the server, and the duration of those sessions.

You can manage how many active sessions a user can open to the server, and the duration of those sessions. Doing so helps prevent overuse of available resources, and can improve overall throughput.

You can define connection limits at two levels:

  • Set the MAXCONNECTIONS property on individual users. This property specifies how many sessions a user can open concurrently on individual nodes, or across the database cluster. For example, the following ALTER USER statement allows user Joe up to 10 concurrent sessions:

    => ALTER USER Joe MAXCONNECTIONS 10 ON DATABASE;
    
  • Set the configuration parameter MaxClientSessions on the database or individual nodes. This parameter specifies the maximum number of client sessions that can run on nodes in the database cluster, by default set to 50. An extra five sessions are always reserved to dbadmin users. This enables them to log in when the total number of client sessions equals MaxClientSessions.

Total client connections to a given node cannot exceed the limits set in MaxClientSessions.

Changes to a client's MAXCONNECTIONS property have no effect on current sessions; these changes apply only to new sessions. For example, if you change user's connection mode from DATABASE to NODE, current node connections are unaffected. This change applies only to new sessions, which are reserved on the invoking node.

Managing TCP keepalive settings

Vertica uses kernel TCP keepalive parameters to detect unresponsive clients and determine when the connection should be closed.Vertica also supports a set of equivalent KeepAlive parameters that can override TCP keepalive parameter settings. By default, all Vertica KeepAlive parameters are set to 0, which signifies to use TCP keepalive settings. To override TCP keepalive settings, set the equivalent parameters at the database level with ALTER DATABASE, or for the current session with ALTER SESSION.

TCP keepalive Parameter Vertica Parameter Description
tcp_keepalive_time KeepAliveIdleTime Length (in seconds) of the idle period before the first TCP keepalive probe is sent to ensure that the client is still connected.
tcp_keepalive_probes KeepAliveProbeCount Number of consecutive keepalive probes that must go unacknowledged by the client before the client connection is considered lost and closed
tcp_keepalive_intvl KeepAliveProbeInterval Time interval (in seconds) between keepalive probes.

Examples

The following examples show how to use Vertica KeepAlive parameters to override TCP keepalive parameters as follows:

  • After 600 seconds (ten minutes), the first keepalive probe is sent to the client.

  • Consecutive keepalive probes are sent every 30 seconds.

  • If the client fails to respond to 10 keepalive probes, the connection is considered lost and is closed.

To make this the default policy for client connections, use ALTER DATABASE:

=> ALTER DATABASE DEFAULT SET KeepAliveIdleTime = 600;
=> ALTER DATABASE DEFAULT SET KeepAliveProbeInterval = 30;
=> ALTER DATABASE DEFAULT SET KeepAliveProbeCount = 10;

To override database-level policies for the current session, use ALTER SESSION:

=> ALTER SESSION SET KeepAliveIdleTime = 400;
=> ALTER SESSION SET KeepAliveProbeInterval = 72;
=> ALTER SESSION SET KeepAliveProbeCount = 60;

Query system table CONFIGURATION_PARAMETERS to verify database and session settings of the three Vertica KeepAlive parameters:

=> SELECT parameter_name, database_value, current_value FROM configuration_parameters WHERE parameter_name ILIKE 'KeepAlive%';
     parameter_name     | database_value | current_value
------------------------+----------------+---------------
 KeepAliveProbeCount    | 10             | 60
 KeepAliveIdleTime      | 600            | 400
 KeepAliveProbeInterval | 30             | 72
(3 rows)

Limiting idle session length

If a client continues to respond to TCP keepalive probes, but is not running any queries, the client's session is considered idle. Idle sessions eventually time out. The maximum time that sessions are allowed to idle can be set at three levels, in descending order of precedence:

  • As dbadmin, set the IDLESESSIONTIMEOUT property for individual users. This property overrides all other session timeout settings.
  • Users can limit the idle time of the current session with SET SESSION IDLESESSIONTIMEOUT. Non-superusers can only set their session idle time to a value equal to or lower than their own IDLESESSIONTIMEOUT setting. If no session idle time is explicitly set for a user, the session idle time for that user is inherited from the node or database settings.
  • As dbadmin, set configuration parameter DEFAULTIDLESESSIONTIMEOUT. on the database or on individual nodes. This You can limit the default database cluster or individual nodes, with configuration parameter DEFAULTIDLESESSIONTIMEOUT. This parameter sets the default timeout value for all non-superusers.

All settings apply to sessions that are continuously idle—that is, sessions where no queries are running. If a client is slow or unresponsive during query execution, that time does not apply to timeouts. For example, the time that is required for a streaming batch insert is not counted towards timeout. The server identifies a session as idle starting from the moment it starts to wait for any type of message from that session.

Viewing session settings

The following sections demonstrate how you can query the database for details about the session and connection limits.

Session length limits

Use SHOW DATABASE to view the session length limit for the database:

=> SHOW DATABASE DEFAULT DEFAULTIDLESESSIONTIMEOUT;
           name            | setting
---------------------------+---------
 DefaultIdleSessionTimeout | 2 day
(1 row)

Use SHOW to view the length limit for the current session:

=> SHOW IDLESESSIONTIMEOUT;
        name        | setting
--------------------+---------
 idlesessiontimeout | 1
(1 row)

Connection limits

Use SHOW DATABASE to view the connection limits for the database:

=> SHOW DATABASE DEFAULT MaxClientSessions;
       name        | setting
-------------------+---------
 MaxClientSessions | 50
(1 row)

Query USERS to view the connection limits for users:

=> SELECT user_name, max_connections, connection_limit_mode FROM users
     WHERE user_name != 'dbadmin';
 user_name | max_connections | connection_limit_mode
-----------+-----------------+-----------------------
 SuzyX     | 3               | database
 Joe       | 10              | database
(2 rows)

Closing user sessions

To manually close a user session, use CLOSE_USER_SESSIONS:

=> SELECT CLOSE_USER_SESSIONS ('Joe');
                             close_user_sessions
------------------------------------------------------------------------------
 Close all sessions for user Joe sent. Check v_monitor.sessions for progress.
(1 row)

Example

A user executes a query, and for some reason the query takes an unusually long time to finish (for example, because of server traffic or query complexity). In this case, the user might think the query failed, and opens another session to run the same query. Now, two sessions run the same query, using an extra connection.

To prevent this situation, you can limit how many sessions individual users can run, by modifying their MAXCONNECTIONS user property. This can help minimize the chances of running redundant queries. It also helps prevent users from consuming all available connections, as set by the database For example, the following setting on user SuzyQ limits her to two database sessions at any time:

=> CREATE USER SuzyQ MAXCONNECTIONS 2 ON DATABASE;

Limiting Another issue setting client connections prevents is when a user connects to the server many times. Too many user connections exhausts the number of allowable connections set by database configuration parameter MaxClientSessions.

Cluster changes and connections

Behavior changes can occur with client connection limits when the following changes occur to a cluster:

  • You add or remove a node.

  • A node goes down or comes back up.

Changes in node availability between connection requests have little impact on connection limits.

In terms of honoring connection limits, no significant impact exists when nodes go down or come up in between connection requests. No special actions are needed to handle this. However, if a node goes down, its active session exits and other nodes in the cluster also drop their sessions. This frees up connections. The query may hang in which case the blocked sessions are reasonable and as expected.

2 - Drain client connections

Draining client connections in a subclusters prepares the subcluster for shutdown by marking all nodes in the subcluster as draining.

Eon Mode only

Draining client connections in a subclusters prepares the subcluster for shutdown by marking all nodes in the subcluster as draining. Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load-balancing operations. If clients attempt to connect to a draining node, they receive an error that informs them of the draining status. Load balancing operations exclude draining nodes, so clients that opt-in to connection load balancing should receive a connection error only if all nodes in the load balancing policy are draining. You do not need to change any connection load balancing configurations to use this feature. dbadmin can still connect to draining nodes.

To drain client connections before shutting down a subcluster, you can use the SHUTDOWN_WITH_DRAIN function. This function performs a Graceful Shutdown that marks a subcluster as draining until either the existing connections complete their work and close or a user-specified timeout is reached. When one of these conditions is met, the function proceeds to shutdown the subcluster. Vertica provides several meta-functions that allow you to independently perform each step of the SHUTDOWN_WITH_DRAIN process. You can use the START_DRAIN_SUBCLUSTER function to mark a subcluster as draining and then the SHUTDOWN_SUBCLUSTER function to shut down a subcluster once its connections have closed.

You can use the CANCEL_DRAIN_SUBCLUSTER function to mark all nodes in a subcluster as not draining. As soon as a node is both UP and not draining, the node accepts new client connections. If all nodes in a draining subcluster are down, the draining status of its nodes is automatically reset to not draining.

You can query the DRAINING_STATUS system table to monitor the draining status of each node as well as client connection information, such as the number of active user sessions on each node.

The following example drains a subcluster named analytics, then cancels the draining of the subcluster.

To mark the analytics subcluster as draining, call SHUTDOWN_WITH_DRAIN with a negative timeout value:

=> SELECT SHUTDOWN_WITH_DRAIN('analytics', -1);
NOTICE 0:  Draining has started on subcluster (analytics)

You can confirm that the subcluster is draining by querying the DRAINING_STATUS system table:

=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name          | subcluster_name    | is_draining
-------------------+--------------------+--------------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics          | t
verticadb_node0005 | analytics          | t
verticadb_node0006 | analytics          | t

If a client attempts to connect directly to a node in the draining subcluster, they receive the following error message:

$ /opt/vertica/bin/vsql -h noeIP --password password verticadb analyst
vsql: FATAL 10611:  New session rejected because subcluster to which this node belongs is draining connections

To cancel the graceful shutdown of the analytics subcluster, you can type Ctrl+C:

=> SELECT SHUTDOWN_WITH_DRAIN('analytics', -1);
NOTICE 0:  Draining has started on subcluster (analytics)
^CCancel request sent
ERROR 0:  Cancel received after draining started and before shutdown issued. Nodes will not be shut down. The subclusters are still in the draining state.
HINT:  Run cancel_drain_subcluster('') to restore all nodes to the 'not_draining' state

As mentioned in the above hint, you can run CANCEL_DRAIN_SUBCLUSTER to reset the status of the draining nodes in the subcluster to not draining:

=> SELECT CANCEL_DRAIN_SUBCLUSTER('analytics');
              CANCEL_DRAIN_SUBCLUSTER
--------------------------------------------------------
Targeted subcluster: 'analytics'
Action: CANCEL DRAIN

(1 row)

To confirm that the subcluster is no longer draining, you can again query the DRAINING_STATUS system table:

=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name          | subcluster_name    | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics          | f
verticadb_node0005 | analytics          | f
verticadb_node0006 | analytics          | f
(6 rows)

3 - Connection load balancing

Each client connection to a host in the Vertica cluster requires a small overhead in memory and processor time.

Each client connection to a host in the Vertica cluster requires a small overhead in memory and processor time. If many clients connect to a single host, this overhead can begin to affect the performance of the database. You can spread the overhead of client connections by dictating that certain clients connect to specific hosts in the cluster. However, this manual balancing becomes difficult as new clients and hosts are added to your environment.

Connection load balancing helps automatically spread the overhead of client connections across the cluster by having hosts redirect client connections to other hosts. By redirecting connections, the overhead from client connections is spread across the cluster without having to manually assign particular hosts to individual clients. Clients can connect to a small handful of hosts, and they are naturally redirected to other hosts in the cluster. Load balancing does not redirect connections to draining hosts. For more information see, Drain client connections.

Native connection load balancing overview

Native connection load balancing is a feature built into the Vertica Analytic Database server and client libraries as well as vsql. Both the server and the client need to enable load balancing for it to function. If connection load balancing is enabled, a host in the database cluster can redirect a client's attempt to connect to it to another currently-active host in the cluster. This redirection is based on a load balancing policy. This redirection only takes place once, so a client is not bounced from one host to another.

Because native connection load balancing is incorporated into the Vertica client libraries, any client application that connects to Vertica transparently takes advantage of it simply by setting a connection parameter.

How you choose to implement connection load balancing depends on your network environment. Since native load connection balancing is easier to implement, you should use it unless your network configuration requires that clients be separated from the hosts in the Vertica database by a firewall.

For more about native connection load balancing, see About Native Connection Load Balancing.

3.1 - About native connection load balancing

Native connection load balancing is a feature built into the Vertica server and client libraries that helps spread the CPU and memory overhead caused by client connections across the hosts in the database.

Native connection load balancing is a feature built into the Vertica server and client libraries that helps spread the CPU and memory overhead caused by client connections across the hosts in the database. It can prevent unequal distribution of client connections among hosts in the cluster.

There are two types of native connection load balancing:

  • Cluster-wide balancing—This method the legacy method of connection load balancing. It was the only type of load balancing prior to Vertica version 9.2. Using this method, you apply a single load balancing policy across the entire cluster. All connection to the cluster are handled the same way.

  • Load balancing policies—This method lets you set different load balancing policies depending on the source of client connection. For example, you can have a policy that redirects connections from outside of your local network to one set of nodes in your cluster, and connections from within your local network to another set of nodes.

3.2 - Classic connection load balancing

The classic connection load balancing feature applies a single policy for all client connections to your database.

The classic connection load balancing feature applies a single policy for all client connections to your database. Both your database and the client must enable the load balancing option in order for connections to be load balanced. When both client and server enable load balancing, the following process takes place when the client attempts to open a connection to Vertica:

  1. The client connects to a host in the database cluster, with a connection parameter indicating that it is requesting a load-balanced connection.

  2. The host chooses a host from the list of currently up hosts in the cluster, according to the current load balancing scheme. Under all schemes, it is possible for a host to select itself.

  3. The host tells the client which host it selected to handle the client's connection.

  4. If the host chose another host in the database to handle the client connection, the client disconnects from the initial host. Otherwise, the client jumps to step 6.

  5. The client establishes a connection to the host that will handle its connection. The client sets this second connection request so that the second host does not interpret the connection as a request for load balancing.

  6. The client connection proceeds as usual, (negotiating encryption if the connection has SSL enabled, and proceeding to authenticating the user ).

This process is transparent to the client application. The client driver automatically disconnects from the initial host and reconnects to the host selected for load balancing.

Requirements

  • In mixed IPv4 and IPv6 environments, balancing only works for the address family for which you have configured native load balancing. For example, if you have configured load balancing using an IPv4 address, then IPv6 clients cannot use load balancing, however the IPv6 clients can still connect, but load balancing does not occur.

  • The native load balancer returns an IP address for the client to use. This address must be one that the client can reach. If your nodes are on a private network, native load-balancing requires you to publish a public address in one of two ways:

    • Set the public address on each node. Vertica saves that address in the export_address field in the NODES system table.

    • Set the subnet on the database. Vertica saves that address in the export_subnet field in the DATABASES system table.

Load balancing schemes

The load balancing scheme controls how a host selects which host to handle a client connection. There are three available schemes:

  • NONE (default): Disables native connection load balancing.

  • ROUNDROBIN: Chooses the next host from a circular list of hosts in the cluster that are up—for example, in a three-node cluster, iterates over node1, node2, and node3, then wraps back to node1. Each host in the cluster maintains its own pointer to the next host in the circular list, rather than there being a single cluster-wide state.

  • RANDOM: Randomly chooses a host from among all hosts in the cluster that are up.

You set the native connection load balancing scheme using the SET_LOAD_BALANCE_POLICY function. See Enabling and Disabling Native Connection Load Balancing for instructions.

Driver notes

  • Native connection load balancing works with the ADO.NET driver's connection pooling. The connection the client makes to the initial host, and the final connection to the load-balanced host, use pooled connections if they are available.

  • If a client application uses the JDBC and ODBC driver with third-party connection pooling solutions, the initial connection is not pooled because it is not a full client connection. The final connection is pooled because it is a standard client connection.

Connection failover

The client libraries include a failover feature that allow them to connect to backup hosts if the host specified in the connection properties is unreachable. When using native connection load balancing, this failover feature is only used for the initial connection to the database. If the host to which the client was redirected does not respond to the client's connection request, the client does not attempt to connect to a backup host and instead returns a connection error to the user.

Clients are redirected only to hosts that are known to be up. Thus, this sort of connection failure should only occur if the targeted host goes down at the same moment the client is redirected to it. For more information, see ADO.NET connection failover, JDBC connection failover, and Connection failover.

3.2.1 - Enabling and disabling classic connection load balancing

Only a database can enable or disable classic cluster-wide connection load balancing.

Only a database superuser can enable or disable classic cluster-wide connection load balancing. To enable or disable load balancing, use the SET_LOAD_BALANCE_POLICY function to set the load balance policy. Setting the load balance policy to anything other than 'NONE' enables load balancing on the server. The following example enables native connection load balancing by setting the load balancing policy to ROUNDROBIN.

=> SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
                  SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)

To disable native connection load balancing, use SET_LOAD_BALANCE_POLICY to set the policy to 'NONE':

=> SELECT SET_LOAD_BALANCE_POLICY('NONE');
SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: none
(1 row)

By default, client connections are not load balanced, even when connection load balancing is enabled on the server. Clients must set a connection parameter to indicates they are willing to have their connection request load balanced. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing, for information on enabling load balancing on the client. For vsql, use the -C command-line option to enable load balancing.

Resetting the load balancing state

When the load balancing policy is ROUNDROBIN, each host in the Vertica cluster maintains its own state of which host it will select to handle the next client connection. You can reset this state to its initial value (usually, the host with the lowest-node id) using the RESET_LOAD_BALANCE_POLICY function:

=> SELECT RESET_LOAD_BALANCE_POLICY();
RESET_LOAD_BALANCE_POLICY
-------------------------------------------------------------------------
Successfully reset stateful client load balance policies: "roundrobin".
(1 row)

See also

3.2.2 - Monitoring legacy connection load balancing

Query the LOAD_BALANCE_POLICY column of the V_CATALOG.DATABASES to determine the state of native connection load balancing on your server:.

Query the LOAD_BALANCE_POLICY column of the V_CATALOG.DATABASES to determine the state of native connection load balancing on your server:

=> SELECT LOAD_BALANCE_POLICY FROM V_CATALOG.DATABASES;
LOAD_BALANCE_POLICY
---------------------
roundrobin
(1 row)

Determining to which node a client has connected

A client can determine the node to which is has connected by querying the NODE_NAME column of the V_MONITOR.CURRENT_SESSION table:

=> SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION;
NODE_NAME
------------------
v_vmart_node0002
(1 row)

3.3 - Connection load balancing policies

Connection load balancing policies help spread the load of servicing client connections by redirecting connections based on the connection's origin.

Connection load balancing policies help spread the load of servicing client connections by redirecting connections based on the connection's origin. These policies can also help prevent nodes reaching their client connection limits and rejecting new connections by spreading connections among nodes. See Limiting the number and length of client connections for more information about client connection limits.

A load balancing policy consists of:

  • Network addresses that identify particular IP address and port number combinations on a node.

  • One or more connection load balancing groups that consists of network addresses that you want to handle client connections. You define load balancing groups using fault groups, subclusters, or a list of network addresses.

  • One or more routing rules that map a range of client IP addresses to a connection load balancing group.

When a client connects to a node in the database with load balancing enabled, the node evaluates all of the routing rules based on the client's IP address to determine if any match. If more than one rule matches the IP address, the node applies the most specific rule (the one that affects the fewest IP addresses).

If the node finds a matching rule, it uses the rule to determine the pool of potential nodes to handle the client connection. When evaluating potential target nodes, it always ensures that the nodes are currently up. The initially-contacted node then chooses one of the nodes in the group based on the group's distribution scheme. This scheme can be either choosing a node at random, or choosing a node in a rotating "round-robin" order. For example, in a three-node cluster, the round robin order would be node 1, then node 2, then node 3, and then back to node 1 again.

After it processes the rules, if the node determines that another node should handle the client's connection, it tells the client which node it has chosen. The client disconnects from the initial node and connects to the chosen node to continue with the connection process (either negotiating encryption if the connection has TLS/SSL enabled, or authentication).

If the initial node chooses itself based on the routing rules, it tells the client to proceed to the next step of the connection process.

If no routing rule matches the incoming IP address, the node checks to see if classic connection load balancing is enabled by both Vertica and the client. If so, it handles the connection according to the classic load balancing policy. See Classic connection load balancing for more information.

Finally, if the database is running in Eon Mode, the node tries to apply a default interior load balancing rule. See Default Subcluster Interior Load Balancing Policy below.

If no routing rule matches the incoming IP address and classic load balancing and the default subcluster interior load balancing rule did not apply, the node handles the connection itself. It also handles the connection itself if it cannot follow the load balancing rule. For example, if all nodes in the load balancing group targeted by the rule are down, then the initially-contacted node handles the client connection itself. In this case, the node does not attempt to apply any other less-restrictive load balancing rules that would apply to the incoming connection. It only attempts to apply a single load balancing rule.

Use cases

Using load balancing policies you can:

  • Ensure connections originating from inside or outside of your internal network are directed to a valid IP address for the client. For example, suppose your Vertica nodes have two IP addresses: one for the external network and another for the internal network. These networks are mutually exclusive. You cannot reach the private network from the public, and you cannot reach the public network from the private. Your load balancing rules need to provide the client with an IP address they can actually reach.


  • Enable access to multiple nodes of a Vertica cluster that are behind a NAT router. A NAT router is accessible from the outside network via a single IP address. Systems within the NAT router's private network can be accessed on this single IP address using different port numbers. You can create a load balancing policy that redirects a client connection to the NAT's IP address but with a different port number.


  • Designate sets of nodes to service client connections from an IP address range. For example, if your ETL systems have a set range of IP addresses, you could limit their client connections to an arbitrary set of Vertica nodes, a subcluster, or a fault group. This technique lets you isolate the overhead of servicing client connections to a few nodes. It is useful when you are using subclusters in an Eon Mode database to isolate workloads (see Subclusters for more information).

Using connection load balancing policies with IPv4 and IPv6

Connection load balancing policies work with both IPv4 and IPv6. As far as the load balancing policies are concerned, the two address families represent separate networks. If you want your load balancing policy to handle both IPv4 and IPv6 addresses, you must create separate sets of network addresses, load balancing groups, and rules for each protocol. When a client opens a connection to a node in the cluster, the addressing protocol it uses determines which set of rules Vertica consults when deciding whether and how to balance the connection.

Default subcluster interior load balancing policy

Databases running in Eon Mode have a default connection load balancing policy that helps spread the load of handling client connections among the nodes in a subcluster. When a client connects to a node while opting into connection load balancing, the node checks for load balancing policies that apply to the client's IP address. If it does not find any applicable load balancing rule, and classic load balancing is not enabled, the node falls back to the default interior load balancing rule. This rule distributes connections among the nodes in the same subcluster as the initially-contacted node.

As with other connection load balancing policies, the nodes in the subcluster must have a network address defined for them to be eligible to handle the client connection. If no nodes in the subcluster have a network address, the node does not apply the default subcluster interior load balancing rule, and the connection is not load balanced.

This default rule is convenient when you are primarily interested in load balancing connections within each subcluster. You just create network addresses for the nodes in your subcluster. You do not need to create load balancing groups or rules. Clients that opt-in to load balancing are then automatically balanced among the nodes in the subcluster.

Interior load balancing policy with multiple network addresses

If your nodes have multiple network addresses, the default subcluster interior load balancing rule chooses the address that was created first as the target of load balancing rule. For example, suppose you create a network address on a node for the private IP address 192.168.1.10. Then you create another network address for the node for the public IP address 233.252.0.1. The default subcluster interior connection load balancing rule always selects 192.168.1.10 as the target of the rule.

If you want the default interior load balancing rule to choose a different network address as its target, drop the other network addresses on the node and then recreate them. Deleting and recreating other addresses makes the address you want the rule to select the oldest address. For example, suppose you want the rule to use a public address (233.252.0.1) that was created after a private address (192.168.1.10). In this case, you can drop the address for 192.168.1.10 and then recreate it. The rule then defaults to the older public 233.252.0.1 address.

If you intend to create multiple network addresses for the nodes in your subcluster, create the network addresses you want to use with the default subcluster interior load balancing first. For example, suppose you want to use the default interior load balancing subcluster rule to load balance most client connections. However, you also want to create a connection load balancing policy to manage connections coming in from a group of ETL systems. In this case, create the network addresses you want to use for the default interior load balancing rule first, then create the network addresses for the ETL systems.

Load balancing policies vs. classic load balancing

There are several differences between the classic load balancing feature and the load balancing policy feature:

  • In classic connection load balancing, you just enable the load balancing option on both client and server, and load balancing is enabled. There are more steps to implement load balancing policies: you have to create addresses, groups, and rules and then enable load balancing on the client.

  • Classic connection load balancing only supports a single, cluster-wide policy for redirecting connections. With connection load balancing policies, you get to choose which nodes handle client connections based on the connection's origin. This gives you more flexibility to handle complex situations. Examples include routing connections through a NAT-based router or having nodes that are accessible via multiple IP addresses on different networks.

  • In classic connection load balancing, each node in the cluster can only be reached via a single IP address. This address is set in the EXPORT_ADDRESS column of the NODES system table. With connection load balancing policies, you can create a network address for each IP address associated with a node. Then you create rules that redirect to those addresses.

Steps to create a load balancing policy

There are three steps you must follow to create a load balancing policy:

  1. Create one or more network addresses for each node that you want to participate in the connection load balancing policies.

  2. Create one or more load balancing groups to be the target of the routing rules. Load balancing groups can target a collection of specific network addresses. Alternatively, you can create a group from a fault group or subcluster. You can limit the members of the load balance group to a subset of the fault group or subcluster using an IP address filter.

  3. Create one or more routing rules.

While not absolutely necessary, it is always a good idea to idea to test your load balancing policy to ensure it works the way you expect it to.

After following these steps, Vertica will apply the load balancing policies to client connections that opt into connection load balancing. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing, for information on enabling load balancing on the client. For vsql, use the -C command-line option to enable load balancing.

These steps are explained in the other topics in this section.

See also

3.3.1 - Creating network addresses

Network addresses assign a name to an IP address and port number on a node.

Network addresses assign a name to an IP address and port number on a node. You use these addresses when you define load balancing groups. A node can have multiple network addresses associated with it. For example, suppose a node has one IP address that is only accessible from outside of the local network, and another that is accessible only from inside the network. In this case, you can define one network address using the external IP address, and another using the internal address. You can then create two different load balancing policies, one for external clients, and another for internal clients.

You create a network address using the CREATE NETWORK ADDRESS statement. This statement takes:

  • The name to assign to the network address

  • The name of the node

  • The IP address of the node to associate with the network address

  • The port number the node uses to accept client connections (optional)

The following example demonstrates creating three network addresses, one for each node in a three-node database.

=> SELECT node_name,node_address,node_address_family FROM v_catalog.nodes;
    node_name     | node_address | node_address_family
------------------+--------------+----------------------
 v_vmart_node0001 | 10.20.110.21 | ipv4
 v_vmart_node0002 | 10.20.110.22 | ipv4
 v_vmart_node0003 | 10.20.110.23 | ipv4
(4 rows)


=> CREATE NETWORK ADDRESS node01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS

Creating network addresses for IPv6 addresses works the same way:

=> CREATE NETWORK ADDRESS node1_ipv6 ON v_vmart_node0001 WITH '2001:0DB8:7D5F:7433::';
CREATE NETWORK ADDRESS

Vertica does not perform any tests on the IP address you supply in the CREATE NETWORK ADDRESS statement. You must test the IP addresses you supply to this statement to confirm they correspond to the right node.

Vertica does not restrict the address you supply because it is often not aware of all the network addresses through which the node is accessible. For example, your node may be accessible from an external network via an IP address that Vertica is not configured to use. Or, your node can have both an IPv4 and an IPv6 address, only one of which Vertica is aware of.

For example, suppose v_vmart_node0003 from the previous example is not accessible via the IP address 192.168.1.5. You can still create a network address for it using that address:

=> CREATE NETWORK ADDRESS node04 ON v_vmart_node0003 WITH '192.168.1.5';
CREATE NETWORK ADDRESS

If you create a network group and routing rule that targets this address, client connections would either connect to the wrong node, or fail due to being connected to a host that's not part of a Vertica cluster.

Specifying a port number in a network address

By default, the CREATE NETWORK ADDRESS statement assumes the port number for the node's client connection is the default 5433. Sometimes, you may have a node listening for client connections on a different port. You can supply an alternate port number for the network address using the PORT keyword.

For example, suppose your nodes are behind a NAT router. In this case, you can have your nodes listen on different port numbers so the NAT router can route connections to them. When creating network addresses for these nodes, you supply the IP address of the NAT router and the port number the node is listening on. For example:

=> CREATE NETWORK ADDRESS node1_nat ON v_vmart_node0001 WITH '192.168.10.10' PORT 5433;
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node2_nat ON v_vmart_node0002 with '192.168.10.10' PORT 5434;
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node3_nat ON v_vmart_node0003 with '192.168.10.10' PORT 5435;
CREATE NETWORK ADDRESS

3.3.2 - Creating connection load balance groups

After you have created network addresses for nodes, you create collections of them so you can target them with routing rules.

After you have created network addresses for nodes, you create collections of them so you can target them with routing rules. These collections of network addresses are called load balancing groups. You have two ways to select the addresses to include in a load balancing group:

  • A list of network addresses

  • The name of one or more fault groups or subclusters, plus an IP address range in CIDR format. The address range selects which network addresses in the fault groups or subclusters Vertica adds to the load balancing group. Only the network addresses that are within the IP address range you supply are added to the load balance group. This filter lets you base your load balance group on a portion of the nodes that make up the fault group or subcluster.

You create a load balancing group using the CREATE LOAD BALANCE GROUP statement. When basing your group on a list of addresses, this statement takes the name for the group and the list of addresses. The following example demonstrates creating addresses for four nodes, and then creating two groups based on those nodes.

=> CREATE NETWORK ADDRESS addr01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr04 on v_vmart_node0004 WITH '10.20.110.24';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP group_1 WITH ADDRESS addr01, addr02;
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_2 WITH ADDRESS addr03, addr04;
CREATE LOAD BALANCE GROUP

=> SELECT * FROM LOAD_BALANCE_GROUPS;
    name    |   policy   |     filter      |         type          | object_name
------------+------------+-----------------+-----------------------+-------------
 group_1    | ROUNDROBIN |                 | Network Address Group | addr01
 group_1    | ROUNDROBIN |                 | Network Address Group | addr02
 group_2    | ROUNDROBIN |                 | Network Address Group | addr03
 group_2    | ROUNDROBIN |                 | Network Address Group | addr04
(4 rows)

A network address can be a part of as many load balancing groups as you like. However, each group can only have a single network address per node. You cannot add two network addresses belonging to the same node to the same load balancing group.

Creating load balancing groups from fault groups

To create a load balancing group from one or more fault groups, you supply:

  • The name for the load balancing group

  • The name of one or more fault groups

  • An IP address filter in CIDR format that filters the fault groups to be added to the load balancing group basd on their IP addresses. Vertica excludes any network addresses in the fault group that do not fall within this range. If you want all of the nodes in the fault groups to be added to the load balance group, specify the filter 0.0.0.0/0.

This example creates two load balancing groups from a fault group. The first includes all network addresses in the group by using the CIDR notation for all IP addresses. The second limits the fault group to three of the four nodes in the fault group by using the IP address filter.

=> CREATE FAULT GROUP fault_1;
CREATE FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0001;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0002;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0003;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0004;
ALTER FAULT GROUP
=> SELECT node_name,node_address,node_address_family,export_address
   FROM v_catalog.nodes;
    node_name     | node_address | node_address_family | export_address
------------------+--------------+---------------------+----------------
 v_vmart_node0001 | 10.20.110.21 | ipv4                | 10.20.110.21
 v_vmart_node0002 | 10.20.110.22 | ipv4                | 10.20.110.22
 v_vmart_node0003 | 10.20.110.23 | ipv4                | 10.20.110.23
 v_vmart_node0004 | 10.20.110.24 | ipv4                | 10.20.110.24
(4 rows)

=> CREATE LOAD BALANCE GROUP group_all WITH FAULT GROUP fault_1 FILTER
   '0.0.0.0/0';
CREATE LOAD BALANCE GROUP

=> CREATE LOAD BALANCE GROUP group_some WITH FAULT GROUP fault_1 FILTER
   '10.20.110.21/30';
CREATE LOAD BALANCE GROUP

=> SELECT * FROM LOAD_BALANCE_GROUPS;
      name      |   policy   |     filter      |         type          | object_name
----------------+------------+-----------------+-----------------------+-------------
 group_all      | ROUNDROBIN | 0.0.0.0/0       | Fault Group           | fault_1
 group_some     | ROUNDROBIN | 10.20.110.21/30 | Fault Group           | fault_1
(2 rows)

You can also supply multiple fault groups to the CREATE LOAD BALANCE GROUP statement:

=> CREATE LOAD BALANCE GROUP group_2_faults WITH FAULT GROUP
   fault_2, fault_3 FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP

Creating load balance groups from subclusters

Creating a load balance group from a subcluster is similar to creating a load balance group from a fault group. You just use WITH SUBCLUSTER instead of WITH FAULT GROUP in the CREATE LOAD BALANCE GROUP statement.

=> SELECT node_name,node_address,node_address_family,subcluster_name
   FROM v_catalog.nodes;
      node_name       | node_address | node_address_family |  subcluster_name
----------------------+--------------+---------------------+--------------------
 v_verticadb_node0001 | 10.11.12.10  | ipv4                | load_subcluster
 v_verticadb_node0002 | 10.11.12.20  | ipv4                | load_subcluster
 v_verticadb_node0003 | 10.11.12.30  | ipv4                | load_subcluster
 v_verticadb_node0004 | 10.11.12.40  | ipv4                | analytics_subcluster
 v_verticadb_node0005 | 10.11.12.50  | ipv4                | analytics_subcluster
 v_verticadb_node0006 | 10.11.12.60  | ipv4                | analytics_subcluster
(6 rows)

=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH '10.11.12.10';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH '10.11.12.20';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH '10.11.12.30';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '10.11.12.40';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '10.11.12.50';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '10.11.12.60';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP load_subcluster WITH SUBCLUSTER load_subcluster
   FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP analytics_subcluster WITH SUBCLUSTER
   analytics_subcluster FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP

Setting the group's distribution policy

A load balancing group has a policy setting that determines how the initially-contacted node chooses a target from the group. CREATE LOAD BALANCE GROUP supports three policies:

  • ROUNDROBIN (default) rotates among the available members of the load balancing group. The initially-contacted node keeps track of which node it chose last time, and chooses the next one in the cluster.

  • RANDOM chooses an available node from the group randomly.

  • NONE disables load balancing.

The following example demonstrates creating a load balancing group with a RANDOM distribution policy.

=> CREATE LOAD BALANCE GROUP group_random WITH ADDRESS node01, node02,
   node03, node04 POLICY 'RANDOM';
CREATE LOAD BALANCE GROUP

The next step

After creating the load balancing group, you must add a load balancing routing rule that tells Vertica how incoming connections should be redirected to the groups. See Creating load balancing routing rules.

3.3.3 - Creating load balancing routing rules

Once you have created one or more connection load balancing groups, you are ready to create load balancing routing rules.

Once you have created one or more connection load balancing groups, you are ready to create load balancing routing rules. These rules tell Vertica how to redirect client connections based on their IP addresses.

You create routing rules using the CREATE ROUTING RULE statement. You pass this statement:

  • The name for the rule

  • The source IP address range (either IPv4 or IPv6) in CIDR format the rule applies to

  • The name of the load balancing group to handle the connection

The following example creates two rules. The first redirects connections coming from the IP address range 192.168.1.0 through 192.168.1.255 to a load balancing group named group_1. The second routes connections from the IP range 10.20.1.0 through 10.20.1.255 to the load balancing group named group_2.

=> CREATE ROUTING RULE internal_clients ROUTE '192.168.1.0/24' TO group_1;
CREATE ROUTING RULE

=> CREATE ROUTING RULE external_clients ROUTE '10.20.1.0/24' TO group_2;
CREATE ROUTING RULE

Creating a catch-all routing rule

Vertica applies routing rules in most specific to least specific order. This behavior lets you create a "catch-all" rule that handles all incoming connections. Then you can create rules to handle smaller IP address ranges for specific purposes. For example, suppose you wanted to create a catch-all rule that worked with the rules created in the previous example. Then you can create a new rule that routes 0.0.0.0/0 (the CIDR notation for all IP addresses) to a group that should handle connections that aren't handled by either of the previously-created rules. For example:

=> CREATE LOAD BALANCE GROUP group_all WITH ADDRESS node01, node02, node03, node04;
CREATE LOAD BALANCE GROUP

=> CREATE ROUTING RULE catch_all ROUTE '0.0.0.0/0' TO group_all;
CREATE ROUTING RULE

After running the above statements, any connection that does not originate from the IP address ranges 192.168.1.* or 10.20.1.* are routed to the group_all group.

3.3.4 - Testing connection load balancing policies

After creating your routing rules, you should test them to verify that they perform the way you expect.

After creating your routing rules, you should test them to verify that they perform the way you expect. The best way to test your rules is to call the DESCRIBE_LOAD_BALANCE_DECISION function with an IP address. This function evaluates the routing rules and reports back how Vertica would route a client connection from the IP address. It uses the same logic that Vertica uses when handling client connections, so the results reflect the actual connection load balancing result you will see from client connections. It also reflects the current state of the your Vertica cluster, so it will not redirect connections to down nodes.

The following example demonstrates testing a set of rules. One rule handles all connections from the range 192.168.1.0 to 192.168.1.255, while the other handles all connections originating from the 192 subnet. The third call demonstrates what happens when no rules apply to the IP address you supply.

=> SELECT describe_load_balance_decision('192.168.1.25');
                        describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]

(1 row)

=> SELECT describe_load_balance_decision('192.168.2.25');
                        describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.2.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
matches this rule
Matched to load balance group [group_all] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
(2) LB Address: [10.20.100.249]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]

(1 row)

=> SELECT describe_load_balance_decision('1.2.3.4');
                         describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either
the policy was NONE or no target was available. Details: [NONE or invalid]

(1 row)

The DESCRIBE_LOAD_BALANCE_DECISION function also takes into account the classic cluster-wide load balancing settings:

=>  SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
                            SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
 Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('1.2.3.4');
                            describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not
match any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Success. Load balance redirect to: [10.20.100.247]
port [5433]

(1 row)

The function can also help you debug connection issues you notice after going live with your load balancing policy. For example, if you notice that one node is handling a large number of client connections, you can test the client IP addresses against your policies to see why the connections are not being balanced.

3.3.5 - Load balancing policy examples

The following examples demonstrate some common use cases for connection load balancing policies.

The following examples demonstrate some common use cases for connection load balancing policies.

Enabling client connections from multiple networks

Suppose you have a Vertica cluster that is accessible from two (or more) different networks. Some examples of this situation are:

  • You have an internal and an external network. In this configuration, your database nodes usually have two or more IP addresses, which each address only accessible from one of the networks. This configuration is common when running Vertica in a cloud environment. In many cases, you can create a catch-all rule that applies to all IP addresses, and then add additional routing rules for the internal subnets.

  • You want clients to be load balanced whether they use IPv4 or IPv6 protocols. From the database's perspective, IPv4 and IPv6 connections are separate networks because each node has a separate IPv4 and IPv6 IP address.

When creating a load balancing policy for a database that is accessible from multiple networks, client connections must be directed to IP addresses on the network they can access. The best solution is to create load balancing groups for each set of IP addresses assigned to a node. Then create routing rules that redirect client connections to the IP addresses that are accessible from their network.

The following example:

  1. Creates two sets of network addresses: one for the internal network and another for the external network.

  2. Creates two load balance groups: one for the internal network and one for the external.

  3. Creates three routing rules: one for the internal network, and two for the external. The internal routing rule covers a subset of the network covered by one of the external rules.

  4. Tests the routing rules using internal and external IP addresses.

=> CREATE NETWORK ADDRESS node01_int ON v_vmart_node0001 WITH '192.168.0.1';
CREATE NETWORK ADDRESS

=> CREATE NETWORK ADDRESS node01_ext ON v_vmart_node0001 WITH '203.0.113.1';
CREATE NETWORK ADDRESS

=> CREATE NETWORK ADDRESS node02_int ON v_vmart_node0002 WITH '192.168.0.2';
CREATE NETWORK ADDRESS

=> CREATE NETWORK ADDRESS node02_ext ON v_vmart_node0002 WITH '203.0.113.2';
CREATE NETWORK ADDRESS

=> CREATE NETWORK ADDRESS node03_int ON v_vmart_node0003 WITH '192.168.0.3';
CREATE NETWORK ADDRESS

=> CREATE NETWORK ADDRESS node03_ext ON v_vmart_node0003 WITH '203.0.113.3';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP internal_group WITH ADDRESS node01_int, node02_int, node03_int;
CREATE LOAD BALANCE GROUP

=> CREATE LOAD BALANCE GROUP external_group WITH ADDRESS node01_ext, node02_ext, node03_ext;
CREATE LOAD BALANCE GROUP

=> CREATE ROUTING RULE internal_rule ROUTE '192.168.0.0/24' TO internal_group;
CREATE ROUTING RULE

=> CREATE ROUTING RULE external_rule ROUTE '0.0.0.0/0' TO external_group;
CREATE ROUTING RULE

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.100.10');
                         DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
 Describing load balance decision for address [198.51.100.10]
Load balance cache internal version id (node-local): [3]
Considered rule [internal_rule] source ip filter [192.168.0.0/24]... input
address does not match source ip filter for this rule.
Considered rule [external_rule] source ip filter [0.0.0.0/0]... input
address matches this rule
Matched to load balance group [external_group] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [203.0.113.1]:5433
(1) LB Address: [203.0.113.2]:5433
(2) LB Address: [203.0.113.3]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [203.0.113.3] port [5433]

(1 row)

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.100.10');

                         DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.0.79]
Load balance cache internal version id (node-local): [3]
Considered rule [internal_rule] source ip filter [192.168.0.0/24]... input
address matches this rule
Matched to load balance group [internal_group] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [192.168.0.1]:5433
(1) LB Address: [192.168.0.3]:5433
(2) LB Address: [192.168.0.2]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [192.168.0.2] port
[5433]

(1 row)

Isolating workloads

You may want to control which nodes in your cluster are used by specific types of clients. For example, you may want to limit clients that perform data-loading tasks to one set of nodes, and reserve the rest of the nodes for running queries. This separation of workloads is especially common for Eon Mode databases. See Controlling Where a Query Runs for an example of using load balancing policies in an Eon Mode database to control which subcluster a client connects to.

You can create client load balancing policies that support workload isolation if clients performing certain types of tasks always originate from a limited IP address range. For example, if the clients that load data into your system always fall into a specific subnet, you can create a policy that limits which nodes those clients can access.

In the following example:

  • There are two fault groups (group_a and group_b) that separate workloads in an Eon Mode database. These groups are used as the basis of the load balancing groups.

  • The ETL client connections all originate from the 203.0.113.0/24 subnet.

  • User connections originate in the range of 192.0.0.0 to 199.255.255.255.

=> CREATE NETWORK ADDRESS node01 ON v_vmart_node0001 WITH '192.0.2.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_node0002 WITH '192.0.2.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_vmart_node0003 WITH '192.0.2.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_vmart_node0004 WITH '192.0.2.4';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_vmart_node0005 WITH '192.0.2.5';
CREATE NETWORK ADDRESS
                                                     ^
=> CREATE LOAD BALANCE GROUP lb_users WITH FAULT GROUP group_a FILTER '192.0.2.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP lb_etl WITH FAULT GROUP group_b FILTER '192.0.2.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE users_rule ROUTE '192.0.0.0/5' TO lb_users;
CREATE ROUTING RULE
=> CREATE ROUTING RULE etl_rule ROUTE '203.0.113.0/24' TO lb_etl;
CREATE ROUTING RULE

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('198.51.200.129');
                          DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
 Describing load balance decision for address [198.51.200.129]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
does not match source ip filter for this rule.
Considered rule [users_rule] source ip filter [192.0.0.0/5]... input address
matches this rule
Matched to load balance group [lb_users] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [192.0.2.1]:5433
(1) LB Address: [192.0.2.2]:5433
(2) LB Address: [192.0.2.3]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [192.0.2.2] port
[5433]

(1 row)

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('203.0.113.24');
                             DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
 Describing load balance decision for address [203.0.113.24]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
matches this rule
Matched to load balance group [lb_etl] the group has policy [ROUNDROBIN] number
of addresses [2]
(0) LB Address: [192.0.2.4]:5433
(1) LB Address: [192.0.2.5]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [192.0.2.5] port
[5433]

(1 row)

=> SELECT DESCRIBE_LOAD_BALANCE_DECISION('10.20.100.25');
                           DESCRIBE_LOAD_BALANCE_DECISION
-------------------------------------------------------------------------------
 Describing load balance decision for address [10.20.100.25]
Load balance cache internal version id (node-local): [6]
Considered rule [etl_rule] source ip filter [203.0.113.0/24]... input address
does not match source ip filter for this rule.
Considered rule [users_rule] source ip filter [192.0.0.0/5]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either the
policy was NONE or no target was available. Details: [NONE or invalid]

(1 row)

Enabling the default subcluster interior load balancing policy

Vertica attempts to apply the default subcluster interior load balancing policy if no other load balancing policy applies to an incoming connection and classic load balancing is not enabled. See Default Subcluster Interior Load Balancing Policy for a description of this rule.

To enable default subcluster interior load balancing, you must create network addresses for the nodes in a subcluster. Once you create the addresses, Vertica attempts to apply this rule to load balance connections within a subcluster when no other rules apply.

The following example confirms the database has no load balancing groups or rules. Then it adds publicly-accessible network addresses to the nodes in the primary subcluster. When these addresses are added, Vertica applies the default subcluster interior load balancing policy.

=> SELECT * FROM LOAD_BALANCE_GROUPS;
 name | policy | filter | type | object_name
------+--------+--------+------+-------------
(0 rows)

=> SELECT * FROM ROUTING_RULES;
 name | source_address | destination_name
------+----------------+------------------
(0 rows)

=> CREATE NETWORK ADDRESS node01_ext ON v_verticadb_node0001 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_ext ON v_verticadb_node0002 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_ext ON v_verticadb_node0003 WITH '203.0.113.3';
CREATE NETWORK ADDRESS

=> SELECT describe_load_balance_decision('11.0.0.100');
                                describe_load_balance_decision
-----------------------------------------------------------------------------------------------
Describing load balance decision for address [11.0.0.100] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [2]
Considered rule [auto_rr_default_subcluster] subcluster interior filter  [default_subcluster]...
current subcluster matches this rule
Matched to load balance group [auto_lbg_sc_default_subcluster] the group has policy
[ROUNDROBIN] number of addresses [3]
(0) LB Address: [203.0.113.1]:5433
(1) LB Address: [203.0.113.2]:5433
(2) LB Address: [203.0.113.3]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [203.0.113.2] port [5433]

(1 row)

Load balance both IPv4 and IPv6 connections

Connection load balancing policies regard IPv4 and IPv6 connections as separate networks. To load balance both types of incoming client connections, create two sets of network addresses, at least two load balancing groups, and two load balancing , once for each network address family.

This example creates two load balancing policies for the default subcluster: one for the IPv4 network addresses (192.168.111.31 to 192.168.111.33) and one for the IPv6 network addresses (fd9b:1fcc:1dc4:78d3::31 to fd9b:1fcc:1dc4:78d3::33).

=> SELECT node_name,node_address,subcluster_name FROM NODES;
      node_name       |  node_address  |  subcluster_name
----------------------+----------------+--------------------
 v_verticadb_node0001 | 192.168.111.31 | default_subcluster
 v_verticadb_node0002 | 192.168.111.32 | default_subcluster
 v_verticadb_node0003 | 192.168.111.33 | default_subcluster

=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH
   '192.168.111.31';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node01_ipv6 ON v_verticadb_node0001 WITH
   'fd9b:1fcc:1dc4:78d3::31';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH
   '192.168.111.32';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02_ipv6 ON v_verticadb_node0002 WITH
   'fd9b:1fcc:1dc4:78d3::32';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH
   '192.168.111.33';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03_ipv6 ON v_verticadb_node0003 WITH
   'fd9b:1fcc:1dc4:78d3::33';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP group_ipv4 WITH SUBCLUSTER default_subcluster
   FILTER '192.168.111.0/24';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_ipv6 WITH SUBCLUSTER default_subcluster
   FILTER 'fd9b:1fcc:1dc4:78d3::0/64';
CREATE LOAD BALANCE GROUP

=> CREATE ROUTING RULE all_ipv4 route '0.0.0.0/0' TO group_ipv4;
CREATE ROUTING RULE
=> CREATE ROUTING RULE all_ipv6 route '0::0/0' TO group_ipv6;
CREATE ROUTING RULE

=> SELECT describe_load_balance_decision('203.0.113.50');
                                                                                                                                                                                                                                                                                   describe_load_balance_decision
-----------------------------------------------------------------------------------------------
Describing load balance decision for address [203.0.113.50] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [3]
Considered rule [all_ipv4] source ip filter [0.0.0.0/0]... input address matches this rule
Matched to load balance group [ group_ipv4] the group has policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [192.168.111.31]:5433
(1) LB Address: [192.168.111.32]:5433
(2) LB Address: [192.168.111.33]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [192.168.111.33] port [5433]

(1 row)

=> SELECT describe_load_balance_decision('2001:0DB8:EA04:8F2C::1');
                                                                                                                                                                                                                                                                                                                                                                    describe_load_balance_decision
---------------------------------------------------------------------------------------------------------
Describing load balance decision for address [2001:0DB8:EA04:8F2C::1] on subcluster [default_subcluster]
Load balance cache internal version id (node-local): [3]
Considered rule [all_ipv4] source ip filter [0.0.0.0/0]... input address does not match source ip filter for this rule.
Considered rule [all_ipv6] source ip filter [0::0/0]... input address matches this rule
Matched to load balance group [ group_ipv6] the group has policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [fd9b:1fcc:1dc4:78d3::31]:5433
(1) LB Address: [fd9b:1fcc:1dc4:78d3::32]:5433
(2) LB Address: [fd9b:1fcc:1dc4:78d3::33]:5433
Chose address at position [2]
Routing table decision: Success. Load balance redirect to: [fd9b:1fcc:1dc4:78d3::33] port [5433]

(1 row)

Other examples

For other examples of using connection load balancing, see the following topics:

3.3.6 - Viewing load balancing policy configurations

Query the following system tables in the V_CATALOG Schema to see the load balance policies defined in your database:.

Query the following system tables in the V_CATALOG schema to see the load balance policies defined in your database:

  • NETWORK_ADDRESSES lists all of the network addresses defined in your database.

  • LOAD_BALANCE_GROUPS lists the contents of your load balance groups.

  • ROUTING_RULES lists all of the routing rules defined in your database.

This example demonstrates querying each of the load balancing policy system tables.

=> \x
Expanded display is on.
=> SELECT * FROM V_CATALOG.NETWORK_ADDRESSES;
-[ RECORD 1 ]----+-----------------
name             | node01
node             | v_vmart_node0001
address          | 10.20.100.247
port             | 5433
address_family   | ipv4
is_enabled       | t
is_auto_detected | f
-[ RECORD 2 ]----+-----------------
name             | node02
node             | v_vmart_node0002
address          | 10.20.100.248
port             | 5433
address_family   | ipv4
is_enabled       | t
is_auto_detected | f
-[ RECORD 3 ]----+-----------------
name             | node03
node             | v_vmart_node0003
address          | 10.20.100.249
port             | 5433
address_family   | ipv4
is_enabled       | t
is_auto_detected | f
-[ RECORD 4 ]----+-----------------
name             | alt_node1
node             | v_vmart_node0001
address          | 192.168.1.200
port             | 8080
address_family   | ipv4
is_enabled       | t
is_auto_detected | f
-[ RECORD 5 ]----+-----------------
name             | test_addr
node             | v_vmart_node0001
address          | 192.168.1.100
port             | 4000
address_family   | ipv4
is_enabled       | t
is_auto_detected | f

=> SELECT * FROM LOAD_BALANCE_GROUPS;
-[ RECORD 1 ]----------------------
name        | group_all
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node01
-[ RECORD 2 ]----------------------
name        | group_all
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node02
-[ RECORD 3 ]----------------------
name        | group_all
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node03
-[ RECORD 4 ]----------------------
name        | group_1
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node01
-[ RECORD 5 ]----------------------
name        | group_1
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node02
-[ RECORD 6 ]----------------------
name        | group_2
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node01
-[ RECORD 7 ]----------------------
name        | group_2
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node02
-[ RECORD 8 ]----------------------
name        | group_2
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node03
-[ RECORD 9 ]----------------------
name        | etl_group
policy      | ROUNDROBIN
filter      |
type        | Network Address Group
object_name | node01

=> SELECT * FROM ROUTING_RULES;
-[ RECORD 1 ]----+-----------------
name             | internal_clients
source_address   | 192.168.1.0/24
destination_name | group_1
-[ RECORD 2 ]----+-----------------
name             | etl_rule
source_address   | 10.20.100.0/24
destination_name | etl_group
-[ RECORD 3 ]----+-----------------
name             | subnet_192
source_address   | 192.0.0.0/8
destination_name | group_all

3.3.7 - Maintaining load balancing policies

Once you have created load balancing policies, you maintain them using the following statements:.

Once you have created load balancing policies, you maintain them using the following statements:

  • ALTER NETWORK ADDRESS letsyou: rename, change the IP address, and enable or disable a network address.

  • ALTER LOAD BALANCE GROUP letsyou rename, add or remove network addresses or fault groups, change the fault group IP address filter, or change the policy of a load balance group.

  • ALTER ROUTING RULE letsyou rename, change the source IP address, and the target load balance group of a rule.

See the refence pages for these statements for examples.

Deleting load balancing policy objects

You can also delete existing load balance policy objects using the following statements: