Workload routing

Workload routing routes client connections to subclusters based on their workloads.

Workload routing routes client connections to subclusters. This lets you reserve subclusters for certain types of tasks.

When a client connects to Vertica, they connect to a Connection node, which then routes the client to the correct subcluster based on the client's specified workload, user, or role and the database's routing rules. If multiple subclusters are associated with the same workload, the client is randomly routed to one of those subclusters.

In this context, "routing" refers to the connection node acting as a proxy for the client and the Execution node in the target subcluster. All queries and query results are first sent to the connection node and then passed on to the execution node and client, respectively.

The primary advantages of this type of load balancing are as follows:

  • Database administrators can associate certain subclusters with certain workloads and roles (as opposed to client IP addresses).
  • Clients do not need to know anything about the subcluster they will be routed to, only the type of workload they have or the role they should use.

Workload routing depends on actions from both the database administrator and the client:

  • The database administrator must create rules for handling various workloads.
  • The client must either specify the type of workload they have or have an enabled role (either from a manual SET ROLE or default role) associated with a routing rule.

View the current workload

To view the workload associated with the current session, use SHOW WORKLOAD:

=> SHOW WORKLOAD;
   name   |  setting
----------+------------
 workload | my_worload
(1 row)

Create workload routing rules

Routing rules apply to a client's specified workload and either user or role. If you specify more than one subcluster in a routing rule, the client is randomly routed to one of those subclusters.

If multiple routing rules could apply to a client's session, the rule with the highest priority is used. For details, see Priorities.

To view existing routing rules, see WORKLOAD_ROUTING_RULES.

To view workloads available to you and your enabled roles, use SHOW AVAILABLE WORKLOADS:

=> SHOW AVAILABLE WORKLOADS;
                name | setting
---------------------+------------------------
 available workloads | reporting, analytics
(1 row) 

Workload-based routing

Workload-based routing rules apply to clients that specify a particular workload, routing them to one of the subclusters listed in the rule. In this example, when a client connects to the database and specifies the analytics workload, their connection is randomly routed to either sc_analytics or sc_analytics_2:

=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;

To alter a routing rule, use ALTER ROUTING RULE. For example, to route analytics workloads to sc_analytics:

=> ALTER ROUTING RULE FOR WORKLOAD analytics SET SUBCLUSTER TO sc_analytics;

To drop a routing rule, use DROP ROUTING RULE and specify the workload. For example, to drop the routing rule for the analytics workload:

=> DROP ROUTING RULE FOR WORKLOAD analytics;

User- and role-based routing

You can grant USAGE privileges to a user or role to let them route their queries to one of the subclusters listed in the routing rule. In this example, when a client connects to the database and enables the analytics_role role, their connection is randomly routed to either sc_analytics or sc_analtyics_2:

=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;
=> GRANT USAGE ON ROUTING RULE analytics TO analytics_role;

Users can then enable the role and set their workload to analytics for the session:

=> SET ROLE analytics_role;
=> SET SESSION WORKLOAD analytics;

Users can also enable the role automatically by setting it as a default role and then specify the workload when they connect. For details on default roles, see Enabling roles automatically.

Similarly, in this example, when a client connects to the database as user analytics_user, they are randomly routed to either sc_analytics or sc_analtyics_2:

=> CREATE ROUTING RULE ROUTE WORKLOAD analytics TO SUBCLUSTER sc_analytics, sc_analytics_2;
=> GRANT USAGE ON ROUTING RULE analytics TO analytics_user;

Priorities

Only one workload routing rule can apply to a session at any given time. If multiple routing rules are granted to a user or role, the priority associated with the rule determines which one applies. If multiple rules have the highest priority, Vertica chooses one of those rules randomly.

You can force a routing rule to apply to your session, ignoring priority, by using SET SESSION WORKLOAD.

In this example, the user Gunther is granted usage on the routing rule for workload reporting. The following statement does not specify a priority, so the priority defaults to 0:

=> GRANT USAGE ON WORKLOAD reporting TO Gunther;

Gunther also has the roles analytics_role and qa_role, each of which are granted their own routing rules:

=> GRANT USAGE ON WORKLOAD analytics TO analytics_role;
=> GRANT USAGE ON WORKLOAD qa TO qa_role;

When Gunther first connects to Vertica, the routing rule for reporting applies. If Gunther then enables the analytics_role and qa_role, the routing rule for qa_role applies instead because it has the greatest priority value.

Similarly, if the priority of the routing rule for reporting is changed to 3 with ALTER ROUTING RULE, that rule applies instead:

=> ALTER ROUTING RULE FOR WORKLOAD reporting SET PRIORITY TO 3;

To ignore priorities and force the session to use the analytics rule, Gunther can use SET SESSION WORKLOAD:

=> SET SESSION WORKLOAD analytics;

Specify a workload

The workload for a given connection can be reported by the client when they connect. The method for specifying a workload depends on the client.

The following examples set the workload to analytics for several clients. After you connect, you can verify that the workload was set with SHOW WORKLOAD.

vsql uses --workload:

$ vsql --dbname databaseName --host node01.example.com --username Bob --password my_pwd --workload analytics

JDBC uses workload:

jdbc:vertica://node01.example.com:5443/databaseName?user=analytics_user&password=***&workload=analytics

ODBC uses Workload:

Database=databaseName;Servername=node01.mydomain.com;Port=5433;UID=analytics_user;PWD=***;Workload=analytics

ADO.NET uses Workload:

Database=databaseName;Host=node01.mydomain.com;Port=5433;User=analytics_user;Password=***;Workload=analytics

vertica-sql-go uses Workload:

var query = url.URL{
    Scheme:   "vertica",
    User:     url.UserPassword(user, password),
    Host:     fmt.Sprintf("%s:%d", host, port),
    Path:     databaseName,
    Workload: "analytics",
    RawQuery: rawQuery.Encode(),
}

vertica-python uses workload:

conn_info = {'host': '127.0.0.1',
            'port': 5433,
            'user': 'some_user',
            'password': 'my_pwd',
            'database': 'databaseName',
            'workload': 'analytics',
            # autogenerated session label by default,
            'session_label': 'some_label',
            # default throw error on invalid UTF-8 results
            'unicode_error': 'strict',
            # SSL is disabled by default
            'ssl': False,
            # autocommit is off by default
            'autocommit': True,
            # using server-side prepared statements is disabled by default
            'use_prepared_statements': False,
            # connection timeout is not enabled by default
            # 5 seconds timeout for a socket operation (Establishing a TCP connection or read/write operation)
            'connection_timeout': 5}

vertica-nodejs uses workload:

const client = new Client({
    user:     "vertica_user",
    host:     "node01.example.com",
    database: "verticadb",
    password: "",
    port:     "5433",
    workload: "analytics",
})

Clients can also change their workload type after they connect with SET SESSION WORKLOAD:

=> SET SESSION WORKLOAD my_workload;