This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Scaling your Eon Mode database

One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands.

One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands. You can add nodes to and remove nodes from your database to meet changing workload demands. For an overview of why you would scale your database and how it affects queries, see Elasticity.

Scaling up your database by starting stopped nodes

The easiest way to scale up your database is to start any stopped nodes:

Scaling up your database by adding nodes

If you do not have stopped nodes in your database, or the stopped nodes are not in the subclusters where you want to add new nodes, then you can add new nodes to the database. In supported environments, you can use the MC to provision and add new nodes to your database in a single step. See Viewing and managing your cluster for more information.

You can also manually add new nodes:

Controlling how Vertica uses your new nodes

New nodes can improve your database's performance in one of two ways:

  • Increase the query throughput (the number of queries your database processes at the same time).

  • Increase individual query performance (how fast each query runs).

See Elasticity for details on these performance improvements. You control how the new nodes improve your database's performance by choosing what subclusters you add them to. The following topics explain how to use scaling to improve throughput and query performance.

1 - Reshard the default namespace

The initial number of shards is set when you create a namespace.

The initial number of shards for the default_namespace is set when you create a database. You might choose to change the number of shards in this namespace for the following reasons:

  • Improve large subcluster performance. For example, if you have a 24-node subcluster and a namespace with 6 shards, the subcluster uses Elastic Crunch Scaling (ECS) to split the responsibility for processing the data in each shard among the nodes. Re-sharding the namespace to 24 shards avoids using ECS, which improves performance because ECS is not as efficient as having a 1:1 shard:node ratio. For more information, see Using elastic crunch scaling to improve query performance.

  • Reducing the number of shards in default_namespace reduces the catalog size.

  • Improve performance after migrating from Enterprise Mode to Eon Mode. When you migrate your database from Enterprise Mode to Eon Mode, the number of shards in your the default_namespace is initially set to the number of nodes that you had in your Enterprise database. This default number of shards might not be ideal. For details, see Choosing the initial node and shard counts.

  • Scale your database effectively. To evenly distribute work among nodes, the number of nodes in the database should be a multiple of the number of shards. You might re-shard the namespace if you plan to scale the subclusters to a size that is incompatible with this guidance. For example, a namespace with 7 shards works best with subclusters that have a multiple of 7 nodes. Choosing a shard count with more divisors, such as 8, gives you greater flexibility in choosing the number of nodes in a subcluster.

You should not re-shard your default_namespace every time you scale subclusters. While in progress, re-sharding might affect the database's performance. After re-sharding, the storage containers on the subcluster are not immediately aligned with the new shard subscription bounds. This misalignment adds overhead to query execution.

Re-sharding the default_namespace

To re-shard the default_namespace, call the RESHARD_DATABASE function with the new shard count as the argument. This function takes a global catalog lock, so avoid running it during busy periods or when performing heavy ETL loads. The runtime depends on the size of your catalog.

After RESHARD_DATABASE completes, the nodes in the cluster use the new catalog shard definitions. However, the re-sharding process does not immediately alter the storage containers in communal storage. The shards continue to point to the existing storage containers. For example, if you double the number of shards in the namespace, each storage container now has two associated shards. During queries, each node filters out the data in the storage containers that does not apply to its subscribed shard. This adds a small overheard to the query. Eventually, the Tuple Mover's background reflexive mergeout processes automatically update the storage containers so they align with the new shard definitions. You can call DO_TM_TASK to run a 'RESHARDMERGEOUT' task that has the Tuple Mover immediately realign the storage containers.

The following query returns the details of any storage containers that Tuple Mover has not yet realigned:

=> SELECT * FROM storage_containers WHERE original_segment_lower_bound IS NOT NULL AND original_segment_upper_bound IS NOT NULL;

Example

This example demonstrates the re-sharding process and how it affects shard assignments and storage containers. To illustrate the impact of re-sharding, the shard assignment and storage container details are compared before and after re-sharding. The following four queries return information about the namespace's shard count, shard bounds, node subscriptions, and storage container catalog objects:

=> SELECT * FROM NAMESPACES;
   namespace_oid   |  namespace_name   | is_default | default_shard_count
-------------------+-------------------+------------+---------------------
 45035996273704988 | default_namespace | t          |                   4

=> SELECT shard_name, lower_hash_bound, upper_hash_bound FROM shards ORDER BY shard_name;

shard_name  | lower_hash_bound | upper_hash_bound
------------+------------------+------------------
replica     |                  |
segment0001 |                0 |       1073741825
segment0002 |       1073741826 |       2147483649
segment0003 |       2147483650 |       3221225473
segment0004 |       3221225474 |       4294967295
(5 rows)

=> SELECT node_name, shard_name, is_primary, is_resubscribing, is_participating_primary FROM node_subscriptions;

node_name | shard_name  | is_primary | is_resubscribing | is_participating_primary
----------+-------------+------------+------------------+--------------------------
initiator | replica     | t          | f                | t
e0        | replica     | f          | f                | t
e1        | replica     | f          | f                | t
e2        | replica     | f          | f                | t
e0        | segment0002 | t          | f                | t
e1        | segment0003 | t          | f                | t
e2        | segment0004 | t          | f                | t
initiator | segment0001 | t          | f                | t
(8 rows)

=> SELECT node_name, projection_name, storage_oid, sal_storage_id, total_row_count, deleted_row_count, segment_lower_bound, segment_upper_bound, shard_name FROM storage_containers WHERE projection_name = 't_super';

node_name | projection_name |    storage_oid    |                  sal_storage_id                  | total_row_count | deleted_row_count | segment_lower_bound | segment_upper_bound | shard_name
----------+-----------------+-------------------+--------------------------------------------------+-----------------+-------------------+---------------------+---------------------+-------------
initiator | t_super         | 45035996273842990 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d |               4 |                 0 |                   0 |          1073741825 | segment0001
e0        | t_super         | 49539595901213486 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d |               3 |                 0 |          1073741826 |          2147483649 | segment0002
e1        | t_super         | 54043195528583990 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 |               8 |                 0 |          2147483650 |          3221225473 | segment0003
e2        | t_super         | 54043195528583992 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 |               6 |                 0 |          3221225474 |          4294967295 | segment0004
(4 rows)

The following call to RESHARD_DATABASE changes the shard count of default_namespace to eight:

=> SELECT RESHARD_DATABASE(8);

                RESHARD_DATABASE
----------------------------------------------------------
The database has been re-sharded from 4 shards to 8 shards
(1 row)

To confirm that the reshard operation was successful, query the NAMESPACES system table:

=> SELECT * FROM NAMESPACES;
   namespace_oid   |  namespace_name   | is_default | default_shard_count
-------------------+-------------------+------------+---------------------
 45035996273704988 | default_namespace | t          |                   8

You can use the following query to view the namespace's new shard definitions:

=> SELECT shard_name, lower_hash_bound, upper_hash_bound FROM shards ORDER BY shard_name;

shard_name  | lower_hash_bound | upper_hash_bound
-------------+------------------+------------------
replica     |                  |
segment0001 |                0 |        536870913
segment0002 |        536870914 |       1073741825
segment0003 |       1073741826 |       1610612737
segment0004 |       1610612738 |       2147483649
segment0005 |       2147483650 |       2684354561
segment0006 |       2684354562 |       3221225473
segment0007 |       3221225474 |       3758096385
segment0008 |       3758096386 |       4294967295
(9 rows)

The namespace now has eight shards. Because re-sharding cut the boundary range of each shard in half, each shard is responsible for about half as much of the communal storage data.

The following query returns the namespace's new node subscriptions:

=> SELECT node_name, shard_name, is_primary, is_resubscribing, is_participating_primary FROM node_subscriptions;

node_name | shard_name  | is_primary | is_resubscribing | is_participating_primary
-----------+-------------+------------+------------------+--------------------------
initiator | replica     | t          | f                | t
e0        | replica     | f          | f                | t
e1        | replica     | f          | f                | t
e2        | replica     | f          | f                | t
initiator | segment0001 | t          | f                | t
e0        | segment0002 | t          | f                | t
e1        | segment0003 | t          | f                | t
e2        | segment0004 | t          | f                | t
initiator | segment0005 | t          | f                | t
e0        | segment0006 | t          | f                | t
e1        | segment0007 | t          | f                | t
e2        | segment0008 | t          | f                | t
(12 rows)

After re-sharding, each node now subscribes to two shards instead of one.

You can use the following query to see how re-sharding affected the database's storage container catalog objects:

=> SELECT node_name, projection_name, storage_oid, sal_storage_id, total_row_count, deleted_row_count, segment_lower_bound, segment_upper_bound, shard_name FROM storage_containers WHERE projection_name = 't_super';

node_name | projection_name |    storage_oid    |                  sal_storage_id                  | total_row_count | deleted_row_count | segment_lower_bound | segment_upper_bound | shard_name
----------+-----------------+-------------------+--------------------------------------------------+-----------------+-------------------+---------------------+---------------------+-------------
initiator | t_super         | 45035996273843145 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 |               8 |                 0 |          2147483650 |          3221225473 | segment0005
initiator | t_super         | 45035996273843149 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d |               4 |                 0 |                   0 |          1073741825 | segment0001
e0        | t_super         | 49539595901213641 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 |               8 |                 0 |          2147483650 |          3221225473 | segment0006
e0        | t_super         | 49539595901213645 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d |               4 |                 0 |                   0 |          1073741825 | segment0002
e1        | t_super         | 54043195528584141 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 |               6 |                 0 |          3221225474 |          4294967295 | segment0007
e1        | t_super         | 54043195528584143 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 |               6 |                 0 |          1073741826 |          2147483649 | segment0003
e2        | t_super         | 54043195528584137 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d |               3 |                 0 |          3221225474 |          4294967295 | segment0008
e2        | t_super         | 54043195528584139 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d |               3 |                 0 |          1073741826 |          2147483649 | segment0004
(8 rows)

The shards point to storage files with the same sal_storage_id as before the re-shard. Eventually, the TM's mergeout processes will automatically update the storage containers.

You can query the RESHARDING_EVENTS system table for information about current and historical resharding operations, such as a node's previous shard subscription bounds and the current status of the resharding operation:

=> SELECT node_name, running_status, old_shard_name, old_shard_lower_bound, old_shard_upper_bound FROM RESHARDING_EVENTS;
node_name | running_status  | old_shard_name |   old_shard_lower_bound   |   old_shard_upper_bound
----------+-----------------+----------------+---------------------------+-------------------------
e0        | Running         | segment0001    |                         0 |              1073741825
e0        | Running         | segment0002    |                1073741826 |              2147483649
e0        | Running         | segment0003    |                2147483650 |              3221225473
e0        | Running         | segment0004    |                3221225474 |              4294967295
e1        | Running         | segment0001    |                         0 |              1073741825
e1        | Running         | segment0002    |                1073741826 |              2147483649
e1        | Running         | segment0003    |                2147483650 |              3221225473
e1        | Running         | segment0004    |                3221225474 |              4294967295
initiator | Running         | segment0001    |                         0 |              1073741825
initiator | Running         | segment0002    |                1073741826 |              2147483649
initiator | Running         | segment0003    |                2147483650 |              3221225473
initiator | Running         | segment0004    |                3221225474 |              4294967295
(12 rows)

After you re-shard the namespace, you can query the DC_ROSES_CREATED table to track the original ROS containers and DVMiniROSes from which the new storage containers were derived:

=> SELECT node_name, projection_name, storage_oid, old_storage_oid, is_dv FROM DC_ROSES_CREATED;
     node_name       | projection_name  |    storage_oid    |   old_storage_oid  | is_dv
---------------------+------------------+-------------------+-----------------------------
initiator            | t_super          | 45035996273860625 | 45035996273843149  |  f
initiator            | t_super          | 45035996273860632 |                 0  |  f
e0                   | t_super          | 45035996273843149 |                 0  |  f
(3 rows)

2 - Improving query throughput using subclusters

Improving query throughput increases the number of queries your Eon Mode database processes at the same time.

Improving query throughput increases the number of queries your Eon Mode database processes at the same time. You are usually concerned about your database's throughput when your workload consists of many short-running queries. They are often referred to as "dashboard queries." This term describes type of workload you see when a large number of users have web-based dashboard pages open to monitor some sort of status. These dashboards tend to update frequently, using simpler, short-running queries instead of analytics-heavy long running queries.

The best way to improve your database's throughput is to add new subclusters to the database or start any stopped subclusters. Then distribute the client connections among these subclusters using connection load balancing policies. Subclusters independently process queries. By adding more subclusters, you improve your database's parallelism.

For the best performance, make the number of nodes in your subcluster the same as the number of shards in a namespace. If you choose to have less nodes than the number of shards, make the number of nodes an even divisor of the number of shards. When the number of shards is divisible by the number of nodes, the data in your namespace is equally divided among the nodes in the subcluster.

The easiest way of adding subclusters is to use the MC:

  1. From the MC home page, click the database you want to add subclusters to.

  2. Click Manage.

  3. Click Add Subcluster.

  4. Follow the steps in the wizard to add the subcluster. Normally, the only items you need to fill in are the subcluster name and the number of instances to add to it.

Distributing clients among the throughput subclusters

To gain benefits from the added subclusters, you must have clients that will execute short-running queries connect to the nodes that the subclusters contain. Queries run only on the subcluster that contains the initiator node (the node that the client is connected to). Use connection load balancing policies to spread the connections across all of the subclusters you created to increase query throughput. See Connection load balancing policies for details.

The following example creates a load balancing policy that spreads client connections across two three-node subclusters named query_pool_a and query_pool_b. This example:

  • Creates network addresses on the six nodes that are in the two subclusters.

  • Creates a load balance group from all the nodes in the two subclusters.

  • Creates the routing rule to redirect all incoming connections to the two subclusters.

=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '203.0.113.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node07 ON v_verticadb_node0007 WITH '203.0.113.4';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node08 ON v_verticadb_node0008 WITH '203.0.113.5';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node09 ON v_verticadb_node0009 WITH '203.0.113.6';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP query_subclusters WITH SUBCLUSTER query_pool_a,
   query_pool_b FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE query_clients ROUTE '0.0.0.0/0' TO query_subclusters;
CREATE ROUTING RULE

After creating the policy, any client that opts into load balancing is redirected to one of the nodes in the two subclusters. For example, when you connect to node 1 in the cluster (with the IP address 203.0.113.1) using vsql with the -C flag, you see output similar to this:

$ vsql -h 203.0.113.1 -U dbadmin -w mypassword -C
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)

INFO: Connected using a load-balanced connection.
INFO: Connected to 203.0.113.7 at port 5433.
=>

Connection load balancing policies take into account nodes that are stopped when picking a node to handle a client connection. If you shut down one or more subclusters to save money during low-demand periods, you do not need to adjust your load balancing policy as long as some of the nodes are still up.

3 - Using elastic crunch scaling to improve query performance

You can choose to add nodes to your database to improve the performance of complex long-running analytic queries.

You can choose to add nodes to your database to improve the performance of complex long-running analytic queries. Adding nodes helps these queries run faster.

When you have more nodes in a subcluster than you have shards in a namespace, multiple nodes subscribe to each shard. To involve all of the nodes in the subcluster in queries, the Vertica query optimizer automatically uses a feature called Elastic Crunch Scaling (ECS). This feature splits the responsibility for processing the data in each shard among the nodes that subscribe to it. During a query, each node has less data to process and usually finishes the query faster.

For example, suppose you have a six-node subcluster in a three-shard namespace. In this subcluster, two nodes subscribe to each shard of the namespace. When you execute a query, Vertica assigns each node roughly half of the data in the shard it subscribes to. Because all nodes in the subcluster participate in the query, the query usually finishes faster than if only half the nodes had participated.

ECS lets a subcluster that has more nodes than shards in a namespace act as if the shard count in the namespace were higher. In a three-shard namespace, a six-node subcluster acts as if the namespace has six shards by splitting each shard in half. However, using ECS isn't as efficient as having a higher shard count. In practice, you will see slightly slower query performance on a six-node subcluster in a three shard database than you would see from a six-node subcluster in a six-shard database.

If the only namespace in your database is default_namespace, you can call RESHARD_DATABASE to change the number of shards in the namespace. If the new number of shards is greater than or equal to the number of nodes in the subcluster, the subcluster no longer uses ECS. This will generally lead to faster query performance. However, re-sharding produces a larger catalog size and storage containers that are initially misaligned with the new shard definitions. Until the storage containers are realigned, queries must filter out the data in the storage containers that is outside the new shard bounds. This adds a small overhead to queries. For details, see Reshard the default namespace.

You can determine when the optimizer will use ECS in a subcluster by querying the V_CATALOG.SESSION_SUBSCRIPTIONS system table and look for nodes whose is_collaborating column is TRUE. Subclusters whose node count is less than or equal to the number of shards in a namespace only have participating nodes. Subclusters that have more nodes than a namespace's shard count assign the "extra" nodes the role of collaborators. The differences between the two types of nodes are not important for when you are executing queries. The two types just relate to how Vertica organizes the nodes to execute ECS-enabled queries.

This example shows how to get the list of nodes that are participating or collaborating in resolving queries for the current session:

=> SELECT node_name, shard_name, is_collaborating, is_participating
        FROM V_CATALOG.SESSION_SUBSCRIPTIONS
        WHERE is_participating = TRUE OR is_collaborating = TRUE
        ORDER BY shard_name, node_name;
      node_name       | shard_name  | is_collaborating | is_participating
----------------------+-------------+------------------+------------------
 v_verticadb_node0004 | replica     | f                | t
 v_verticadb_node0005 | replica     | f                | t
 v_verticadb_node0006 | replica     | t                | f
 v_verticadb_node0007 | replica     | f                | t
 v_verticadb_node0008 | replica     | t                | f
 v_verticadb_node0009 | replica     | t                | f
 v_verticadb_node0007 | segment0001 | f                | t
 v_verticadb_node0008 | segment0001 | t                | f
 v_verticadb_node0005 | segment0002 | f                | t
 v_verticadb_node0009 | segment0002 | t                | f
 v_verticadb_node0004 | segment0003 | f                | t
 v_verticadb_node0006 | segment0003 | t                | f
(12 rows)

You can see that nodes 4, 5, and 7 are participating, and nodes 6, 8, and 9 are collaborating.

You can also see that ECS is enabled by looking at an EXPLAIN plan for a query. At the top of the plan for an ECS-enabled query is the statement "this query involves non-participating nodes." These non-participating nodes are the collaborating nodes that are splitting the data in the shard with the participating nodes. The plan also lists the nodes taking part in the query.

This example shows an explain plan for an ECS-enabled query in a six-node subcluster in a three-shard namespace:

=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
     FROM online_sales.online_sales_fact
     INNER JOIN online_sales.call_center_dimension
     ON (online_sales.online_sales_fact.call_center_key
         = online_sales.call_center_dimension.call_center_key
         AND sale_date_key = 156)
     ORDER BY sales_dollar_amount DESC;

                                      QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes. Crunch scaling
 strategy preserves data segmentation
 ------------------------------

 EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
 FROM online_sales.online_sales_fact
 INNER JOIN online_sales.call_center_dimension
 ON (online_sales.online_sales_fact.call_center_key
     = online_sales.call_center_dimension.call_center_key
     AND sale_date_key = 156)
 ORDER BY sales_dollar_amount DESC;

 Access Path:
 +-SORT [Cost: 6K, Rows: 754K] (PATH ID: 1)
 |  Order: online_sales_fact.sales_dollar_amount DESC
 |  Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005,
 |     v_verticadb_node0006, v_verticadb_node0008, v_verticadb_node0009
 | +---> JOIN MERGEJOIN(inputs presorted) [Cost: 530, Rows: 754K (202 RLE)] (PATH ID: 2)
 | |      Join Cond: (online_sales_fact.call_center_key = call_center_dimension.call_center_key)
 | |      Materialize at Output: online_sales_fact.sales_quantity,
 | |            online_sales_fact.sales_dollar_amount, online_sales_fact.transaction_type
 | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | |            v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
 | |            v_verticadb_node0009
 | | +-- Outer -> STORAGE ACCESS for online_sales_fact [Cost: 13, Rows: 754K (202 RLE)] (PATH ID: 3)
 | | |      Projection: online_sales.online_sales_fact_DBD_18_seg_vmart_b0
 | | |      Materialize: online_sales_fact.call_center_key
 | | |      Filter: (online_sales_fact.sale_date_key = 156)
 | | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | | |          v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
 | | |          v_verticadb_node0009
 | | |      Runtime Filter: (SIP1(MergeJoin): online_sales_fact.call_center_key)
 | | +-- Inner -> STORAGE ACCESS for call_center_dimension [Cost: 17, Rows: 200] (PATH ID: 4)
 | | |      Projection: online_sales.call_center_dimension_DBD_16_seg_vmart_b0
 | | |      Materialize: call_center_dimension.call_center_key, call_center_dimension.cc_name
 | | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
                v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
                v_verticadb_node0009
 . . .

Taking advantage of ECS

To take advantage of ECS, create a secondary subcluster where the number of nodes is a multiple of the number of shards in a namespace. For example, in a 12-shard namespace, create a subcluster that contains a multiple of 12 nodes such as 24 or 36. The number of nodes must be a multiple of the number of shards to evenly distribute the data in the namespace across the nodes in the subcluster. See Subclusters for more information.

Once you have created the subcluster, have users connect to it and run their analytic queries. Vertica automatically enables ECS in the subcluster because it has more nodes than there are shards in a namespace.

How the optimizer assigns data responsibilities to nodes

The optimizer has two strategies to choose from when dividing the data in a shard among its subscribing nodes. One strategy is optimized for queries that use data segmentation. Queries that contain a JOIN or GROUP BY clause rely on data segmentation. The other strategy is for queries that do not need segmentation.

By default, the optimizer automatically chooses the strategy to use. For most queries, the automatically-chosen strategy results in faster query performance. For some queries, you may want to manually override the strategy using hints. In a small number of queries, ECS does not help performance. In these cases, you can disable ECS. See Manually choosing an ECS strategy for details.

4 - Manually choosing an ECS strategy

When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries.

When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries. For each shard, the optimizer divides responsibility for processing shard data among its subscribing nodes, using one of the following strategies:

Strategy Description
I/O-optimized Optimizer divides the list of ROS containers in the shard among the subscribing nodes. Use this strategy when nodes must fetch the data for the query from communal storage, rather than the depot. Nodes only fetch the ROS containers they need to resolve the query from communal storage, reducing the amount of data each needs to transfer from communal storage. Due to the arbitrary division of data among the nodes, this strategy does not support query optimizations that rely on data segmentation.
Compute-optimized Optimizer uses data segmentation to assign portions to each subscribing node. The nodes scan the entire shard, but use sub-segment filtering to find their assigned segments of the data. Use this strategy when most data for the query is in the depot, because nodes must scan the entire contents of the shard. Because this strategy uses data segmentation, it supports optimizations such as local joins that the I/O-optimized strategy cannot.

The optimizer automatically chooses a strategy based on whether the query can take advantage of data segmentation. You can tell which strategy the optimizer chooses for a query by using EXPLAIN. The top of the plan explanation states whether ECS is preserving segmentation. For example, this simple query on a single table does not need to use segmentation, so it uses the I/O-optimized strategy:

=> EXPLAIN SELECT employee_last_name,
            employee_first_name,employee_age
            FROM employee_dimension
            ORDER BY employee_age DESC;

                                 QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------
. . .

A more complex query using a JOIN results in ECS preserving data segmentation by using the compute-optimized strategy. The query plan tells you that segmentation is preserved:

=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
    FROM online_sales.online_sales_fact
    INNER JOIN online_sales.call_center_dimension
    ON (online_sales.online_sales_fact.call_center_key
        = online_sales.call_center_dimension.call_center_key
        AND sale_date_key = 156)
    ORDER BY sales_dollar_amount DESC;

                                     QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
 . . .

In most cases, the optimizer chooses the best strategy to use to split the data among the nodes subscribing to the same shard. However, you might occasionally find that some queries perform poorly. In these cases, the query can embed the ECSMODE hint to specify which strategy to use, or even disable ECS.

Setting the ECS strategy for individual queries

You can use the ECSMODE hint in a query to force the optimizer to use a specific ECS strategy (or disable ECS entirely). The ECSMODE hint takes one of the following arguments:

  • AUTO: The optimizer chooses the strategy to use, useful only if ECS mode is set at the session level (see Setting the ECS Strategy for the Session or Database).

  • IO_OPTIMIZED: Use I/O-optimized strategy.

  • COMPUTE_OPTIMIZED: Use compute-optimized strategy.

  • NONE: Disable use of ECS for this query. Only participating nodes are involved in query execution; collaborating nodes are not.

The following example shows the query plan for a simple single-table query that is forced to use the compute-optimized strategy:

=> EXPLAIN SELECT /*+ECSMode(COMPUTE_OPTIMIZED)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

                                   QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
. . .

This example disable ECS in a six-node cluster in a three-shard database:

=> EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

                                     QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

 Access Path:
 +-SORT [Cost: 243, Rows: 10K] (PATH ID: 1)
 |  Order: employee_dimension.employee_age DESC
 |  Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005
 | +---> STORAGE ACCESS for employee_dimension [Cost: 71, Rows: 10K] (PATH ID: 2)
 | |      Projection: public.employee_dimension_DBD_8_seg_vmart_b0
 | |      Materialize: employee_dimension.employee_first_name,
 | |      employee_dimension.employee_last_name, employee_dimension.employee_age
 | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | |          v_verticadb_node0005
 . . .

Note that this query plan lacks the "this query involves non-participating nodes" statement, indicating that it does not use ECS. It also lists just three participating nodes. These nodes are marked as participating in the V_CATALOG.SESSION_SUBSCRIPTIONS system table.

Setting the ECS strategy for the session or database

You can use the ECSMode configuration parameter to set the ECS strategy for the current session. This parameter accepts the same values as the ECSMODE hint except NONE, which is valid only for individual queries.

The following example demonstrates using the configuration parameter to force a simple query to use the COMPUTE_OPTIMIZED strategy. It then sets the parameter back to its default value of AUTO:

=> EXPLAIN SELECT employee_first_name,employee_age
    FROM employee_dimension ORDER BY employee_age DESC;

                                 QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------
. . .

=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT employee_first_name,employee_age
    FROM employee_dimension ORDER BY employee_age DESC;


                                  QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
 . . .

=> ALTER SESSION SET ECSMode = 'AUTO';
ALTER SESSION

Individual query hints override the session-level settings. This example sets the session default to use COMPUTE_OPTIMIZED, then restores the default behavior for a query by using the ECSMode hint with the value AUTO:

=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT /*+ECSMode(AUTO)*/ employee_first_name,employee_age
   FROM employee_dimension ORDER BY employee_age DESC;


                                  QUERY PLAN
-----------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------

Note that setting the ECSMode hint to AUTO let the optimizer pick the I/O-optimized strategy (which does not preserve segmentation) instead of using the compute-optimized strategy set at the session level.

You can also set the ECS strategy at the database level using ALTER DATABASE. However, doing so overrides the Vertica optimizer's settings for all users in all subclusters that use ECS. Before setting the ECS strategy at the database level, verify that the majority of the queries run by all users of the ECS-enabled subclusters must have the optimizer's default behavior overridden. If not, then use the session or query-level settings to override the optimizer for just the queries that benefit from a specific strategy.