This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Best practices for managing workload resources

This section provides general guidelines and best practices on how to set up and tune resource pools for various common scenarios.

This section provides general guidelines and best practices on how to set up and tune resource pools for various common scenarios.

1 - Basic principles for scalability and concurrency tuning

A Vertica database runs on a cluster of commodity hardware.

A Vertica database runs on a cluster of commodity hardware. All loads and queries running against the database take up system resources, such as CPU, memory, disk I/O bandwidth, file handles, and so forth. The performance (run time) of a given query depends on how much resource it has been allocated.

When running more than one query concurrently on the system, both queries are sharing the resources; therefore, each query could take longer to run than if it was running by itself. In an efficient and scalable system, if a query takes up all the resources on the machine and runs in X time, then running two such queries would double the run time of each query to 2X. If the query runs in > 2X, the system is not linearly scalable, and if the query runs in < 2X then the single query was wasteful in its use of resources. Note that the above is true as long as the query obtains the minimum resources necessary for it to run and is limited by CPU cycles. Instead, if the system becomes bottlenecked so the query does not get enough of a particular resource to run, then the system has reached a limit. In order to increase concurrency in such cases, the system must be expanded by adding more of that resource.

In practice, Vertica should achieve near linear scalability in run times, with increasing concurrency, until a system resource limit is reached. When adequate concurrency is reached without hitting bottlenecks, then the system can be considered as ideally sized for the workload.

2 - Setting a runtime limit for queries

You can set a limit for the amount of time a query is allowed to run.

You can set a limit for the amount of time a query is allowed to run. You can set this limit at three levels, listed in descending order of precedence:

  1. The resource pool to which the user is assigned.

  2. User profile with RUNTIMECAP configuredby CREATE USER/ ALTER USER

  3. Session queries, set by SET SESSION RUNTIMECAP

In all cases, you set the runtime limit with an interval value that does not exceed one year. When you set runtime limit at multiple levels, Vertica always uses the shortest value. If a runtime limit is set for a non-superuser, that user cannot set any session to a longer runtime limit. Superusers can set the runtime limit for other users and for their own sessions, to any value up to one year, inclusive.

Example

user1 is assigned to the ad_hoc_queries resource pool:

=> CREATE USER user1 RESOURCE POOL ad_hoc_queries;

RUNTIMECAP for user1 is set to 1 hour:

=> ALTER USER user1 RUNTIMECAP '60 minutes';

RUNTIMECAP for the ad_hoc_queries resource pool is set to 30 minutes:

=> ALTER RESOURCE POOL ad_hoc_queries RUNTIMECAP '30 minutes';

In this example, Vertica terminates user1's queries if they exceed 30 minutes. Although the user1's runtime limit is set to one hour, the pool on which the query runs, which has a 30-minute runtime limit, has precedence.

See also

3 - Handling session socket blocking

A session socket can be blocked while awaiting client input or output for a given query.

A session socket can be blocked while awaiting client input or output for a given query. Session sockets are typically blocked for numerous reasons—for example, when the Vertica execution engine transmits data to the client, or a COPY LOCAL operation awaits load data from the client.

In rare cases, a session socket can remain indefinitely blocked. For example, a query times out on the client, which tries to forcibly cancel the query, or relies on the session RUNTIMECAP setting to terminate it. In either case, if the query ends while awaiting messages or data, the socket can remain blocked and the session hang until it is forcibly closed.

Configuring a grace period

You can configure the system with a grace period, during which a lagging client or server can catch up and deliver a pending response. If the socket is blocked for a continuous period that exceeds the grace period setting, the server shuts down the socket and throws a fatal error. The session is then terminated. If no grace period is set, the query can maintain its block on the socket indefinitely.

You should set the session grace period high enough to cover an acceptable range of latency and avoid closing sessions prematurely—for example, normal client-side delays in responding to the server. Very large load operations might require you to adjust the session grace period as needed.

You can set the grace period at four levels, listed in descending order of precedence:

  1. Session (highest)

  2. User

  3. Node

  4. Database

Setting grace periods for the database and nodes

At the database and node levels, you set the grace period to any interval up to 20 days, through configuration parameter BlockedSocketGracePeriod:

  • ALTER DATABASE db-name SET BlockedSocketGracePeriod = 'interval';
  • ALTER NODE node-name SET BlockedSocketGracePeriod = 'interval';

By default, the grace period for both levels is set to an empty string, which allows unlimited blocking.

Setting grace periods for users and sessions

You can set the grace period for individual users and for a given session, as follows:

A user can set a session to any interval equal to or less than the grace period set for that user. Superusers can set the grace period for other users, and for their own sessions, to any value up to 20 days, inclusive.

Examples

Superuser dbadmin sets the database grace period to 6 hours. This limit only applies to non-superusers. dbadmin can sets the session grace period for herself to any value up to 20 days—in this case, 10 hours:

=> ALTER DATABASE VMart SET BlockedSocketGracePeriod = '6 hours';
ALTER DATABASE
=> SHOW CURRENT BlockedSocketGracePeriod;
  level   |           name           | setting
----------+--------------------------+---------
 DATABASE | BlockedSocketGracePeriod | 6 hours
(1 row)

=> SET SESSION GRACEPERIOD '10 hours';
SET
=> SHOW GRACEPERIOD;
    name     | setting
-------------+---------
 graceperiod | 10:00
(1 row)

dbadmin creates user user777 created with no grace period setting. Thus, the effective grace period for user777 is derived from the database setting of BlockedSocketGracePeriod, which is 6 hours. Any attempt by user777 to set the session grace period to a value greater than 6 hours returns with an error:


=> CREATE USER user777;
=> \c - user777
You are now connected as user "user777".
=> SHOW GRACEPERIOD;
    name     | setting
-------------+---------
 graceperiod | 06:00
(1 row)

=> SET SESSION GRACEPERIOD '7 hours';
ERROR 8175:  The new period 07:00 would exceed the database limit of 06:00

dbadmin sets a grace period of 5 minutes for user777. Now, user777 can set the session grace period to any value equal to or less than the user-level setting:


=> \c
You are now connected as user "dbadmin".
=> ALTER USER user777 GRACEPERIOD '5 minutes';
ALTER USER
=> \c - user777
You are now connected as user "user777".
=> SET SESSION GRACEPERIOD '6 minutes';
ERROR 8175:  The new period 00:06 would exceed the user limit of 00:05
=> SET SESSION GRACEPERIOD '4 minutes';
SET

4 - Managing workloads with resource pools and user profiles

The scenarios in this section describe common workload-management issues, and provide solutions with examples.

The scenarios in this section describe common workload-management issues, and provide solutions with examples.

4.1 - Periodic batch loads

You do batch loads every night, or occasionally (infrequently) during the day.

Scenario

You do batch loads every night, or occasionally (infrequently) during the day. When loads are running, it is acceptable to reduce resource usage by queries, but at all other times you want all resources to be available to queries.

Solution

Create a separate resource pool for loads with a higher priority than the preconfigured setting on the build-in GENERAL pool.

In this scenario, nightly loads get preference when borrowing memory from the GENERAL pool. When loads are not running, all memory is automatically available for queries.

Example

Create a resource pool that has higher priority than the GENERAL pool:

  1. Create resource pool load_pool with PRIORITY set to 10:

    => CREATE RESOURCE POOL load_pool PRIORITY 10;
    
  2. Modify user load_user to use the new resource pool:

    => ALTER USER load_user RESOURCE POOL load_pool;
    

4.2 - CEO query

The CEO runs a report every Monday at 9AM, and you want to be sure that the report always runs.

Scenario

The CEO runs a report every Monday at 9AM, and you want to be sure that the report always runs.

Solution

To ensure that a certain query or class of queries always gets resources, you could create a dedicated pool for it as follows:

  1. Using the PROFILE command, run the query that the CEO runs every week to determine how much memory should be allocated:

    => PROFILE SELECT DISTINCT s.product_key, p.product_description
    -> FROM store.store_sales_fact s, public.product_dimension p
    -> WHERE s.product_key = p.product_key AND s.product_version = p.product_version
    -> AND s.store_key IN (
    ->  SELECT store_key FROM store.store_dimension
    ->  WHERE store_state = 'MA')
    -> ORDER BY s.product_key;
    
  2. At the end of the query, the system returns a notice with resource usage:

    NOTICE:  Statement is being profiled.HINT:  select * from v_monitor.execution_engine_profiles where
    transaction_id=45035996273751349 and statement_id=6;
    NOTICE:  Initiator memory estimate for query: [on pool general: 1723648 KB,
    minimum: 355920 KB]
    
  3. Create a resource pool with MEMORYSIZE reported by the above hint to ensure that the CEO query has at least this memory reserved for it:

    => CREATE RESOURCE POOL ceo_pool MEMORYSIZE '1800M' PRIORITY 10;
    CREATE RESOURCE POOL
    => \x
    Expanded display is on.
    => SELECT * FROM resource_pools WHERE name = 'ceo_pool';
    -[ RECORD 1 ]-------+-------------
    name                | ceo_pool
    is_internal         | f
    memorysize          | 1800M
    maxmemorysize       |
    priority            | 10
    queuetimeout        | 300
    plannedconcurrency  | 4
    maxconcurrency      |
    singleinitiator     | f
    
  4. Assuming the CEO report user already exists, associate this user with the above resource pool using ALTER USER statement.

    => ALTER USER ceo_user RESOURCE POOL ceo_pool;
    
  5. Issue the following command to confirm that the ceo_user is associated with the ceo_pool:

    => SELECT * FROM users WHERE user_name ='ceo_user';
    -[ RECORD 1 ]-+------------------
    user_id       | 45035996273713548
    user_name     | ceo_user
    is_super_user | f
    resource_pool | ceo_pool
    memory_cap_kb | unlimited
    

If the CEO query memory usage is too large, you can ask the Resource Manager to reduce it to fit within a certain budget. See Query budgeting.

4.3 - Preventing runaway queries

Joe, a business analyst often runs big reports in the middle of the day that take up the whole machine's resources.You want to prevent Joe from using more than 100MB of memory, and you want to also limit Joe's queries to run for less than 2 hours.

Scenario

Joe, a business analyst often runs big reports in the middle of the day that take up the whole machine's resources.You want to prevent Joe from using more than 100MB of memory, and you want to also limit Joe's queries to run for less than 2 hours.

Solution

User resource allocation provides a solution to this scenario. To restrict the amount of memory Joe can use at one time, set a MEMORYCAP for Joe to 100MB using the ALTER USER command. To limit the amount of time that Joe's query can run, set a RUNTIMECAP to 2 hours using the same command. If any query run by Joe takes up more than its cap, Vertica rejects the query.

If you have a whole class of users whose queries you need to limit, you can also create a resource pool for them and set RUNTIMECAP for the resource pool. When you move these users to the resource pool, Vertica limits all queries for these users to the RUNTIMECAP you specified for the resource pool.

Example

=> ALTER USER analyst_user MEMORYCAP '100M' RUNTIMECAP '2 hours';

If Joe attempts to run a query that exceeds 100MB, the system returns an error that the request exceeds the memory session limit, such as the following example:

\i vmart_query_04.sqlvsql:vmart_query_04.sql:12: ERROR:  Insufficient resources to initiate plan
on pool general [Request exceeds memory session limit: 137669KB > 102400KB]

Only the system database administrator (dbadmin) can increase only the MEMORYCAP setting. Users cannot increase their own MEMORYCAP settings and will see an error like the following if they attempt to edit their MEMORYCAP or RUNTIMECAP settings:

ALTER USER analyst_user MEMORYCAP '135M';
ROLLBACK:  permission denied

4.4 - Restricting resource usage of ad hoc query application

You recently made your data warehouse available to a large group of users who are inexperienced with SQL.

Scenario

You recently made your data warehouse available to a large group of users who are inexperienced with SQL. Some users run reports that operate on a large number of rows and overwhelm the system. You want to throttle system usage by these users.

Solution

  1. Create a resource pool for ad hoc applications where MAXMEMORYSIZE is equal to MEMORYSIZE. This prevents queries in that resource pool from borrowing resources from the GENERAL pool. Also, set RUNTIMECAP to limit the maximum duration of ad hoc queries:

    => CREATE RESOURCE POOL adhoc_pool
        MEMORYSIZE '200M'
           MAXMEMORYSIZE '200M'
        RUNTIMECAP '20 seconds'
        PRIORITY 0
        QUEUETIMEOUT 300
        PLANNEDCONCURRENCY 4;
    => SELECT pool_name, memory_size_kb, queueing_threshold_kb
        FROM V_MONITOR.RESOURCE_POOL_STATUS WHERE pool_name='adhoc_pool';
     pool_name  | memory_size_kb | queueing_threshold_kb
    ------------+----------------+-----------------------
     adhoc_pool |         204800 |                153600
    (1 row)
    
  2. Associate this resource pool with database users who use the application to connect to the database.

    => ALTER USER app1_user RESOURCE POOL adhoc_pool;
    

4.5 - Setting a hard limit on concurrency for an application

For billing purposes, analyst Jane would like to impose a hard limit on concurrency for this application.

Scenario

For billing purposes, analyst Jane would like to impose a hard limit on concurrency for this application. How can she achieve this?

Solution

The simplest solution is to create a separate resource pool for the users of that application and set its MAXCONCURRENCY to the desired concurrency level. Any queries beyond MAXCONCURRENCY are queued.

Example

In this example, there are four billing users associated with the billing pool. The objective is to set a hard limit on the resource pool so a maximum of three concurrent queries can be executed at one time. All other queries will queue and complete as resources are freed.

=> CREATE RESOURCE POOL billing_pool MAXCONCURRENCY 3 QUEUETIMEOUT 2;
=> CREATE USER bill1_user RESOURCE POOL billing_pool;
=> CREATE USER bill2_user RESOURCE POOL billing_pool;
=> CREATE USER bill3_user RESOURCE POOL billing_pool;
=> CREATE USER bill4_user RESOURCE POOL billing_pool;
=> \x
Expanded display is on.

=> select maxconcurrency,queuetimeout from resource_pools where name = 'billing_pool';
maxconcurrency | queuetimeout
----------------+--------------
             3 |            2
(1 row)
> SELECT reason, resource_type, rejection_count  FROM RESOURCE_REJECTIONS
WHERE pool_name = 'billing_pool' AND node_name ilike '%node0001';
reason                                | resource_type | rejection_count
---------------------------------------+---------------+-----------------
Timedout waiting for resource request | Queries       |              16
(1 row)

If queries are running and do not complete in the allotted time (default timeout setting is 5 minutes), the next query requested gets an error similar to the following:

ERROR:  Insufficient resources to initiate plan on pool billing_pool [Timedout waiting for resource request: Request exceeds limits:
Queries Exceeded: Requested = 1, Free = 0 (Limit = 3, Used = 3)]

The table below shows that there are three active queries on the billing pool.

=> SELECT pool_name, thread_count, open_file_handle_count, memory_inuse_kb FROM RESOURCE_ACQUISITIONS
WHERE pool_name = 'billing_pool';
pool_name    | thread_count | open_file_handle_count | memory_inuse_kb
--------------+--------------+------------------------+-----------------
billing_pool |            4 |                      5 |          132870
billing_pool |            4 |                      5 |          132870
billing_pool |            4 |                      5 |          132870
(3 rows)

4.6 - Handling mixed workloads: batch versus interactive

You have a web application with an interactive portal.

Scenario

You have a web application with an interactive portal. Sometimes when IT is running batch reports, the web page takes a long time to refresh and users complain, so you want to provide a better experience to your web site users.

Solution

The principles learned from the previous scenarios can be applied to solve this problem. The basic idea is to segregate the queries into two groups associated with different resource pools. The prerequisite is that there are two distinct database users issuing the different types of queries. If this is not the case, do consider this a best practice for application design.

**Method 1
**Create a dedicated pool for the web page refresh queries where you:

  • Size the pool based on the average resource needs of the queries and expected number of concurrent queries issued from the portal.

  • Associate this pool with the database user that runs the web site queries. SeeCEO query for information about creating a dedicated pool.

This ensures that the web site queries always run and never queue behind the large batch jobs. Leave the batch jobs to run off the GENERAL pool.

For example, the following pool is based on the average resources needed for the queries running from the web and the expected number of concurrent queries. It also has a higher PRIORITY to the web queries over any running batch jobs and assumes the queries are being tuned to take 250M each:

=> CREATE RESOURCE POOL web_pool
     MEMORYSIZE '250M'
     MAXMEMORYSIZE NONE
     PRIORITY 10
     MAXCONCURRENCY 5
     PLANNEDCONCURRENCY 1;

**Method 2
**Create a resource pool with fixed memory size. This limits the amount of memory available to batch reports so memory is always left over for other purposes. For details, see Restricting resource usage of ad hoc query application.

For example:

=> CREATE RESOURCE POOL batch_pool
     MEMORYSIZE '4G'
     MAXMEMORYSIZE '4G'
     MAXCONCURRENCY 10;

The same principle can be applied if you have three or more distinct classes of workloads.

4.7 - Setting priorities on queries issued by different users

You want user queries from one department to have a higher priority than queries from another department.

Scenario

You want user queries from one department to have a higher priority than queries from another department.

Solution

The solution is similar to the mixed workload case. In this scenario, you do not limit resource usage; you set different priorities. To do so, create two different pools, each with MEMORYSIZE=0% and a different PRIORITY parameter. Both pools borrow from the GENERAL pool, however when competing for resources, the priority determine the order in which each pool's request is granted. For example:

=> CREATE RESOURCE POOL dept1_pool PRIORITY 5;
=> CREATE RESOURCE POOL dept2_pool PRIORITY 8;

If you find this solution to be insufficient, or if one department's queries continuously starves another department’s users, you can add a reservation for each pool by setting MEMORYSIZE so some memory is guaranteed to be available for each department.

For example, both resources use the GENERAL pool for memory, so you can allocate some memory to each resource pool by using ALTER RESOURCE POOL to change MEMORYSIZE for each pool:

=> ALTER RESOURCE POOL dept1_pool MEMORYSIZE '100M';
=> ALTER RESOURCE POOL dept2_pool MEMORYSIZE '150M';

4.8 - Continuous load and query

You want your application to run continuous load streams, but many have up concurrent query streams.

Scenario

You want your application to run continuous load streams, but many have up concurrent query streams. You want to ensure that performance is predictable.

Solution

The solution to this scenario depends on your query mix. In all cases, the following approach applies:

  1. Determine the number of continuous load streams required. This may be related to the desired load rate if a single stream does not provide adequate throughput, or may be more directly related to the number of sources of data to load. Create a dedicated resource pool for the loads, and associate it with the database user that will perform them. See CREATE RESOURCE POOL for details.

    In general, concurrency settings for the load pool should be less than the number of cores per node. Unless the source processes are slow, it is more efficient to dedicate more memory per load, and have additional loads queue. Adjust the load pool's QUEUETIMEOUT setting if queuing is expected.

  2. Run the load workload for a while and observe whether the load performance is as expected. If the Tuple Mover is not tuned adequately to cover the load behavior, see Managing the tuple mover.

  3. If there is more than one kind of query in the system—for example, some queries must be answered quickly for interactive users, while others are part of a batch reporting process—follow the guidelines in Handling mixed workloads: batch versus interactive.

  4. Let the queries run and observe performance. If some classes of queries do not perform as desired, then you might need to tune the GENERAL pool as outlined in Restricting resource usage of ad hoc query application, or create more dedicated resource pools for those queries. For more information, see CEO query and Handling mixed workloads: batch versus interactive.

See the sections on Managing workloads and CREATE RESOURCE POOL for information on obtaining predictable results in mixed workload environments.

4.9 - Prioritizing short queries at run time

You recently created a resource pool for users who are inexperienced with SQL and who frequently run ad hoc reports.

Scenario

You recently created a resource pool for users who are inexperienced with SQL and who frequently run ad hoc reports. Until now, you managed resource allocation by creating a resource pool where MEMORYSIZE and MAXMEMORYSIZE are equal. This prevented queries in that resource pool from borrowing resources from the GENERAL pool. Now you want to manage resources at run time and prioritize short queries so they are never queued as a result of limited run-time resources.

Solution

  • Set RUNTIMEPRIORITY for the resource pool to MEDIUM or LOW.

  • Set RUNTIMEPRIORITYTHRESHOLD for the resource pool to the duration of queries you want to ensure always run at a high priority.

For example:

=> ALTER RESOURCE POOL ad_hoc_pool RUNTIMEPRIORITY medium RUNTIMEPRIORITYTHRESHOLD 5;

Because RUNTIMEPRIORITYTHRESHOLD is set to 5, all queries in resource pool ad_hoc_pool that complete within 5 seconds run at high priority. Queries that exceeds 5 seconds drop down to the RUNTIMEPRIORITY assigned to the resource pool, MEDIUM.

4.10 - Dropping the runtime priority of long queries

You want most queries in a resource pool to run at a HIGH runtime priority; however, you'd like to be able to drop jobs longer than 1 hour to a lower priority.

Scenario

You want most queries in a resource pool to run at a HIGH runtime priority; however, you'd like to be able to drop jobs longer than 1 hour to a lower priority.

Solution

Set the RUNTIMEPRIORITY for the resource pool to LOW and set the RUNTIMEPRIORITYTHRESHOLD to a number that cuts off only the longest jobs.

Example

To ensure that all queries with a duration of more than 3600 seconds (1 hour) are assigned a low runtime priority, modify the resource pool as follows:

  • Set the RUNTIMEPRIORITY to LOW.

  • Set the RUNTIMETHRESHOLD to 3600

=> ALTER RESOURCE POOL ad_hoc_pool RUNTIMEPRIORITY low RUNTIMEPRIORITYTHRESHOLD 3600;

5 - Tuning built-in pools

The scenarios in this section describe how to tune built-in pools.

The scenarios in this section describe how to tune built-in pools.

5.1 - Restricting Vertica to take only 60% of memory

You have a single node application that embeds Vertica, and some portion of the RAM needs to be devoted to the application process.

Scenario

You have a single node application that embeds Vertica, and some portion of the RAM needs to be devoted to the application process. In this scenario, you want to limit Vertica to use only 60% of the available RAM.

Solution

Set the MAXMEMORYSIZE parameter of the GENERAL pool to the desired memory size. See Resource pool architecture for a discussion on resource limits.

5.2 - Tuning for recovery

You have a large database that contains a single large table with two projections, and with default settings, recovery is taking too long.

Scenario

You have a large database that contains a single large table with two projections, and with default settings, recovery is taking too long. You want to give recovery more memory to improve speed.

Solution

Set PLANNEDCONCURRENCY and MAXCONCURRENCY in the RECOVERY pool to 1, so recovery can take as much memory as possible from the GENERAL pool and run only one thread at once.

5.3 - Tuning for refresh

When a operation is running, system performance is affected and user queries are rejected.

Scenario

When a refresh operation is running, system performance is affected and user queries are rejected. You want to reduce the memory usage of the refresh job.

Solution

Set MEMORYSIZE in the REFRESH pool to a fixed value. The Resource Manager then tunes the refresh query to only use this amount of memory.

5.4 - Tuning tuple mover pool settings

During heavy load operations, you occasionally notice spikes in the number of ROS containers.

Scenario 1

During heavy load operations, you occasionally notice spikes in the number of ROS containers. You would like the Tuple Mover to perform mergeout more aggressively to consolidate ROS containers, and avoid ROS pushback.

Solution

Use ALTER RESOURCE POOL to increase the setting of MAXCONCURRENCY in the TM resource pools. This setting determines how many threads are available for mergeout. By default , this parameter is set to 7. Vertica allocates half the threads to active partitions, and the remaining half to active and inactive partitions as needed. If MAXCONCURRENCY is set to an uneven integer, Vertica rounds up to favor active partitions.

For example, if you increase MAXCONCURRENCY to 9, then Vertica allocates five threads exclusively to active partitions, and allocates the remaining four threads to active and inactive partitions.

Scenario 2

You have a secondary subcluster that is dedicated to time-sensitive analytic queries. You want to limit any other workloads on this subcluster that could interfere with it processing queries while also freeing up memory to perform queries.

By default, each subcluster has a built-in TM resource pool for Tuple Mover operations that makes it eligible to execute Tuple Mover mergeout operations. The TM pool consumes memory that could be used for queries. In addition, the mergeout operation could add a slight overhead to your subcluster's processing. You want to reallocate the memory consumed by the TM pool, and prevent the subcluster from running mergeout operations.

Solution

Use ALTER RESOURCE POOL to override the global TM resource pool for the secondary subcluster, and set both its MAXMEMORYSIZE and MEMORYSIZE to 0. This allows you to use the memory consumed by the global TM pool for use running analytic queries and prevents the subcluster being assigned TM mergeout operations to execute.

5.5 - Tuning for machine learning

A large number of machine learning functions are running, and you want to give them more memory to improve performance.

Scenario

A large number of machine learning functions are running, and you want to give them more memory to improve performance.

Solution

Vertica executes machine learning functions in the BLOBDATA resource pool. To improve performance of machine learning functions and avoid spilling queries to disk, increase the pool's MAXMEMORYSIZE setting with ALTER RESOURCE POOL.

For more about tuning query budgets, see Query budgeting.

See also

6 - Reducing query run time

Query run time depends on the complexity of the query, the number of operators in the plan, data volumes, and projection design.

Query run time depends on the complexity of the query, the number of operators in the plan, data volumes, and projection design. I/O or CPU bottlenecks can cause queries to run slower than expected. You can often remedy high CPU usage with better projection design. High I/O can often be traced to contention caused by joins and sorts that spill to disk. However, no single solution addresses all queries that incur high CPU or I/O usage. You must analyze and tune each queryindividually.

You can evaluate a slow-running query in two ways:

Examining the query plan can reveal one or more of the following:

  • Suboptimal projection sort order

  • Predicate evaluation on an unsorted or unencoded column

  • Use of GROUPBY HASH instead of GROUPBY PIPE

Profiling

Vertica provides profiling mechanisms that help you evaluate database performance at different levels. For example, you can collect profiling data for a single statement, a single session, or for all sessions on all nodes. For details, see Profiling database performance.

7 - Managing workload resources in an Eon Mode database

You primarily control workloads in an Eon Mode database using subclusters.

You primarily control workloads in an Eon Mode database using subclusters. For example, you can create subclusters for specific use cases, such as ETL or query workloads, or you can create subclusters for different groups of users to isolate workloads. Within each subcluster, you can create individual resource pools to optimize resource allocation according to workload. See Managing subclusters for more information about how Vertica uses subclusters.

Global and subcluster-specific resource pools

You can define global resource pool allocations that affect all nodes in the database. You can also create resource pool allocations at the subcluster level. If you create both, the subcluster-level settings override the global settings.

You can use this feature to remove global resource pools that the subcluster does not need. Additionally, you can create a resource pool with settings that are adequate for most subclusters, and then tailor the settings for specific subclusters as needed.

Optimizing ETL and query subclusters

Overriding resource pool settings at the subcluster level allows you to isolate built-in and user-defined resource pools and optimize them by workload. You often assign specific roles to different subclusters:

  • Subclusters dedicated to ETL workloads and DDL statements that alter the database.

  • Subclusters dedicated to running in-depth, long-running analytics queries. These queries need more resources allocated for the best performance.

  • Subclusters that run many short-running "dashboard" queries that you want to finish quickly and run in parallel.

After you define the type of queries executed by each subcluster, you can create a subcluster-specific resource pool that is optimized to improve efficiency for that workload.

The following scenario optimizes 3 subclusters by workload:

  • etl: A subcluster that performs ETL that you want to optimize for Tuple Mover operations.

  • dashboard: A subcluster that you want to designate for short-running queries executed by a large number of users to refresh a web page.

  • analytics: A subcluster that you want to designate for long-running queries.

See Best practices for managing workload resources for additional scenarios about resource pool tuning.

Configure an ETL subcluster to improve TM performance

Vertica chooses the subcluster that has the most ROS containers involved in a mergeout operation in its depot to execute a mergeout (see The Tuple Mover in Eon Mode Databases). Often, a subcluster performing ETL will be the best candidate to perform a mergeout because the data it loaded is involved in the mergeout. You can choose to improve the performance of mergeout operations on a subcluster by altering the TM pool's MAXCONCURRENCY setting to increase the number of threads available for mergeout operations. You cannot change this setting at the subcluster level, so you must set it globally:

=> ALTER RESOURCE POOL TM MAXCONCURRENCY 10;

See Tuning tuple mover pool settings for additional information about Tuple Mover resources.

Configure the dashboard query subcluster

By default, secondary subclusters have memory allocated to Tuple Mover resource pools. This pool setting allows Vertica to assign mergeout operations to the subcluster, which can add a small overhead. If you primarily use a secondary subcluster for queries, the best practice is to reclaim the memory used by the TM pool and prevent mergeout operations being assigned to the subcluster.

To optimize your dashboard query secondary subcluster, set their TM pool's MEMORYSIZE and MAXMEMORYSIZE settings to 0:

=> ALTER RESOURCE POOL TM FOR SUBCLUSTER dashboard MEMORYSIZE '0%'
   MAXMEMORYSIZE '0%';

To confirm the overrides, query the SUBCLUSTER_RESOURCE_POOL_OVERRIDES table:

=> SELECT pool_oid, name, subcluster_name, memorysize, maxmemorysize
          FROM SUBCLUSTER_RESOURCE_POOL_OVERRIDES;

     pool_oid      | name | subcluster_name | memorysize | maxmemorysize
-------------------+------+-----------------+------------+---------------
 45035996273705046 | tm   | dashboard       | 0%         | 0%
(1 row)

To optimize the dashboard subcluster for short-running queries on a web page, create a dash_pool subcluster-level resource pool that uses 70% of the subcluster's memory. Additionally, increase PLANNEDCONCURRENCY to use all of the machine's logical cores, and limit EXECUTIONPARALLELISM to no more than half of the machine's available cores:

=> CREATE RESOURCE POOL dash_pool FOR SUBCLUSTER dashboard
     MEMORYSIZE '70%'
     PLANNEDCONCURRENCY 16
     EXECUTIONPARALLELISM 8;

Configure the analytic query subcluster

To optimize the analytics subcluster for long-running queries, create an analytics_pool subcluster-level resource pool that uses 60% of the subcluster's memory. In this scenario, you cannot allocate more memory to this pool because the nodes in this subcluster still have memory assigned to their TM pools. Additionally, set EXECUTIONPARALLELISM to AUTO to use all cores available on the node to process a query, and limit PLANNEDCONCURRENCY to no more than 8 concurrent queries:

=> CREATE RESOURCE POOL analytics_pool FOR SUBCLUSTER analytics
      MEMORYSIZE '60%'
      EXECUTIONPARALLELISM AUTO
      PLANNEDCONCURRENCY 8;