This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Analyzing workloads

If queries perform suboptimally, use Workload Analyzer to get tuning recommendations for them and hints about optimizing database objects.

If queries perform suboptimally, use Workload Analyzer to get tuning recommendations for them and hints about optimizing database objects. Workload Analyzer is a Vertica utility that analyzes system information in Vertica system tables.

Workload Analyzer identifies the root causes of poor query performance through intelligent monitoring of query execution, workload history, resources, and configurations. It then returns a set of tuning recommendations based on statistics, system and data collector events, and database/table/projection design. Use these recommendations to tune query performance, quickly and easily.

You can run Workload Analyzer in two ways:

See Workload analyzer recommendations for common issues that Workload Analyzer finds, and recommendations.

1 - Getting tuning recommendations

Call the function ANALYZE_WORKLOAD to get tuning recommendations for queries and database objects.

Call the function ANALYZE_WORKLOAD to get tuning recommendations for queries and database objects. The function arguments specify what events to analyze and when.

Setting scope and time span

ANALYZE_WORKLOAD's scope argument determines what to analyze:

This argument... Returns Workload Analyzer recommendations for...
'' (empty string) All database objects
Table name A specific table
Schema name All objects in the specified schema

The optional since-time argument specifies to return values from all in -scope events starting from since-time and continuing to the current system status. If you omit since_time, ANALYZE_WORKLOAD returns recommendations for events since the last recorded time that you called the function. You must explicitly cast the since-time string value to either TIMESTAMP or TIMESTAMPTZ.

The following examples show four ways to express the since-time argument with different formats. All queries return the same result for workloads on table t1 since October 4, 2012:

=> SELECT ANALYZE_WORKLOAD('t1', TIMESTAMP '2012-10-04 11:18:15');
=> SELECT ANALYZE_WORKLOAD('t1', '2012-10-04 11:18:15'::TIMESTAMPTZ);
=> SELECT ANALYZE_WORKLOAD('t1', 'October 4, 2012'::TIMESTAMP);
=> SELECT ANALYZE_WORKLOAD('t1', '10-04-12'::TIMESTAMPTZ);

Saving function results

Instead of analyzing events since a specific time, you can save results from ANALYZE_WORKLOAD, by setting the function's second argument to true. The default is false, and no results are saved. After saving function results, subsequent calls to ANALYZE_WORKLOAD analyze only events since you last saved returned data, and ignore all previous events.

For example, the following statement returns recommendations for all database objects in all schemas and records this analysis invocation.

=> SELECT ANALYZE_WORKLOAD('', true);

The next invocation of ANALYZE_WORKLOAD analyzes events from this point forward.

Observation count and time

The observation_count column returns an integer that represents the total number of events Workload Analyzerobserved for this tuning recommendation. In each case above, Workload Analyzer is making its first recommendation. Null results in observation_time only mean that the recommendations are from the current system status instead of from a prior event.

Tuning targets

The tuning_parameter column returns the object on which Workload Analyzer recommends that you apply the tuning action. The parameter of release in the example above notifies the DBA to set a password for user release.

Tuning recommendations and costs

Workload Analyzer's output returns a brief description of tasks you should consider in the tuning_description column, along with a SQL command you can run, where appropriate, in the tuning_command column. In records 1 and 2 above, Workload Analyzer recommends that you run the Database Designer on two tables, and in record 3 recommends setting a user's password. Record 3 also provides the ALTER USER command to run because the tuning action is a SQL command.

Output in the tuning_cost column indicates the cost of running the recommended tuning command:

  • LOW: Running the tuning command has minimal impact on resources. You can perform the tuning operation at any time, like changing the user's password in Record 3 above.

  • MEDIUM: Running the tuning command has moderate impact on resources.

  • HIGH: Running the tuning command has maximum impact on resources. Depending on the size of your database or table, consider running high-cost operations during off-peak load times.

Examples

The following statement tells Workload Analyzer to analyze all events for the locations table:

=> SELECT ANALYZE_WORKLOAD('locations');

Workload Analyzer returns with a recommendation that you run the Database Designer on the table, an operation that, depending on the size of locations, might incur a high cost:

-[ RECORD 1 ]----------+------------------------------------------------
observation_count      | 1
first_observation_time |
last_observation_time  |
tuning_parameter       | public.locations
tuning_description     | run database designer on table public.locations
tuning_command         |
tuning_cost            | HIGH

The following statement analyzes workloads on all tables in the VMart example database since one week before today:

=> SELECT ANALYZE_WORKLOAD('', NOW() - INTERVAL '1 week');

Workload Analyzer returns with the following results:

-[ RECORD 1 ]----------+------------------------------------------------------
observation_count      | 4
first_observation_time | 2012-02-17 13:57:17.799003-04
last_observation_time  | 2011-04-22 12:05:26.856456-04
tuning_parameter       | store.store_orders_fact.date_ordered
tuning_description     | analyze statistics on table column store.store_orders_fact.date_ordered
tuning_command         | select analyze_statistics('store.store_orders_fact.date_ordered');
tuning_cost            | MEDIUM
-[ RECORD 2 ]---------+------------------------------------------------------
...
-[ RECORD 14 ]---------+-----------------------------------------------------
observation_count      | 2
first_observation_time | 2012-02-19 17:52:03.022644-04
last_observation_time  | 2012-02-19 17:52:03.02301-04
tuning_parameter       | SELECT x FROM t WHERE x > (SELECT SUM(DISTINCT x) FROM
                       | t GROUP BY y) OR x < 9;
tuning_description     | consider incremental design on query
tuning_command         |
tuning_cost            | HIGH

Workload Analyzer finds two issues:

  • In record 1, the date_ordered column in the store.store_orders_fact table likely has stale statistics, so Workload Analyzer suggests running ANALYZE_STATISTICS on that column. The function output also returns the query to run. For example:

    => SELECT ANALYZE_STATISTICS('store.store_orders_fact.date_ordered');
    
  • In record 14, Workload Analyzer identifies an under-performing query in the tuning_parameter column. It recommends to use the Database Designer to run an incremental design. Workload Analyzer rates the potential cost as HIGH.

System table recommendations

You can also get tuning recommendations by querying system table TUNING_RECOMMENDATIONS, which returns tuning recommendation results from the last ANALYZE_WORKLOAD call.

=> SELECT * FROM tuning_recommendations;

System information that Workload Analyzer uses for its recommendations is held in SQL system tables, so querying the TUNING_RECOMMENDATIONS system table does not run Workload Analyzer.

See also

Collecting database statistics

2 - Workload analyzer recommendations

Workload Analyzer monitors database activity and logs recommendations as needed in system table TUNING_RECOMMENDATIONS.

Workload Analyzer monitors database activity and logs recommendations as needed in system table TUNING_RECOMMENDATIONS. When you run Workload Analyzer, the utility returns the following information:

  • Description of the object that requires tuning

  • Recommended action

  • SQL command to implement the recommendation

Common issues and recommendations

Issue Recommendation
No custom resource pools, user queries are typically handled by the GENERAL resource pool. Create custom resource pools to handle queries from specific users.
A projection is identified as rarely or never used to execute queries: Remove the projection with DROP PROJECTION
User with admin privileges has empty password. Set the password for user with ALTER USER...IDENTIFIED BY.
Table has too many partitions. Alter the table's partition expression with ALTER TABLE. Also consider grouping partitions and hierarchical partitioning.
Partitioned table data is not fully reorganized after repartitioning. Reorganize data in the partitioned table with ALTER TABLE...REORGANIZE.
Table has multiple partition keys within the same ROS container.
Tuple Mover's MoveOutInterval parameter setting is greater than the default value. Decrease the parameter setting, or reset the parameter to its default setting.
Average CPU usage exceeds 95% for 20 minutes. Check system processes, or change resource pool settings of parameters PLANNEDCONCURRENCY and/or MAXCONCURRENCY. For details, see ALTER RESOURCE POOL and Built-in resource pools configuration.
Excessive swap activity; average memory usage exceeds 99% for 10 minutes. Check system processes
A table does not have any Database Designer-designed projections. Run database designer on the table. For details, see Incremental design.
Table statistics are stale. Run ANALYZE_STATISTICS on table columns. See also Collecting database statistics.
Data distribution in segmented projection is skewed. Resegment projection on high-cardinality columns. For details, see Designing for segmentation.
Attempts to execute a query generated a GROUP BY spill event. Consider running an incremental design on the query.
Internal configuration parameter is not the same across nodes. Reset configuration parameter with ALTER DATABASE...SET
LGE threshold setting is lower than the default setting. Workload Analyzer does not trigger a tuning recommendation for this scenario unless you altered settings and/or services under the guidance of technical support.
Tuple Mover is disabled.
Too many ROS containers since the last mergeout operation; configuration parameters are set lower than the default.
Too many ROS containers since the last mergeout operation; the TM Mergeout service is disabled.