ANALYZE_STATISTICS

Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table.

Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table. The function skips columns of complex data types. You can set the scope of the collection at several levels:

  • Database

  • Table

  • Table column

By default, Vertica analyzes multiple columns in a single-query execution plan, depending on resource limits. This multi-column analysis reduces plan execution latency and speeds up analysis of relatively small tables with many columns.

Vertica writes statistics to the database catalog. The query optimizer uses this collected data to create query plans. Without this data, the query optimizer assumes uniform distribution of data values and equal storage usage for all projections.

You can cancel statistics collection with CTRL+C or by calling INTERRUPT_STATEMENT.

ANALYZE_STATISTICS is an alias of the function ANALYZE_HISTOGRAM, which is no longer documented.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ANALYZE_STATISTICS ('[[[database.]schema.]table]' [, 'column-list' [, percent ]]  )

Returns

0—Success

If an error occurs, refer to vertica.log for details.

Arguments

[[database.]schema].table
Table on which to collect data. If set to an empty string, Vertica collects statistics for all database tables and their projections. The default schema is public. If you specify a database, it must be the current database.
column-list
Comma-delimited list of columns in table, typically predicate columns. Vertica narrows the scope of the data collection to the specified columns. Columns of complex types are not supported.

If you alter a table to add a column and populate its contents with either default or other values, call ANALYZE_STATISTICS on this column to get the most current statistics.

percent
Percentage of data to read from disk (not the amount to analyze), a float between 0 and 100. The default value is 10.

Analyzing a higher percentage takes proportionally longer to process, but produces a higher level of sampling accuracy.

Privileges

Non-superuser:

  • Schema: USAGE

  • Table: One of INSERT, DELETE, or UPDATE

Restrictions

  • Vertica supports ANALYZE_STATISTICS on local and global temporary tables. In both cases, you can obtain statistics only on tables that are created with the option ON COMMIT PRESERVE ROWS. Otherwise, Vertica deletes table content when committing the current transaction, so no table data is available for analysis.

  • Vertica collects no statistics from the following projections:

    • Live aggregate and Top-K projections
    • Projections that are defined to include a SQL function within an expression
  • Vertica collects no statistics on columns of ARRAY, SET, or ROW types.

Examples

See Collecting table statistics.

See also

ANALYZE_STATISTICS_PARTITION