Initial process for improving query performance

To optimize query performance, begin by performing the following tasks:.

To optimize query performance, begin by performing the following tasks:

  1. Run Database Designer.

  2. Check query events proactively.

  3. Review the query plan.

Run Database Designer

Database Designer creates a physical schema for your database that provides optimal query performance. The first time you run Database Designer, you should create a comprehensive design that includes relevant sample queries and data on which to base the design. If you develop performance issues later, consider loading additional queries that you run frequently and then rerunning Database Designer to create an incremental design.

When you run Database Designer, choose the option, Update Statistics. The Vertica query optimizer uses statistics about the data to create a query plan. Statistics help the optimizer determine:

  • Multiple eligible projections to answer the query

  • The best order in which to perform joins

  • Data distribution algorithms, such as broadcast and resegmentation

If your statistics become out of date, run the Vertica function ANALYZE_STATISTICS function to update statistics for a schema, table, or columns. For more information, see Collecting database statistics.

Check query events proactively

The QUERY_EVENTS system table returns information on query planning, optimization, and execution events.

The EVENT_TYPE column provides various event types:

Review the query plan

A query plan is a sequence of step-like paths that the Vertica query optimizer selects to access or alter information in your Vertica database. There are two ways to get information about the query plan:

  • Run the EXPLAIN command. Each step (path) represents a single operation that the optimizer uses for its execution strategy.

  • Query the QUERY_PLAN_PROFILES system table. This table provides detailed execution status for currently running queries. Output from the QUERY_PLAN_PROFILES table shows the real-time flow of data and the time and resources consumed for each path in each query plan.

See also