Profiling query resource consumption

Vertica collects data on resource usage of all queries—including those that fail—and summarizes this data in system table QUERY_CONSUMPTION.

Vertica collects data on resource usage of all queries—including those that fail—and summarizes this data in system table QUERY_CONSUMPTION. This data includes the following information about each query:

  • Wall clock duration

  • CPU cycles consumed

  • Memory reserved and allocated

  • Network bytes sent and received

  • Disk bytes read and written

  • Bytes spilled

  • Threads allocated

  • Rows output to client

  • Rows read and written

You can obtain information about individual queries through their transaction and statement IDs. Columns TRANSACTION_ID and STATEMENT_ID provide a unique key to each query statement.

For example, the following query is profiled:

=> PROFILE SELECT pd.category_description AS 'Category', SUM(sf.sales_quantity*sf.sales_dollar_amount) AS 'Total Sales'
     FROM store.store_sales_fact sf
     JOIN public.product_dimension pd ON pd.product_version=sf.product_version AND pd.product_key=sf.product_key
     GROUP BY pd.category_description;
NOTICE 4788:  Statement is being profiled
HINT:  Select * from v_monitor.execution_engine_profiles where transaction_id=45035996274751822 and statement_id=1;
NOTICE 3557:  Initiator memory for query: [on pool general: 256160 KB, minimum: 256160 KB]
NOTICE 5077:  Total memory required by query: [256160 KB]
             Category             | Total Sales
----------------------------------+-------------
 Non-food                         |  1147919813
 Misc                             |  1158328131
 Medical                          |  1155853990
 Food                             |  4038220327
(4 rows)

You can use the transaction and statement IDs that Vertica returns to get profiling data from QUERY_CONSUMPTION—for example, the total number of bytes sent over the network for a given query:

=> SELECT NETWORK_BYTES_SENT FROM query_consumption WHERE transaction_id=45035996274751822 AND statement_id=1;
 NETWORK_BYTES_SENT
--------------------
             757745
(1 row)

QUERY_CONSUMPTION versus EXECUTION _ENGINE_PROFILES

QUERY_CONSUMPTION includes data that it rolls up from counters in EXECUTION_ENGINE_PROFILES. In the previous example, NETWORK_BYTES_SENT rolls up data that is accessible through multiple counters in EXECUTION_ENGINE_PROFILES. The equivalent query on EXECUTION_ENGINE_PROFILES looks like this:

=> SELECT operator_name, counter_name, counter_tag, SUM(counter_value) FROM execution_engine_profiles
     WHERE transaction_id=45035996274751822 AND statement_id=1 AND counter_name='bytes sent'
     GROUP BY ROLLUP (operator_name, counter_name, counter_tag) ORDER BY 1,2,3, GROUPING_ID();
 operator_name | counter_name |          counter_tag           |  SUM
---------------+--------------+--------------------------------+--------
 NetworkSend   | bytes sent   | Net id 1000 - v_vmart_node0001 | 252471
 NetworkSend   | bytes sent   | Net id 1000 - v_vmart_node0002 | 251076
 NetworkSend   | bytes sent   | Net id 1000 - v_vmart_node0003 | 253717
 NetworkSend   | bytes sent   | Net id 1001 - v_vmart_node0001 |    192
 NetworkSend   | bytes sent   | Net id 1001 - v_vmart_node0002 |    192
 NetworkSend   | bytes sent   | Net id 1001 - v_vmart_node0003 |      0
 NetworkSend   | bytes sent   | Net id 1002 - v_vmart_node0001 |     97
 NetworkSend   | bytes sent   |                                | 757745
 NetworkSend   |              |                                | 757745
               |              |                                | 757745
(10 rows)

QUERY_CONSUMPTION and EXECUTION_ENGINE_PROFILES also differ as follows:

  • QUERY_CONSUMPTION saves data from all queries, no matter their duration or whether they are explicitly profiled. It also includes data on unsuccessful queries.

  • EXECUTION_ENGINE_PROFILES only includes data from queries whose length of execution exceeds a set threshold, or that you explicitly profile. It also excludes data of unsuccessful queries.