This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Data collector functions

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events.

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events. It gathers and retains monitoring information about your database cluster and makes that information available in system tables, requiring few configuration parameter tweaks, and having negligible impact on performance.

Collected data is stored on disk in the DataCollector directory under the Vertica /catalog path. You can use the information the Data Collector retains to query the past state of system tables and extract aggregate information, as well as do the following:

  • See what actions users have taken

  • Locate performance bottlenecks

  • Identify potential improvements to Vertica configuration

Data Collector works in conjunction with an advisor tool called Workload Analyzer, which intelligently monitors the performance of SQL queries and workloads and recommends tuning actions based on observations of the actual workload history.

By default, Data Collector is on and retains information for all sessions. If performance issues arise, a superuser can disable Data Collector by setting set configuration parameter EnableDataCollector to 0.

1 - CLEAR_DATA_COLLECTOR

Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.

Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DATA_COLLECTOR( [ 'component' ] )

Parameters

component
Clears memory and disk records for the specified component. If you provide no argument, the function clears memory and disk records for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command clears memory and disk records for the ResourceAcquisitions component:

=> SELECT clear_data_collector('ResourceAcquisitions');
 clear_data_collector
----------------------
 CLEAR
(1 row)

The following command clears data collection for all components:

=> SELECT clear_data_collector();
 clear_data_collector
----------------------
 CLEAR
(1 row)

See also

Data collector utility

2 - DATA_COLLECTOR_HELP

Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions.

Returns online usage instructions about the Data Collector, the DATA_COLLECTOR system table, and the Data Collector control functions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DATA_COLLECTOR_HELP()

Privileges

None

Returns

The DATA_COLLECTOR_HELP() function returns the following information:

=> SELECT DATA_COLLECTOR_HELP();

-----------------------------------------------------------------------------
Usage Data Collector
The data collector retains history of important system activities.
   This data can be used as a reference of what actions have been taken
      by users, but it can also be used to locate performance bottlenecks,
      or identify potential improvements to the Vertica configuration.
   This data is queryable via Vertica system tables.
Acccess a list of data collector components, and some statistics, by running:
   SELECT * FROM v_monitor.data_collector;

The amount of data retained by size and time can be controlled with several
functions.
   To just set the size amount:
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>);

   To set both the size and time amounts (the smaller one will dominate):
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>,
                                <interval>);

   To set just the time amount:
      set_data_collector_time_policy(<component>,
                                     <interval>);

   To set the time amount for all tables:
      set_data_collector_time_policy(<interval>);

The current retention policy for a component can be queried with:
   get_data_collector_policy(<component>);

Data on disk is kept in the "DataCollector" directory under the Vertica
\catalog path. This directory also contains instructions on how to load
the monitoring data into another Vertica database.

To move the data collector logs and instructions to other storage locations,
create labeled storage locations using add_location and then use:

   set_data_collector_storage_location(<storage_label>);

Additional commands can be used to configure the data collection logs.
The log can be cleared with:
clear_data_collector([<optional component>]);
The log can be synchronized with the disk storage using:
flush_data_collector([<optional component>]);

See also

3 - FLUSH_DATA_COLLECTOR

Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.

Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

FLUSH_DATA_COLLECTOR( [ 'component' ] )

Parameters

component
Flushes data for the specified component. If you omit this argument, the function flushes data for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command flushes the Data Collector for the ResourceAcquisitions component:

=> SELECT flush_data_collector('ResourceAcquisitions');
 flush_data_collector
----------------------
 FLUSH
(1 row)

The following command flushes data collection for all components:

=> SELECT flush_data_collector();
 flush_data_collector
----------------------
 FLUSH
(1 row)

See also

Data collector utility

4 - GET_DATA_COLLECTOR_POLICY

Retrieves a brief statement about the retention policy for the specified component.

Retrieves a brief statement about the retention policy for the specified component.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_DATA_COLLECTOR_POLICY( 'component' )

Parameters

component
Returns the retention policy of the specified component.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

None

Examples

The following query returns the history of all resource acquisitions by specifying the ResourceAcquisitions component:

=> SELECT get_data_collector_policy('ResourceAcquisitions');
          get_data_collector_policy
----------------------------------------------
 1000KB kept in memory, 10000KB kept on disk.
(1 row)

See also

5 - SET_DATA_COLLECTOR_POLICY

Updates the following retention policy properties for the specified component:.

Updates the following retention policy properties for the specified component:

  • MEMORY_BUFFER_SIZE_KB

  • DISK_SIZE_KB

  • INTERVAL_TIME

Before you change a retention policy, you can view its current settings by querying system table DATA_COLLECTOR or by calling meta-function GET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DATA_COLLECTOR_POLICY('component', 'memory-buffer-size', 'disk-size' [,'interval-time']  )

Parameters

component
Specifies the retention policy to update.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)
memory-buffer-size
Specifies in kilobytes the maximum amount of data that is buffered in memory before moving it to disk. The policy retention policy property MEMORY_BUFFER_SIZE_KB is set from this value.

Consider setting this parameter to a high value in the following cases:

  • Unusually high levels of data collection. If memory-buffer-size is set too low, the Data Collector might be unable to flush buffered data to disk fast enough to keep up with the activity level, which can lead to loss of in-memory data.

  • Very large data collector records—for example, records with very long query strings. The Data Collector uses double-buffering, so it cannot retain in memory records that are more than 50 percent larger than memory-buffer-size.

disk-size
Specifies in kilobytes the maximum disk space allocated for this component's Data Collector table. The policy retention policy property DISK_SIZE_KB is set from this value. If set to 0, the Data Collector retains only as much component data as it can buffer in memory, as specified by memory-buffer-size.
interval-time

INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).

For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval), the Data Collector table dc_tuple_mover_events retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

  • INTERVAL_SET: f

  • INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.

6 - SET_DATA_COLLECTOR_TIME_POLICY

Updates the retention policy property INTERVAL_TIME for the specified component.

Updates the retention policy property INTERVAL_TIME for the specified component. Calling this function has no effect on other properties of the same component. You can use this function to update the INTERVAL_TIME property of all component retention policies.

To set other retention policy properties, call SET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DATA_COLLECTOR_TIME_POLICY( ['component',] 'interval-time' )

Parameters

component
Specifies the retention policy to update. If you omit this argument, Vertica updates the retention policy of all Data Collector components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)
interval-time

INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).

For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval), the Data Collector table dc_tuple_mover_events retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

  • INTERVAL_SET: f

  • INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.