This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Data collector functions
The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events.
The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events. It gathers and retains monitoring information about your database cluster and makes that information available in system tables, requiring few configuration parameter tweaks, and having negligible impact on performance.
Collected data is stored on disk in the DataCollector
directory under the Vertica /catalog path. You can use the information the Data Collector retains to query the past state of system tables and extract aggregate information, as well as do the following:
-
See what actions users have taken
-
Locate performance bottlenecks
-
Identify potential improvements to Vertica configuration
Data Collector works in conjunction with an advisor tool called Workload Analyzer, which intelligently monitors the performance of SQL queries and workloads and recommends tuning actions based on observations of the actual workload history.
By default, Data Collector is on and retains information for all sessions. If performance issues arise, a superuser can disable Data Collector by setting set configuration parameter EnableDataCollector to 0.
1 - CLEAR_DATA_COLLECTOR
Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.
Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
CLEAR_DATA_COLLECTOR( [ 'component' ] )
Parameters
component
- Clears memory and disk records for the specified component. If you provide no argument, the function clears memory and disk records for all components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
component | description
----------------+-------------------------------
DepotEvictions | Files evicted from the Depot
DepotFetches | Files fetched to the Depot
DepotUploads | Files Uploaded from the Depot
(3 rows)
Privileges
Superuser
Examples
The following command clears memory and disk records for the ResourceAcquisitions component:
=> SELECT clear_data_collector('ResourceAcquisitions');
clear_data_collector
----------------------
CLEAR
(1 row)
The following command clears data collection for all components:
=> SELECT clear_data_collector();
clear_data_collector
----------------------
CLEAR
(1 row)
See also
Data collector utility
2 - DATA_COLLECTOR_HELP
Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions.
Returns online usage instructions about the Data Collector, the
DATA_COLLECTOR
system table, and the Data Collector control functions.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
DATA_COLLECTOR_HELP()
Privileges
None
Returns
The DATA_COLLECTOR_HELP()
function returns the following information:
=> SELECT DATA_COLLECTOR_HELP();
-----------------------------------------------------------------------------
Usage Data Collector
The data collector retains history of important system activities.
This data can be used as a reference of what actions have been taken
by users, but it can also be used to locate performance bottlenecks,
or identify potential improvements to the Vertica configuration.
This data is queryable via Vertica system tables.
Acccess a list of data collector components, and some statistics, by running:
SELECT * FROM v_monitor.data_collector;
The amount of data retained by size and time can be controlled with several
functions.
To just set the size amount:
set_data_collector_policy(<component>,
<memory retention (KB)>,
<disk retention (KB)>);
To set both the size and time amounts (the smaller one will dominate):
set_data_collector_policy(<component>,
<memory retention (KB)>,
<disk retention (KB)>,
<interval>);
To set just the time amount:
set_data_collector_time_policy(<component>,
<interval>);
To set the time amount for all tables:
set_data_collector_time_policy(<interval>);
The current retention policy for a component can be queried with:
get_data_collector_policy(<component>);
Data on disk is kept in the "DataCollector" directory under the Vertica
\catalog path. This directory also contains instructions on how to load
the monitoring data into another Vertica database.
To move the data collector logs and instructions to other storage locations,
create labeled storage locations using add_location and then use:
set_data_collector_storage_location(<storage_label>);
Additional commands can be used to configure the data collection logs.
The log can be cleared with:
clear_data_collector([<optional component>]);
The log can be synchronized with the disk storage using:
flush_data_collector([<optional component>]);
See also
3 - FLUSH_DATA_COLLECTOR
Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.
Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
FLUSH_DATA_COLLECTOR( [ 'component' ] )
Parameters
component
- Flushes data for the specified component. If you omit this argument, the function flushes data for all components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
component | description
----------------+-------------------------------
DepotEvictions | Files evicted from the Depot
DepotFetches | Files fetched to the Depot
DepotUploads | Files Uploaded from the Depot
(3 rows)
Privileges
Superuser
Examples
The following command flushes the Data Collector for the ResourceAcquisitions component:
=> SELECT flush_data_collector('ResourceAcquisitions');
flush_data_collector
----------------------
FLUSH
(1 row)
The following command flushes data collection for all components:
=> SELECT flush_data_collector();
flush_data_collector
----------------------
FLUSH
(1 row)
See also
Data collector utility
4 - GET_DATA_COLLECTOR_POLICY
Retrieves a brief statement about the retention policy for the specified component.
Retrieves a brief statement about the retention policy for the specified component.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
GET_DATA_COLLECTOR_POLICY( 'component' )
Parameters
component
- Returns the retention policy of the specified component.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
component | description
----------------+-------------------------------
DepotEvictions | Files evicted from the Depot
DepotFetches | Files fetched to the Depot
DepotUploads | Files Uploaded from the Depot
(3 rows)
Privileges
None
Examples
The following query returns the history of all resource acquisitions by specifying the ResourceAcquisitions
component:
=> SELECT get_data_collector_policy('ResourceAcquisitions');
get_data_collector_policy
----------------------------------------------
1000KB kept in memory, 10000KB kept on disk.
(1 row)
See also
5 - SET_DATA_COLLECTOR_POLICY
Updates the following retention policy properties for the specified component:.
Updates the following retention policy properties for the specified component:
-
MEMORY_BUFFER_SIZE_KB
-
DISK_SIZE_KB
-
INTERVAL_TIME
Before you change a retention policy, you can view its current settings by querying system table DATA_COLLECTOR or by calling meta-function GET_DATA_COLLECTOR_POLICY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SET_DATA_COLLECTOR_POLICY('component', 'memory-buffer-size', 'disk-size' [,'interval-time'] )
Parameters
component
- Specifies the retention policy to update.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
component | description
----------------+-------------------------------
DepotEvictions | Files evicted from the Depot
DepotFetches | Files fetched to the Depot
DepotUploads | Files Uploaded from the Depot
(3 rows)
memory-buffer-size
- Specifies in kilobytes the maximum amount of data that is buffered in memory before moving it to disk. The policy retention policy property MEMORY_BUFFER_SIZE_KB is set from this value.
Caution
If you set this parameter to 0, the function returns with a warning that the Data Collector cannot retain any data for this component in memory or on disk.
Consider setting this parameter to a high value in the following cases:
-
Unusually high levels of data collection. If memory-buffer-size
is set too low, the Data Collector might be unable to flush buffered data to disk fast enough to keep up with the activity level, which can lead to loss of in-memory data.
-
Very large data collector records—for example, records with very long query strings. The Data Collector uses double-buffering, so it cannot retain in memory records that are more than 50 percent larger than memory-buffer-size
.
disk-size
- Specifies in kilobytes the maximum disk space allocated for this component's Data Collector table. The policy retention policy property DISK_SIZE_KB is set from this value. If set to 0, the Data Collector retains only as much component data as it can buffer in memory, as specified by
memory-buffer-size
.
interval-time
INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).
For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval
), the Data Collector table dc_tuple_mover_events
retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.
Note
Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see
Configuring data retention policies.
To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:
-
INTERVAL_SET: f
-
INTERVAL_TIME: 0
With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.
Privileges
Superuser
Examples
See Configuring data retention policies.
6 - SET_DATA_COLLECTOR_TIME_POLICY
Updates the retention policy property INTERVAL_TIME for the specified component.
Updates the retention policy property INTERVAL_TIME for the specified component. Calling this function has no effect on other properties of the same component. You can use this function to update the INTERVAL_TIME property of all component retention policies.
To set other retention policy properties, call SET_DATA_COLLECTOR_POLICY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SET_DATA_COLLECTOR_TIME_POLICY( ['component',] 'interval-time' )
Parameters
component
- Specifies the retention policy to update. If you omit this argument, Vertica updates the retention policy of all Data Collector components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
component | description
----------------+-------------------------------
DepotEvictions | Files evicted from the Depot
DepotFetches | Files fetched to the Depot
DepotUploads | Files Uploaded from the Depot
(3 rows)
interval-time
INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).
For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval
), the Data Collector table dc_tuple_mover_events
retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.
Note
Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see
Configuring data retention policies.
To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:
-
INTERVAL_SET: f
-
INTERVAL_TIME: 0
With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.
Privileges
Superuser
Examples
See Configuring data retention policies.