Data collection scope

scrutinize options let you control the scope of the data collection.

scrutinize options let you control the scope of the data collection. You can specify the scope of the data collection according to the following criteria:

You can use these options singly or in combination, to achieve the desired level of granularity.

Amount of collected data

Several options let you limit how much data scrutinize collects:

--by-second
Collect data every second. This is the highest level of granularity when collecting from Data Collector tables.
--by-minute=boolean-value
Collect data every minute (if the value is true) or every hour (if the value is false).
--get-files file-list
Collect the specified additional files, including globs, where file-list is a semicolon-delimited list of files.
--include_gzlogs=num-files
-z num-files
Number of rotated log files (vertica.log*.gz) to include in the scrutinize output, or all.

By default, scrutinize includes three rotated log files.

--log-limit=limit
-l limit
How much data to collect from Vertica logs, in gigabytes, starting from the most recent log entry. By default, scrutinize collects unlimited log data.

Node-specific collection

By default, scrutinize collects data from all cluster nodes. You can specify that scrutinize collect from individual nodes in two ways:

--local_diags
-s
Collect diagnostics only from the host on which scrutinize was invoked. To collect data from multiple nodes in the cluster, use the --hosts option.
--hosts=host-list
-n host-list
Collect diagnostics only from the hosts specified in host-list, a comma-separated list of IP addresses or host names.

For example:

$ scrutinize --hosts=127.0.0.1,host_3,host_1

Types of data to include

scrutinize provides several options that let you specify the type of data to collect:

--debug
Collects debug information for the log.
--diag-dump
Limits the collection to database design, system tables, and Data Collector tables. Use this option to collect data to analyze system performance.
--diagnostics
Limits the collection to log file data and output from commands that are run against Vertica and its host system. Use this option to collect data to evaluate unexpected behavior in your Vertica system.
--include-ros-info
Includes ROS related information from system tables.
--no-active-queries | --with-active-queries
Whether to exclude diagnostic information from system tables and Data Collector tables about currently running queries. By default, scrutinize collects this information (--with-active-queries).
--tasks=tasks
-T tasks
Gathers diagnostics on one or more tasks, as specified in a file or JSON list. This option is typically used together with --exclude.
--type=type
-t type
Type of diagnostics collection to perform, one of the following:
  • profiling: Gather profiling data.

  • context: Gather summary information.

Types of data to exclude

scrutinize options also let you specify the types of data to exclude from its collection:

--exclude=tasks
-X tasks
Excludes one or more types of tasks from the diagnostics collection, where tasks is a comma-separated list of the tasks to exclude:
  • all: All default tasks

  • DC: Data Collector tables

  • File: Log files from the installation process, the database, and Administration Tools, such as vertica.log, dbLog, and adminTools.log

  • VerticaLog: Vertica logs

  • CatalogObject: Vertica catalog metadata, such as system configuration parameters

  • SystemTable: Vertica system tables that contain information about system, resources, workload, and performance

  • Query: Vertica meta-functions that use vsql to connect to the database, such as EXPORT_CATALOG()

  • Command: Operating system information, such as the length of time that a node has been up

--no-active-queries
Omits diagnostic information from system tables and Data Collector tables about currently running queries. By default, scrutinize always collects active query information (--with-active-queries).
--vsql-off
-v
Excludes Query and SystemTable tasks, which are used to connect to the database. This option can help you deal with problems that occur during an upgrade, and is typically used in the following cases:
  • Vertica is running but is slow to respond.

  • You haven't yet created a database but need help troubleshooting other cluster issues.