Data collection scope

scrutinize options let you control the scope of the data collection.

scrutinize options let you control the scope of the data collection. You can specify the scope of the data collection according to the following criteria:

You can use these options singly or in combination, to achieve the desired level of granularity.

Amount of collected data

Several options let you limit how much data scrutinize collects:

--by-second --by-minute=boolean-value
Specifies the granularity of information that is collected from Data Collector tables with one of the following options:
  • --by-second: Highest level of granularity, specifies to collect data down to the second.

  • --by-minute=boolean-value

    where boolean-value is set to one of the following:

    • {yes|on}: Default setting, specifies to collect data down to the minute.

    • {no|off}: Lowest level of granularity, specifies to collect data down to the hour.

For example, the following command collects data down to the hour:

$ scrutinize --by-minute=no

This command data down to the second:

  
$ scrutinize --by-second
--get-files file-list
Specifies extra files to collect, including globs, where file-list is a semicolon-delimited list of files.
--include_gzlogs=num-files
-z num-files
Specifies to include num-files rotated log files (vertica.log*.gz) in the scrutinize output, where num-files can be one of the following:
  • An integer specifies the number of rotated log files to collect.

  • all specifies to collect all rotated log files.

By default, scrutinize includes three rotated log files.

For example the following command specifies to collect two rotated log files:

$ scrutinize --include_gzlogs=2
--log-limit=limit
-l limit
Specifies how much data to collect from Vertica logs, where limit specifies, in gigabytes, how much log data to collect, starting from the most recent log entry. By default, scrutinize collects 1 GB of log data.

For example, the following command specifies to collect 4 GB of log data:

$ scrutinize --log-limit=4

Node-specific collection

By default, scrutinize collects data from all cluster nodes. You can specify that scrutinize collect from individual nodes in two ways:

--local_diags -s
Specifies to collect diagnostics only from the host on which scrutinize was invoked.
--hosts=host-list -n host-list
Specifies to collect diagnostics only from the hosts specified in host-list, where host-list is a comma-separated list of IP addresses or host names.

For example:

$ scrutinize --hosts=127.0.0.1,host_3,host_1

Types of data to include

scrutinize provides several options that let you specify the type of data to collect:

--debug
Collects debug information for the log.
--diag-dump
Limits the collection to database design, system tables, and Data Collector tables. Use this option to collect data to analyze system performance.
--diagnostics
Limits the collection to log file data and output from commands that are run against Vertica and its host system. Use this option to collect data to evaluate unexpected behavior in your Vertica system.
--include-ros-info
Includes ROS related information from system tables.
--no-active-queries --with-active-queries
Specifies to exclude diagnostic information from system tables and Data Collector tables about currently running queries. By default, scrutinize collects this information (--with-active-queries).
--tasks=tasks -T tasks
Specifies that scrutinize gather diagnostics on one or more tasks, as specified in a file or JSON list. This option is typically used together with --exclude.
--type=type -t type
Specifies the type of diagnostics collection to perform, where type can be one of the following arguments:
  • profiling: Gather profiling data.

  • context: Gather summary information.

--with-active-queries
The default setting, specifies to include diagnostic information from system tables and Data Collector tables about currently running queries. To omit this data, use --no-active-queries.

Types of data to exclude

scrutinize options also let you specify the types of data to exclude from its collection:

--exclude=tasks -X tasks
Excludes one or more types of tasks from the diagnostics collection, where tasks is a comma-separated list of the tasks to exclude.

Specify the tasks to exclude with the following case-insensitive arguments :

  • all: All default tasks

  • DC: Data Collector tables

  • File: Log files from the installation process, the database, and Administration Tools, such as vertica.log, dbLog, and adminTools.log

  • VerticaLog: Vertica logs

  • CatalogObject: Vertica catalog metadata, such as system configuration parameters

  • SystemTable: Vertica system tables that contain information about system, resources, workload, and performance

  • Query: Vertica meta-functions that use vsql to connect to the database, such as EXPORT_CATALOG()

  • Command: Operating system information, such as the length of time that a node has been up

--no-active-queries
Specifies to omit diagnostic information from system tables and Data Collector tables about currently running queries. By default, scrutinize always collects active query information (--with-active-queries).
--vsql-off -v
Excludes Query and SystemTable tasks, which are used to connect to the database. This option can help you deal with problems that occur during an upgrade, and is typically used in the following cases:
  • Vertica is running but is slow to respond.

  • You haven't yet created a database but need help troubleshooting other cluster issues.