统计信息工具选项

统计信息工具可用于访问调度程序已运行的微批处理的历史记录。该工具以 JSON 格式将微批处理的日志输出到标准输出。可以使用其选项来筛选微批处理列表,以获得您感兴趣的微批处理。

语法

vkconfig statistics [options]
\--cluster "cluster"[,"cluster2"...]
仅返回从名称与您提供的列表中的名称相匹配的群集中检索数据的微批处理。
--dump
返回 vkconfig 为了从调度程序表中提取数据而会执行的 SQL 查询,而不是返回微批处理数据。如果您希望使用 Vertica 客户端应用程序来获取微批处理日志,而不是使用 vkconfig 的 JSON 输出,则可以使用此选项。
\--from-timestamp "timestamp"
仅返回在 timestamp 之后开始的微批处理。timestamp 值采用 yyyy-[m]m-[d]d hh:mm:ss 格式。

不能与 --last 结合使用。

\--last number
返回满足所有其他筛选器的 number 个最近的微批处理。不能与 --from-timestamp--to-timestamp 结合使用。
\--microbatch "name"[,"name2"...]
仅返回名称与逗号分隔列表中的名称之一相匹配的微批处理。
\--partition partition#[,partition#2...]
仅返回从与分区列表中的值之一相匹配的主题分区访问数据的微批处理。
\--source "source"[,"source2"...]
仅返回从名称与您提供给此实参的列表中的名称之一相匹配的源访问数据的微批处理。
--target-schema "schema"[,"schema2"...]
仅返回将数据写入名称与目标架构列表实参中的名称之一相匹配的 Vertica 架构的微批处理。
--target-table "table"[,"table2"...]
仅返回将数据写入名称与目标表列表实参中的名称之一相匹配的 Vertica 表的微批处理。
\--to-timestamp "timestamp"
仅返回在 timestamp 之前开始的微批处理。timestamp 值采用 yyyy-[m]m-[d]d hh:mm:ss 格式。

不能与 --last 结合使用。

请参阅常用 vkconfig 脚本选项以了解所有 vkconfig 工具中提供的选项。

用法注意事项

  • 可以在提供给 --cluster--microbatch--source--target-schema--target-table 实参的值中使用 LIKE 通配符。此功能可用于匹配微批处理数据中的部分字符串。有关使用通配符的详细信息,请参阅 LIKE 谓词

  • --cluster--microbatch--source--target-schema--target-table 实参的字符串比较不区分大小写。

  • 提供给 --from-timestamp--to-timestamp 实参的日期和时间值使用 java.sql.timestamp 格式以便解析该值。此格式的解析可以接受您可能认为无效并希望它拒绝的值。例如,如果提供的时间戳为 01-01-2018 24:99:99,Java 时间戳解析器会静默地将日期转换为 2018-01-02 01:40:39,而不是返回错误。

示例

以下示例获取 weblog.conf 文件中定义的调度程序运行的最后一个微批处理:

$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf weblog.conf
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.793000", "batch_start":"2018-11-06 09:42:00.176747",
"batch_end":"2018-11-06 09:42:00.437787", "source_duration":"00:00:00.214314",
"consecutive_error_count":null, "transaction_id":45035996274513069,
"frame_start":"2018-11-06 09:41:59.949", "frame_end":null}

如果调度程序正在从多个分区读取,则 --last 1 选项会列出来自每个分区的最后一个微批处理:

$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":0,
"start_offset":-2, "end_offset":-2, "end_reason":"DEADLINE",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:09.950127",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4387, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.220329",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":2,
"start_offset":1603, "end_offset":1652, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4383, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.318997",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":3,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4375, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.219543",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}

可以使用 --partition 实参只获取所需的分区:

$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --partition 2 --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":2,
"start_offset":1603, "end_offset":1652, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4383, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.318997",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}

如果调度程序从多个源读取,则 --last 1 选项会输出来自每个源的最后一个微批处理:

$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf weblog.conf
{"microbatch":"weberrors", "target_schema":"public", "target_table":"web_errors",
"source_name":"web_errors", "source_cluster":"kafka_weblog",
"source_partition":0, "start_offset":10000, "end_offset":9999,
"end_reason":"END_OF_STREAM", "end_reason_message":null,
"partition_bytes":0, "partition_messages":0, "timeslice":"00:00:04.909000",
"batch_start":"2018-11-06 10:58:02.632624",
"batch_end":"2018-11-06 10:58:03.058663", "source_duration":"00:00:00.220618",
"consecutive_error_count":null, "transaction_id":45035996274523991,
"frame_start":"2018-11-06 10:58:02.394", "frame_end":null}
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.128000", "batch_start":"2018-11-06 10:58:03.322852",
"batch_end":"2018-11-06 10:58:03.63047", "source_duration":"00:00:00.226493",
"consecutive_error_count":null, "transaction_id":45035996274524004,
"frame_start":"2018-11-06 10:58:02.394", "frame_end":null}

可以使用通配符来启用部分匹配。以下示例演示了如何获取名称以“log”结尾的所有微批处理的最后一个微批处理:

~$ /opt/vertica/packages/kafka/bin/vkconfig statistics --microbatch "%log" \
                                            --last 1 --conf weblog.conf
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:04.874000", "batch_start":"2018-11-06 11:37:16.17198",
"batch_end":"2018-11-06 11:37:16.460844", "source_duration":"00:00:00.213129",
"consecutive_error_count":null, "transaction_id":45035996274529932,
"frame_start":"2018-11-06 11:37:15.877", "frame_end":null}

要获取特定时间段的微批处理,请使用 --from-timestamp--to-timestamp 实参。以下示例获取 iot.conf 中定义的调度程序在 2018-11-06 12:52:30 到 12:53:00 之间从分区 #2 进行读取的微批处理。

$ /opt/vertica/packages/kafka/bin/vkconfig statistics  --partition 1 \
                        --from-timestamp "2018-11-06 12:52:30" \
                        --to-timestamp "2018-11-06 12:53:00" --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4387, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.220329",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1554, "end_offset":1603, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4371, "partition_messages":50,
"timeslice":"00:00:09.788000", "batch_start":"2018-11-06 12:52:38.930428",
"batch_end":"2018-11-06 12:52:48.932604", "source_duration":"00:00:00.231709",
"consecutive_error_count":null, "transaction_id":45035996274536981,
"frame_start":"2018-11-06 12:52:38.685", "frame_end":null}

以下示例演示了如何使用 --dump 实参获取 vkconfig 为检索上一个示例的输出而执行的 SQL 语句:

$ /opt/vertica/packages/kafka/bin/vkconfig statistics  --dump --partition 1 \
                       --from-timestamp "2018-11-06 12:52:30" \
                       --to-timestamp "2018-11-06 12:53:00" --conf iot.conf
SELECT microbatch, target_schema, target_table, source_name, source_cluster,
source_partition, start_offset, end_offset, end_reason, end_reason_message,
partition_bytes, partition_messages, timeslice, batch_start, batch_end,
last_batch_duration AS source_duration, consecutive_error_count, transaction_id,
frame_start, frame_end FROM "iot_sched".stream_microbatch_history WHERE
(source_partition = '1') AND (frame_start >= '2018-11-06 12:52:30.0') AND
(frame_start < '2018-11-06 12:53:00.0') ORDER BY frame_start DESC, microbatch,
source_cluster, source_name, source_partition;