统计信息工具选项
统计信息工具可用于访问调度程序已运行的微批处理的历史记录。该工具以 JSON 格式将微批处理的日志输出到标准输出。可以使用其选项来筛选微批处理列表,以获得您感兴趣的微批处理。
注意
如果您随着时间的推移更改了调度程序配置,统计信息工具有时会生成令人困惑的输出。例如,假定让微批处理 A 以某个表为目标。之后,您更改调度程序的配置,让微批处理 B 以该表为目标。随后,您运行统计信息工具并根据目标表筛选微批处理日志。然后,日志输出将显示来自微批处理 A 和微批处理 B 的条目。语法
vkconfig statistics [options]
-
\--cluster "cluster"[,"cluster2"...]
- 仅返回从名称与您提供的列表中的名称相匹配的群集中检索数据的微批处理。
--dump
- 返回 vkconfig 为了从调度程序表中提取数据而会执行的 SQL 查询,而不是返回微批处理数据。如果您希望使用 Vertica 客户端应用程序来获取微批处理日志,而不是使用 vkconfig 的 JSON 输出,则可以使用此选项。
-
\--from-timestamp "timestamp"
- 仅返回在 timestamp 之后开始的微批处理。timestamp 值采用 yyyy-[m]m-[d]d hh:mm:ss 格式。
不能与
--last
结合使用。 -
\--last number
- 返回满足所有其他筛选器的 number 个最近的微批处理。不能与
--from-timestamp
或--to-timestamp
结合使用。 -
\--microbatch "name"[,"name2"...]
- 仅返回名称与逗号分隔列表中的名称之一相匹配的微批处理。
-
\--partition partition#[,partition#2...]
- 仅返回从与分区列表中的值之一相匹配的主题分区访问数据的微批处理。
-
\--source "source"[,"source2"...]
- 仅返回从名称与您提供给此实参的列表中的名称之一相匹配的源访问数据的微批处理。
--target-schema "schema"[,"schema2"...]
- 仅返回将数据写入名称与目标架构列表实参中的名称之一相匹配的 Vertica 架构的微批处理。
--target-table "table"[,"table2"...]
- 仅返回将数据写入名称与目标表列表实参中的名称之一相匹配的 Vertica 表的微批处理。
-
\--to-timestamp "timestamp"
- 仅返回在 timestamp 之前开始的微批处理。timestamp 值采用 yyyy-[m]m-[d]d hh:mm:ss 格式。
不能与
--last
结合使用。
请参阅常用 vkconfig 脚本选项以了解所有 vkconfig 工具中提供的选项。
用法注意事项
-
可以在提供给
--cluster
、--microbatch
、--source
、--target-schema
和--target-table
实参的值中使用 LIKE 通配符。此功能可用于匹配微批处理数据中的部分字符串。有关使用通配符的详细信息,请参阅 LIKE 谓词。 -
--cluster
、--microbatch
、--source
、--target-schema
和--target-table
实参的字符串比较不区分大小写。 -
提供给
--from-timestamp
和--to-timestamp
实参的日期和时间值使用 java.sql.timestamp 格式以便解析该值。此格式的解析可以接受您可能认为无效并希望它拒绝的值。例如,如果提供的时间戳为 01-01-2018 24:99:99,Java 时间戳解析器会静默地将日期转换为 2018-01-02 01:40:39,而不是返回错误。
示例
以下示例获取 weblog.conf 文件中定义的调度程序运行的最后一个微批处理:
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf weblog.conf
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.793000", "batch_start":"2018-11-06 09:42:00.176747",
"batch_end":"2018-11-06 09:42:00.437787", "source_duration":"00:00:00.214314",
"consecutive_error_count":null, "transaction_id":45035996274513069,
"frame_start":"2018-11-06 09:41:59.949", "frame_end":null}
如果调度程序正在从多个分区读取,则 --last 1
选项会列出来自每个分区的最后一个微批处理:
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":0,
"start_offset":-2, "end_offset":-2, "end_reason":"DEADLINE",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:09.950127",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4387, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.220329",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":2,
"start_offset":1603, "end_offset":1652, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4383, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.318997",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":3,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4375, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.219543",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
可以使用 --partition
实参只获取所需的分区:
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --partition 2 --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":2,
"start_offset":1603, "end_offset":1652, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4383, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.318997",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
如果调度程序从多个源读取,则 --last 1
选项会输出来自每个源的最后一个微批处理:
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --last 1 --conf weblog.conf
{"microbatch":"weberrors", "target_schema":"public", "target_table":"web_errors",
"source_name":"web_errors", "source_cluster":"kafka_weblog",
"source_partition":0, "start_offset":10000, "end_offset":9999,
"end_reason":"END_OF_STREAM", "end_reason_message":null,
"partition_bytes":0, "partition_messages":0, "timeslice":"00:00:04.909000",
"batch_start":"2018-11-06 10:58:02.632624",
"batch_end":"2018-11-06 10:58:03.058663", "source_duration":"00:00:00.220618",
"consecutive_error_count":null, "transaction_id":45035996274523991,
"frame_start":"2018-11-06 10:58:02.394", "frame_end":null}
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:09.128000", "batch_start":"2018-11-06 10:58:03.322852",
"batch_end":"2018-11-06 10:58:03.63047", "source_duration":"00:00:00.226493",
"consecutive_error_count":null, "transaction_id":45035996274524004,
"frame_start":"2018-11-06 10:58:02.394", "frame_end":null}
可以使用通配符来启用部分匹配。以下示例演示了如何获取名称以“log”结尾的所有微批处理的最后一个微批处理:
~$ /opt/vertica/packages/kafka/bin/vkconfig statistics --microbatch "%log" \
--last 1 --conf weblog.conf
{"microbatch":"weblog", "target_schema":"public", "target_table":"web_hits",
"source_name":"web_hits", "source_cluster":"kafka_weblog", "source_partition":0,
"start_offset":80000, "end_offset":79999, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":0, "partition_messages":0,
"timeslice":"00:00:04.874000", "batch_start":"2018-11-06 11:37:16.17198",
"batch_end":"2018-11-06 11:37:16.460844", "source_duration":"00:00:00.213129",
"consecutive_error_count":null, "transaction_id":45035996274529932,
"frame_start":"2018-11-06 11:37:15.877", "frame_end":null}
要获取特定时间段的微批处理,请使用 --from-timestamp
和 --to-timestamp
实参。以下示例获取 iot.conf 中定义的调度程序在 2018-11-06 12:52:30 到 12:53:00 之间从分区 #2 进行读取的微批处理。
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --partition 1 \
--from-timestamp "2018-11-06 12:52:30" \
--to-timestamp "2018-11-06 12:53:00" --conf iot.conf
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1604, "end_offset":1653, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4387, "partition_messages":50,
"timeslice":"00:00:09.842000", "batch_start":"2018-11-06 12:52:49.387567",
"batch_end":"2018-11-06 12:52:59.400219", "source_duration":"00:00:00.220329",
"consecutive_error_count":null, "transaction_id":45035996274537015,
"frame_start":"2018-11-06 12:52:49.213", "frame_end":null}
{"microbatch":"iotlog", "target_schema":"public", "target_table":"iot_data",
"source_name":"iot_data", "source_cluster":"kafka_iot", "source_partition":1,
"start_offset":1554, "end_offset":1603, "end_reason":"END_OF_STREAM",
"end_reason_message":null, "partition_bytes":4371, "partition_messages":50,
"timeslice":"00:00:09.788000", "batch_start":"2018-11-06 12:52:38.930428",
"batch_end":"2018-11-06 12:52:48.932604", "source_duration":"00:00:00.231709",
"consecutive_error_count":null, "transaction_id":45035996274536981,
"frame_start":"2018-11-06 12:52:38.685", "frame_end":null}
以下示例演示了如何使用 --dump
实参获取 vkconfig 为检索上一个示例的输出而执行的 SQL 语句:
$ /opt/vertica/packages/kafka/bin/vkconfig statistics --dump --partition 1 \
--from-timestamp "2018-11-06 12:52:30" \
--to-timestamp "2018-11-06 12:53:00" --conf iot.conf
SELECT microbatch, target_schema, target_table, source_name, source_cluster,
source_partition, start_offset, end_offset, end_reason, end_reason_message,
partition_bytes, partition_messages, timeslice, batch_start, batch_end,
last_batch_duration AS source_duration, consecutive_error_count, transaction_id,
frame_start, frame_end FROM "iot_sched".stream_microbatch_history WHERE
(source_partition = '1') AND (frame_start >= '2018-11-06 12:52:30.0') AND
(frame_start < '2018-11-06 12:53:00.0') ORDER BY frame_start DESC, microbatch,
source_cluster, source_name, source_partition;