Every time you create a scheduler (--create), Vertica creates a schema for that scheduler with the name you specify or the default stream_config. Each schema has the following tables:
Vertica recommends that you do not alter these tables except in consultation with support.
1 - stream_clusters
This table lists clusters and hosts.
This table lists clusters and hosts. You change settings in this table using the vkconfig cluster tool. See Cluster tool options for more information.
Column
Data Type
Description
id
INTEGER
The identification number assigned to the cluster.
cluster
VARCHAR
The name of the cluster.
hosts
VARCHAR
A comma-separated list of hosts associated with the cluster.
Examples
This example shows a cluster and its associated hosts.
=> SELECT * FROM stream_config.stream_clusters;
id | cluster | hosts
---------+----------------+-----------------------------------
2250001 | streamcluster1 | 10.10.10.10:9092,10.10.10.11:9092
(1 rows)
2 - stream_events
This table logs microbatches and other important events from the scheduler in an internal log table.
This table logs microbatches and other important events from the scheduler in an internal log table.
This table was renamed from kafka_config.kafka_events.
Column
Data Type
Description
event_time
TIMESTAMP
The time the event was logged.
log_level
VARCHAR
The type of event that was logged.
Valid Values:
TRACE
DEBUG
FATAL
ERROR
WARN
INFO
Default: INFO
frame_start
TIMESTAMP
The time when the frame executed.
frame_end
TIMESTAMP
The time when the frame completed.
microbatch
INTEGER
The identification number of the associated microbatch.
message
VARCHAR
A description of the event.
exception
VARCHAR
If this log is in the form of a stack trace, this column lists the exception.
Examples
This example shows typical rows from the stream_events table.
=> SELECT * FROM stream_config.stream_events;
-[ RECORD 1 ]-+-------------
event_time | 2016-07-17 13:28:35.548-04
log_level | INFO
frame_start |
frame_end |
microbatch |
message | New leader registered for schema stream_config. New ID: 0, new Host: 10.20.30.40
exception |
-[ RECORD 2 ]-+-------------
event_time | 2016-07-17 13:28:45.643-04
log_level | INFO
frame_start | 2015-07-17 12:28:45.633
frame_end | 2015-07-17 13:28:50.701-04
microbatch |
message | Generated tuples: test3|2|-2,test3|1|-2,test3|0|-2
exception |
-[ RECORD 3 ]-+----------------
event_time | 2016-07-17 14:28:50.701-04
log_level | INFO
frame_start | 2016-07-17 13:28:45.633
frame_end | 2016-07-17 14:28:50.701-04
microbatch |
message | Total rows inserted: 0
exception |
3 - stream_load_specs
This table describes user-created load specs.
This table describes user-created load specs. You change the entries in this table using the vkconfig utility's load spec tool.
Column
Data Type
Description
id
INTEGER
The identification number assigned to the cluster.
load_spec
VARCHAR
The name of the load spec.
filters
VARCHAR
A comma-separated list of UDFilters for the scheduler to include in the COPY statement it uses to load data from Kafka.
parser
VARCHAR
A Vertica UDParser to use with a specified target. If you are using a Vertica native parser, parser parameters serve as a COPY statement parameters.
parser_parameters
VARCHAR
A list of parameters to provide to the parser.
load_method
VARCHAR
The COPY load method to use for all loads with this scheduler.
Deprecated
In Vertica 10.0, load methods are no longer used due to the removal of the WOS. The value shown in this column has no effect.
message_max_bytes
INTEGER
The maximum size, in bytes, of a message.
uds_kv_parameters
VARCHAR
A list of parameters that are supplied to the KafkaSource statement. If the value in this column is in the format key=value, the scheduler it to the COPY statement's KafkaSource call.
Examples
This example shows the load specs that you can use with a Vertica instance.
SELECT * FROM stream_config.stream_load_specs;
-[ RECORD 1 ]-----+------------
id | 1
load_spec | loadspec2
filters |
parser | KafkaParser
parser_parameters |
load_method | direct
message_max_bytes | 1048576
uds_kv_parameters |
-[ RECORD 2 ]-----+------------
id | 750001
load_spec | streamspec1
filters |
parser | KafkaParser
parser_parameters |
load_method | TRICKLE
message_max_bytes | 1048576
uds_kv_parameters |
4 - stream_lock
This table is locked by the scheduler.
This table is locked by the scheduler. This locks prevent multiple schedulers from running at the same time. The scheduler that locks this table updates it with its own information.
Important
Do not use this table in a serializable transaction that locks this table. Locking this table can interfere with the operation of the scheduler.
Column
Data Type
Description
scheduler_id
INTEGER
A unique ID for the scheduler instance that is currently running.
update_time
TIMESTAMP
The time the scheduler took ownership of writing to the schema.
process_info
VARCHAR
Information about the scheduler process. Currently unused.
This table contains a history of every microbatch executed within this scheduler configuration.
This table contains a history of every microbatch executed within this scheduler configuration.
Column
Data Type
Description
source_name
VARCHAR
The name of the source.
source_cluster
VARCHAR
The name of the source cluster. The clusters are defined in stream_clusters.
source_partition
INTEGER
The number of the data streaming partition.
start_offset
INTEGER
The starting offset of the microbatch.
end_offset
INTEGER
The ending offset of the microbatch.
end_reason
VARCHAR
An explanation for why the batch ended.The following are valid end reasons:
DEADLINE - The batch ran out of time.
END_OFFSET - The load reached the ending offset specified in the KafkaSource. This reason is never used by the scheduler, as it does specify an end offset.
END_OF_STREAM - There are no messages available to the scheduler or the eof_timeout has been reached.
NETWORK_ERROR - The scheduler could not connect to Kafka.
RESET_OFFSET - The start offset was changed using the --update and --offset parameters to the KafkaSource. This state does not occur during normal scheduler operations.
SOURCE_ISSUE - The Kafka service returned an error.
UNKNOWN - The batch ended for an unknown reason.
end_reason_message
VARCHAR
If the end reason is a network or source issue, this column contains a brief description of the issue.
partition_bytes
INTEGER
The number of bytes transferred from a source partition to a Vertica target table.
partition_messages
INTEGER
The number of messages transferred from a source partition to a Vertica target table.
microbatch_id
INTEGER
The Vertica transaction id for the batch session.
microbatch
VARCHAR
The name of the microbatch.
target_schema
VARCHAR
The name of the target schema.
target_table
VARCHAR
The name of the target table.
timeslice
INTERVAL
The amount of time spent in the KafkaSource operator.
batch_start
TIMESTAMP
The time the batch executed.
batch_end
TIMESTAMP
The time the batch completed.
last_batch_duration
INTERVAL
The length of time required to run the complete COPY statement.
last_batch_parallelism
INTEGER
The number of parallel COPY statements generated to process the microbatch during the last frame.
microbatch_sub_id
INTEGER
The identifier for the COPY statement that processed the microbatch.
consecutive_error_count
INTEGER
(Currently not used.) The number of times a microbatch has encountered an error on an attempt to load. This value increases over multiple attempts.
transaction_id
INTEGER
The identifier for the transaction within the session.
frame_start
TIMESTAMP
The time the frame started. A frame can contain multiple microbatches.
frame_end
TIMESTAMP
The time the frame completed.
Examples
This example shows typical rows from the stream_microbatch_history table.
This table contains configuration data related to microbatches.
This table contains configuration data related to microbatches.
Column
Data Type
Description
id
INTEGER
The identification number of the microbatch.
microbatch
VARCHAR
The name of the microbatch.
target
INTEGER
The identification number of the target associated with the microbatch.
load_spec
INTEGER
The identification number of the load spec associated with the microbatch.
target_columns
VARCHAR
The table columns associated with the microbatch.
rejection_schema
VARCHAR
The schema that contains the rejection table.
rejection_table
VARCHAR
The table where Vertica stores messages that are rejected by the database.
max_parallelism
INTEGER
The number of parallel COPY statements the scheduler uses to process the microbatch.
enabled
BOOLEAN
When TRUE, the microbatch is enabled for use.
consumer_group_id
VARCHAR
The name of the Kafka consumer group to report loading progress to. This value is NULL if the microbatch reports its progress to the default consumer group for the scheduler. See Monitoring Vertica message consumption with consumer groups for more information.
Examples
This example shows a row from a typical stream_microbatches table.
=> select * from weblog_sched.stream_microbatches;
-[ RECORD 1 ]-----+----------
id | 750001
microbatch | weberrors
target | 750001
load_spec | 2250001
target_columns |
rejection_schema |
rejection_table |
max_parallelism | 1
enabled | t
consumer_group_id |
-[ RECORD 2 ]-----+----------
id | 1
microbatch | weblog
target | 1
load_spec | 1
target_columns |
rejection_schema |
rejection_table |
max_parallelism | 1
enabled | t
consumer_group_id | weblog_group
8 - stream_scheduler
This table contains metadata related to a single scheduler.
This table contains metadata related to a single scheduler.
This table was renamed from kafka_config.kafka_scheduler. This table used to contain a column named eof_timeout_ms. It has been removed.
Column
Data Type
Description
version
VARCHAR
The version of the scheduler.
frame_duration
INTERVAL
The length of time of the frame. The default is 00:00:10.
resource_pool
VARCHAR
The resource pool associated with this scheduler.
config_refresh
INTERVAL
The interval of time that the scheduler runs before applying any changes to its metadata, such as, changes made using the --update option.
The name of the Kafka consumer group to which the scheduler reports its progress in consuming messages. This value is NULL if the scheduler reports to the default consumer group named vertica-database_name. See Monitoring Vertica message consumption with consumer groups for more information.
Examples
This example shows a typical row in the stream_scheduler table.
=> SELECT * FROM weblog_sched.stream_scheduler;
-[ RECORD 1 ]------+-----------------------
version | v9.2.1
frame_duration | 00:05:00
resource_pool | weblog_pool
config_refresh | 00:05
new_source_policy | FAIR
pushback_policy | LINEAR
pushback_max_count | 5
auto_sync | t
consumer_group_id | vertica-consumer-group
9 - stream_scheduler_history
This table shows the history of launched scheduler instances.
This table shows the history of launched scheduler instances.
This table was renamed from kafka_config.kafka_scheduler_history.
Column
Data Type
Description
elected_leader_time
TIMESTAMP
The time when this instance took began scheduling operations.
host
VARCHAR
The host name of the machine running the scheduler instance.
launcher
VARCHAR
The name of the currently active scheduler instance.
Default: NULL
scheduler_id
INTEGER
The identification number of the scheduler.
version
VARCHAR
The version of the scheduler.
Examples
This example shows typical rows from the stream_scheduler_history table.
This table contains metadata related to data streaming sources.
This table contains metadata related to data streaming sources.
This table was formerly named kafka_config.kafka_scheduler.
Column
Data Type
Description
id
INTEGER
The identification number of the source
source
VARCHAR
The name of the source.
cluster
INTEGER
The identification number of the cluster associated with the source.
partitions
INTEGER
The number of partitions in the source.
enabled
BOOLEAN
When TRUE, the source is enabled for use.
Examples
This example shows a typical row from the stream_sources table.
select * from stream_config.stream_sources;
-[ RECORD 1 ]--------------
id | 1
source | SourceFeed1
cluster | 1
partitions | 1
enabled | t
-[ RECORD 2 ]--------------
id | 250001
source | SourceFeed2
cluster | 1
partitions | 1
enabled | t
11 - stream_targets
This table contains the metadata for all Vertica target tables.
This table contains the metadata for all Vertica target tables.
The table was formerly named kafka_config.kafka_targets.
Column
Data Type
Description
id
INTEGER
The identification number of the target table
target_schema
VARCHAR
The name of the schema for the target table.
target_table
VARCHAR
The name of the target table.
Examples
This example shows typical rows from the stream_tables table.
=> SELECT * FROM stream_config.stream_targets;
-[ RECORD 1 ]-----+---------------------
id | 1
target_schema | public
target_table | stream_flex1
-[ RECORD 2 ]-----+---------------------
id | 2
target_schema | public
target_table | stream_flex2