KafkaJSONParser

The KafkaJSONParser parses JSON-formatted Kafka messages and loads them into a regular Vertica table or a Vertica flex table.

The KafkaJSONParser parses JSON-formatted Kafka messages and loads them into a regular Vertica table or a Vertica flex table.

Syntax

KafkaJSONParser(
        [enforce_length=Boolean]
        [, flatten_maps=Boolean]
        [, flatten_arrays=Boolean]
        [, start_point=string]
        [, start_point_occurrence=integer]
        [, omit_empty_keys=Boolean]
        [, reject_on_duplicate=Boolean]
        [, reject_on_materialized_type_error=Boolean]
        [, reject_on_empty_key=Boolean]
        [, key_separator=char]
        [, suppress_nonalphanumeric_key_chars=Boolean]
        [, enable_chunker=true]
        )
enforce_length
If set to TRUE, rejects the row if data being loaded is too wide to fit into its column. Defaults to FALSE, which truncates any data that is too wide to fit into its column.
flatten_maps
If set to TRUE, flattens all JSON maps.
flatten_arrays
If set to TRUE, flattens JSON arrays.
start_point
Specifies the key in the JSON data that the parser should parse. The parser only extracts data that is within the value associated with the start_point key. It parses the values of all instances of the start_point key within the data.
start_point_occurrence
Integer value indicating which the occurrence of the key specified by the start_point parameter where the parser should begin parsing. For example, if you set this value to 4, the parser will only begin loading data from the fifth occurrence of the start_point key. Only has an effect if you also supply the start_point parameter.
omit_empty_keys
If set to TRUE, omits any key from the load data that does not have a value set.
reject_on_duplicate
If set to TRUE, rejects the row that contains duplicate key names. Key names are case-insensitive, so the keys "mykey" and "MyKey" are considered duplicates.
reject_on_materialized_type_error
If set to TRUE, rejects the row if the data includes keys matching an existing materialized column and has a key that cannot be mapped into the materialized column's data type.
reject_on_empty_key
If set to TRUE, rejects any row containing a key without a value.
key_separator
A single character to use as the separator between key values instead of the default period (.) character.
suppress_nonalphanumeric_key_chars
If set to TRUE, replaces all non-alphanumeric characters in JSON key values with an underscore (_) character.
enable_chunker
If set to TRUE, can improve parsing performance, especially when parsing large/complex Kafka messages.

See JSON data for more information.

The following example demonstrates loading JSON data from Kafka. The parameters in the statement define to the load to:

  • Load data into the pre-existing table named logs.

  • The KafkaSource streams the data from a single partition in the source called server_log.

  • The Kafka broker for the data load is running on the host named kafka01 on port 9092.

  • KafkaSource stops loading data after either 10 seconds or on reaching the end of the stream, whichever happens first.

  • The KafkJSONParser flattens any arrays or maps in the JSON data.

=> COPY logs SOURCE KafkaSource(stream='server_log|0|0',
                                stop_on_eof=true,
                                duration=interval '10 seconds',
                                brokers='kafka01:9092')
   PARSER KafkaJSONParser(flatten_arrays=True, flatten_maps=True);