FREGEXPARSER
Parses a regular expression, matching columns to the contents of the named regular expression groups.
This parser is for use in Flex tables only. All flex parsers store the data as a single VMap in the LONG VARBINAR_raw__
column. If a data row is too large to fit in the column, it is rejected. Vertica supports null values for loading data with NULL-specified columns.
Syntax
FREGEXPARSER ( pattern=[parameter-name='value'[,...]] )
Parameters
pattern
- Specifies the regular expression of data to match.
Default: Empty string (
""
) use_jit
- Boolean, specifies whether to use just-in-time compiling when parsing the regular expression.
Default: false
record_terminator
- Specifies the character used to separate input records.
Default:
\n
logline_column
- A string that captures the destination column containing the full string that the regular expression matched.
Default: Empty string (
""
)
Example
These examples use the following regular expression, which searches for information that includes the timestamp
, date
, thread_name
, and thread_id
strings.
Caution
For display purposes, this sample regular expression adds new line characters to split long lines of text. To use this expression in a query, first copy and edit the example to remove any new line characters.This example expression loads any thread_id
hex value, regardless of whether it has a 0x
prefix, (<thread_id>(?:0x)?[0-9a-f]+)
.
'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
(?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)]
\<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
?(?<text>.*)'
-
Create a flex table (
vlog
) to contain the results of a Vertica log file. For this example, we made a copy of a log file in the directory/home/dbadmin/data/vertica.log
:=> CREATE FLEX TABLE vlog1(); CREATE TABLE
-
Use the
fregexparser
with the sample regular expression to load data from the log file. Be sure to remove any line characters before using this expression shown here:=> COPY vlog1 FROM '/home/dbadmin/tempdat/KMvertica.log' PARSER FREGEXPARSER(pattern=
'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+) (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+) -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: ) ?(?<text>.*)'
-
After successfully loading data, use the MAPTOSTRING() function with the table's
__raw__
column. The four rows (limt 4
) that the query returns are regular expression results of theKMvertica.log
file, parsed withfregexparser
. The output showsthread_id
values with a preceding0x
or without:=> SELECT maptostring(__raw__) FROM vlog1 LIMIT 4; maptostring ------------------------------------------------------------------------------------- { "text" : " [Init] <INFO> Log /home/dbadmin/VMart/v_vmart_node0001_catalog/vertica.log opened; #2", "thread_id" : "0x7f2157e287c0", "thread_name" : "Main", "time" : "2017-03-21 23:30:01.704" } { "text" : " [Init] <INFO> Processing command line: /opt/vertica/bin/vertica -D /home/dbadmin/VMart/v_vmart_node0001_catalog -C VMart -n v_vmart_node0001 -h 10.20.100.247 -p 5433 -P 4803 -Y ipv4", "thread_id" : "0x7f2157e287c0", "thread_name" : "Main", "time" : "2017-03-21 23:30:01.704" } { "text" : " [Init] <INFO> Starting up Vertica Analytic Database v8.1.1-20170321", "thread_id" : "7f2157e287c0", "thread_name" : "Main", "time" : "2017-03-21 23:30:01.704" } { "text" : " [Init] <INFO> Compiler Version: 4.8.2 20140120 (Red Hat 4.8.2-15)", "thread_id" : "7f2157e287c0", "thread_name" : "Main", "time" : "2017-03-21 23:30:01.704" } (4 rows)