MAPREGEXEXTRACTOR

Extracts data with a regular expression and returns results as a VMap.

Extracts data with a regular expression and returns results as a VMap.

Syntax

MAPREGEXEXTRACTOR (record-value [ USING PARAMETERS param=value[,...] ])

Arguments

record-value
String containing a JSON or delimited format record on which to apply the regular expression.

Parameters

pattern
Regular expression used to extract the desired data.

Default: Empty string ('')

use_jit
Boolean, use just-in-time compiling when parsing the regular expression.

Default: false

record_terminator
Character used to separate input records.

Default: \n

logline_column
Destination column containing the full string that the regular expression matched.

Default: Empty string ('')

Examples

These examples use the following regular expression, which searches for information that includes the timestamp, date, thread_name, and thread_id strings.

This example expression loads any thread_id hex value, regardless of whether it has a 0x prefix, (<thread_id>(?:0x)?[0-9a-f]+).

'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
 (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)]
\<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
?(?<text>.*)'

The following examples may include newline characters for display purposes.

  1. Create a flex table, flogs:

    => CREATE FLEX TABLE flogs();
    CREATE TABLE
    
  2. Use COPY to load a sample log file (vertica.log), using the flex table fregexparser. Note that this example includes added line characters for displaying long text lines.

    => COPY flogs FROM '/home/dbadmin/tempdat/vertica.log' PARSER FREGEXPARSER(pattern='
    ^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+) (?<thread_name>[A-Za-z ]+):
    (?<thread_id>(?:0x)?[0-9a-f])-?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)]
    \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )?(?<text>.*)');
    Rows Loaded
    -------------
    81399
    (1 row)
    
  3. Use to return the results from calling MAPREGEXEXTRACTOR with a regular expression. The output returns the results of the function in string format.

    => SELECT MAPTOSTRING(MapregexExtractor(E'2014-04-02 04:02:51.011
    TM Moveout:0x2aab9000f860-a0000000002067 [Txn] <INFO>
    Begin Txn: a0000000002067 \'Moveout: Tuple Mover\'' using PARAMETERS
    pattern=
    '^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+)
     (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+)
    -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)]
    \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: )
    ?(?<text>.*)'
    
    )) FROM flogs where __identity__=13; maptostring -------------------------------------------------------------------------------------------------- { "component" : "Txn", "level" : "INFO", "text" : "Begin Txn: a0000000002067 'Moveout: Tuple Mover'", "thread_id" : "0x2aab9000f860", "thread_name" : "TM Moveout", "time" : "2014-04-02 04:02:51.011", "transaction_id" : "a0000000002067" } (1 row)

See also