Matches from regular expressions
You can load flex or columnar tables with the matched results of a regular expression, using the fregexparser
. This section describes some examples of using the options that the flex parsers support.
Sample regular expression
These examples use the following regular expression, which searches information that includes the timestamp
, date
, thread_name
, and thread_id
strings.
Caution
For display purposes, this sample regular expression adds new line characters to split long lines of text. To use this expression in a query, first copy and edit the example to remove any new line characters.This example expression loads any thread_id
hex value, regardless of whether it has a 0x
prefix, (<thread_id>(?:0x)?[0-9a-f]+).
Using regular expression matches for a flex table
You can load the results from a regular expression into a flex table, using the fregexparser
. For a complete example of doing so, see FREGEXPARSER.
Using fregexparser for columnar tables
This section illustrates how to load the results of a regular expression used with a sample log file for a Vertica database. By using an external table definition, the section presents an example of using fregexparser to load data into a columnar table. Using a flex table parser for a columnar tables gives you the capability to mix data loads in one table. For example, you can load the results of a regular expression in one session, and JSON data in another.
The following basic examples illustrate this usage.
-
Create a columnar table,
vlog
, with the following columns:=> CREATE TABLE vlog ( "text" varchar(2322), thread_id varchar(28), thread_name varchar(44), "time" varchar(46), component varchar(30), level varchar(20), transaction_id varchar(32), elevel varchar(20), enode varchar(34) );
-
Use COPY to load parts of a log file using the sample regular expression presented above, with the
fregexparser
. Be sure to remove any line characters from this expression example before trying it yourself:=> COPY v_log FROM '/home/dbadmin/data/flex/vertica.log' PARSER FRegexParser(pattern=
'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+) (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+) -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: ) ?(?<text>.*)'
-
Query the
time
column:=> SELECT time FROM flogs limit 10; time ------------------------- 2014-04-02 04:02:02.613 2014-04-02 04:02:02.613 2014-04-02 04:02:02.614 2014-04-02 04:02:51.008 2014-04-02 04:02:51.010 2014-04-02 04:02:51.012 2014-04-02 04:02:51.012 2014-04-02 04:02:51.013 2014-04-02 04:02:51.014 2014-04-02 04:02:51.017 (10 rows)
Using external tables with fregexparser
By creating an external columnar table for your Vertica log file, querying the table will return updated log information. The following basic example illustrate this usage.
-
Create a columnar table,
vertica_log
, using theAS COPY
clause andfregexparser
to load matched results from the regular expression. For illustrative purposes, this regular expression has new line characters to split long text lines. Remove any line returns before testing with this expression:=> CREATE EXTERNAL TABLE public.vertica_log ( "text" varchar(2322), thread_id varchar(28), thread_name varchar(44), "time" varchar(46), component varchar(30), level varchar(20), transaction_id varchar(32), elevel varchar(20), enode varchar(34) ) AS COPY FROM '/home/dbadmin/data/vertica.log' PARSER FRegexParser(pattern=
'^(?<time>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d+) (?<thread_name>[A-Za-z ]+):(?<thread_id>(?:0x)?[0-9a-f]+) -?(?<transaction_id>[0-9a-f])?(?:[(?<component>\w+)] \<(?<level>\w+)\> )?(?:<(?<elevel>\w+)> @[?(?<enode>\w+)]?: ) ?(?<text>.*)'
-
Query from the external table to get updated results:
=> SELECT component, thread_id, time FROM vertica_log limit 10; component | thread_id | time -----------+------------+------------------------- Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 Init | 0x16321430 | 2014-04-02 04:02:02.613 | 0x16321430 | 2014-04-02 04:02:02.614 | 0x16321430 | 2014-04-02 04:02:02.614 | 0x16321430 | 2014-04-02 04:02:02.614 (10 rows)