高级日志分词器

返回可包含二级分隔符的令牌。在令牌由空格或多种标点分隔的情况下可以使用此分词器。在通过添加二级分隔符定义分隔符方面,高级日志分词器能比基本日志分词器提供更大的粒度。这一方法通常适用于分析日志文件。

参数

示例

以下示例显示了如何在没有词干分析器的情况下使用高级日志分词器从表 foo 创建文本索引。

=> CREATE TABLE foo (id INT PRIMARY KEY NOT NULL,text VARCHAR(250));
=> COPY foo FROM STDIN;
End with a backslash and a period on a line by itself.
>> 1|2014-05-10 00:00:05.700433 %ASA-6-302013: Built outbound TCP connection 9986454 for outside:101.123.123.111/443 (101.123.123.111/443)
>> \.
=> CREATE PROJECTION foo_projection AS SELECT * FROM foo ORDER BY id
                                    SEGMENTED BY HASH(id) ALL NODES KSAFE;
=> CREATE TEXT INDEX indexfoo_AdvancedLogTokenizer ON foo (id, text)
                  TOKENIZER v_txtindex.AdvancedLogTokenizer(LONG VARCHAR) STEMMER NONE;
=> SELECT * FROM indexfoo_AdvancedLogTokenizer;
            token            | doc_id
-----------------------------+--------
 %ASA-6-302013:              |      1
 00                          |      1
 00:00:05.700433             |      1
 05                          |      1
 10                          |      1
 101                         |      1
 101.123.123.111/443         |      1
 111                         |      1
 123                         |      1
 2014                        |      1
 2014-05-10                  |      1
 302013                      |      1
 443                         |      1
 700433                      |      1
 9986454                     |      1
 ASA                         |      1
 Built                       |      1
 TCP                         |      1
 connection                  |      1
 for                         |      1
 outbound                    |      1
 outside                     |      1
 outside:101.123.123.111/443 |      1
(23 rows)