FDELIMITEDPAIRPARSER
解析分隔的数据文件。此解析器提供了解析器 fdelimitedparser
功能的一个子集。当您要加载的数据指定成对列名称且每行均有数据时,使用 fdelimitedpairparser
。
该解析器仅可用于 Flex 表。所有 Flex 解析器均会将数据作为单个 VMap 存储在 LONG VARBINAR_raw__
列中。如果某个数据行过大而无法适应该列,该数据行将被拒绝。Vertica 在加载带 NULL 指定列的数据时支持 NULL 值。
语法
FDELIMITEDPAIRPARSER ( [parameter‑name='value'[,...]] )
参数
delimiter
- 指定单字符分隔符。
默认值:
' '
record_terminator
- 指定单字符记录终止符。
默认值: 换行符
trim
- 布尔值,指定是否从标题名和键值中去掉空格。
默认值: true
示例
以下示例演示了为简单分隔数据创建示例 Flex 表,该表包含两个实际列,分别为 eventId
和 priority
。
-
创建一个表:
=> create flex table CEFData(eventId int default(eventId::int), priority int default(priority::int) ); CREATE TABLE
-
使用
fcefparser
将示例分隔的 Micro Focus ArcSight 日志文件加载到CEFData
表中:=> copy CEFData from '/home/release/kmm/flextables/sampleArcSight.txt' parser fdelimitedpairparser(); Rows Loaded | 200
-
加载示例数据文件后,使用
maptostring()
在CEFData
的__raw__
列中显示虚拟列:=> select maptostring(__raw__) from CEFData limit 1; maptostring ----------------------------------------------------------- "agentassetid" : "4-WwHuD0BABCCQDVAeX21vg==", "agentzone" : "3083", "agt" : "265723237", "ahost" : "svsvm0176", "aid" : "3tGoHuD0BABCCMDVAeX21vg==", "art" : "1099267576901", "assetcriticality" : "0", "at" : "snort_db", "atz" : "America/Los_Angeles", "av" : "5.3.0.19524.0", "cat" : "attempted-recon", "categorybehavior" : "/Communicate/Query", "categorydevicegroup" : "/IDS/Network", "categoryobject" : "/Host", "categoryoutcome" : "/Attempt", "categorysignificance" : "/Recon", "categorytechnique" : "/Scan", "categorytupledescription" : "An IDS observed a scan of a host.", "cnt" : "1", "cs2" : "3", "destinationgeocountrycode" : "US", "destinationgeolocationinfo" : "Richardson", "destinationgeopostalcode" : "75082", "destinationgeoregioncode" : "TX", "destinationzone" : "3133", "device product" : "Snort", "device vendor" : "Snort", "device version" : "1.8", "deviceseverity" : "2", "dhost" : "198.198.121.200", "dlat" : "329913940429", "dlong" : "-966644973754", "dst" : "3334896072", "dtz" : "America/Los_Angeles", "dvchost" : "unknown:eth1", "end" : "1364676323451", "eventid" : "1219383333", "fdevice product" : "Snort", "fdevice vendor" : "Snort", "fdevice version" : "1.8", "fdtz" : "America/Los_Angeles", "fdvchost" : "unknown:eth1", "lblstring2label" : "sig_rev", "locality" : "0", "modelconfidence" : "0", "mrt" : "1364675789222", "name" : "ICMP PING NMAP", "oagentassetid" : "4-WwHuD0BABCCQDVAeX21vg==", "oagentzone" : "3083", "oagt" : "265723237", "oahost" : "svsvm0176", "oaid" : "3tGoHuD0BABCCMDVAeX21vg==", "oat" : "snort_db", "oatz" : "America/Los_Angeles", "oav" : "5.3.0.19524.0", "originator" : "0", "priority" : "8", "proto" : "ICMP", "relevance" : "10", "rt" : "1099267573000", "severity" : "8", "shost" : "198.198.104.10", "signature id" : "[1:469]", "slat" : "329913940429", "slong" : "-966644973754", "sourcegeocountrycode" : "US", "sourcegeolocationinfo" : "Richardson", "sourcegeopostalcode" : "75082", "sourcegeoregioncode" : "TX", "sourcezone" : "3133", "src" : "3334891530", "start" : "1364676323451", "type" : "0" } (1 row)
-
选择
eventID
和priority
实际列以及两个虚拟列atz
和destinationgeoregioncode
:=> select eventID, priority, atz, destinationgeoregioncode from CEFData limit 10; eventID | priority | atz | destinationgeoregioncode ------------+----------+---------------------+-------------------------- 1218325417 | 5 | America/Los_Angeles | 1219383333 | 8 | America/Los_Angeles | TX 1219533691 | 9 | America/Los_Angeles | TX 1220034458 | 5 | America/Los_Angeles | TX 1220034578 | 9 | America/Los_Angeles | 1220067119 | 5 | America/Los_Angeles | TX 1220106960 | 5 | America/Los_Angeles | TX 1220142122 | 5 | America/Los_Angeles | TX 1220312009 | 5 | America/Los_Angeles | TX 1220321355 | 5 | America/Los_Angeles | CA (10 rows)