FDELIMITEDPAIRPARSER

Parses delimited data files.

Parses delimited data files. This parser provides a subset of the functionality in the parser fdelimitedparser. Use the fdelimitedpairparser when the data you are loading specifies pairs of column names with data in each row.

This parser is for use in Flex tables only. All flex parsers store the data as a single VMap in the LONG VARBINAR_raw__ column. If a data row is too large to fit in the column, it is rejected. Vertica supports null values for loading data with NULL-specified columns.

Syntax

FDELIMITEDPAIRPARSER ( [parameter-name='value'[,...]] )

Parameters

delimiter
Specifies a single-character delimiter.

Default: ' '

record_terminator
Specifies a single-character record terminator.

Default: newline

trim
Boolean specifies whether to trim white space from header names and key values.

Default: true

Examples

The following example illustrates creating a sample flex table for simple delimited data, with two real columns, eventId and priority.

  1. Create a table:

    => create flex table CEFData(eventId int default(eventId::int), priority int default(priority::int) );
    CREATE TABLE
    
  2. Load a sample delimited OpenText ArcSight log file into the CEFData table, using the fcefparser:

    => copy CEFData from '/home/release/kmm/flextables/sampleArcSight.txt' parser fdelimitedpairparser();
    Rows Loaded | 200
    
  3. After loading the sample data file, use maptostring() to display the virtual columns in the __raw__ column of CEFData:

    => select maptostring(__raw__) from CEFData limit 1;                                                                                                                                                                                                                                        maptostring
    -----------------------------------------------------------
       "agentassetid" : "4-WwHuD0BABCCQDVAeX21vg==",
       "agentzone" : "3083",
       "agt" : "265723237",
       "ahost" : "svsvm0176",
       "aid" : "3tGoHuD0BABCCMDVAeX21vg==",
       "art" : "1099267576901",
       "assetcriticality" : "0",
       "at" : "snort_db",
       "atz" : "America/Los_Angeles",
       "av" : "5.3.0.19524.0",
       "cat" : "attempted-recon",
       "categorybehavior" : "/Communicate/Query",
       "categorydevicegroup" : "/IDS/Network",
       "categoryobject" : "/Host",
       "categoryoutcome" : "/Attempt",
       "categorysignificance" : "/Recon",
       "categorytechnique" : "/Scan",
       "categorytupledescription" : "An IDS observed a scan of a host.",
       "cnt" : "1",
       "cs2" : "3",
       "destinationgeocountrycode" : "US",
       "destinationgeolocationinfo" : "Richardson",
       "destinationgeopostalcode" : "75082",
       "destinationgeoregioncode" : "TX",
       "destinationzone" : "3133",
       "device product" : "Snort",
       "device vendor" : "Snort",
       "device version" : "1.8",
       "deviceseverity" : "2",
       "dhost" : "198.198.121.200",
       "dlat" : "329913940429",
       "dlong" : "-966644973754",
       "dst" : "3334896072",
       "dtz" : "America/Los_Angeles",
       "dvchost" : "unknown:eth1",
       "end" : "1364676323451",
       "eventid" : "1219383333",
       "fdevice product" : "Snort",
       "fdevice vendor" : "Snort",
       "fdevice version" : "1.8",
       "fdtz" : "America/Los_Angeles",
       "fdvchost" : "unknown:eth1",
       "lblstring2label" : "sig_rev",
       "locality" : "0",
       "modelconfidence" : "0",
       "mrt" : "1364675789222",
       "name" : "ICMP PING NMAP",
       "oagentassetid" : "4-WwHuD0BABCCQDVAeX21vg==",
       "oagentzone" : "3083",
       "oagt" : "265723237",
       "oahost" : "svsvm0176",
       "oaid" : "3tGoHuD0BABCCMDVAeX21vg==",
       "oat" : "snort_db",
       "oatz" : "America/Los_Angeles",
       "oav" : "5.3.0.19524.0",
       "originator" : "0",
       "priority" : "8",
       "proto" : "ICMP",
       "relevance" : "10",
       "rt" : "1099267573000",
       "severity" : "8",
       "shost" : "198.198.104.10",
       "signature id" : "[1:469]",
       "slat" : "329913940429",
       "slong" : "-966644973754",
       "sourcegeocountrycode" : "US",
       "sourcegeolocationinfo" : "Richardson",
       "sourcegeopostalcode" : "75082",
       "sourcegeoregioncode" : "TX",
       "sourcezone" : "3133",
       "src" : "3334891530",
       "start" : "1364676323451",
       "type" : "0"
    }
    
    (1 row)
    
  4. Select the eventID and priority real columns, along with two virtual columns, atz and destinationgeoregioncode:

    
    =>  select eventID, priority, atz, destinationgeoregioncode from CEFData limit 10;
      eventID   | priority |         atz         | destinationgeoregioncode
    ------------+----------+---------------------+--------------------------
     1218325417 |        5 | America/Los_Angeles |
     1219383333 |        8 | America/Los_Angeles | TX
     1219533691 |        9 | America/Los_Angeles | TX
     1220034458 |        5 | America/Los_Angeles | TX
     1220034578 |        9 | America/Los_Angeles |
     1220067119 |        5 | America/Los_Angeles | TX
     1220106960 |        5 | America/Los_Angeles | TX
     1220142122 |        5 | America/Los_Angeles | TX
     1220312009 |        5 | America/Los_Angeles | TX
     1220321355 |        5 | America/Los_Angeles | CA
    (10 rows)
    

See also