Used with the MATCH clause, the functions in this section return additional data about the patterns found or returned. For example, you can use these functions to return values representing the name of the event or pattern that matched the input row, the sequential number of the match, or a partition-wide unique identifier for the instance of the pattern that matched.
Pattern matching is particularly useful for clickstream analysis where you might want to identify users' actions based on their Web browsing behavior (page clicks). A typical online clickstream funnel is:
Company home page -> product home page -> search -> results -> purchase online
Using the above clickstream funnel, you can search for a match on the user's sequence of web clicks and identify that the user:
GROUP BY and PARTITION BY expressions do not support window functions.
1 - EVENT_NAME
Returns a VARCHAR value representing the name of the event that matched the row.
Returns a VARCHAR value representing the name of the event that matched the row.
Syntax
EVENT_NAME()
Notes
Pattern matching functions must be used in MATCH clause syntax; for example, if you call EVENT_NAME() on its own, Vertica returns the following error message:
=> SELECT event_name();
ERROR: query with pattern matching function event_name must include a MATCH clause
The following statement analyzes users' browsing history on website2.com and identifies patterns where the user landed on website2.com from another Web site (Entry) and browsed to any number of other pages (Onsite) before making a purchase (Purchase). The query also outputs the values for EVENT_NAME(), which is the name of the event that matched the row.
SELECT uid,
sid,
ts,
refurl,
pageurl,
action,
event_name()
FROM clickstream_log
MATCH
(PARTITION BY uid, sid ORDER BY ts
DEFINE
Entry AS RefURL NOT ILIKE '%website2.com%' AND PageURL ILIKE '%website2.com%',
Onsite AS PageURL ILIKE '%website2.com%' AND Action='V',
Purchase AS PageURL ILIKE '%website2.com%' AND Action = 'P'
PATTERN
P AS (Entry Onsite* Purchase)
ROWS MATCH FIRST EVENT);
uid | sid | ts | refurl | pageurl | action | event_name
-----+-----+----------+----------------------+----------------------+--------+------------
1 | 100 | 12:00:00 | website1.com | website2.com/home | V | Entry
1 | 100 | 12:01:00 | website2.com/home | website2.com/floby | V | Onsite
1 | 100 | 12:02:00 | website2.com/floby | website2.com/shamwow | V | Onsite
1 | 100 | 12:03:00 | website2.com/shamwow | website2.com/buy | P | Purchase
2 | 100 | 12:10:00 | website1.com | website2.com/home | V | Entry
2 | 100 | 12:11:00 | website2.com/home | website2.com/forks | V | Onsite
2 | 100 | 12:13:00 | website2.com/forks | website2.com/buy | P | Purchase
(7 rows)
Returns a successful pattern match as an INTEGER value.
Returns a successful pattern match as an INTEGER value. The returned value is the ordinal position of a match within a partition.
Syntax
MATCH_ID()
Notes
Pattern matching functions must be used in MATCH clause syntax; for example, if you call MATCH_ID() on its own, Vertica returns the following error message:
=> SELECT match_id();
ERROR: query with pattern matching function match_id must include a MATCH clause
The following statement analyzes users' browsing history on a site called website2.com and identifies patterns where the user reached website2.com from another Web site (Entry in the MATCH clause) and browsed to any number of other pages (Onsite) before making a purchase (Purchase). The query also outputs values for the MATCH_ID(), which represents a sequential number of the match.
SELECT uid,
sid,
ts,
refurl,
pageurl,
action,
match_id()
FROM clickstream_log
MATCH
(PARTITION BY uid, sid ORDER BY ts
DEFINE
Entry AS RefURL NOT ILIKE '%website2.com%' AND PageURL ILIKE '%website2.com%',
Onsite AS PageURL ILIKE '%website2.com%' AND Action='V',
Purchase AS PageURL ILIKE '%website2.com%' AND Action = 'P'
PATTERN
P AS (Entry Onsite* Purchase)
ROWS MATCH FIRST EVENT);
uid | sid | ts | refurl | pageurl | action | match_id
----+-----+----------+----------------------+----------------------+--------+------------
1 | 100 | 12:00:00 | website1.com | website2.com/home | V | 1
1 | 100 | 12:01:00 | website2.com/home | website2.com/floby | V | 2
1 | 100 | 12:02:00 | website2.com/floby | website2.com/shamwow | V | 3
1 | 100 | 12:03:00 | website2.com/shamwow | website2.com/buy | P | 4
2 | 100 | 12:10:00 | website1.com | website2.com/home | V | 1
2 | 100 | 12:11:00 | website2.com/home | website2.com/forks | V | 2
2 | 100 | 12:13:00 | website2.com/forks | website2.com/buy | P | 3
(7 rows)
Returns an integer value that is a partition-wide unique identifier for the instance of the pattern that matched.
Returns an integer value that is a partition-wide unique identifier for the instance of the pattern that matched.
Syntax
PATTERN_ID()
Notes
Pattern matching functions must be used in MATCH clause syntax; for example, if call PATTERN_ID() on its own, Vertica returns the following error message:
=> SELECT pattern_id();
ERROR: query with pattern matching function pattern_id must include a MATCH clause
The following statement analyzes users' browsing history on website2.com and identifies patterns where the user landed on website2.com from another Web site (Entry) and browsed to any number of other pages (Onsite) before making a purchase (Purchase). The query also outputs values for PATTERN_ID(), which represents the partition-wide identifier for the instance of the pattern that matched.
SELECT uid,
sid,
ts,
refurl,
pageurl,
action,
pattern_id()
FROM clickstream_log
MATCH
(PARTITION BY uid, sid ORDER BY ts
DEFINE
Entry AS RefURL NOT ILIKE '%website2.com%' AND PageURL ILIKE '%website2.com%',
Onsite AS PageURL ILIKE '%website2.com%' AND Action='V',
Purchase AS PageURL ILIKE '%website2.com%' AND Action = 'P'
PATTERN
P AS (Entry Onsite* Purchase)
ROWS MATCH FIRST EVENT);
uid | sid | ts | refurl | pageurl | action | pattern_id
----+-----+----------+----------------------+----------------------+--------+------------
1 | 100 | 12:00:00 | website1.com | website2.com/home | V | 1
1 | 100 | 12:01:00 | website2.com/home | website2.com/floby | V | 1
1 | 100 | 12:02:00 | website2.com/floby | website2.com/shamwow | V | 1
1 | 100 | 12:03:00 | website2.com/shamwow | website2.com/buy | P | 1
2 | 100 | 12:10:00 | website1.com | website2.com/home | V | 1
2 | 100 | 12:11:00 | website2.com/home | website2.com/forks | V | 1
2 | 100 | 12:13:00 | website2.com/forks | website2.com/buy | P | 1
(7 rows)