A Top-K query returns the top k
rows from partitions of selected rows. Top-K projections can significantly improve performance of Top-K queries. For example, you can define a table that stores gas meter readings with three columns: gas meter ID, time of meter reading, and the read value:
=> CREATE TABLE readings (
meter_id INT,
reading_date TIMESTAMP,
reading_value FLOAT);
Given this table, the following Top-K query returns the five most recent meter readings for a given meter:
SELECT meter_id, reading_date, reading_value FROM readings
LIMIT 5 OVER (PARTITION BY meter_id ORDER BY reading_date DESC);
To improve the performance of this query, you can create a Top-K projection, which is a special type of live aggregate projection:
=> CREATE PROJECTION readings_topk (meter_id, recent_date, recent_value)
AS SELECT meter_id, reading_date, reading_value FROM readings
LIMIT 5 OVER (PARTITION BY meter_id ORDER BY reading_date DESC);
After you create this Top-K projection and load its data (through START_REFRESH or REFRESH), Vertica typically redirects the query to the projection and returns with the pre-aggregated data.