Top-k projections

A Top-K query returns the top k rows from partitions of selected rows.

A Top-K query returns the top k rows from partitions of selected rows. Top-K projections can significantly improve performance of Top-K queries. For example, you can define a table that stores gas meter readings with three columns: gas meter ID, time of meter reading, and the read value:


=> CREATE TABLE readings (
    meter_id INT,
    reading_date TIMESTAMP,
    reading_value FLOAT);

Given this table, the following Top-K query returns the five most recent meter readings for a given meter:


SELECT meter_id, reading_date, reading_value FROM readings
    LIMIT 5 OVER (PARTITION BY meter_id ORDER BY reading_date DESC);

To improve the performance of this query, you can create a Top-K projection, which is a special type of live aggregate projection:


=> CREATE PROJECTION readings_topk (meter_id, recent_date, recent_value)
    AS SELECT meter_id, reading_date, reading_value FROM readings
    LIMIT 5 OVER (PARTITION BY meter_id ORDER BY reading_date DESC);

After you create this Top-K projection and load its data (through START_REFRESH or REFRESH), Vertica typically redirects the query to the projection and returns with the pre-aggregated data.