STV_Create_Index

Creates a spatial index on a set of polygons to speed up spatial intersection with a set of points.

Creates a spatial index on a set of polygons to speed up spatial intersection with a set of points.

A spatial index is created from an input polygon set, which can be the result of a query. Spatial indexes are created in a global name space. Vertica uses a distributed plan whenever the input table or projection is segmented across nodes of the cluster.

The OVER() clause must be empty.

Behavior type

Immutable

Syntax

STV_Create_Index( gid, g
                   USING PARAMETERS index='index_name'
                                    [, overwrite={ true | false } ]
                                    [, max_mem_mb=maxmem_value]
                     [, skip_nonindexable_polygons={true | false } ] )
                 OVER()
                    [ AS (polygons, srid, min_x, min_y, max_x, max_y, info) ]

Arguments

gid
Name of an integer column that uniquely identifies the polygon. The gid cannot be NULL.
g
Name of a geometry or geography (WGS84) column or expression that contains polygons and multipolygons. Only polygon and multipolygon can be indexed. Other shape types are excluded from the index.

Parameters

index = 'index_name'
Name of the index, type VARCHAR. Index names cannot exceed 110 characters. The slash, backslash, and tab characters are not allowed in index names.
overwrite = [ true | false ]

Boolean, whether to overwrite the index, if an index exists. This parameter cannot be NULL.

Default: False

max_mem_mb = maxmem_value
A positive integer that assigns a limit to the amount of memory in megabytes that STV_Create_Index can allocate during index construction. On a multi-node database this is the memory limit per node. The default value is 256. Do not assign a value higher than the amount of memory in the GENERAL resource pool. For more information about this pool, see Monitoring resource pools.

Setting a value for max_mem_mb that is at or near the maximum memory available on the node can negatively affect your system's performance. For example, it could cause other queries to time out waiting for memory resources during index construction.

skip_nonindexable_polygons = [ true | false ]

(Optional) BOOLEAN

In rare cases, intricate polygons (for instance, with too high resolution or anomalous spikes) cannot be indexed. These polygons are considered non-indexable. When set to False, non-indexable polygons cause the index creation to fail. When set to True, index creation can succeed by excluding non-indexable polygons from the index.

To review the polygons that were not able to be indexed, use STV_Describe_Index with the parameter list_polygon.

Default: False

Returns

polygons
Number of polygons indexed.
SRID
Spatial reference system identifier.
min_x, min_y, max_x, max_y
Coordinates of the minimum bounding rectangle (MBR) of the indexed geometries. (min_x, min_y) are the south-west coordinates, and (max_x, max_y) are the north-east coordinates.
info
Lists the number of excluded spatial objects as well as their type that were excluded from the index.

Supported data types

Data Type GEOMETRY GEOGRAPHY (WGS84)
Point No No
Multipoint No No
Linestring No No
Multilinestring No No
Polygon Yes Yes
Multipolygon Yes No
GeometryCollection No No

Privileges

Any user with access to the STV_*_Index functions can describe, rename, or drop indexes created by any other user.

Recommendations

  • Segment large polygon tables across multiple nodes. Table segmentation causes index creation to run in parallel, leveraging the Massively Parallel Processing (MPP) architecture in Vertica. This significantly reduces execution time on large tables.

    Vertica recommends that you segment the table from which you are building the index when the total number of polygons is large.

  • STV_Create_Index can consume large amounts of processing time and memory.

    Vertica recommends that when indexing new data for the first time, you monitor memory usage to be sure it stays within safe limits. Memory usage depends on number of polygons, number of vertices, and the amount of overlap among polygons.

  • STV_Create_Index tries to allocate memory before it starts creating the index. If it cannot allocate enough memory, the function fails. If not enough memory is available, try the following:

    • Create the index at a time of less load on the system.

    • Avoid concurrent index creation.

    • Try segmenting the input table across the nodes of the cluster.

  • Ensure that all of the polygons you plan to index are valid polygons. STV_Create_Index and STV_Refresh_Index do not check polygon validity when building an index.

    For more information, see Ensuring polygon validity before creating or refreshing an index.

Limitations

  • Any indexes created prior to 23.4.x need to re-created.

  • Index creation fails if there are WGS84 polygons with vertices on the International Date Line (IDL) or the North and South Poles.

  • The backslash or tab characters are not allowed in index names.

  • Indexes cannot have names greater than 110 characters.

  • The following geometries are excluded from the index:

    • Non-polygons

    • Geometries with NULL identifiers

    • NULL (multi) polygon

    • EMPTY (multi) polygon

    • Invalid (multi) polygon

  • The following geographies are excluded from the index:

    • Polygons with holes
    • Polygons crossing the International Date Line
    • Polygons covering the north or south pole
    • Antipodal polygons

Usage tips

  • To cancel an STV_Create_Index run, use Ctrl + C.

  • If there are no valid polygons in the geom column, STV_Create_Index reports an error in vertica.log and stops index creation.

  • If index creation uses a large amount of memory, consider segmenting your data to utilize parallel index creation.

Examples

The following examples show how to use STV_Create_Index.

Create an index with a single literal argument:


=> SELECT STV_Create_Index(1, ST_GeomFromText('POLYGON((0 0,0 15.2,3.9 15.2,3.9 0,0 0))')
     USING PARAMETERS index='my_polygon') OVER();
polygons  | SRID | min_x | min_y | max_x | max_y | info
----------+------+-------+-------+-------+-------+------
        1 |    0 |     0 |     0 |   3.9 |  15.2 |
(1 row)

Create an index from a table:

=> CREATE TABLE pols (gid INT, geom GEOMETRY(1000));
CREATE TABLE
=> COPY pols(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1|POLYGON((-31 74,8 70,8 50,-36 53,-31 74))
>> 2|POLYGON((-38 50,4 13,11 45,0 65,-38 50))
>> 3|POLYGON((10 20,15 60,20 45,46 15,10 20))
>> 4|POLYGON((5 20,9 30,20 45,36 35,5 20))
>> 5|POLYGON((12 23,9 30,20 45,36 35,37 67,45 80,50 20,12 23))
>> \.
=> SELECT STV_Create_Index(gid, geom USING PARAMETERS index='my_polygons_1', overwrite=true,
    max_mem_mb=256) OVER() FROM pols;
 polygons | SRID | min_x | min_y | max_x | max_y | info
----------+------+-------+-------+-------+-------+------
        5 |    0 |   -38 |    13 |    50 |    80 |
(1 row)

Create an index in parallel from a partitioned table:

=> CREATE TABLE pols (p INT, gid INT, geom GEOMETRY(1000)) SEGMENTED BY HASH(p) ALL NODES;
CREATE TABLE
=> COPY pols (p, gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1|10|POLYGON((-31 74,8 70,8 50,-36 53,-31 74))
>> 1|11|POLYGON((-38 50,4 13,11 45,0 65,-38 50))
>> 3|12|POLYGON((-12 42,-12 42,27 48,14 26,-12 42))
>> \.
=> SELECT STV_Create_Index(gid, geom USING PARAMETERS index='my_polygons', overwrite=true,
    max_mem_mb=256) OVER() FROM pols;
 polygons | SRID | min_x | min_y | max_x | max_y | info
----------+------+-------+-------+-------+-------+------
        3 |    0 |   -38 |    13 |    27 |    74 |
(1 row)

See also