This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Geospatial analytics
Vertica provides functions that allows you to manipulate complex two- and three-dimensional spatial objects.
Vertica provides functions that allows you to manipulate complex two- and three-dimensional spatial objects. These functions follow the Open Geospatial Consortium (OGC) standards. Vertica also provides data types and SQL functions that allow you to specify and store spatial objects in a database according to OGC standards.
Convert well-known text (WKT) and well-known binary (WKB)
Convert WKT and WKB.
Optimized spatial joins
Perform fast spatial joins using ST_Intersects and STV_Intersects.
Load and export spatial data from shapefiles
Easily load and export shapefiles.
Store and retrieve objects
Determine if:
-
An object contains self-intersection or self-tangency points.
-
One object is entirely within another object, such as a point within a polygon.
Test the relationships between objects
For example, if they intersect or touch:
Calculate
-
Shortest distance between two objects.
-
Size of an object (length, area).
-
Centroid for one or more objects.
-
Buffer around one or more objects.
1 - Best practices for geospatial analytics
Vertica recommends the following best practices when performing geospatial analytics in Vertica.
Vertica recommends the following best practices when performing geospatial analytics in Vertica.
Recommendation |
Details |
Use the minimum column size for spatial data. |
Performance degrades as column widths increase. When creating columns for your spatial data, use the smallest size column that can accommodate your data. For example, use GEOMETRY(85) for point data. |
Use GEOMETRY types where possible. |
Performance of functions on GEOGRAPHY types is slower than functions that support GEOMETRY types. Use GEOMETRY types where possible. |
To improve the performance of the following functions, sort projections on spatial columns:
|
You may improve the pruning efficiency of these functions by sorting the projection on the GEOMETRY column. However, sorting on a large GEOMETRY column may slow down data load. |
Spatial joins with points and polygons
Vertica provides two ways to identify whether a set of points intersect with a set of polygons. Depending on the size of your data set, choose the approach that gives the best performance.
For a detailed example of best practices with spatial joins, see Best practices for spatial joins.
Recommendation |
Details |
Use ST_Intersects to intersect a constant geometry with a set of geometries. |
When you intersect a set of geometries with a geometry, use the ST_Intersects function. Express the constant geometry argument as a WKT or WKB. For example:
ST_Intersects(geom,ST_GeomFromText( 'POLYGON((43.1 50.1,43.1 59.0, 48.9 59.0,43.1 50.1))'))
For more information, see Performing spatial joins with ST_Intersects.
|
Create a spatial index only when performing spatial joins with STV_Intersect. |
Spatial indexes should only be used with STV_Intersect. Creating a spatial index and then performing spatial joins with ST_Intersects will not improve performance. |
Use the STV_Intersect function when you intersect a set of points with a set of polygons. |
Determine if a set of points intersects with a set of polygons in a medium to large data set. First, create a spatial index using STV_Create_Index. Then, use one of the STV_Intersect functions to return the set of pairs that intersect.
Spatial indexes provide the best performance for accessing a large number of polygons.
|
When using the STV_Intersect transform function, partition the data and use an OVER(PARTITION BEST) clause. |
The STV_Intersect transform function does not require that you partition the data. However, you may improve performance by partitioning the data and using an OVER(PARTITION BEST) clause. |
Spatial indexes
The STV_Create_Index function can consume large amounts of processing time and memory. When you index new data for the first time, monitor memory usage to be sure it stays within safe limits. Memory usage depends on:
Recommendation |
Details |
Segment polygon data when a table contains a large number of polygons. |
Segmenting the data allows the index creation process to run in parallel. This is advantageous because sometimes STV_Create_Index tasks cannot be completed when large tables that are not segmented prior to index creation. |
Adjust STV_Create_Index parameters as needed for memory allocation and CPU usage. |
The max_mem_mb parameter can affect the resource usage of STV_Create_Index. max_mem_mb assigns a limit to the amount of memory that STV_Create_Index can allocate.
Default: 256
Valid values: Any value less than or equal to the amount of memory in the GENERAL resource pool. Assigning a higher value results in an error.
|
Make changes if STV_Create_Index cannot allocate 300 MB memory. |
Before STV_Create_Index starts creating the index, it tries to allocate about 300 MB of memory. If that much memory is not available, the function fails. If you get a failure message, try these solutions:
-
Create the index at a time of less load on the system.
-
Avoid concurrent index creation.
-
Add more memory to your system.
|
Create misplaced indexes again, if needed. |
When you back up your Vertica database, spatial index files are not included. If you misplace an index, use STV_Create_Index to re-create it. |
Use STV_Refresh_Index to add new or updated polygons to an existing index. |
Instead of rebuilding your spatial index each time you add new or updated polygons to a table, you can use STV_Refresh_Index to append the polygons to your existing spatial index. |
Checking polygon validity
2 - Spatial objects
Vertica implements several data types for storing spatial objects, Well-Known Text (WKT) strings, and Well-Known Binary (WKB) representations.
Vertica implements several data types for storing spatial objects, Well-Known Text (WKT) strings, and Well-Known Binary (WKB) representations. These data types include:
2.1 - Supported spatial objects
Vertica supports two spatial data types. These data types store two- and three-dimensional spatial objects in a table column:
The maximum size of a GEOMETRY
or GEOGRAPHY
data type is 10,000,000 bytes (10 MB). You cannot use either data type as a table's primary key.
2.2 - Spatial reference identifiers (SRIDs)
A spatial reference identifier (SRID) is an integer value that represents a method for projecting coordinates on the plane.
A spatial reference identifier (SRID) is an integer value that represents a method for projecting coordinates on the plane. A SRID is metadata that indicates the coordinate system in which a spatial object is defined.
Geospatial functions using Geometry arguments must contain the same SRID. If the functions do not contain the same SRID, then the query returns an error.
For example, in this query the two points have different SRIDs. As a result the query returns an error:
=> SELECT ST_Distance(ST_GeomFromText('POINT(34 9)',2749), ST_GeomFromText('POINT(70 12)', 3359));
ERROR 5861: Error calling processBlock() in User Function ST_Distance at [src/Distance.cpp:65],
error code: 0, message: Geometries with different SRIDs found: 2749, 3359
Supported SRIDs
Vertica supports SRIDs derived from the EPSG standards. Geospatial functions using Geometry arguments must use supported SRIDs when performing calculations. SRID values of 0 to 232-1 are valid. Queries with SRID values outside of this range will return an error.
3 - Working with spatial objects in tables
3.1 - Defining table columns for spatial data
To define columns to contain GEOMETRY and GEOGRAPHY data, use this command:.
To define columns to contain GEOMETRY and GEOGRAPHY data, use this command:
=> CREATE TABLE [[db-name.]schema.]table-name (
column-name GEOMETRY[(length)],
column-name GEOGRAPHY[(length)]);
If you omit the length specification, the default column size is 1 MB. The maximum column size is 10 MB. The upper limit is not enforced, but the geospatial functions can only accept or return spatial data up to 10 MB.
You cannot modify the size or data type of a GEOMETRY or GEOGRAPHY column after creation. If the column size you created is not sufficient, create a new column with the desired size. Then copy the data from the old column, and drop the old column from the table.
You cannot import data to or export data from tables that contain spatial data from another Vertica database.
Important
A column width that is too large could impact performance. Use a column width that fits the data without being excessively large. See
STV_MemSize.
3.2 - Exporting spatial data from a table
You can export spatial data from a table in your Vertica database to a shapefile.
You can export spatial data from a table in your Vertica database to a shapefile.
To export spatial data from a table to a shapefile:
-
As the superuser., set the shapefile export directory.
=> SELECT STV_SetExportShapefileDirectory(USING PARAMETERS path = '/home/geo/temp');
STV_SetExportShapefileDirectory
------------------------------------------------------------
SUCCESS. Set shapefile export directory: [/home/geo/temp]
(1 row)
-
Export your spatial data to a shapefile.
=> SELECT STV_Export2Shapefile(*
USING PARAMETERS shapefile = 'visualizations/city-data.shp',
shape = 'Polygon') OVER() FROM spatial_data;
Rows Exported | File Path
---------------+----------------------------------------------------------------
185873 | v_geo-db_node0001: /home/geo/temp/visualizations/city-data.shp
(1 row)
-
The value asterisk (*) is the equivalent to listing all columns in the FROM clause.
-
You can specify sub-directories when exporting your shapefile.
-
Your shapefile must end with the file extension .shp.
-
Verify that three files now appear in the shapefile export directory.
$ ls
city-data.dbf city-data.shp city-data.shx
3.3 - Identifying null spatial objects
You can identify null GEOMETRY and GEOGRAPHY objects using the Vertica IS NULL and IS NOT NULL constructs.
You can identify null GEOMETRY and GEOGRAPHY objects using the Vertica IS NULL and IS NOT NULL constructs.
This example uses the following table, where the row with id
=2 has a null value in the geog
field.
=> SELECT id, ST_AsText(geom), ST_AsText(geog) FROM locations
ORDER BY 1 ASC;
id | ST_AsText | ST_AsText
----+----------------------------------+--------------------------------------
1 | POINT (2 3) | POINT (-85 15)
2 | POINT (4 5) |
3 | POLYGON ((-1 2, 0 3, 1 2, -1 2)) | POLYGON ((-24 12, -15 23, -20 27, -24 12))
4 | LINESTRING (-1 2, 1 5) | LINESTRING (-42.74 23.98, -62.19 23.78)
(4 rows)
Identify all the rows that have a null geog
value:
=> SELECT id, ST_AsText(geom), (ST_AsText(geog) IS NULL) FROM locations
ORDER BY 1 ASC;
id | ST_AsText | ?column?
----+----------------------------------+----------
1 | POINT (2 3) | f
2 | POINT (4 5) | t
3 | POLYGON ((-1 2, 0 3, 1 2, -1 2)) | f
4 | LINESTRING (-1 2, 1 5) | f
(4 rows)
Identify the rows where the geog
value is not null:
=> SELECT id, ST_AsText(geom), (ST_AsText(geog) IS NOT NULL) FROM locations
ORDER BY 1 ASC;
id | st_astext | ?column?
----+----------------------------------+----------
1 | POINT (2 3) | t
2 | POINT (4 5) | f
3 | LINESTRING (-1 2, 1 5) | t
4 | POLYGON ((-1 2, 0 3, 1 2, -1 2)) | t
(4 rows)
3.4 - Loading spatial data from shapefiles
Vertica provides the capability to load and parse spatial data that is stored in shapefiles.
Vertica provides the capability to load and parse spatial data that is stored in shapefiles. Shapefiles describe points, lines, and polygons. A shapefile is made up of three required files; all three files must be present and in the same directory to define the geometries:
-
.shp—Contains the geometry data.
-
.shx—Contains the positional index of the geometry.
-
.dbf—Contains the attributes for each geometry.
To load spatial data from a shapefile:
-
Use STV_ShpCreateTable to generate a CREATE TABLE statement.
=> SELECT STV_ShpCreateTable ( USING PARAMETERS file = '/home/geo/temp/shp-files/spatial_data.shp')
OVER() AS spatial_data;
spatial_data
----------------------------------
CREATE TABLE spatial_data(
gid IDENTITY(64) PRIMARY KEY,
uniq_id INT8,
geom GEOMETRY(85)
);
(5 rows)
-
Create the table.
=> CREATE TABLE spatial_data(
gid IDENTITY(64) PRIMARY KEY,
uniq_id INT8,
geom GEOMETRY(85));
-
Load the shapefile.
=> COPY spatial_data WITH SOURCE STV_ShpSource(file='/home/geo/temp/shp-files/spatial_data.shp')
PARSER STV_ShpParser();
Rows Loaded
-------------
10
(1 row)
Supported shapefile shape types
The following table lists the shapefile shape types that Vertica supports.
Shapefile Shape Type |
Supported |
Null shape |
Yes |
Point |
Yes |
Polyline |
Yes |
Polygon |
Yes |
MultiPoint |
Yes |
PointZ |
No |
PolylineZ |
No |
PolygonZ |
No |
MultiPointZ |
No |
PointM |
No |
PolylineM |
No |
PolygonM |
No |
MultiPointM |
No |
MultiPatch |
No |
3.5 - Loading spatial data into tables using COPY
You can load spatial data into a table in Vertica using a COPY statement.
You can load spatial data into a table in Vertica using a COPY statement.
To load data into Vertica using a COPY statement:
-
Create a table.
=> CREATE TABLE spatial_data (id INTEGER, geom GEOMETRY(200));
CREATE TABLE
-
Create a text file named spatial.dat
with the following data.
1|POINT(2 3)
2|LINESTRING(-1 2, 1 5)
3|POLYGON((-1 2, 0 3, 1 2, -1 2))
-
Use COPY to load the data into the table.
=> COPY spatial_data (id, gx FILLER LONG VARCHAR(605), geom AS ST_GeomFromText(gx)) FROM LOCAL 'spatial.dat';
Rows Loaded
-------------
3
(1 row)
The statement specifies a LONG VARCHAR(32000000) filler, which is the maximum size of WKT. You must specify a filler value large enough to hold the largest WKT you want to insert into the table.
3.6 - Retrieving spatial data from a table as well-known text (WKT)
GEOMETRY and GEOGRAPHY data is stored in Vertica tables as LONG VARBINARY, which isn't human readable.
GEOMETRY and GEOGRAPHY data is stored in Vertica tables as LONG VARBINARY, which isn't human readable. You can use ST_AsText to return the spatial data as Well-Known Text (WKT).
To return spatial data as WKT:
=> SELECT id, ST_AsText(geom) AS WKT FROM spatial_data;
id | WKT
----+----------------------------------
1 | POINT (2 3)
2 | LINESTRING (-1 2, 1 5)
3 | POLYGON ((-1 2, 0 3, 1 2, -1 2))
(3 rows)
3.7 - Working with GeoHash data
Vertica supports GeoHashes.
Vertica supports GeoHashes. A GeoHash is a geocoding system for hierarchically encoding increasingly granular spatial references. Each additional character in a GeoHash drills down to a smaller section of a map.
You can use Vertica to generate spatial data from GeoHashes and GeoHashes from spatial data. Vertica supports the following functions for use with GeoHashes:
For example, to generate a full precision and partial precision GeoHash from a single point.
=> SELECT ST_GeoHash(ST_GeographyFromText('POINT(3.14 -1.34)')), LENGTH(ST_GeoHash(ST_GeographyFromText('POINT(3.14 -1.34)'))),
ST_GeoHash(ST_GeographyFromText('POINT(3.14 -1.34)') USING PARAMETERS numchars=5) partial_hash;
ST_GeoHash | LENGTH | partial_hash
----------------------+--------+--------------
kpf0rkn3zmcswks75010 | 20 | kpf0r
(1 row)
This example shows how to generate a GeoHash from a multipoint point object. The returned polygon is a geometry object of the smallest tile that encloses that GeoHash.
=> SELECT ST_AsText(ST_GeomFromGeoHash(ST_GeoHash(ST_GeomFromText('MULTIPOINT(0 0, 0.0002 0.0001)')))) AS region_1,
ST_AsText(ST_GeomFromGeoHash(ST_GeoHash(ST_GeomFromText('MULTIPOINT(0.0001 0.0001, 0.0003 0.0002)')))) AS region_2;
-[ RECORD 1 ]---------------------------------------------------------------------------------------------
region_1 | POLYGON ((0 0, 0.00137329101562 0, 0.00137329101562 0.00137329101562, 0 0.00137329101562, 0 0))
region_2 | POLYGON ((0 0, 0.010986328125 0, 0.010986328125 0.0054931640625, 0 0.0054931640625, 0 0))
3.8 - Spatial joins with ST_Intersects and STV_Intersect
Spatial joins allow you to identify spatial relationships between two sets of spatial data.
Spatial joins allow you to identify spatial relationships between two sets of spatial data. For example, you can use spatial joins to:
-
Calculate the density of mobile calls in various regions to determine the location of a new cell phone tower.
-
Identify homes that fall within the impact zone of a hurricane.
-
Calculate the number of users who live within a certain ZIP code.
-
Calculate the number of customers in a retail store at any given time.
3.8.1 - Best practices for spatial joins
Use these best practices to improve overall performance and optimize your spatial queries.
Use these best practices to improve overall performance and optimize your spatial queries.
Best practices for using spatial joins in Vertica include:
-
Table segmentation to speed up index creation
-
Adequately sizing a geometry column to store point data
-
Loading Well-Known Text (WKT) directly into a Geometry column, using STV_GeometryPoint in a COPY statement
-
Using OVER (PARTITION BEST) with STV_Intersect transform queries
Best practices example
Before performing the steps in the following example, download place_output.csv.zip
from the Vertica Place GitHub repository (https://github.com/vertica/Vertica-Geospatial). You need to use the data set from this repository.
-
Create the table for the polygons. Use a GEOMETRY column width that fits your data without being excessively large. A good column-width fit improves performance. In addition, segmenting the table by HASH provides the advantages of parallel computation.
=> CREATE TABLE artworks (gid int, g GEOMETRY(700)) SEGMENTED BY HASH(gid) ALL NODES;
-
Use a copy statement with ST_Buffer to create and load the polygons on which to run the intersect. By using ST_Buffer in your copy statement, you can use that function to create the polygons.
=> COPY artworks(gid, gx FILLER LONG VARCHAR, g AS ST_Buffer(ST_GeomFromText(gx),8)) FROM STDIN DELIMITER ',';
>> 1, POINT(10 45)
>> 2, POINT(25 45)
>> 3, POINT(35 45)
>> 4, POINT(35 15)
>> 5, POINT(30 5)
>> 6, POINT(15 5)
>> \.
-
Create a table for the location data, represented by points. You can store point data in a GEOMETRY column of 100 bytes. Avoid over-fitting your GEOMETRY column. Doing so can significantly degrade spatial intersection performance. Also, segment this table by HASH, to take advantage of parallel computation.
=> CREATE TABLE usr_data (gid identity, usr_id int, date_time timestamp, g GEOMETRY(100))
SEGMENTED BY HASH(gid) ALL NODES;
-
During the copy statement, transform the raw location data to GEOMETRY data. You must perform this transformation because your location data needs to use the GEOMETRY data type. Use the function STV_GeometryPoint to transform the x and y columns of the source table.
=> COPY usr_data (usr_id, date_time, x FILLER LONG VARCHAR,
y FILLER LONG VARCHAR, g AS STV_GeometryPoint(x, y))
FROM LOCAL 'place_output.csv' DELIMITER ',' ENCLOSED BY '';
-
Create the spatial index for the polygons. This index helps you speed up intersection calculations.
=> SELECT STV_Create_Index(gid, g USING PARAMETERS index='art_index', overwrite=true) OVER() FROM artworks;
-
Write an analytic query that returns the number of intersections per polygon. Specify that Vertica ignore any usr_id
that intersects less than 20 times with a given polygon.
=> SELECT pol_gid,
COUNT(DISTINCT(usr_id)) AS count_user_visit
FROM
(SELECT pol_gid,
usr_id,
COUNT(usr_id) AS user_points_in
FROM
(SELECT STV_Intersect(usr_id, g USING PARAMETERS INDEX='art_index') OVER(PARTITION BEST) AS (usr_id,
pol_gid)
FROM usr_data
WHERE date_time BETWEEN '2014-07-02 09:30:20' AND '2014-07-02 17:05:00') AS c
GROUP BY pol_gid,
usr_id HAVING COUNT(usr_id) > 20) AS real_visits
GROUP BY pol_gid
ORDER BY count_user_visit DESC;
Optimizations in the example query
This query has the following optimizations:
-
The time predicated appears in the subquery.
-
Using the location data table avoids the need for an expensive join.
-
The query uses OVER (PARTITION BEST), to improve performance by partitioning the data.
-
The user_points_in
provides an estimate of the combined time spent intersecting with the artwork by all visitors.
3.8.2 - Ensuring polygon validity before creating or refreshing an index
When Vertica creates or updates a spatial index it does not check polygon validity.
When Vertica creates or updates a spatial index it does not check polygon validity. To prevent getting invalid results when you query your spatial index, you should check the validity of your polygons prior to creating or updating your spatial index.
The following example shows you how to check the validity of polygons.
-
Create a table and load spatial data.
=> CREATE TABLE polygon_validity_test (gid INT, geom GEOMETRY);
CREATE TABLE
=> COPY polygon_validity_test (gid, gx FILLER LONG VARCHAR, geom AS St_GeomFromText(gx)) FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 2|POLYGON((-31 74,8 70,8 50,-36 53,-31 74))
>> 3|POLYGON((-38 50,4 13,11 45,0 65,-38 50))
>> 4|POLYGON((-12 42,-12 42,27 48,14 26,-12 42))
>> 5|POLYGON((0 0,1 1,0 0,2 1,1 1,0 0))
>> 6|POLYGON((3 3,2 2,2 1,2 3,3 3))
>> \.
-
Use ST_IsValid and STV_IsValidReason to find any invalid polygons.
=> SELECT gid, ST_IsValid(geom), STV_IsValidReason(geom) FROM polygon_validity_test;
gid | ST_IsValid | STV_IsValidReason
-----+------------+------------------------------------------
4 | t |
6 | f | Self-intersection at or near POINT (2 1)
2 | t |
3 | t |
5 | f | Self-intersection at or near POINT (0 0)
(5 rows)
Now that we have identifed the invalid polygons in our table, there are a couple different ways you can handle the invalid polygons when creating or refreshing a spatial index.
Filtering invalid polygons using a WHERE clause
This method is slower than filtering before creating an index because it checks the validity of each polygon at execution time.
The following example shows you how to exclude invalid polygons using a WHERE clause.
```
=> SELECT STV_Create_Index(gid, geom USING PARAMETERS index = 'valid_polygons') OVER()
FROM polygon_validity_test
WHERE ST_IsValid(geom) = 't';
```
Filtering invalid polygons before creating or refreshing an index
This method is faster than filtering using a WHERE clause because you incur the performance cost prior to building the index.
The following example shows you how to exclude invalid polygons by creating a new table excluding invalid polygons.
```
=> CREATE TABLE polygon_validity_clean AS
SELECT *
FROM polygon_validity_test
WHERE ST_IsValid(geom) = 't';
CREATE TABLE
=> SELECT STV_Create_Index(gid, geom USING PARAMETERS index = 'valid_polygons') OVER()
FROM polygon_validity_clean;
```
3.8.3 - STV_Intersect: scalar function vs. transform function
The STV_Intersect functions are similar in purpose, but you use them differently.
The STV_Intersect
functions are similar in purpose, but you use them differently.
STV_Intersect Function Type |
Description |
Performance |
Scalar |
Matches a point to a polygon. If several polygons contain the point, this function returns a gid value. The result is a polygon gid or, if no polygon contains the point, the result is NULL. |
Eliminates points that do not intersect with any indexed polygons, avoiding unnecessary comparisons. |
Transform |
Matches a point to all the polygons that contain it. When a point does not intersect with any polygon in the index, the function returns no rows. |
Processes all input points regardless of whether or not they intersect with the indexed polygons. |
In the following example, the STV_Intersect
scalar function compares the points in the points
table to the polygons in a spatial index named my_polygons
. STV_Intersect
returns all points and polygons that match exactly:
=> SELECT gid AS pt_gid
STV_Intersect(geom USING PARAMETERS index='my_polygons') AS pol_gid
FROM points ORDER BY pt_gid;
pt_gid | pol_gid
--------+---------
100 | 2
101 |
102 | 2
103 |
104 |
105 | 3
106 |
107 |
(8 rows)
The following example shows how to use the STV_Intersect
transform function to return information about the three point-polygon pairs that match and each of the polygons they match:
=> SELECT STV_Intersect(gid, geom
USING PARAMETERS index='my_polygons')
OVER (PARTITION BEST) AS (pt_gid, pol_id)
FROM points;
pt_gid | pol_id
--------+--------
100 | 1
100 | 2
100 | 3
102 | 2
105 | 3
(3 rows)
See also
3.8.4 - Performing spatial joins with STV_Intersect functions
Suppose you want to process a medium-to-large spatial data set and determine which points intersect with which polygons.
Suppose you want to process a medium-to-large spatial data set and determine which points intersect with which polygons. In that case, first create a spatial index using STV_Create_Index
. A spatial index provides efficient access to the set of polygons.
Then, use the STV_Intersect
scalar or transform function to identify which point-polygon pairs match.
3.8.4.1 - Spatial indexes and STV_Intersect
Before performing a spatial join using one of the STV_Intersect functions, you must first run STV_Create_Index to create a database object that contains information about polygons.
Before performing a spatial join using one of the STV_Intersect
functions, you must first run STV_Create_Index
to create a database object that contains information about polygons. This object is called a spatial index of the set of polygons. The spatial index improves the time it takes for the STV_Intersect
functions to access the polygon data.
Vertica creates spatial indexes in a global space. Thus, any user with access to the STV_*_Index
functions can describe, rename, or drop indexes created by any other user.
Vertica provides functions that work with spatial indexes:
3.8.5 - When to use ST_Intersects vs. STV_Intersect
Vertica provides two capabilities to identify whether a set of points intersect with a set of polygons.
Vertica provides two capabilities to identify whether a set of points intersect with a set of polygons. Depending on the size of your data set, choose the approach that gives the best performance:
-
When comparing a set of geometries to a single geometry to see if they intersect, use the ST_Intersects function.
-
To determine if a set of points intersects with a set of polygons in a medium-to-large data set, first create a spatial index using STV_Create_Index
. Then, use one of the STV_Intersect
functions to return the set of pairs that intersect.
Note
You can only perform spatial joins on GEOMETRY data.
3.8.5.1 - Performing spatial joins with ST_Intersects
The ST_Intersects function determines if two GEOMETRY objects intersect or touch at a single point.
The ST_Intersects function determines if two GEOMETRY objects intersect or touch at a single point.
Use ST_Intersects when you want to identify if a small set of geometries in a column intersect with a given geometry.
Example
The following example uses ST_Intersects to compare a column of point geometries to a single polygon. The table that contains the points has 1 million rows.
ST_Intersects returns only the points that intersect with the polygon. Those points represent about 0.01% of the points in the table:
=> CREATE TABLE points_1m(gid IDENTITY, g GEOMETRY(100)) ORDER BY g;
=> COPY points_1m(wkt FILLER LONG VARCHAR(100), g AS ST_GeomFromText(wkt))
FROM LOCAL '/data/points.dat';
Rows Loaded
-------------
1000000
(1 row)
=> SELECT ST_AsText(g) FROM points_1m WHERE
ST_Intersects
(
g,
ST_GeomFromText('POLYGON((-71 42, -70.9 42, -70.9 42.1, -71 42.1, -71 42))')
);
st_astext
----------------------------
POINT (-70.97532 42.03538)
POINT (-70.97421 42.0376)
POINT (-70.99004 42.07538)
POINT (-70.99477 42.08454)
POINT (-70.99088 42.08177)
POINT (-70.98643 42.07593)
POINT (-70.98032 42.07982)
POINT (-70.95921 42.00982)
POINT (-70.95115 42.02177)
...
(116 rows)
Vertica recommends that you test the intersections of two columns of geometries by creating a spatial index. Use one of the STV_Intersect functions as described in STV_Intersect: scalar function vs. transform function.
4 - Working with spatial objects from client applications
The Vertica client driver libraries provide interfaces for connecting your client applications to your Vertica database.
The Vertica client driver libraries provide interfaces for connecting your client applications to your Vertica database. The drivers simplify exchanging data for loading, report generation, and other common database tasks.
There are three separate client drivers:
-
Open Database Connectivity (ODBC)—The most commonly-used interface for third-party applications and clients written in C, Python, PHP, Perl, and most other languages.
-
Java Database Connectivity (JDBC)—Used by clients written in the Java programming language.
-
ActiveX Data Objects for .NET (ADO.NET)—Used by clients developed using Microsoft's .NET Framework and written in C#, Visual Basic .NET, and other .NET languages.
Vertica Place supports the following new data types:
-
LONG VARCHAR
-
LONG VARBINARY
-
GEOMETRY
-
GEOGRAPHY
The client driver libraries support these data types; the following sections describe that support and provide examples.
4.1 - Using LONG VARCHAR and LONG VARBINARY data types with ODBC
The ODBC drivers support the LONG VARCHAR and LONG VARBINARY data types similarly to VARCHAR and VARBINARY data types.
The ODBC drivers support the LONG VARCHAR and LONG VARBINARY data types similarly to VARCHAR and VARBINARY data types. When binding input or output parameters to a LONG VARCHAR or LONG VARBINARY column in a query, use the SQL_LONGVARCHAR and SQL_LONGVARBINARY constants to set the column's data type. For example, to bind an input parameter to a LONG VARCHAR column, you would use a statement that looks like this:
rc = SQLBindParameter(hdlStmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_LONGVARCHAR,
80000, 0, (SQLPOINTER)myLongString, sizeof(myLongString), NULL);
Note
Do not use inefficient encoding formats for LONG VARBINARY and LONG VARCHAR values. Vertica cannot load encoded values larger than 32MB, even if the decoded value is less than 32 MB in size. For example, Vertica returns an error if you attempt to load a 32MB LONG VARBINARY value encoded in octal format, since the octal encoding quadruples the size of the value (each byte is converted into a backslash followed by three digits).
4.2 - Using LONG VARCHAR and LONG VARBINARY data types with JDBC
Using LONG VARCHAR and LONG VARBINARY data types in a JDBC client application is similar to using VARCHAR and VARBINARY data types.
Using LONG VARCHAR and LONG VARBINARY data types in a JDBC client application is similar to using VARCHAR and VARBINARY data types. The JDBC driver transparently handles the conversion (for example, between a Java String
object and a LONG VARCHAR).
The following example code demonstrates inserting and retrieving a LONG VARCHAR string. It uses the JDBC Types class to determine the data type of the string returned by Vertica, although it does not actually need to know whether the database column is a LONG VARCHAR or just a VARCHAR in order to retrieve the value.
import java.sql.*;
import java.util.Properties;
public class LongVarcharExample {
public static void main(String[] args) {
try {
Class.forName("com.vertica.jdbc.Driver");
} catch (ClassNotFoundException e) {
System.err.println("Could not find the JDBC driver class.");
e.printStackTrace();
return;
}
Properties myProp = new Properties();
myProp.put("user", "ExampleUser");
myProp.put("password", "password123");
Connection conn;
try {
conn = DriverManager.getConnection(
"jdbc:vertica://VerticaHost:5433/ExampleDB",
myProp);
// establish connection and make a table for the data.
Statement stmt = conn.createStatement();
// How long we want the example string to be. This is
// larger than can fit into a traditional VARCHAR (which is limited
// to 65000.
int length = 100000;
// Create a table with a LONG VARCHAR column that can store
// the string we want to insert.
stmt.execute("DROP TABLE IF EXISTS longtable CASCADE");
stmt.execute("CREATE TABLE longtable (text LONG VARCHAR(" + length
+ "))");
// Build a long string by appending an integer to a string builder
// until we hit the size limit. Will result in a string
// containing 01234567890123....
StringBuilder sb = new StringBuilder(length);
for (int i = 0; i < length; i++)
{
sb.append(i % 10);
}
String value = sb.toString();
System.out.println("String value is " + value.length() +
" characters long.");
// Create the prepared statement
PreparedStatement pstmt = conn.prepareStatement(
"INSERT INTO longtable (text)" +
" VALUES(?)");
try {
// Insert LONG VARCHAR value
System.out.println("Inserting LONG VARCHAR value");
pstmt.setString(1, value);
pstmt.addBatch();
pstmt.executeBatch();
// Query the table we created to get the value back.
ResultSet rs = null;
rs = stmt.executeQuery("SELECT * FROM longtable");
// Get metadata about the result set.
ResultSetMetaData rsmd = rs.getMetaData();
// Print the type of the first column. Should be
// LONG VARCHAR. Also check it against the Types class, to
// recognize it programmatically.
System.out.println("Column #1 data type is: " +
rsmd.getColumnTypeName(1));
if (rsmd.getColumnType(1) == Types.LONGVARCHAR) {
System.out.println("It is a LONG VARCHAR");
} else {
System.out.println("It is NOT a LONG VARCHAR");
}
// Print out the string length of the returned value.
while (rs.next()) {
// Use the same getString method to get the value that you
// use to get the value of a VARCHAR.
System.out.println("Returned string length: " +
rs.getString(1).length());
}
} catch (SQLException e) {
System.out.println("Error message: " + e.getMessage());
return; // Exit if there was an error
}
// Cleanup
conn.close();
} catch (SQLException e) {
e.printStackTrace();
}
}
}
Note
Do not use inefficient encoding formats for LONG VARBINARY and LONG VARCHAR values. Vertica cannot load encoded values larger than 32MB, even if the decoded value is less than 32 MB in size. For example, Vertica returns an error if you attempt to load a 32MB LONG VARBINARY value encoded in octal format, since the octal encoding quadruples the size of the value (each byte is converted into a backslash followed by three digits).
4.3 - Using GEOMETRY and GEOGRAPHY data types in ODBC
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and ODBC client applications treat them as binary data.
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and ODBC client applications treat them as binary data. However, these data types have a format that is unique to Vertica. To manipulate this data in your C++ application, you must use the functions in Vertica that convert them to a recognized format.
To convert a WKT or WKB to the GEOMETRY or GEOGRAPHY format, use one of the following SQL functions:
To convert a GEOMETRY or GEOGRAPHY object to its corresponding WKT or WKB, use one of the following SQL functions:
-
ST_AsText—Converts a GEOMETRY or GEOGRAPHY object to a WKT, returns a LONGVARCHAR.
-
ST_AsBinary—Converts a GEOMETRY or GEOGRAPHY object to a WKB, returns a LONG VARBINARY.
The following code example converts WKT data into GEOMETRY data using ST_GeomFromText
and stores it in a table. Later, this example retrieves the GEOMETRY data from the table and converts it to WKT and WKB format using ST_AsText
and ST_AsBinary
.
// Compile on Linux using:
// g++ -g -I/opt/vertica/include -L/opt/vertica/lib64 -lodbc -o SpatialData SpatialData.cpp
// Some standard headers
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <sstream>
// Only needed for Windows clients
// #include <windows.h>
// Standard ODBC headers
#include <sql.h>
#include <sqltypes.h>
#include <sqlext.h>
// Helper function to print SQL error messages.
template <typename HandleT>
void reportError(int handleTypeEnum, HandleT hdl)
{
// Get the status records.
SQLSMALLINT i, MsgLen;
SQLRETURN ret2;
SQLCHAR SqlState[6], Msg[SQL_MAX_MESSAGE_LENGTH];
SQLINTEGER NativeError;
i = 1;
printf("\n");
while ((ret2 = SQLGetDiagRec(handleTypeEnum, hdl, i, SqlState, &NativeError,
Msg, sizeof(Msg), &MsgLen)) != SQL_NO_DATA) {
printf("error record %d\n", i);
printf("sqlstate: %s\n", SqlState);
printf("detailed msg: %s\n", Msg);
printf("native error code: %d\n\n", NativeError);
i++;
}
exit(EXIT_FAILURE); // bad form... but Ok for this demo
}
int main()
{
// Set up the ODBC environment
SQLRETURN ret;
SQLHENV hdlEnv;
ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hdlEnv);
assert(SQL_SUCCEEDED(ret));
// Tell ODBC that the application uses ODBC 3.
ret = SQLSetEnvAttr(hdlEnv, SQL_ATTR_ODBC_VERSION,
(SQLPOINTER) SQL_OV_ODBC3, SQL_IS_UINTEGER);
assert(SQL_SUCCEEDED(ret));
// Allocate a database handle.
SQLHDBC hdlDbc;
ret = SQLAllocHandle(SQL_HANDLE_DBC, hdlEnv, &hdlDbc);
assert(SQL_SUCCEEDED(ret));
// Connect to the database
printf("Connecting to database.\n");
const char *dsnName = "ExampleDB";
const char* userID = "dbadmin";
const char* passwd = "password123";
ret = SQLConnect(hdlDbc, (SQLCHAR*)dsnName,
SQL_NTS,(SQLCHAR*)userID,SQL_NTS,
(SQLCHAR*)passwd, SQL_NTS);
if(!SQL_SUCCEEDED(ret)) {
printf("Could not connect to database.\n");
reportError<SQLHDBC>(SQL_HANDLE_DBC, hdlDbc);
} else {
printf("Connected to database.\n");
}
// Disable AUTOCOMMIT
ret = SQLSetConnectAttr(hdlDbc, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_OFF,
SQL_NTS);
// Set up a statement handle
SQLHSTMT hdlStmt;
SQLAllocHandle(SQL_HANDLE_STMT, hdlDbc, &hdlStmt);
// Drop any previously defined table.
ret = SQLExecDirect(hdlStmt, (SQLCHAR*)"DROP TABLE IF EXISTS polygons",
SQL_NTS);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
// Run query to create a table to hold a geometry.
ret = SQLExecDirect(hdlStmt,
(SQLCHAR*)"CREATE TABLE polygons(id INTEGER PRIMARY KEY, poly GEOMETRY);",
SQL_NTS);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
// Create the prepared statement. This will insert data into the
// table we created above. It uses the ST_GeomFromText function to convert the
// string-formatted polygon definition to a GEOMETRY datat type.
printf("Creating prepared statement\n");
ret = SQLPrepare (hdlStmt,
(SQLTCHAR*)"INSERT INTO polygons(id, poly) VALUES(?, ST_GeomFromText(?))",
SQL_NTS) ;
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
SQLINTEGER id = 0;
int numBatches = 5;
int rowsPerBatch = 10;
// Polygon definition as a string.
char polygon[] = "polygon((1 1, 1 2, 2 2, 2 1, 1 1))";
// Bind variables to the parameters in the prepared SQL statement
ret = SQLBindParameter(hdlStmt, 1, SQL_PARAM_INPUT, SQL_C_LONG, SQL_INTEGER,
0, 0, &id, 0 , NULL);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT,hdlStmt);}
// Bind polygon string to the geometry column
SQLBindParameter(hdlStmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_LONGVARCHAR,
strlen(polygon), 0, (SQLPOINTER)polygon, strlen(polygon), NULL);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT,hdlStmt);}
// Execute the insert
ret = SQLExecute(hdlStmt);
if(!SQL_SUCCEEDED(ret)) {
reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);
} else {
printf("Executed batch.\n");
}
// Commit the transaction
printf("Committing transaction\n");
ret = SQLEndTran(SQL_HANDLE_DBC, hdlDbc, SQL_COMMIT);
if(!SQL_SUCCEEDED(ret)) {
reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);
} else {
printf("Committed transaction\n");
}
// Now, create a query to retrieve the geometry.
ret = SQLAllocHandle(SQL_HANDLE_STMT, hdlDbc, &hdlStmt);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
printf("Getting data from table.\n");
// Execute a query to get the id, raw geometry data, and
// the geometry data as a string. Uses the ST_AsText SQL function to
// format raw data back into a string polygon definition
ret = SQLExecDirect(hdlStmt,
(SQLCHAR*)"select id,ST_AsBinary(poly),ST_AsText(poly) from polygons ORDER BY id;",
SQL_NTS);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT,hdlStmt);}
SQLINTEGER idval;
// 10MB buffer to hold the raw data from the geometry (10Mb is the maximum
// length of a GEOMETRY)
SQLCHAR* polygonval = (SQLCHAR*)malloc(10485760);
SQLLEN polygonlen, polygonstrlen;
// Buffer to hold a LONGVARCHAR that can result from converting the
// geometry to a string.
SQLTCHAR* polygonstr = (SQLTCHAR*)malloc(33554432);
// Get the results of the query and print each row.
do {
ret = SQLFetch(hdlStmt);
if (SQL_SUCCEEDED(ret)) {
// ID column
ret = SQLGetData(hdlStmt, 1, SQL_C_LONG, &idval, 0, NULL);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
printf("id: %d\n",idval);
// The WKB format geometry data
ret = SQLGetData(hdlStmt, 2, SQL_C_BINARY, polygonval, 10485760,
&polygonlen);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
printf("Polygon in WKB format: ");
// Print each byte of polygonval buffer in hex format.
for (int z = 0; z < polygonlen; z++)
printf("%02x ",polygonval[z]);
printf("\n");
// Geometry data formatted as a string.
ret = SQLGetData(hdlStmt, 3, SQL_C_TCHAR, polygonstr, 33554432, &polygonstrlen);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
printf("Polygon in WKT format: %s\n", polygonstr);
}
} while(SQL_SUCCEEDED(ret));
free(polygonval);
free(polygonstr);
// Clean up
printf("Free handles.\n");
ret = SQLFreeHandle(SQL_HANDLE_STMT, hdlStmt);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
ret = SQLFreeHandle(SQL_HANDLE_DBC, hdlDbc);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
ret = SQLFreeHandle(SQL_HANDLE_ENV, hdlEnv);
if (!SQL_SUCCEEDED(ret)) {reportError<SQLHDBC>(SQL_HANDLE_STMT, hdlStmt);}
exit(EXIT_SUCCESS);
}
The output of running the above example is:
Connecting to database.
Connected to database.
Creating prepared statement
Executed batch.
Committing transaction
Committed transaction
Getting data from table.
id: 0
Polygon in WKB format: 01 03 00 00 00 01 00 00 00 05 00 00 00 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f
Polygon in WKT format: POLYGON ((1 1, 1 2, 2 2, 2 1, 1 1))
Free handles.
Note
Do not use inefficient encoding formats for LONG VARBINARY and LONG VARCHAR values. Vertica cannot load encoded values larger than 32 MB, even if the decoded value is less than 32 MB in size. For example, Vertica returns an error if you attempt to load a 32 MB LONG VARBINARY value encoded in octal format, since the octal encoding quadruples the size of the value (each byte is converted into a backslash followed by three digits).
4.4 - Using GEOMETRY and GEOGRAPHY data types in JDBC
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and JDBC client applications treat them as binary data.
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and JDBC client applications treat them as binary data. However, these data types have a format that is unique to Vertica. To manipulate this data in your Java application, you must use the functions in Vertica that convert them to a recognized format.
To convert a WKT or WKB to the GEOMETRY or GEOGRAPHY format, use one of the following SQL functions:
To convert a GEOMETRY or GEOGRAPHY object to its corresponding WKT or WKB, use one of the following SQL functions:
-
ST_AsText—Converts a GEOMETRY or GEOGRAPHY object to a WKT, returns a LONGVARCHAR.
-
ST_AsBinary—Converts a GEOMETRY or GEOGRAPHY object to a WKB, returns a LONG VARBINARY.
The following code example converts WKT and WKB data into GEOMETRY data using ST_GeomFromText
and ST_GeomFromWKB
and stores it in a table. Later, this example retrieves the GEOMETRY data from the table and converts it to WKT and WKB format using ST_AsText
and ST_AsBinary
.
import java.io.InputStream;
import java.io.Reader;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
public class GeospatialDemo
{
public static void main(String [] args) throws Exception
{
Class.forName("com.vertica.jdbc.Driver");
Connection conn =
DriverManager.getConnection("jdbc:vertica://localhost:5433/db",
"user", "password");
conn.setAutoCommit(false);
Statement stmt = conn.createStatement();
stmt.execute("CREATE TABLE polygons(id INTEGER PRIMARY KEY, poly GEOMETRY)");
int id = 0;
int numBatches = 5;
int rowsPerBatch = 10;
//batch inserting WKT data
PreparedStatement pstmt = conn.prepareStatement("INSERT INTO polygons
(id, poly) VALUES(?, ST_GeomFromText(?))");
for(int i = 0; i < numBatches; i++)
{
for(int j = 0; j < rowsPerBatch; j++)
{
//Insert your own WKT data here
pstmt.setInt(1, id++);
pstmt.setString(2, "polygon((1 1, 1 2, 2 2, 2 1, 1 1))");
pstmt.addBatch();
}
pstmt.executeBatch();
}
conn.commit();
pstmt.close();
//batch insert WKB data
pstmt = conn.prepareStatement("INSERT INTO polygons(id, poly)
VALUES(?, ST_GeomFromWKB(?))");
for(int i = 0; i < numBatches; i++)
{
for(int j = 0; j < rowsPerBatch; j++)
{
//Insert your own WKB data here
byte [] wkb = getWKB();
pstmt.setInt(1, id++);
pstmt.setBytes(2, wkb);
pstmt.addBatch();
}
pstmt.executeBatch();
}
conn.commit();
pstmt.close();
//selecting data as WKT
ResultSet rs = stmt.executeQuery("select ST_AsText(poly) from polygons");
while(rs.next())
{
String wkt = rs.getString(1);
Reader wktReader = rs.getCharacterStream(1);
//process the wkt as necessary
}
rs.close();
//selecting data as WKB
rs = stmt.executeQuery("select ST_AsBinary(poly) from polygons");
while(rs.next())
{
byte [] wkb = rs.getBytes(1);
InputStream wkbStream = rs.getBinaryStream(1);
//process the wkb as necessary
}
rs.close();
//binding parameters in predicates
pstmt = conn.prepareStatement("SELECT id FROM polygons WHERE
ST_Contains(ST_GeomFromText(?), poly)");
pstmt.setString(1, "polygon((1 1, 1 2, 2 2, 2 1, 1 1))");
rs = pstmt.executeQuery();
while(rs.next())
{
int pk = rs.getInt(1);
//process the results as necessary
}
rs.close();
conn.close();
}
}
4.5 - Using GEOMETRY and GEOGRAPHY data types in ADO.NET
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and ADO.NET client applications treat them as binary data.
Vertica GEOMETRY and GEOGRAPHY data types are backed by LONG VARBINARY native types and ADO.NET client applications treat them as binary data. However, these data types have a format that is unique to Vertica. To manipulate this data in your C# application, you must use the functions in Vertica that convert them to a recognized format.
To convert a WKT or WKB to the GEOMETRY or GEOGRAPHY format, use one of the following SQL functions:
To convert a GEOMETRY or GEOGRAPHY object to its corresponding WKT or WKB, use one of the following SQL functions:
-
ST_AsText—Converts a GEOMETRY or GEOGRAPHY object to a WKT, returns a LONGVARCHAR.
-
ST_AsBinary—Converts a GEOMETRY or GEOGRAPHY object to a WKB, returns a LONG VARBINARY.
The following C# code example converts WKT data into GEOMETRY data using ST_GeomFromText
and stores it in a table. Later, this example retrieves the GEOMETRY data from the table and converts it to WKT and WKB format using ST_AsText
and ST_AsBinary
.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using Vertica.Data.VerticaClient;
namespace ConsoleApplication
{
class Program
{
static void Main(string[] args)
{
VerticaConnectionStringBuilder builder =
new VerticaConnectionStringBuilder();
builder.Host = "VerticaHost";
builder.Database = "VMart";
builder.User = "ExampleUser";
builder.Password = "password123";
VerticaConnection _conn = new
VerticaConnection(builder.ToString());
_conn.Open();
VerticaCommand command = _conn.CreateCommand();
command.CommandText = "DROP TABLE IF EXISTS polygons";
command.ExecuteNonQuery();
command.CommandText =
"CREATE TABLE polygons (id INTEGER PRIMARY KEY, poly GEOMETRY)";
command.ExecuteNonQuery();
// Prepare to insert a polygon using a prepared statement. Use the
// ST_GeomFromtText SQl function to convert from WKT to GEOMETRY.
VerticaTransaction txn = _conn.BeginTransaction();
command.CommandText =
"INSERT into polygons VALUES(@id, ST_GeomFromText(@polygon))";
command.Parameters.Add(new
VerticaParameter("id", VerticaType.BigInt));
command.Parameters.Add(new
VerticaParameter("polygon", VerticaType.VarChar));
command.Prepare();
// Set the values for the parameters
command.Parameters["id"].Value = 0;
//
command.Parameters["polygon"].Value =
"polygon((1 1, 1 2, 2 2, 2 1, 1 1))";
// Execute the query to insert the value
command.ExecuteNonQuery();
// Now query the table
VerticaCommand query = _conn.CreateCommand();
query.CommandText =
"SELECT id, ST_AsText(poly), ST_AsBinary(poly) FROM polygons;";
VerticaDataReader dr = query.ExecuteReader();
while (dr.Read())
{
Console.WriteLine("ID: " + dr[0]);
Console.WriteLine("Polygon WKT format data type: "
+ dr.GetDataTypeName(1) +
" Value: " + dr[1]);
// Get the WKB format of the polygon and print it out as hex.
Console.Write("Polygon WKB format data type: "
+ dr.GetDataTypeName(2));
Console.WriteLine(" Value: "
+ BitConverter.ToString((byte[])dr[2]));
}
_conn.Close();
}
}
}
The example code prints the following on the system console:
ID: 0
Polygon WKT format data type: LONG VARCHAR Value: POLYGON ((1 1, 1 2,
2 2, 2 1,1 1))
Polygon WKB format data type: LONG VARBINARY Value: 01-03-00-00-00-01
-00-00-00-05-00-00-00-00-00-00-00-00-00-F0-3F-00-00-00-00-00-00-F0-3F
-00-00-00-00-00-00-F0-3F-00-00-00-00-00-00-00-40-00-00-00-00-00-00-00
-40-00-00-00-00-00-00-00-40-00-00-00-00-00-00-00-40-00-00-00-00-00-00
-F0-3F-00-00-00-00-00-00-F0-3F-00-00-00-00-00-00-F0-3F
5 - OGC spatial definitions
Using Vertica requires an understanding of the Open Geospatial Consortium (OGC) concepts and capabilities.
Using Vertica requires an understanding of the Open Geospatial Consortium (OGC) concepts and capabilities. For more information, see the OGC Simple Feature Access Part 1 - Common Architecture specification.
5.1 - Spatial classes
Vertica supports several classes of objects, as defined in the OGC standards.
Vertica supports several classes of objects, as defined in the OGC standards.
5.1.1 - Point
A location in two-dimensional space that is identified by one of the following:.
A location in two-dimensional space that is identified by one of the following:
A point has dimension 0 and no boundary.
Examples
The following example uses a GEOMETRY point:
=> CREATE TABLE point_geo (gid int, geom GEOMETRY(100));
CREATE TABLE
=> COPY point_geo(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1, POINT(3 5)
>>\.
=> SELECT gid, ST_AsText(geom) FROM point_geo;
gid | ST_AsText
-----+-------------
1 | POINT (3 5)
(1 row)
The following example uses a GEOGRAPHY point:
=> CREATE TABLE point_geog (gid int, geog geography(100));
CREATE TABLE
=> COPY point_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1, POINT(42 71)
>>\.
=> SELECT gid, ST_AsText(geog) FROM point_geog;
gid | ST_AsText
-----+---------------
1 | POINT (42 71)
(1 row)
5.1.2 - Multipoint
A set of one or more points.
A set of one or more points. A multipoint object has dimension 0 and no boundary.
Examples
The following example uses a GEOMETRY multipoint:
=> CREATE TABLE mpoint_geo (gid int, geom GEOMETRY(1000));
CREATE TABLE
=> COPY mpoint_geo(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|MULTIPOINT(4 7, 8 10)
>>\.
=> SELECT gid, ST_AsText(geom) FROM mpoint_geo;
gid | st_astext
-----+-----------------------
1 | MULTIPOINT (7 8, 6 9)
(1 row)
The following example uses a GEOGRAPHY multipoint:
=> CREATE TABLE mpoint_geog (gid int, geog GEOGRAPHY(1000));
CREATE TABLE
=> COPY mpoint_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|MULTIPOINT(42 71, 41.4 70)
>>\.
=> SELECT gid, ST_AsText(geom) FROM mpoint_geo;
gid | st_astext
-----+-----------------------
1 | MULTIPOINT (42 71, 41.4 70)
(1 row)
5.1.3 - Linestring
One or more connected lines, identified by pairs of consecutive points.
One or more connected lines, identified by pairs of consecutive points. A linestring has dimension 1. The boundary of a linestring is a multipoint object containing its start and end points.
The following are examples of linestrings:
Examples
The following example uses the GEOMETRY type to create a table, use copy to load a linestring to the table, and then queries the table to view the linestring:
=> CREATE TABLE linestring_geom (gid int, geom GEOMETRY(1000));
CREATE TABLE
=> COPY linestring_geom(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|LINESTRING(0 0, 1 1, 2 2, 3 4, 2 4, 1 5)
>>\.
=> SELECT gid, ST_AsText(geom) FROM linestring_geom;
gid | ST_AsText
-----+-------------------------------------------
1 | LINESTRING (0 0, 1 1, 2 2, 3 4, 2 4, 1 5)
(1 row)
The following example uses the GEOGRAPHY type to create a table, use copy to load a linestring to the table, and then queries the table to view the linestring:
=> CREATE TABLE linestring_geog (gid int, geog GEOGRAPHY(1000));
CREATE TABLE
=> COPY linestring_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|LINESTRING(42.1 71, 41.4 70, 41.3 72.9, 42.99 71.46, 44.47 73.21)
>>\.
=> SELECT gid, ST_AsText(geog) FROM linestring_geog;
gid | ST_AsText
-----+--------------------------------------------------------------------
1 | LINESTRING (42.1 71, 41.4 70, 41.3 72.9, 42.99 71.46, 44.47 73.21)
(1 row)
5.1.4 - Multilinestring
A collection of zero or more linestrings.
A collection of zero or more linestrings. A multilinestring has no dimension. The boundary of a multilinestring is a multipoint object containing the start and end points of all the linestrings.
The following are examples of multilinestrings:
Examples
The following example uses the GEOMETRY type to create a table, use copy to load a multilinestring to the table, and then queries the table to view the multilinestring:
=> CREATE TABLE multilinestring_geom (gid int, geom GEOMETRY(1000));
CREATE TABLE
=> COPY multilinestring_geom(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|MULTILINESTRING((1 5, 2 4, 5 3, 6 6),(3 5, 3 7))
>>\.
=> SELECT gid, ST_AsText(geom) FROM multilinestring_geom;
gid | ST_AsText
-----+----------------------------------------------------
1 | MULTILINESTRING ((1 5, 2 4, 5 3, 6 6), (3 5, 3 7))
(1 row)
The following example uses the GEOGRAPHY type to create a table, use copy to load a multilinestring to the table, and then queries the table to view the multilinestring:
=> CREATE TABLE multilinestring_geog (gid int, geog GEOGRAPHY(1000));
CREATE TABLE
=> COPY multilinestring_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|MULTILINESTRING((42.1 71, 41.4 70, 41.3 72.9), (42.99 71.46, 44.47 73.21))
>>\.
=> SELECT gid, ST_AsText(geog) FROM multilinestring_geog;
gid | ST_AsText
-----+----------------------------------------------------------------------------
1 | MULTILINESTRING((42.1 71, 41.4 70, 41.3 72.9), (42.99 71.46, 44.47 73.21))
(1 row)
5.1.5 - Polygon
An object identified by a set of closed linestrings.
An object identified by a set of closed linestrings. A polygon can have one or more holes, as defined by interior boundaries, but all points must be connected. Two examples of polygons are:
Inclusive and exclusive polygons
Polygons that include their points in clockwise order include all space inside the perimeter of the polygon and exclude all space outside that perimeter. Polygons that include their points in counterclockwise order exclude all space inside the perimeter and include all space outside that perimeter.
Examples
The following example uses the GEOMETRY type to create a table, use copy to load a polygon into the table, and then queries the table to view the polygon:
=> CREATE TABLE polygon_geom (gid int, geom GEOMETRY(1000));
CREATE TABLE
=> COPY polygon_geom(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|POLYGON(( 2 6, 2 9, 6 9, 7 7, 4 6, 2 6))
>>\.
=> SELECT gid, ST_AsText(geom) FROM polygon_geom;
gid | ST_AsText
-----+------------------------------------------
1 | POLYGON((2 6, 2 9, 6 9, 7 7, 4 6, 2 6))
(1 row)
The following example uses the GEOGRAPHY type to create a table, use copy to load a polygon into the table, and then queries the table to view the polygon:
=> CREATE TABLE polygon_geog (gid int, geog GEOGRAPHY(1000));
CREATE TABLE
=> COPY polygon_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|POLYGON((42.1 71, 41.4 70, 41.3 72.9, 44.47 73.21, 42.99 71.46, 42.1 71))
>>\.
=> SELECT gid, ST_AsText(geog) FROM polygon_geog;
gid | ST_AsText
-----+---------------------------------------------------------------------------
1 | POLYGON((42.1 71, 41.4 70, 41.3 72.9, 44.47 73.21, 42.99 71.46, 42.1 71))
(1 row)
5.1.6 - Multipolygon
A collection of zero or more polygons that do not overlap.
A collection of zero or more polygons that do not overlap.
Examples
The following example uses the GEOMETRY type to create a table, use copy to load a multipolygon into the table, and then queries the table to view the polygon:
=> CREATE TABLE multipolygon_geom (gid int, geom GEOMETRY(1000));
CREATE TABLE
=> COPY multipolygon_geom(gid, gx filler LONG VARCHAR, geom AS ST_GeomFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>9|MULTIPOLYGON(((2 6, 2 9, 6 9, 7 7, 4 6, 2 6)),((0 0, 0 5, 1 0, 0 0)),((0 2, 2 5, 4 5, 0 2)))
>>\.
=> SELECT gid, ST_AsText(geom) FROM polygon_geom;
gid | ST_AsText
-----+----------------------------------------------------------------------------------------------
9 | MULTIPOLYGON(((2 6, 2 9, 6 9, 7 7, 4 6, 2 6)),((0 0, 0 5, 1 0, 0 0)),((0 2, 2 5, 4 5, 0 2)))
(1 row)
The following example uses the GEOGRAPHY type to create a table, use copy to load a multipolygon into the table, and then queries the table to view the polygon:
=> CREATE TABLE multipolygon_geog (gid int, geog GEOGRAPHY(1000));
CREATE TABLE
=> COPY polygon_geog(gid, gx filler LONG VARCHAR, geog AS ST_GeographyFromText(gx)) FROM stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>>1|POLYGON((42.1 71, 41.4 70, 41.3 72.9, 44.47 73.21, 42.99 71.46, 42.1 71))
>>\.
=> SELECT gid, ST_AsText(geog) FROM polygon_geog;
gid | ST_AsText
-----+---------------------------------------------------------------------------
1 | POLYGON(((42.1 71, 41.4 70, 41.3 72.9, 42.1 71)),((44.47 73.21, 42.99 71.46, 42.1 71, 44.47 73.21)))
(1 row)
5.2 - Spatial object representations
The OGC defines two ways to represent spatial objects:.
The OGC defines two ways to represent spatial objects:
5.2.1 - Well-known text (WKT)
Well-Known Text (WKT) is an ASCII representation of a spatial object.
Well-Known Text (WKT) is an ASCII representation of a spatial object.
WKTs are not case sensitive; Vertica recognizes any combination of lowercase and uppercase letters.
Some examples of valid WKTs are:
WKT Example |
Description |
POINT(1 2) |
The point (1,2) |
MULTIPOINT(0 0,1 1) |
A set made up of the points (0,0) and (1,1) |
LINESTRING(1.5 2.45,3.21 4) |
The line from the point (1.5,2.45) to the point (3.21,4) |
MULTILINESTRING((0 0,–1 –2,–3 –4),(2 3,3 4,6 7)) |
Two linestrings, one that passes through (0,0), (–1,–2), and (–3,–4), and one that passes through (2,3), (3,4), and (6,7). |
POLYGON((1 2,1 4,3 4,3 2,1 2)) |
The rectangle whose four corners are indicated by (1,2), (1,4), (3,4), and (3,2). A polygon must be closed, so the first and last points in the WKT must match. |
POLYGON((0.5 0.5,5 0,5 5,0 5,0.5 0.5), (1.5 1,4 3,4 1,1.5 1)) |
A polygon (0.5 0.5,5 0,5 5,0 5,0.5 0.5) with a hole in it (1.5 1,4 3,4 1,1.5 1). |
MULTIPOLYGON(((0 1,3 0,4 3,0 4,0 1)), ((3 4,6 3,5 5,3 4)), ((0 0,–1 –2,–3 –2,–2 –1,0 0))) |
A set of three polygons |
GEOMETRYCOLLECTION(POINT(5 8), LINESTRING(–1 3,1 4)) |
A set containing the point (5,8) and the line from (–1,3) to (1,4) |
POINT EMPTY
MULTIPOINT EMPTY
LINESTRING EMPTY
MULTILINESTRING EMPTY
MULTILINESTRING(EMPTY)
POLYGON EMPTY
POLYGON(EMPTY)
MULTIPOLYGON EMPTY
MULTIPOLYGON(EMPTY)
|
Empty spatial objects; empty objects have no points. |
Invalid WKTs are:
-
POINT(1 NAN), POINT(1 INF)—Coordinates must be numbers.
-
POLYGON((1 2, 1 4, 3 4, 3 2))—A polygon must be closed.
-
POLYGON((1 4, 2 4))—A linestring is not a valid polygon.
5.2.2 - Well-known binary (WKB)
Well-Known Binary (WKB) is a binary representation of a spatial object.
Well-Known Binary (WKB) is a binary representation of a spatial object. This format is primarily used to port spatial data between applications.
5.3 - Spatial definitions
The OGC defines properties that describe.
The OGC defines properties that describe
Vertica provides functions that test for and analyze the following properties and relationships.
5.3.1 - Boundary
The set of points that define the limit of a spatial object:.
The set of points that define the limit of a spatial object:
-
Points, multipoints, and geometrycollections do not have boundaries.
-
The boundary of a linestring is a multipoint object. This object contains its start and end points.
-
The boundary of a multilinestring is a multipoint object. This object contains the start and end points of all the linestrings that make up the multilinestring.
-
The boundary of a polygon is a linestring that begins and ends at the same point. If the polygon has one or more holes, the boundary is a multilinestring that contains the boundaries of the exterior polygon and any interior polygons.
-
The boundary of a multipolygon is a multilinestring that contains the boundaries of all the polygons that make up the multipolygon.
5.3.2 - Buffer
The set of all points that are within or equal to a specified distance from the boundary of a spatial object.
The set of all points that are within or equal to a specified distance from the boundary of a spatial object. The distance can be positive or negative.
Positive buffer:
Negative buffer:
5.3.3 - Contains
One spatial object contains another spatial object if its interior includes all points of the other object.
One spatial object contains another spatial object if its interior includes all points of the other object. If an object such as a point or linestring only exists along a polygon's boundary, the polygon does not contain it. If a point is on a linestring, the linestring contains it; the interior of a linestring is all the points on the linestring except the start and end points.
Contains(a, b) is spatially equivalent to within(b, a).
5.3.4 - Convex hull
The smallest convex polygon that contains one or more spatial objects.
The smallest convex polygon that contains one or more spatial objects.
In the following figure, the dotted lines represent the convex hull for a linestring and a triangle.
5.3.5 - Crosses
Two spatial objects cross if both of the following are true:.
Two spatial objects cross if both of the following are true:
5.3.6 - Disjoint
Two spatial objects have no points in common; they do not intersect or touch.
Two spatial objects have no points in common; they do not intersect or touch.
5.3.7 - Envelope
The minimum bounding rectangle that contains a spatial object.
The minimum bounding rectangle that contains a spatial object.
The envelope for the following polygon is represented by the dotted lines in the following figure.
5.3.8 - Equals
Two spatial objects are equal when their coordinates match exactly.
Two spatial objects are equal when their coordinates match exactly. Synonymous with spatially equivalent.
The order of the points do not matter in determining spatial equivalence:
-
LINESTRING(1 2, 4 3) equals LINESTRING(4 3, 1 2).
-
POLYGON ((0 0, 1 1, 1 2, 2 2, 2 1, 3 0, 1.5 –1.5, 0 0)) equals POLYGON((1 1 , 1 2, 2 2, 2 1, 3 0, 1.5 –1.5, 0 0, 1 1)).
-
MULTILINESTRING((1 2, 4 3),(0 0, –1 –4)) equals MULTILINESTRING((0 0, –1 –4),(1 2, 4 3)).
5.3.9 - Exterior
The set of points not contained within a spatial object nor on its boundary.
The set of points not contained within a spatial object nor on its boundary.
5.3.10 - GeometryCollection
A set of zero or more objects from any of the supported classes of spatial objects.
A set of zero or more objects from any of the supported classes of spatial objects.
5.3.11 - Interior
The set of points contained in a spatial object, excluding its boundary.
The set of points contained in a spatial object, excluding its boundary.
5.3.12 - Intersection
The set of points that two or more spatial objects have in common.
The set of points that two or more spatial objects have in common.
5.3.13 - Overlaps
If a spatial object shares space with another object, but is not contained within that object, the objects overlap.
If a spatial object shares space with another object, but is not contained within that object, the objects overlap. The objects must overlap at their interiors; if two objects touch at a single point or intersect only along a boundary, they do not overlap.
5.3.14 - Relates
When a spatial object is spatially related to another object as defined by a DE-9IM pattern matrix string.
When a spatial object is spatially related to another object as defined by a DE-9IM pattern matrix string.
A DE-9IM pattern matrix string identifies how two spatial objects are spatially related to each other. For more information about the DE-9IM standard, see Understanding Spatial Relations.
5.3.15 - Simple
For points, multipoints, linestrings, or multilinestrings, a spatial object is simple if it does not intersect itself or has no self-tangency points.
For points, multipoints, linestrings, or multilinestrings, a spatial object is simple if it does not intersect itself or has no self-tangency points.
Polygons, multipolygons, and geometrycollections are always simple.
5.3.16 - Symmetric difference
The set of all points of a pair of spatial objects where the objects do not intersect.
The set of all points of a pair of spatial objects where the objects do not intersect. This difference is spatially equivalent to the union of the two objects less their intersection. The symmetric difference contains the boundaries of the intersections.
In the following figure, the shaded areas represent the symmetric difference of the two rectangles.
The following figure shows the symmetric difference of two overlapping linestrings.
5.3.17 - Union
For two or more spatial objects, the set of all points in all the objects.
For two or more spatial objects, the set of all points in all the objects.
5.3.18 - Validity
For a polygon or multipolygon, when all of the following are true:.
For a polygon or multipolygon, when all of the following are true:
-
It is closed; its start point is the same as its end point.
-
Its boundary is a set of linestrings.
-
No two linestrings in the boundary cross. The linestrings in the boundary may touch at a point but they cannot cross.
-
Any polygons in the interior must be completely contained; they cannot touch the boundary of the exterior polygon except at a vertex.
Valid polygons:
Invalid polygon:
5.3.19 - Within
A spatial object is considered within another spatial object when all its points are inside the other object's interior.
A spatial object is considered within another spatial object when all its points are inside the other object's interior. Thus, if a point or linestring only exists along a polygon's boundary, it is not considered within the polygon. The polygon boundary is not part of its interior.
If a point is on a linestring, it is considered within the linestring. The interior of a linestring is all the points along the linestring, except the start and end points.
Within(a, b) is spatially equivalent to contains(b, a).
6 - Spatial data type support limitations
Vertica does not support all types of GEOMETRY and GEOGRAPHY objects.
Vertica does not support all types of GEOMETRY and GEOGRAPHY objects. See the respective function page for a list of objects that function supports. Spherical geometry is generally more complex than Euclidean geometry. Thus, there are fewer spatial functions that support the GEOGRAPHY data type.
Limitations of spatial data type support:
-
A non-WGS84 GEOGRAPHY object is a spatial object defined on the surface of a perfect sphere of radius 6371 kilometers. This sphere approximates the shape of the earth. Other spatial programs may use an ellipsoid to model the earth, resulting in slightly different data.
-
You cannot modify the size or data type of a GEOMETRY or GEOGRAPHY column after creation.
-
You cannot import data to or export data from tables that contain spatial data from another Vertica database.
-
You can only use the STV_Intersect functions with points and polygons.
-
GEOGRAPHY objects of type GEOMETRYCOLLECTION are not supported.
-
Values for longitude must be between -180 and +180 degrees. Values for latitude must be between –90 and +90 degrees. The Vertica geospatial functions do not validate these values.
-
GEOMETRYCOLLECTION objects cannot contain empty objects. For example, you cannot specify GEOMETRYCOLLECTION (LINESTRING(1 2, 3 4), POINT(5 6)
, POINT EMPTY)
.
-
If you pass a spatial function a NULL geometry, the function returns NULL, unless otherwise specified. A result of NULL has no value.
-
Polymorphic functions, such as NVL
and GREATEST
, do not accept GEOMETRY and GEOGRAPHY arguments.