This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Python SDK

The Vertica SDK supports writing UDxs of some types in Python 3.

The Vertica SDK supports writing UDxs of some types in Python 3.

The Python SDK does not require any additional system configuration or header files. This low overhead allows you to develop and deploy new capabilities to your Vertica cluster in a short amount of time.

The following workflow is typical for the Python SDK:

Because Python has an interpreter, you do not have to compile your program before loading the UDx in Vertica. However, you should expect to do some debugging of your code after you create your function and begin testing it in Vertica.

When Vertica calls your UDx, it starts a side process that manages the interaction between the server and the Python interpreter.

This section covers Python-specific topics that apply to all UDx types. For information that applies to all languages, see Arguments and return values, UDx parameters, Errors, warnings, and logging, Handling cancel requests and the sections for specific UDx types. For full API documentation, see the Python SDK.

1 - Setting up a Python development environment

To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that Vertica uses.

To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that Vertica uses. To do this without changing your environment for projects that might require other Python versions, you can use a Python virtual environment (venv). You can install libraries that your UDx depends on into your venv and use that path when you create your UDx library with CREATE LIBRARY.

Setting up venv

Set up venv using the Python version bundled with Vertica. If you have direct access to a database node, you can use that Python binary directly to create your venv:

$ /opt/vertica/sbin/python3 -m venv /path/to/new/environment

The result is a directory with a default environment, including a site-packages directory:

$ ls venv/lib/
python3.9
$ ls venv/lib/python3.9/
site-packages

If your UDx depends on libraries that are not packaged with Vertica, install them into this directory:

$ source venv/bin/activate
(venv) $ pip install numpy
...

The lib/python3.9/site-packages directory now contains the installed library. The change affects only your virtual environment.

UDx imports

Your UDx code must import, in addition to any libraries you add, the vertica_sdk library:

# always required:
import vertica_sdk
# other libs:
import numpy as np
# ...

The vertica_sdk library is included as a part of the Vertica server. You do not need to add it to site-packages or declare it as a dependency.

Deployment

For libraries you add, you must declare dependencies when using CREATE LIBRARY. This declaration allows Vertica to find the libraries and distribute them to all database nodes. You can supply a path instead of enumerating the libraries:

=> CREATE OR REPLACE LIBRARY pylib AS
   '/path/to/udx/add2ints.py'
   DEPENDS '/path/to/new/environment/lib/python3.9/site-packages/*'
   LANGUAGE 'Python';

=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python'
   NAME 'add2ints_factory' LIBRARY pylib;

CREATE LIBRARY copies the UDx and the contents of the DEPENDS path and stores them with the database. Vertica then distributes copies to all database nodes.

2 - Python and Vertica data types

The Vertica Python SDK converts native Vertica data types into the appropriate Python data types.

The Vertica Python SDK converts native Vertica data types into the appropriate Python data types. The following table describes some of the data type conversions. Consult the Python SDK for a complete list, as well as lists of helper functions to convert and manipulate these data types.

For information about SDK support for complex data types, see Complex Types as Arguments and Return Values.

Vertica Data Type Python Data Type
INTEGER int
FLOAT float
NUMERIC decimal.Decimal
DATE datetime.date
CHAR, VARCHAR, LONG VARCHAR string (UTF-8 encoded)
BINARY, VARBINARY, LONG VARBINARY binary
TIMESTAMP datetime.datetime
TIME datetime.time
ARRAY

list

Note: Nested ARRAY types are also converted into lists.

ROW

collections.OrderedDict

Note: Nested ROW types are also converted into collections.OrderedDicts.