This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Python SDK
The Vertica SDK supports writing UDxs of some types in Python 3.
The Vertica SDK supports writing UDxs of some types in Python 3.
The Python SDK does not require any additional system configuration or header files. This low overhead allows you to develop and deploy new capabilities to your Vertica cluster in a short amount of time.
The following workflow is typical for the Python SDK:
Because Python has an interpreter, you do not have to compile your program before loading the UDx in Vertica. However, you should expect to do some debugging of your code after you create your function and begin testing it in Vertica.
When Vertica calls your UDx, it starts a side process that manages the interaction between the server and the Python interpreter.
This section covers Python-specific topics that apply to all UDx types. For information that applies to all languages, see Arguments and return values, UDx parameters, Errors, warnings, and logging, Handling cancel requests and the sections for specific UDx types. For full API documentation, see the Python SDK.
Important
Your UDx must be able to run with the version of Python bundled with Vertica. You can find this with /opt/vertica/sbin/python3 --version
. You cannot change the version used by the Vertica Python interpreter.
1 - Setting up a Python development environment
To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that Vertica uses.
To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that Vertica uses. To do this without changing your environment for projects that might require other Python versions, you can use a Python virtual environment (venv). You can install libraries that your UDx depends on into your venv
and use that path when you create your UDx library with CREATE LIBRARY.
Setting up venv
Set up venv
using the Python version bundled with Vertica. If you have direct access to a database node, you can use that Python binary directly to create your venv
:
$ /opt/vertica/sbin/python3 -m venv /path/to/new/environment
The result is a directory with a default environment, including a site-packages
directory:
$ ls venv/lib/
python3.9
$ ls venv/lib/python3.9/
site-packages
If your UDx depends on libraries that are not packaged with Vertica, install them into this directory:
$ source venv/bin/activate
(venv) $ pip install numpy
...
The lib/python3.9/site-packages
directory now contains the installed library. The change affects only your virtual environment.
UDx imports
Your UDx code must import, in addition to any libraries you add, the vertica_sdk
library:
# always required:
import vertica_sdk
# other libs:
import numpy as np
# ...
The vertica_sdk
library is included as a part of the Vertica server. You do not need to add it to site-packages
or declare it as a dependency.
Deployment
For libraries you add, you must declare dependencies when using CREATE LIBRARY. This declaration allows Vertica to find the libraries and distribute them to all database nodes. You can supply a path instead of enumerating the libraries:
=> CREATE OR REPLACE LIBRARY pylib AS
'/path/to/udx/add2ints.py'
DEPENDS '/path/to/new/environment/lib/python3.9/site-packages/*'
LANGUAGE 'Python';
=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python'
NAME 'add2ints_factory' LIBRARY pylib;
CREATE LIBRARY copies the UDx and the contents of the DEPENDS path and stores them with the database. Vertica then distributes copies to all database nodes.
2 - Python and Vertica data types
The Vertica Python SDK converts native Vertica data types into the appropriate Python data types.
The Vertica Python SDK converts native Vertica data types into the appropriate Python data types. The following table describes some of the data type conversions. Consult the Python SDK for a complete list, as well as lists of helper functions to convert and manipulate these data types.
For information about SDK support for complex data types, see Complex Types as Arguments and Return Values.
Vertica Data Type |
Python Data Type |
INTEGER |
int |
FLOAT |
float |
NUMERIC |
decimal.Decimal |
DATE |
datetime.date |
CHAR, VARCHAR, LONG VARCHAR |
string (UTF-8 encoded) |
BINARY, VARBINARY, LONG VARBINARY |
binary |
TIMESTAMP |
datetime.datetime |
TIME |
datetime.time |
ARRAY |
list
Note: Nested ARRAY types are also converted into lists.
|
ROW |
collections.OrderedDict
Note: Nested ROW types are also converted into collections.OrderedDicts.
|
Note
Some Vertica data types are not supported in Python. For a list of all Vertica data types, see
Data types.