Vertica provides several functions for managing models.
This is the multi-page printable view of this section. Click here to print.
Model management
- 1: CHANGE_MODEL_STATUS
- 2: EXPORT_MODELS
- 3: GET_MODEL_ATTRIBUTE
- 4: GET_MODEL_SUMMARY
- 5: IMPORT_MODELS
- 6: REGISTER_MODEL
- 7: UPGRADE_MODEL
1 - CHANGE_MODEL_STATUS
Changes the status of a registered model. Only dbadmin and users with the MLSUPERVISOR role can call this function.
The following diagram depicts the valid status transitions:
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
StableSyntax
CHANGE_MODEL_STATUS( 'registered_name', registered_version, 'new_status' )
Arguments
registered_name
- Identifies the abstract name to which the model is registered. This
registered_name
can represent a group of models for a higher-level application, where each model in the group has a unique version number. registered_version
- Unique version number of the model under the specified
registered_name
.If there is no registered model with the given
registered_name
andregistered_version
, the function errors. new_status
- New status of the registered model. Must be one of the following strings and adhere to the valid status transitions depicted in the above diagram:
-
under_review
: Status assigned to newly registered models. -
staging
: Model is targeted for A/B testing against the model currently in production. -
production
: Model is in production for its specified application. Only one model can be in production for a givenregistered_name
at one time. -
archived
: Status of models that were previously in production. Archived models can be returned to production at any time. -
declined
: Model is no longer in consideration for production. -
unregistered
: Model is removed from the versioning environment. The model does not appear in the REGISTERED_MODELS system table.
If you change the status of a model to 'production' and there is already a model in production under the given
registered_name
, the status of the model in production is set to 'archived' and the status of the new model is set to 'production'. -
Privileges
One of the following:
-
Superuser
- MLSUPERVISOR
Examples
In the following example, the linear_reg_spark1
model, which is uniquely identified by the registered_name
'linear_reg_app' and the registered_version
of two, is set to 'production' status:
=> SELECT * FROM REGISTERED_MODELS;
registered_name | registered_version | status | registered_time | model_id | schema_name | model_name | model_type | category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
linear_reg_app | 2 | STAGING | 2023-01-29 05:49:00.082166-04 | 45035996273714020 | public | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
linear_reg_app | 1 | PRODUCTION | 2023-01-24 09:19:04.553102-05 | 45035996273850350 | public | native_linear_reg | LINEAR_REGRESSION | VERTICA_MODELS
logistic_reg_app | 1 | DECLINED | 2023-01-11 02:47:25.990626-02 | 45035996273853740 | public | log_reg_bfgs | LOGISTIC_REGRESSION | VERTICA_MODELS
(3 rows)
=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 2, 'production');
CHANGE_MODEL_STATUS
-----------------------------------------------------------------------------
The status of model [linear_reg_app] - version [2] is changed to [production]
(1 row)
You can query the REGISTERED_MODELS system table to confirm that the linear_reg_spark1
model is now in 'production' and the native_linear_reg
model, which was currently in 'production', is moved to 'archived':
=> SELECT * FROM REGISTERED_MODELS;
registered_name | registered_version | status | registered_time | model_id | schema_name | model_name | model_type | category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
linear_reg_app | 2 | PRODUCTION | 2023-01-29 05:49:00.082166-04 | 45035996273714020 | public | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
linear_reg_app | 1 | ARCHIVED | 2023-01-24 09:19:04.553102-05 | 45035996273850350 | public | native_linear_reg | LINEAR_REGRESSION | VERTICA_MODELS
logistic_reg_app | 1 | DECLINED | 2023-01-11 02:47:25.990626-02 | 45035996273853740 | public | log_reg_bfgs | LOGISTIC_REGRESSION | VERTICA_MODELS
(2 rows)
If you change a model's status to 'unregistered', the model is removed from the model versioning environment and no longer appears in the REGISTERED_MODELS system table:
=> SELECT CHANGE_MODEL_STATUS('logistic_reg_app', 1, 'unregistered');
CHANGE_MODEL_STATUS
----------------------------------------------------------------------------------
The status of model [logistic_reg_app] - version [1] is changed to [unregistered]
(1 row)
=> SELECT * FROM REGISTERED_MODELS;
registered_name | registered_version | status | registered_time | model_id | schema_name | model_name | model_type | category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
linear_reg_app | 2 | STAGING | 2023-01-29 05:49:00.082166-04 | 45035996273714020 | public | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
linear_reg_app | 1 | PRODUCTION | 2023-01-24 09:19:04.553102-05 | 45035996273850350 | public | native_linear_reg | LINEAR_REGRESSION | VERTICA_MODELS
(2 rows)
See also
2 - EXPORT_MODELS
Exports machine learning models. Vertica supports three model formats:
-
Native Vertica (VERTICA_MODELS)
-
PMML
-
TensorFlow
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EXPORT_MODELS ( 'output-dir', 'export-target' [ USING PARAMETERS category = 'model-category' ] )
Arguments
output-dir
- Absolute path of an output directory to store the exported models, either an absolute path on the initiator node file system or a URI for a supported file system or object store.
export-target
- Models to export:
[schema.]{model-name | * }
schema
specifies the schema from which models are exported. If omitted, EXPORT_MODELS uses the default schema. Supply*
(asterisk) to batch export all models from the schema.If a model in a batch fails to export, the function issues a warning and then continues to export any remaining models in the batch. Details about any failed model exports are available in the log file generated at the
output-dir
location.
Parameters
category
- The category of models to export, one of the following:
-
VERTICA_MODELS
-
PMML
-
TENSORFLOW
EXPORT_MODELS exports models of the specified category according to the scope of the export operation—that is, whether it applies to a single model, or to all models within a schema. See Export Scope and Category Processing below.
Exported Files below describes the files that EXPORT_MODELS exports for each category.
If you omit this parameter, EXPORT_MODELS exports the model, or models in the specified schema, according to their model type.
-
Privileges
Superuser or MLSUPERVISOR
Export scope and category processing
EXPORT_MODELS executes according to the following parameter settings:
-
Scope of the export operation: single model, or all models within a given schema
-
Category specified or omitted
The following table shows how these two parameters control the export process:
Export scope | If category specified... | If category omitted... |
---|---|---|
Single model | Convert the model to the specified category, provided the model and category are compatible; otherwise, return with a mismatch error. | Export the model according to model type. |
All models in schema | Export only models that are compatible with the specified category and issue mismatch warnings on all other models in the schema. | Export all models in the schema according to model type. |
Exported files
EXPORT_MODELS exports the following files for each model category:
VERTICA_MODELS
-
Multiple binary files (exact number dependent on model type)
-
metadata.json
: Metadata file with model information, including model name, category, type, and Vertica version on export. -
crc.json
: Used on import to validate other files of this model.
-
PMML
-
XML file with the same name as the model and complying with PMML standard.
-
metadata.json
: Metadata file with model information, including model name, category, type, and Vertica version on export. -
crc.json
: Used on import to validate other files of this model.
-
TENSORFLOW
-
model-name
.pb
: Contains the TensorFlow model, saved in 'frozen graph' format. -
metadata.json
: Metadata file with model information, including model name, category, type, and Vertica version on export. -
tf_model_desc.json
: Summary model description. -
model.json
: Verbose model description. -
crc.json
: Used on import to validate other files of this model.
-
Categories and compatible models
If EXPORT_MODELS specifies a single model and also sets the category
parameter, the function succeeds if the model type and category are compatible; otherwise, it returns with an error. The following model types are compatible with the listed categories:
PMML
PMML
TensorFlow
TENSORFLOW
VERTICA_MODELS
PMML
VERTICA_MODELS
If EXPORT_MODELS specifies to export all models from a schema and sets a category, it issues a warning message on each model that is incompatible with that category. The function then continues to process remaining models in that schema.
EXPORT_MODELS logs all errors and warnings in output-dir
/export_log.json
.
Examples
Export models without changing their category
Export model myschema.mykmeansmodel
without changing its category:
=> SELECT EXPORT_MODELS ('/home/dbadmin', 'myschema.mykmeansmodel');
EXPORT_MODELS
----------------
Success
(1 row)
Export all models in schema myschema
without changing their categories:
=> SELECT EXPORT_MODELS ('/home/dbadmin', 'myschema.*');
EXPORT_MODELS
----------------
Success
(1 row)
Export all the models in schema models
to an S3 bucket without changing the model categories:
SELECT export_models('s3://vertica/ml_models', 'models.*');
EXPORT_MODELS
---------------
Success
(1 row)
Export models that are compatible with the specified category
Note
When you import a model of category VERTICA_MODELS trained in a different version of Vertica, Vertica automatically upgrades the model version to match that of the database. If this fails, you must run UPGRADE_MODEL.
If both methods fail, the model cannot be used for in-database scoring and cannot be exported as a PMML model.
The category is set to PMML. Models of type PMML and VERTICA_MODELS are compatible with the PMML category, so the export operation succeeds if my_keans
is of either type:
=> SELECT EXPORT_MODELS ('/tmp/', 'my_kmeans' USING PARAMETERS category='PMML');
The category is set to VERTICA_MODELS. Only models of type VERTICA_MODELS are compatible with the VERTICA_MODELS category, so the export operation succeeds only if my_keans
is of that type:
=> SELECT EXPORT_MODELS ('/tmp/', 'public.my_kmeans' USING PARAMETERS category='VERTICA_MODELS');
The category is set to TENSORFLOW. Only models of type TensorFlow are compatible with the TENSORFLOW category, so the model tf_mnist_keras
must be of type TensorFlow:
=> SELECT EXPORT_MODELS ('/tmp/', 'tf_mnist_keras', USING PARAMETERS category='TENSORFLOW');
export_models
---------------
Success
(1 row)
After exporting the TensorFlow model tf_mnist_keras
, list the exported files:
$ ls tf_mnist_keras/
crc.json metadata.json mnist_keras.pb model.json tf_model_desc.json
See also
IMPORT_MODELS3 - GET_MODEL_ATTRIBUTE
Extracts either a specific attribute from a model or all attributes from a model. Use this function to view a list of attributes and row counts or view detailed information about a single attribute. The output of GET_MODEL_ATTRIBUTE is a table format where users can select particular columns or rows.
Syntax
GET_MODEL_ATTRIBUTE ( USING PARAMETERS model_name = 'model-name' [, attr_name = 'attribute' ] )
Parameters
model_name
Name of the model (case-insensitive).
attr_name
- Name of the model attribute to extract. If omitted, the function shows all available attributes. Attribute names are case-sensitive.
Privileges
Non-superusers: model owner, or USAGE privileges on the model
Examples
This example returns a summary of all model attributes.
=> SELECT GET_MODEL_ATTRIBUTE ( USING PARAMETERS model_name='myLinearRegModel');
attr_name | attr_fields | #_of_rows
-------------------+---------------------------------------------------+-----------
details | predictor, coefficient, std_err, t_value, p_value | 2
regularization | type, lambda | 1
iteration_count | iteration_count | 1
rejected_row_count | rejected_row_count | 1
accepted_row_count | accepted_row_count | 1
call_string | call_string | 1
(6 rows)
This example extracts the details
attribute from the myLinearRegModel
model.
=> SELECT GET_MODEL_ATTRIBUTE ( USING PARAMETERS model_name='myLinearRegModel', attr_name='details');
coeffNames | coeff | stdErr | zValue | pValue
-----------+--------------------+---------------------+-------------------+-----------------------
Intercept | -1.87401598641074 | 0.160143331525544 | -11.7021169008952 | 7.3592939615234e-26
waiting | 0.0756279479518627 | 0.00221854185633525 | 34.0890336307608 | 8.13028381124448e-100
(2 rows)
4 - GET_MODEL_SUMMARY
Returns summary information of a model.
Syntax
GET_MODEL_SUMMARY ( USING PARAMETERS model_name = 'model-name' )
Parameters
- model_name
Name of the model (case-insensitive).
Privileges
Non-superusers: model owner, or USAGE privileges on the model
Examples
This example shows how you can view the summary of a linear regression model.
=> SELECT GET_MODEL_SUMMARY( USING PARAMETERS model_name='myLinearRegModel');
--------------------------------------------------------------------------------
=======
details
=======
predictor|coefficient|std_err |t_value |p_value
---------+-----------+--------+--------+--------
Intercept| -2.06795 | 0.21063|-9.81782| 0.00000
waiting | 0.07876 | 0.00292|26.96925| 0.00000
==============
regularization
==============
type| lambda
----+--------
none| 1.00000
===========
call_string
===========
linear_reg('public.linear_reg_faithful', 'faithful_training', '"eruptions"', 'waiting'
USING PARAMETERS optimizer='bfgs', epsilon=1e-06, max_iterations=100,
regularization='none', lambda=1)
===============
Additional Info
===============
Name |Value
------------------+-----
iteration_count | 3
rejected_row_count| 0
accepted_row_count| 162
(1 row)
5 - IMPORT_MODELS
Imports models into Vertica, either Vertica models that were exported with EXPORT_MODELS, or models in Predictive Model Markup Language (PMML) or TensorFlow format. You can use this function to move models between Vertica clusters, or to import PMML and TensorFlow models trained elsewhere.
Other Vertica model management operations such as GET_MODEL_SUMMARY and GET_MODEL_ATTRIBUTE support imported models.
Caution
Changing the exported model files causes the import functionality to fail on attempted re-import.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
IMPORT_MODELS ( 'source'
[ USING PARAMETERS [ new_schema = 'schema-name' ] [, category = 'model-category' ] ] )
Arguments
source
- Path from which to import models, either an absolute path on the initiator node file system or a URI for a supported file system or object store. The path format depends on whether you are importing a single model or a batch of models:
-
To import a single model, provide the path to the model's directory:
path/model-directory
-
To import a batch of models, provide the path to a parent directory that contains the model directories for each model in the batch:
parent-dir-path/*
If a model in a batch fails to import, the function issues a warning and then continues to import any remaining models in the batch. Details about any failed model import are available in the log file generated at the
source
location. -
Parameters
new_schema
- An existing schema where the machine learning models are imported. If omitted, models are imported to the default schema.
IMPORT_MODELS extracts the name of the imported model from its
metadata.json
file, if it exists. Otherwise, the function uses the name of the model directory. category
- Specifies the category of the model to import, one of the following:
-
VERTICA_MODELS
-
PMML
-
TENSORFLOW
This parameter is required if the model directory has no
metadata.json
file. IMPORT_MODELS returns with an error if one of the following cases is true:-
No category is specified and the model directory has no
metadata.json
. -
The specified category does not match the model type.
Note
If the category is TENSORFLOW, IMPORT_MODELS only imports the following files from the model directory:
-
model-name
.pb
-
model-name
.json
-
model-name
.pbtxt
(optional)
-
Privileges
Superuser or MLSUPERVISOR
Requirements and restrictions
The following requirements and restrictions apply:
-
If you export a model, then import it again, the export and import model directory names must match. If naming conflicts occur, import the model to a different schema by using the
new_schema
parameter, and then rename the model. -
The machine learning configuration parameter MaxModelSizeKB sets the maximum size of a model that can be imported into Vertica.
-
Some PMML features and attributes are not currently supported. See PMML features and attributes for details.
-
If you import a PMML model with both
metadata.json
andcrc.json
files, the CRC file must contain the metadata file's CRC value. Otherwise, the import operation returns with an error.
Examples
If no model category is specified, IMPORT_MODELS uses the model's metadata.json
file to determine its category.
Import a single model mykmeansmodel
into the newschema
schema:
=> SELECT IMPORT_MODELS ('/home/dbadmin/myschema/mykmeansmodel' USING PARAMETERS new_schema='newschema')
IMPORT_MODELS
----------------
Success
(1 row)
Import all models in the myschema
directory into the newschema
schema:
=> SELECT IMPORT_MODELS ('/home/dbadmin/myschema/*' USING PARAMETERS new_schema='newschema')
IMPORT_MODELS
----------------
Success
(1 row)
Import the model tf_mnsit_estimator
from an S3 bucket into the ml_models
schema:
=> SELECT IMPORT_MODELS ('s3://ml-models/tensorflow/mnist' USING PARAMETERS new_schema='ml_models')
IMPORT MODELS
---------------
Success
(1 row)
When you set the category
parameter, the specified category must match the model type of the imported models; otherwise, the function returns an error.
Import kmeans_pmml
as a PMML model:
SELECT IMPORT_MODELS ('/root/user/kmeans_pmml' USING PARAMETERS category='PMML')
import_models
---------------
Success
(1 row)
Attempt to import kmeans_pmml
, a PMML model, as a TENSORFLOW model:
SELECT IMPORT_MODELS ('/root/user/kmeans_pmml' USING PARAMETERS category='TENSORFLOW')
import_models
-------------------------------------------------
Has failure. Please check import_log.json file
(1 row)
Import tf_mnist_estimator
as a TensorFlow model:
=> SELECT IMPORT_MODELS ( '/path/tf_models/tf_mnist_estimator' USING PARAMETERS category='TENSORFLOW');
import_models
---------------
Success
(1 row)
Import all TensorFlow models from the specified directory:
=> SELECT IMPORT_MODELS ( '/path/tf_models/*' USING PARAMETERS category='TENSORFLOW');
import_models
---------------
Success
(1 row)
See also
EXPORT_MODELS6 - REGISTER_MODEL
Registers a trained model and adds it to Model versioning environment with a status of 'under_review'. The model must be registered by the owner of the model, dbadmin, or MLSUPERVISOR
.
After a model is registered, the model owner is automatically changed to Superuser and the previous owner is given USAGE privileges. Users with the MLSUPERVISOR
role or dbamin can call the CHANGE_MODEL_STATUS function to alter the status of registered models.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
StableSyntax
REGISTER_MODEL( 'model_name', 'registered_name' )
Arguments
model_name
- Identifies the model to register. If the model has already been registered, the function throws an error.
registered_name
- Identifies an abstract name to which the model is registered. This
registered_name
can represent a group of models for a higher-level application, where each model in the group has a unique version number.If a model is the first to be registered to a given
registered_name
, the model is assigned aregistered_version
of one. Otherwise, newly registered models are assigned an incrementedregistered_version
of n + 1, where n is the number of models already registered to the givenregistered_name
. Each registered model can be uniquely identified by the combination ofregistered_name
andregistered_version
.
Privileges
Non-superusers: model owner
Examples
In the following example, the model log_reg_bfgs
is registered to the logistic_reg_app
application:
=> SELECT REGISTER_MODEL('log_reg_bfgs', 'logistic_reg_app');
REGISTER_MODEL
----------------------------------------------------------------------
Model [log_reg_bfgs] is registered as [logistic_reg_app], version [1]
(1 row)
You can query the REGISTERED_MODELS system table to view details about the newly registered model:
=> SELECT * FROM REGISTERED_MODELS;
registered_name | registered_version | status | registered_time | model_id | schema_name | model_name | model_type | category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
logistic_reg_app | 1 | UNDER_REVIEW | 2023-01-22 09:49:25.990626-02 | 45035996273853740 | public | log_reg_bfgs | LOGISTIC_REGRESSION | VERTICA_MODELS
(1 row)
See also
7 - UPGRADE_MODEL
Upgrades a model from a previous Vertica version. Vertica automatically runs this function during a database upgrade and if you run the IMPORT_MODELS function. Manually call this function to upgrade models after a backup or restore.
If UPGRADE_MODEL fails to upgrade the model and the model is of category VERTICA_MODELS, it cannot be used for in-database scoring and cannot be exported as a PMML model.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
UPGRADE_MODEL ( [ USING PARAMETERS [model_name = 'model-name'] ] )
Parameters
model_name
- Name of the model to upgrade. If you omit this parameter, Vertica upgrades all models on which you have privileges.
Privileges
Non-superuser: Upgrades only models that the user owns.
Examples
Upgrade model myLogisticRegModel
:
=> SELECT UPGRADE_MODEL( USING PARAMETERS model_name = 'myLogisticRegModel');
UPGRADE_MODEL
----------------------------
1 model(s) upgrade
(1 row)
Upgrade all models that the user owns:
=> SELECT UPGRADE_MODEL();
UPGRADE_MODEL
----------------------------
20 model(s) upgrade
(1 row)