Model versioning

Model versioning provides an infrastructure to track and manage the status of registered models in a database.

Model versioning provides an infrastructure to track and manage the status of registered models in a database. The versioning infrastructure supports a collaborative environment where multiple users can submit candidate models for individual applications, which are identified by their registered_name. Models in Vertica are by default unregistered, but any user with sufficient privileges can register a model to an application and add it to the versioning environment.

Register models

When a user has a candidate model ready to submit to an application, they can register the model using the REGISTER_MODEL function, which accepts a model_name and registered_name. The registered model is assigned an initial status of 'under_review' and is visible in the REGISTERED_MODELS system table to users with USAGE privileges.

All registered models under a given registered_name are considered different versions of the same model, regardless of trainer, algorithm type, or production status. If a model is the first to be registered to a given registered_name, the model is assigned a registered_version of one. Otherwise, newly registered models are assigned an incremented registered_version of n + 1, where n is the number of models already registered to the given registered_name. Each registered model can be uniquely identified by the combination of registered_name and registered_version.

Change model status

After a model is registered, the model owner is automatically changed to superuser and the previous owner is given USAGE privileges. The MLSUPERVISOR role has full privileges over all registered models, as does dbadmin. Only users with these two roles can call the CHANGE_MODEL_STATUS function to change the status of registered models. There are six possible statuses for registered models:

  • under_review: Status assigned to newly registered models.

  • staging: Model is targeted for A/B testing against the model currently in production.

  • production: Model is in production for its specified application. Only one model can be in production for a given registered_name at one time.

  • archived: Status of models that were previously in production. Archived models can be returned to production at any time.

  • declined: Model is no longer in consideration for production.

  • unregistered: Model is removed from the versioning environment. The model does not appear in the REGISTERED_MODELS system table.

The following diagram depicts the valid status transitions:

You can view the status history of registered models with the MODEL_STATUS_HISTORY system table, which includes models that have been unregistered or dropped. Only superusers or users to whom they have granted sufficient privileges can query the table. Vertica recommends granting access on the table to the MLSUPERVISOR role.

Managing registered models

Only users with the MLSUPERVISOR role, or those granted equivalent privileges, can drop or alter registered models. As with unregistered models, you can drop and alter registered models with the DROP MODEL and ALTER MODEL commands. Dropped models no longer appear in the REGISTERED_MODELS system table.

To make predictions with a registered model, you must use the [schema_name.]model_name and predict function for the appropriate model type. You can find all necessary information in the REGISTERED_MODELS system table.

Model registration does not effect the process of exporting models. However, model registration information, such as registered_version, is not captured in exported models. All imported models are initially unregistered, but each category of imported model—native Vertica, PMML, and TensorFlow—is compatible with model versioning.

Examples

The following example demonstrates a possible model versioning workflow. This workflow begins with registering multiple models to an application, continues with an MLSUPERVISOR managing and changing the status of those models, and concludes with one of the models in production.

First, the models must be registered to an application. In this case, the models, native_linear_reg and linear_reg_spark1, are registered as the linear_reg_app application:

=> SELECT REGISTER_MODEL('native_linear_reg', 'linear_reg_app');
                           REGISTER_MODEL
-------------------------------------------------------------------------
Model [native_linear_reg] is registered as [linear_reg_app], version [2]
(1 row)

=> SELECT REGISTER_MODEL('linear_reg_spark1', 'linear_reg_app');
                           REGISTER_MODEL
-------------------------------------------------------------------------
Model [linear_reg_spark1] is registered as [linear_reg_app], version [3]
(1 row)

You can query the REGISTERED_MODELS system table to view details about the newly registered models, such as the version number and model status:

=> SELECT * FROM REGISTERED_MODELS;
  registered_name | registered_version |    status    |        registered_time        |      model_id     | schema_name |    model_name     |      model_type       |    category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
 linear_reg_app   |                  3 | UNDER_REVIEW | 2023-01-26 09:52:00.082166-04 | 45035996273714020 | public      | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
 linear_reg_app   |                  2 | UNDER_REVIEW | 2023-01-26 09:51:04.553102-05 | 45035996273850350 | public      | native_linear_reg | LINEAR_REGRESSION     | VERTICA_MODELS
 linear_reg_app   |                  1 | PRODUCTION   | 2023-01-24 05:29:25.990626-02 | 45035996273853740 | public      | linear_reg_newton | LINEAR_REGRESSION     | VERTICA_MODELS
 logistic_reg_app |                  1 | PRODUCTION   | 2023-01-23 08:49:25.990626-02 | 45035996273853740 | public      | log_reg_cgd       | LOGISTIC_REGRESSION   | VERTICA_MODELS
(4 rows)

If you query the MODELS system table, you can see that the model owner has automatically been changed to superuser, dbadmin in this case:

=> SELECT model_name, owner_name FROM MODELS WHERE model_name IN ('native_linear_reg', 'linear_reg_spark1');
     model_name    | owner_name
-------------------+------------
 native_linear_reg | dbadmin
 linear_reg_spark1 | dbadmin
(2 rows)

As a user with the MLSUPERVISOR role enabled, you can then change the status of both models to 'staging' with the CHANGE_MODEL_STATUS function, which accepts a registered_name and registered_version:

=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 2, 'staging');
                          CHANGE_MODEL_STATUS
-----------------------------------------------------------------------------
The status of model [linear_reg_app] - version [2] is changed to [staging]
(1 row)

=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 3, 'staging');
                          CHANGE_MODEL_STATUS
-----------------------------------------------------------------------------
The status of model [linear_reg_app] - version [3] is changed to [staging]
(1 row)

After comparing the evaluation metrics of the two staged models against the model currently in production, you can put the better performing model into production. In this case, the linear_reg_spark1 model is moved into production and the linear_reg_spark1 model is declined:

=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 2, 'declined');
CHANGE_MODEL_STATUS
-----------------------------------------------------------------------------
The status of model [linear_reg_app] - version [2] is changed to [declined]
(1 row)

=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 3, 'production');
CHANGE_MODEL_STATUS
-----------------------------------------------------------------------------
The status of model [linear_reg_app] - version [3] is changed to [production]
(1 row)

You can then query the REGISTERED_MODELS system table to confirm that the linear_reg_spark1 model is now in 'production', the native_linear_reg model has been set to 'declined', and the previously in production linear_reg_spark1 model has been automatically moved to 'archived':

=> SELECT * FROM REGISTERED_MODELS;
  registered_name | registered_version |    status    |        registered_time        |      model_id     | schema_name |    model_name     |      model_type       |    category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
 linear_reg_app   |                  3 | PRODUCTION   | 2023-01-26 09:52:00.082166-04 | 45035996273714020 | public      | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
 linear_reg_app   |                  2 | DECLINED     | 2023-01-26 09:51:04.553102-05 | 45035996273850350 | public      | native_linear_reg | LINEAR_REGRESSION     | VERTICA_MODELS
 linear_reg_app   |                  1 | ARCHIVED     | 2023-01-24 05:29:25.990626-02 | 45035996273853740 | public      | linear_reg_newton | LINEAR_REGRESSION     | VERTICA_MODELS
 logistic_reg_app |                  1 | PRODUCTION   | 2023-01-23 08:49:25.990626-02 | 45035996273853740 | public      | log_reg_cgd       | LOGISTIC_REGRESSION   | VERTICA_MODELS
(4 rows)

To remove the declined native_linear_reg model from the versioning environment, you can set the status to 'unregistered':

=> SELECT CHANGE_MODEL_STATUS('linear_reg_app', 2, 'unregistered');
                            CHANGE_MODEL_STATUS
----------------------------------------------------------------------------------
The status of model [linear_reg_app] - version [2] is changed to [unregistered]
(1 row)

=> SELECT * FROM REGISTERED_MODELS;
  registered_name | registered_version |    status    |        registered_time        |      model_id     | schema_name |    model_name     |      model_type       |    category
------------------+--------------------+--------------+-------------------------------+-------------------+-------------+-------------------+-----------------------+----------------
 linear_reg_app   |                  3 | PRODUCTION   | 2023-01-26 09:52:00.082166-04 | 45035996273714020 | public      | linear_reg_spark1 | PMML_REGRESSION_MODEL | PMML
 linear_reg_app   |                  1 | ARCHIVED     | 2023-01-24 05:29:25.990626-02 | 45035996273853740 | public      | linear_reg_newton | LINEAR_REGRESSION     | VERTICA_MODELS
 logistic_reg_app |                  1 | PRODUCTION   | 2023-01-23 08:49:25.990626-02 | 45035996273853740 | public      | log_reg_cgd       | LOGISTIC_REGRESSION   | VERTICA_MODELS
(3 rows)

You can see that the native_linear_reg model no longer appears in the REGISTERED_MODELS system table. However, you can still query the MODEL_STATUS_HISTORY system table to view the status history of native_linear_reg:


=> SELECT * FROM MODEL_STATUS_HISTORY WHERE model_id=45035996273850350;
 registered_name | registered_version |  new_status  |  old_status  |       status_change_time      |    operator_id    | operator_name |      model_id     | schema_name |    model_name
-----------------+--------------------+--------------+------------- +-------------------------------+-------------------+---------------+-------------------+-------------+-------------------
linear_reg_app   |                  2 | UNDER_REVIEW | UNREGISTERED | 2023-01-26 09:51:04.553102-05 | 45035996273964824 | user1         | 45035996273850350 | public      | native_linear_reg
linear_reg_app   |                  2 | STAGING      | UNDER_REVIEW | 2023-01-29 11:33:02.052464-05 | 45035996273704962 | supervisor1   | 45035996273850350 | public      | native_linear_reg
linear_reg_app   |                  2 | DECLINED     | STAGING      | 2023-01-30 04:12:30.481136-05 | 45035996273704962 | supervisor1   | 45035996273850350 | public      | native_linear_reg
linear_reg_app   |                  2 | UNREGISTERED | DECLINED     | 2023-02-02 03:25:32.332132-05 | 45035996273704962 | supervisor1   | 45035996273850350 | public      | native_linear_reg
(4 rows)

See also