APPLY_BISECTING_KMEANS
Applies a trained bisecting k-means model to an input relation, and assigns each new data point to the closest matching cluster in the trained model.
Applies a trained bisecting k-means model to an input relation, and assigns each new data point to the closest matching cluster in the trained model.
Note
If the input relation is defined in Hive, useSYNC_WITH_HCATALOG_SCHEMA
to sync the hcatalog
schema, and then run the machine learning function.
Syntax
SELECT APPLY_BISECTING_KMEANS( 'input-columns'
USING PARAMETERS model_name = 'model-name'
[, num_clusters = 'num-clusters']
[, match_by_pos = match-by-position] ] )
Arguments
input-columns
- Comma-separated list of columns to use from the input relation, or asterisk (*) to select all columns. Input columns must be of data type numeric.
Parameters
model_name
Name of the model (case-insensitive).
num_clusters
- Integer between 1 and
k
inclusive, wherek
is the number of centers in the model, specifies the number of clusters to use for prediction.Default: Value that the model specifies for
k
match_by_pos
Boolean value that specifies how input columns are matched to model features:
-
true
: Match by the position of columns in the input columns list. -
false
(default): Match by name.
-
Privileges
Non-superusers: model owner, or USAGE privileges on the model