KPROTOTYPES
Executes the k-prototypes algorithm on an input relation.
	Executes the k-prototypes algorithm on an input relation. The result is a model with a list of cluster centers.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Syntax
SELECT KPROTOTYPES ('`*`model-name`*`', '`*`input-relation`*`', '`*`input-columns`*`', `*`num-clusters`*`
                [USING PARAMETERS [exclude_columns = '`*`exclude-columns`*`']
                [, max_iterations = '`*`max-iterations`*`']
                [, epsilon = `*`epsilon`*`]
                [, {[init_method = '`*`init-method`*`'] } | { initial_centers_table = '`*`init-table`*`' } ]
                [, gamma = '`*`gamma`*`']
                [, output_view = '`*`output-view`*`']
                [, key_columns = '`*`key-columns`*`']]);
Behavior type
VolatileArguments
- model-name
- Name of the model resulting from the training.
- input-relation
- Name of the table or view containing the training samples.
- input-columns
- String containing a comma-separated list of columns to use from the input-relation, or asterisk (*) to select all columns.
- num-clusters
- Integer ≤ 10,000 representing the number of clusters to create. This argument represents the k in k-prototypes.
Parameters
- exclude-columns
- String containing a comma-separated list of column names from input-columns to exclude from processing.
Default: (empty) 
- max_iterations
- Integer						≤ 1M representing the maximum number of iterations the algorithm performs.
Default: Integer ≤ 1M 
- epsilon
- Integer which determines whether the algorithm has converged.
Default: 1e-4 
- init_method
- String specifying the method used to find the initial k-prototypes cluster centers.
Default: "random" 
- initial_centers_table
- The table with the initial cluster centers to use.
- gamma
- Float between 0 and 10000 specifying the weighing factor for categorical columns. It can determine relative importance of numerical and categorical attributes
Default: Inferred from data. 
- output_view
- The name of the view where you save the assignments of each point to its cluster
- key_columns
- Comma-separated list of column names that identify the output rows. Columns must be in the input-columns argument list
Examples
The following example creates k-prototypes model small_model and applies it to input table small_test_mixed:
=> SELECT KPROTOTYPES('small_model_initcenters', 'small_test_mixed', 'x0, country', 3 USING PARAMETERS initial_centers_table='small_test_mixed_centers', key_columns='pid');
      KPROTOTYPES
---------------------------
Finished in 2 iterations
(1 row)
=> SELECT country, x0, APPLY_KPROTOTYPES(country, x0
USING PARAMETERS model_name='small_model')
FROM small_test_mixed;
  country   | x0  | apply_kprototypes
------------+-----+-------------------
 'China'    |  20 |                 0
 'US'       |  85 |                 2
 'Russia'   |  80 |                 1
 'Brazil'   |  78 |                 1
 'US'       |  23 |                 0
 'US'       |  50 |                 0
 'Canada'   |  24 |                 0
 'Canada'   |  18 |                 0
 'Russia'   |  90 |                 2
 'Russia'   |  98 |                 2
 'Brazil'   |  89 |                 2
...
(45 rows)