Cardinality

Refers to the number of unique values for a given column in a relational table:.

Refers to the number of unique values for a given column in a relational table:

  • High cardinality: Refers to columns containing values that are highly unique, such as a customer ID or an employee e-mail address. For example, in the Vertica VMart schema, the employee_dimension table contains an employee_key column. This column contains values that uniquely identify each employee. Since the values in this column are unique and could be numerous, the column's cardinality type is referred to as high cardinality.

  • Normal cardinality: Refers to columns containing values that are less unique, such as job titles and street addresses. An example of a normal-cardinality column would be job_title or employee_first_name in the employee_dimension table, where many employees could share the same job title or same first name.

  • Low cardinality: Refers to a low number of unique values, relative to the overall number of records in a table. For example, in the employee_dimension table, the column called employee_gender would contain two unique values: 'Male' or 'Female'. Since there are only two values possible in this column, cardinality is low.