This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Functions
Functions return information from the database.
Functions return information from the database. This section describes functions that Vertica supports. Except for meta-functions, you can use a function anywhere an expression is allowed.
Meta-functions usually access the internal state of Vertica. They can be used in a top-level SELECT statement only, and the statement cannot contain other clauses such as FROM or WHERE. Meta-functions are labeled on their reference pages.
The Behavior Type section on each reference page categorizes the function's return behavior as one or more of the following:
- Immutable (invariant): When run with a given set of arguments, immutable functions always produce the same result, regardless of environment or session settings such as locale.
- Stable: When run with a given set of arguments, stable functions produce the same result within a single query or scan operation. However, a stable function can produce different results when issued under different environments or at different times, such as change of locale and time zone—for example, SYSDATE.
- Volatile: Regardless of their arguments or environment, volatile functions can return a different result with each invocation—for example, UUID_GENERATE.
List of all functions
The following list contains all Vertica SQL functions.
-
ABS
- Returns the absolute value of the argument. [Mathematical functions]
-
ACOS
- Returns a DOUBLE PRECISION value representing the trigonometric inverse cosine of the argument. [Mathematical functions]
-
ACOSH
- Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic cosine of the function argument. [Mathematical functions]
-
ACTIVE_SCHEDULER_NODE
- Returns the active scheduler node. [Stored procedure functions]
-
ADD_MONTHS
- Adds the specified number of months to a date and returns the sum as a DATE. [Date/time functions]
-
ADVANCE_EPOCH
- Manually closes the current epoch and begins a new epoch. [Epoch functions]
-
AGE_IN_MONTHS
- Returns the difference in months between two dates, expressed as an integer. [Date/time functions]
-
AGE_IN_YEARS
- Returns the difference in years between two dates, expressed as an integer. [Date/time functions]
-
ALTER_LOCATION_LABEL
- Adds a label to a storage location, or changes or removes an existing label. [Storage functions]
-
ALTER_LOCATION_SIZE
- Resizes on one node, all nodes in a subcluster, or all nodes in the database. [Eon Mode functions]
-
ALTER_LOCATION_USE
- Alters the type of data that a storage location holds. [Storage functions]
-
ANALYZE_CONSTRAINTS
- Analyzes and reports on constraint violations within the specified scope. [Table functions]
-
ANALYZE_CORRELATIONS
- This function is deprecated and will be removed in a future release. [Table functions]
-
ANALYZE_EXTERNAL_ROW_COUNT
- Calculates the exact number of rows in an external table. [Statistics management functions]
-
ANALYZE_STATISTICS
- Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table. [Statistics management functions]
-
ANALYZE_STATISTICS_PARTITION
- Collects and aggregates data samples and storage information for a range of partitions in the specified table. [Statistics management functions]
-
ANALYZE_WORKLOAD
- Runs Workload Analyzer, a utility that analyzes system information held in system tables. [Workload management functions]
-
APPLY_AVG
- Returns the average of all elements in a with numeric values. [Collection functions]
-
APPLY_BISECTING_KMEANS
- Applies a trained bisecting k-means model to an input relation, and assigns each new data point to the closest matching cluster in the trained model. [Transformation functions]
-
APPLY_COUNT (ARRAY_COUNT)
- Returns the total number of non-null elements in a. [Collection functions]
-
APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
- Returns the total number of elements in a , including NULLs. [Collection functions]
-
APPLY_IFOREST
- Applies an isolation forest (iForest) model to an input relation. [Transformation functions]
-
APPLY_INVERSE_PCA
- Inverts the APPLY_PCA-generated transform back to the original coordinate system. [Transformation functions]
-
APPLY_INVERSE_SVD
- Transforms the data back to the original domain. [Transformation functions]
-
APPLY_KMEANS
- Assigns each row of an input relation to a cluster center from an existing k-means model. [Transformation functions]
-
APPLY_KPROTOTYPES
- Assigns each row of an input relation to a cluster center from an existing k-prototypes model. [Transformation functions]
-
APPLY_MAX
- Returns the largest non-null element in a. [Collection functions]
-
APPLY_MIN
- Returns the smallest non-null element in a. [Collection functions]
-
APPLY_NORMALIZE
- A UDTF function that applies the normalization parameters saved in a model to a set of specified input columns. [Transformation functions]
-
APPLY_ONE_HOT_ENCODER
- A user-defined transform function (UDTF) that loads the one hot encoder model and writes out a table that contains the encoded columns. [Transformation functions]
-
APPLY_PCA
- Transforms the data using a PCA model. [Transformation functions]
-
APPLY_SUM
- Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL). [Collection functions]
-
APPLY_SVD
- Transforms the data using an SVD model. [Transformation functions]
-
APPROXIMATE_COUNT_DISTINCT
- Returns the number of distinct non-NULL values in a data set. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS
- Calculates the number of distinct non-NULL values from the synopsis objects created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS
- Summarizes the information of distinct non-NULL values and materializes the result set in a VARBINARY or LONG VARBINARY synopsis object. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE
- Aggregates multiple synopses into one new synopsis. [Aggregate functions]
-
APPROXIMATE_MEDIAN [aggregate]
- Computes the approximate median of an expression over a group of rows. [Aggregate functions]
-
APPROXIMATE_PERCENTILE [aggregate]
- Computes the approximate percentile of an expression over a group of rows. [Aggregate functions]
-
APPROXIMATE_QUANTILES
- Computes an array of weighted, approximate percentiles of a column within some user-specified error. [Aggregate functions]
-
ARGMAX [analytic]
- This function is patterned after the mathematical function argmax(f(x)), which returns the value of x that maximizes f(x). [Analytic functions]
-
ARGMAX_AGG
- Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
-
ARGMIN [analytic]
- This function is patterned after the mathematical function argmin(f(x)), which returns the value of x that minimizes f(x). [Analytic functions]
-
ARGMIN_AGG
- Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
-
ARIMA
- Creates and trains an autoregressive integrated moving average (ARIMA) model from a time series with consistent timesteps. [Machine learning algorithms]
-
ARRAY_CAT
- Concatenates two arrays of the same element type and dimensionality. [Collection functions]
-
ARRAY_CONTAINS
- Returns true if the specified element is found in the array and false if not. [Collection functions]
-
ARRAY_DIMS
- Returns the dimensionality of the input array. [Collection functions]
-
ARRAY_FIND
- Returns the ordinal position of a specified element in an array, or -1 if not found. [Collection functions]
-
ASCII
- Converts the first character of a VARCHAR datatype to an INTEGER. [String functions]
-
ASIN
- Returns a DOUBLE PRECISION value representing the trigonometric inverse sine of the argument. [Mathematical functions]
-
ASINH
- Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic sine of the function argument. [Mathematical functions]
-
ATAN
- Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the argument. [Mathematical functions]
-
ATAN2
- Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the arithmetic dividend of the arguments. [Mathematical functions]
-
ATANH
- Returns a DOUBLE PRECISION value that represents the inverse hyperbolic tangent of the function argument. [Mathematical functions]
-
AUDIT
- Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. [License functions]
-
AUDIT_FLEX
- Returns the estimated ROS size of __raw__ columns, equivalent to the export size of the flex data in the audited objects. [License functions]
-
AUDIT_LICENSE_SIZE
- Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses. [License functions]
-
AUDIT_LICENSE_TERM
- Triggers an immediate audit to determine if the Vertica license has expired. [License functions]
-
AUTOREGRESSOR
- Creates an autoregressive (AR) model from a stationary time series with consistent timesteps that can then be used for prediction via PREDICT_AR. [Machine learning algorithms]
-
AVG [aggregate]
- Computes the average (arithmetic mean) of an expression over a group of rows. [Aggregate functions]
-
AVG [analytic]
- Computes an average of an expression in a group within a. [Analytic functions]
-
AZURE_TOKEN_CACHE_CLEAR
- Clears the cached access token for Azure. [Cloud functions]
-
BACKGROUND_DEPOT_WARMING
- Vertica version 10.0.0 removes support for foreground depot warming. [Eon Mode functions]
-
BALANCE
- Returns a view with an equal distribution of the input data based on the response_column. [Data preparation]
-
BISECTING_KMEANS
- Executes the bisecting k-means algorithm on an input relation. [Machine learning algorithms]
-
BIT_AND
- Takes the bitwise AND of all non-null input values. [Aggregate functions]
-
BIT_LENGTH
- Returns the length of the string expression in bits (bytes * 8) as an INTEGER. [String functions]
-
BIT_OR
- Takes the bitwise OR of all non-null input values. [Aggregate functions]
-
BIT_XOR
- Takes the bitwise XOR of all non-null input values. [Aggregate functions]
-
BITCOUNT
- Returns the number of one-bits (sometimes referred to as set-bits) in the given VARBINARY value. [String functions]
-
BITSTRING_TO_BINARY
- Translates the given VARCHAR bitstring representation into a VARBINARY value. [String functions]
-
BOOL_AND [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_AND [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BOOL_OR [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_OR [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BOOL_XOR [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_XOR [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BTRIM
- Removes the longest string consisting only of specified characters from the start and end of a string. [String functions]
-
BUILD_FLEXTABLE_VIEW
- Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys. [Flex data functions]
-
CALENDAR_HIERARCHY_DAY
- Specifies to group DATE partition keys into a hierarchy of years, months, and days. [Partition functions]
-
CANCEL_DEPOT_WARMING
- Cancels depot warming on a node. [Eon Mode functions]
-
CANCEL_DRAIN_SUBCLUSTER
- Cancels the draining of a subcluster or subclusters. [Eon Mode functions]
-
CANCEL_REBALANCE_CLUSTER
- Stops any rebalance task that is currently in progress or is waiting to execute. [Cluster functions]
-
CANCEL_REFRESH
- Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH. [Session functions]
-
CBRT
- Returns the cube root of the argument. [Mathematical functions]
-
CEILING
- Rounds up the returned value up to the next whole number. [Mathematical functions]
-
CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY
- Changes the run-time priority of an active query. [Workload management functions]
-
CHANGE_MODEL_STATUS
- Changes the status of a registered model. [Model management]
-
CHANGE_RUNTIME_PRIORITY
- Changes the run-time priority of a query that is actively running. [Workload management functions]
-
CHARACTER_LENGTH
- The CHARACTER_LENGTH() function:. [String functions]
-
CHR
- Converts the first character of an INTEGER datatype to a VARCHAR. [String functions]
-
CLEAN_COMMUNAL_STORAGE
- Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. [Eon Mode functions]
-
CLEAR_CACHES
- Clears the Vertica internal cache files. [Storage functions]
-
CLEAR_DATA_COLLECTOR
- Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR. [Data collector functions]
-
CLEAR_DATA_DEPOT
- Deletes the specified depot data. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_PARTITION
- Clears a depot pinning policy from the specified table or projection partitions. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_PROJECTION
- Clears a depot pinning policy from the specified projection. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_TABLE
- Clears a depot pinning policy from the specified table. [Eon Mode functions]
-
CLEAR_FETCH_QUEUE
- Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. [Eon Mode functions]
-
CLEAR_HDFS_CACHES
- Clears the configuration information copied from HDFS and any cached connections. [Hadoop functions]
-
CLEAR_OBJECT_STORAGE_POLICY
- Removes a user-defined storage policy from the specified database, schema or table. [Storage functions]
-
CLEAR_PROFILING
- Clears from memory data for the specified profiling type. [Profiling functions]
-
CLEAR_PROJECTION_REFRESHES
- Clears information projection refresh history from system table PROJECTION_REFRESHES. [Projection functions]
-
CLEAR_RESOURCE_REJECTIONS
- Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. [Database functions]
-
CLOCK_TIMESTAMP
- Returns a value of type TIMESTAMP WITH TIMEZONE that represents the current system-clock time. [Date/time functions]
-
CLOSE_ALL_RESULTSETS
- Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
-
CLOSE_ALL_SESSIONS
- Closes all external sessions except the one that issues this function. [Session functions]
-
CLOSE_RESULTSET
- Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
-
CLOSE_SESSION
- Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. [Session functions]
-
CLOSE_USER_SESSIONS
- Stops the session for a user, rolls back any transaction currently running, and closes the connection. [Session functions]
-
COALESCE
- Returns the value of the first non-null expression in the list. [NULL-handling functions]
-
COLLATION
- Applies a collation to two or more strings. [String functions]
-
COMPACT_STORAGE
- Bundles existing data (.fdb) and index (.pidx) files into the .gt file format. [Database functions]
-
COMPUTE_FLEXTABLE_KEYS
- Computes the virtual columns (keys and values) from flex table VMap data. [Flex data functions]
-
COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW
- Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view. [Flex data functions]
-
CONCAT
- Concatenates two strings and returns a varchar data type. [String functions]
-
CONDITIONAL_CHANGE_EVENT [analytic]
- Assigns an event window number to each row, starting from 0, and increments by 1 when the result of evaluating the argument expression on the current row differs from that on the previous row. [Analytic functions]
-
CONDITIONAL_TRUE_EVENT [analytic]
- Assigns an event window number to each row, starting from 0, and increments the number by 1 when the result of the boolean argument expression evaluates true. [Analytic functions]
-
CONFUSION_MATRIX
- Computes the confusion matrix of a table with observed and predicted values of a response variable. [Model evaluation]
-
CONTAINS
- Returns true if the specified element is found in the collection and false if not. [Collection functions]
-
COPY_PARTITIONS_TO_TABLE
- Copies partitions from one table to another. [Partition functions]
-
COPY_TABLE
- Copies one table to another. [Table functions]
-
CORR
- Returns the DOUBLE PRECISION coefficient of correlation of a set of expression pairs, as per the Pearson correlation coefficient. [Aggregate functions]
-
CORR_MATRIX
- Takes an input relation with numeric columns, and calculates the Pearson Correlation Coefficient between each pair of its input columns. [Data preparation]
-
COS
- Returns a DOUBLE PRECISION value tat represents the trigonometric cosine of the passed parameter. [Mathematical functions]
-
COSH
- Returns a DOUBLE PRECISION value that represents the hyperbolic cosine of the passed parameter. [Mathematical functions]
-
COT
- Returns a DOUBLE PRECISION value representing the trigonometric cotangent of the argument. [Mathematical functions]
-
COUNT [aggregate]
- Returns as a BIGINT the number of rows in each group where the expression is not NULL. [Aggregate functions]
-
COUNT [analytic]
- Counts occurrences within a group within a. [Analytic functions]
-
COVAR_POP
- Returns the population covariance for a set of expression pairs. [Aggregate functions]
-
COVAR_SAMP
- Returns the sample covariance for a set of expression pairs. [Aggregate functions]
-
CROSS_VALIDATE
- Performs k-fold cross validation on a learning algorithm using an input relation, and grid search for hyper parameters. [Model evaluation]
-
CUME_DIST [analytic]
- Calculates the cumulative distribution, or relative rank, of the current row with regard to other rows in the same partition within a . [Analytic functions]
-
CURRENT_DATABASE
- Returns the name of the current database, equivalent to DBNAME. [System information functions]
-
CURRENT_DATE
- Returns the date (date-type value) on which the current transaction started. [Date/time functions]
-
CURRENT_LOAD_SOURCE
- When called within the scope of a COPY statement, returns the file name used for the load. [System information functions]
-
CURRENT_SCHEMA
- Returns the name of the current schema. [System information functions]
-
CURRENT_SESSION
- Returns the ID of the current client session. [System information functions]
-
CURRENT_TIME
- Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
-
CURRENT_TIMESTAMP
- Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
-
CURRENT_TRANS_ID
- Returns the ID of the transaction currently in progress. [System information functions]
-
CURRENT_USER
- Returns a VARCHAR containing the name of the user who initiated the current database connection. [System information functions]
-
CURRVAL
- Returns the last value across all nodes that was set by NEXTVAL on this sequence in the current session. [Sequence functions]
-
DATA_COLLECTOR_HELP
- Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions. [Data collector functions]
-
DATE
- Converts the input value to a DATE data type. [Date/time functions]
-
DATE_PART
- Extracts a sub-field such as year or hour from a date/time expression, equivalent to the the SQL-standard function EXTRACT. [Date/time functions]
-
DATE_TRUNC
- Truncates date and time values to the specified precision. [Date/time functions]
-
DATEDIFF
- Returns the time span between two dates, in the intervals specified. [Date/time functions]
-
DAY
- Returns as an integer the day of the month from the input value. [Date/time functions]
-
DAYOFMONTH
- Returns the day of the month as an integer. [Date/time functions]
-
DAYOFWEEK
- Returns the day of the week as an integer, where Sunday is day 1. [Date/time functions]
-
DAYOFWEEK_ISO
- Returns the ISO 8061 day of the week as an integer, where Monday is day 1. [Date/time functions]
-
DAYOFYEAR
- Returns the day of the year as an integer, where January 1 is day 1. [Date/time functions]
-
DAYS
- Returns the integer value of the specified date, where 1 AD is 1. [Date/time functions]
-
DBNAME (function)
- Returns the name of the current database, equivalent to CURRENT_DATABASE. [System information functions]
-
DECODE
- Compares expression to each search value one by one. [String functions]
-
DEGREES
- Converts an expression from radians to fractional degrees, or from degrees, minutes, and seconds to fractional degrees. [Mathematical functions]
-
DELETE_TOKENIZER_CONFIG_FILE
- Deletes a tokenizer configuration file. [Text search functions]
-
DEMOTE_SUBCLUSTER_TO_SECONDARY
- Converts a to a . [Eon Mode functions]
-
DENSE_RANK [analytic]
- Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY clause. [Analytic functions]
-
DESCRIBE_LOAD_BALANCE_DECISION
- Evaluates if any load balancing routing rules apply to a given IP address and This function is useful when you are evaluating connection load balancing policies you have created, to ensure they work the way you expect them to. [Client connection functions]
-
DESIGNER_ADD_DESIGN_QUERIES
- Reads and evaluates queries from an input file, and adds the queries that it accepts to the specified design. [Database Designer functions]
-
DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS
- Executes the specified query and evaluates results in the following columns:. [Database Designer functions]
-
DESIGNER_ADD_DESIGN_QUERY
- Reads and parses the specified query, and if accepted, adds it to the design. [Database Designer functions]
-
DESIGNER_ADD_DESIGN_TABLES
- Adds the specified tables to a design. [Database Designer functions]
-
DESIGNER_CANCEL_POPULATE_DESIGN
- Cancels population or deployment operation for the specified design if it is currently running. [Database Designer functions]
-
DESIGNER_CREATE_DESIGN
- Creates a design with the specified name. [Database Designer functions]
-
DESIGNER_DESIGN_PROJECTION_ENCODINGS
- Analyzes encoding in the specified projections, creates a script to implement encoding recommendations, and optionally deploys the recommendations. [Database Designer functions]
-
DESIGNER_DROP_ALL_DESIGNS
- Removes all Database Designer-related schemas associated with the current user. [Database Designer functions]
-
DESIGNER_DROP_DESIGN
- Removes the schema associated with the specified design and all its contents. [Database Designer functions]
-
DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS
- Displays the DDL statements that define the design projections to standard output. [Database Designer functions]
-
DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT
- Displays the deployment script for the specified design to standard output. [Database Designer functions]
-
DESIGNER_RESET_DESIGN
- Discards all run-specific information of the previous Database Designer build or deployment of the specified design but keeps its configuration. [Database Designer functions]
-
DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY
- Populates the design and creates the design and deployment scripts. [Database Designer functions]
-
DESIGNER_SET_DESIGN_KSAFETY
- Sets K-safety for a comprehensive design and stores the K-safety value in the DESIGNS table. [Database Designer functions]
-
DESIGNER_SET_DESIGN_TYPE
- Specifies whether Database Designer creates a comprehensive or incremental design. [Database Designer functions]
-
DESIGNER_SET_OPTIMIZATION_OBJECTIVE
- Valid only for comprehensive database designs, specifies the optimization objective Database Designer uses. [Database Designer functions]
-
DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS
- Specifies whether a design can include unsegmented projections. [Database Designer functions]
-
DESIGNER_SINGLE_RUN
- Evaluates all queries that completed execution within the specified timespan, and returns with a design that is ready for deployment. [Database Designer functions]
-
DESIGNER_WAIT_FOR_DESIGN
- Waits for completion of operations that are populating and deploying the design. [Database Designer functions]
-
DETECT_OUTLIERS
- Returns the outliers in a data set based on the outlier threshold. [Data preparation]
-
DISABLE_DUPLICATE_KEY_ERROR
- Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled). [Table functions]
-
DISABLE_LOCAL_SEGMENTS
- Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. [Cluster functions]
-
DISABLE_PROFILING
- Disables for the current session collection of profiling data of the specified type. [Profiling functions]
-
DISPLAY_LICENSE
- Returns the terms of your Vertica license. [License functions]
-
DISTANCE
- Returns the distance (in kilometers) between two points. [Mathematical functions]
-
DISTANCEV
- Returns the distance (in kilometers) between two points using the Vincenty formula. [Mathematical functions]
-
DO_TM_TASK
- Runs a (TM) operation and commits current transactions. [Storage functions]
-
DROP_EXTERNAL_ROW_COUNT
- Removes external table row count statistics compiled by ANALYZE_EXTERNAL_ROW_COUNT. [Statistics management functions]
-
DROP_LICENSE
- Drops a license key from the global catalog. [Catalog functions]
-
DROP_LOCATION
- Permanently removes a retired storage location. [Storage functions]
-
DROP_PARTITIONS
- Drops the specified table partition keys. [Partition functions]
-
DROP_STATISTICS
- Removes statistical data on database projections previously generated by ANALYZE_STATISTICS. [Statistics management functions]
-
DROP_STATISTICS_PARTITION
- Removes statistical data on database projections previously generated by ANALYZE_STATISTICS_PARTITION. [Statistics management functions]
-
DUMP_CATALOG
- Returns an internal representation of the Vertica catalog. [Catalog functions]
-
DUMP_LOCKTABLE
- Returns information about deadlocked clients and the resources they are waiting for. [Database functions]
-
DUMP_PARTITION_KEYS
- Dumps the partition keys of all projections in the system. [Database functions]
-
DUMP_PROJECTION_PARTITION_KEYS
- Dumps the partition keys of the specified projection. [Partition functions]
-
DUMP_TABLE_PARTITION_KEYS
- Dumps the partition keys of all projections for the specified table. [Partition functions]
-
EDIT_DISTANCE
- Calculates and returns the Levenshtein distance between two strings. [String functions]
-
EMPTYMAP
- Constructs a new VMap with one row but without keys or data. [Flex map functions]
-
ENABLE_ELASTIC_CLUSTER
- Enables elastic cluster scaling, which makes enlarging or reducing the size of your database cluster more efficient by segmenting a node's data into chunks that can be easily moved to other hosts. [Cluster functions]
-
ENABLE_LOCAL_SEGMENTS
- Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. [Cluster functions]
-
ENABLE_PROFILING
- Enables collection of profiling data of the specified type for the current session. [Profiling functions]
-
ENABLE_SCHEDULE
- Enables or disables a schedule. [Stored procedure functions]
-
ENABLE_TRIGGER
- Enables or disables a trigger. [Stored procedure functions]
-
ENABLED_ROLE
- Checks whether a Vertica user role is enabled, and returns true or false. [Privileges and access functions]
-
ENFORCE_OBJECT_STORAGE_POLICY
- Applies storage policies of the specified object immediately. [Storage functions]
-
ERROR_RATE
- Using an input table, returns a table that calculates the rate of incorrect classifications and displays them as FLOAT values. [Model evaluation]
-
EVALUATE_DELETE_PERFORMANCE
- Evaluates projections for potential DELETE and UPDATE performance issues. [Projection functions]
-
EVENT_NAME
- Returns a VARCHAR value representing the name of the event that matched the row. [MATCH clause functions]
-
EXECUTE_TRIGGER
- Manually executes the stored procedure attached to a trigger. [Stored procedure functions]
-
EXP
- Returns the exponential function, e to the power of a number. [Mathematical functions]
-
EXPLODE
- Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element. [Collection functions]
-
EXPONENTIAL_MOVING_AVERAGE [analytic]
- Calculates the exponential moving average (EMA) of expression E with smoothing factor X. [Analytic functions]
-
EXPORT_CATALOG
- This function and EXPORT_OBJECTS return equivalent output. [Catalog functions]
-
EXPORT_DIRECTED_QUERIES
- Generates SQL for creating directed queries from a set of input queries. [Directed queries functions]
-
EXPORT_MODELS
- Exports machine learning models. [Model management]
-
EXPORT_OBJECTS
- This function and EXPORT_CATALOG return equivalent output. [Catalog functions]
-
EXPORT_STATISTICS
- Generates statistics in XML format from data previously collected by ANALYZE_STATISTICS. [Statistics management functions]
-
EXPORT_STATISTICS_PARTITION
- Generates partition-level statistics in XML format from data previously collected by ANALYZE_STATISTICS_PARTITION. [Statistics management functions]
-
EXPORT_TABLES
- Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster. [Catalog functions]
-
EXTERNAL_CONFIG_CHECK
- Tests the Hadoop configuration of a Vertica cluster. [Hadoop functions]
-
EXTRACT
- Retrieves sub-fields such as year or hour from date/time values and returns values of type NUMERIC. [Date/time functions]
-
FILTER
- Takes an input array and returns an array containing only elements that meet a specified condition. [Collection functions]
-
FINISH_FETCHING_FILES
- Fetches to the depot all files that are queued for download from communal storage. [Eon Mode functions]
-
FIRST_VALUE [analytic]
- Lets you select the first value of a table or partition (determined by the window-order-clause) without having to use a self join. [Analytic functions]
-
FLOOR
- Rounds down the returned value to the previous whole number. [Mathematical functions]
-
FLUSH_DATA_COLLECTOR
- Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage. [Data collector functions]
-
FLUSH_REAPER_QUEUE
- Deletes all data marked for deletion in the database. [Eon Mode functions]
-
GET_AHM_EPOCH
- Returns the number of the in which the is located. [Epoch functions]
-
GET_AHM_TIME
- Returns a TIMESTAMP value representing the. [Epoch functions]
-
GET_AUDIT_TIME
- Reports the time when the automatic audit of database size occurs. [License functions]
-
GET_CLIENT_LABEL
- Returns the client connection label for the current session. [Client connection functions]
-
GET_COMPLIANCE_STATUS
- Displays whether your database is in compliance with your Vertica license agreement. [License functions]
-
GET_CONFIG_PARAMETER
- Gets the value of a configuration parameter at the specified level. [Database functions]
-
GET_CURRENT_EPOCH
- Returns the number of the current epoch. [Epoch functions]
-
GET_DATA_COLLECTOR_NOTIFY_POLICY
- Lists any notification policies set on a component. [Notifier functions]
-
GET_DATA_COLLECTOR_POLICY
- Retrieves a brief statement about the retention policy for the specified component. [Data collector functions]
-
GET_LAST_GOOD_EPOCH
- Returns the number. [Epoch functions]
-
GET_METADATA
- Returns the metadata of a Parquet file. [Hadoop functions]
-
GET_MODEL_ATTRIBUTE
- Extracts either a specific attribute from a model or all attributes from a model. [Model management]
-
GET_MODEL_SUMMARY
- Returns summary information of a model. [Model management]
-
GET_NUM_ACCEPTED_ROWS
- Returns the number of rows loaded into the database for the last completed load for the current session. [Session functions]
-
GET_NUM_REJECTED_ROWS
- Returns the number of rows that were rejected during the last completed load for the current session. [Session functions]
-
GET_PRIVILEGES_DESCRIPTION
- Returns the effective privileges the current user has on an object, including explicit, implicit, inherited, and role-based privileges. [Privileges and access functions]
-
GET_PROJECTION_SORT_ORDER
- Returns the order of columns in a projection's ORDER BY clause. [Projection functions]
-
GET_PROJECTION_STATUS
- Returns information relevant to the status of a :. [Projection functions]
-
GET_PROJECTIONS
- Returns contextual and projection information about projections of the specified anchor table. [Projection functions]
-
GET_TOKENIZER_PARAMETER
- Returns the configuration parameter for a given tokenizer. [Text search functions]
-
GETDATE
- Returns the current statement's start date and time as a TIMESTAMP value. [Date/time functions]
-
GETUTCDATE
- Returns the current statement's start date and time as a TIMESTAMP value. [Date/time functions]
-
GREATEST
- Returns the largest value in a list of expressions of any data type. [String functions]
-
GREATESTB
- Returns the largest value in a list of expressions of any data type, using binary ordering. [String functions]
-
GROUP_ID
- Uniquely identifies duplicate sets for GROUP BY queries that return duplicate grouping sets. [Aggregate functions]
-
GROUPING
- Disambiguates the use of NULL values when GROUP BY queries with multilevel aggregates generate NULL values to identify subtotals in grouping columns. [Aggregate functions]
-
GROUPING_ID
- Concatenates the set of Boolean values generated by the GROUPING function into a bit vector. [Aggregate functions]
-
HADOOP_IMPERSONATION_CONFIG_CHECK
- Reports the delegation tokens Vertica will use when accessing Kerberized data in HDFS. [Hadoop functions]
-
HAS_ROLE
- Checks whether a Vertica user role is granted to the specified user or role, and returns true or false. [Privileges and access functions]
-
HAS_TABLE_PRIVILEGE
- Returns true or false to verify whether a user has the specified privilege on a table. [System information functions]
-
HASH
- Calculates a hash value over the function arguments, producing a value in the range 0 <= x < 263. [Mathematical functions]
-
HASH_EXTERNAL_TOKEN
- Returns a hash of a string token, for use with HADOOP_IMPERSONATION_CONFIG_CHECK. [Hadoop functions]
-
HCATALOGCONNECTOR_CONFIG_CHECK
- Tests the configuration of a Vertica cluster that uses the HCatalog Connector to access Hive data. [Hadoop functions]
-
HDFS_CLUSTER_CONFIG_CHECK
- Tests the configuration of a Vertica cluster that uses HDFS. [Hadoop functions]
-
HEX_TO_BINARY
- Translates the given VARCHAR hexadecimal representation into a VARBINARY value. [String functions]
-
HEX_TO_INTEGER
- Translates the given VARCHAR hexadecimal representation into an INTEGER value. [String functions]
-
HOUR
- Returns the hour portion of the specified date as an integer, where 0 is 00:00 to 00:59. [Date/time functions]
-
IFNULL
- Returns the value of the first non-null expression in the list. [NULL-handling functions]
-
IFOREST
- Trains and returns an isolation forest (iForest) model. [Data preparation]
-
IMPLODE
- Takes a column of any scalar type and returns an unbounded array. [Collection functions]
-
IMPORT_DIRECTED_QUERIES
- Imports to the database catalog directed queries from a SQL file that was generated by EXPORT_DIRECTED_QUERIES. [Directed queries functions]
-
IMPORT_MODELS
- Imports models into Vertica, either Vertica models that were exported with EXPORT_MODELS, or models in Predictive Model Markup Language (PMML) or TensorFlow format. [Model management]
-
IMPORT_STATISTICS
- Imports statistics from the XML file that was generated by EXPORT_STATISTICS. [Statistics management functions]
-
IMPUTE
- Imputes missing values in a data set with either the mean or the mode, based on observed values for a variable in each column. [Data preparation]
-
INET_ATON
- Converts a string that contains a dotted-quad representation of an IPv4 network address to an INTEGER. [IP address functions]
-
INET_NTOA
- Converts an INTEGER value into a VARCHAR dotted-quad representation of an IPv4 network address. [IP address functions]
-
INFER_EXTERNAL_TABLE_DDL
- This function is deprecated and will be removed in a future release. [Table functions]
-
INFER_TABLE_DDL
- Inspects a file in Parquet, ORC, JSON, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents. [Table functions]
-
INITCAP
- Capitalizes first letter of each alphanumeric word and puts the rest in lowercase. [String functions]
-
INITCAPB
- Capitalizes first letter of each alphanumeric word and puts the rest in lowercase. [String functions]
-
INSERT
- Inserts a character string into a specified location in another character string. [String functions]
-
INSTALL_LICENSE
- Installs the license key in the global catalog. [Catalog functions]
-
INSTR
- Searches string for substring and returns an integer indicating the position of the character in string that is the first character of this occurrence. [String functions]
-
INSTRB
- Searches string for substring and returns an integer indicating the octet position within string that is the first occurrence. [String functions]
-
INTERRUPT_STATEMENT
- Interrupts the specified statement in a user session, rolls back the current transaction, and writes a success or failure message to the log file. [Session functions]
-
ISFINITE
- Tests for the special TIMESTAMP constant INFINITY and returns a value of type BOOLEAN. [Date/time functions]
-
ISNULL
- Returns the value of the first non-null expression in the list. [NULL-handling functions]
-
ISUTF8
- Tests whether a string is a valid UTF-8 string. [String functions]
-
JARO_DISTANCE
- Calculates and returns the Jaro similarity, an edit distance between two sequences. [String functions]
-
JARO_WINKLER_DISTANCE
- Calculates and returns the Jaro-Winkler similarity, an edit distance between two sequences. [String functions]
-
JULIAN_DAY
- Returns the integer value of the specified day according to the Julian calendar, where day 1 is the first day of the Julian period, January 1, 4713 BC (on the Gregorian calendar, November 24, 4714 BC). [Date/time functions]
-
KERBEROS_CONFIG_CHECK
- Tests the Kerberos configuration of a Vertica cluster. [Database functions]
-
KERBEROS_HDFS_CONFIG_CHECK
- This function is deprecated and will be removed in a future release. [Hadoop functions]
-
KMEANS
- Executes the k-means algorithm on an input relation. [Machine learning algorithms]
-
KPROTOTYPES
- Executes the k-prototypes algorithm on an input relation. [Machine learning algorithms]
-
LAG [analytic]
- Returns the value of the input expression at the given offset before the current row within a. [Analytic functions]
-
LAST_DAY
- Returns the last day of the month in the specified date. [Date/time functions]
-
LAST_INSERT_ID
- Returns the last value of an IDENTITY column. [Table functions]
-
LAST_VALUE [analytic]
- Lets you select the last value of a table or partition (determined by the window-order-clause) without having to use a self join. [Analytic functions]
-
LDAP_LINK_DRYRUN_CONNECT
- Takes a set of LDAP Link connection parameters as arguments and begins a dry run connection between the LDAP server and Vertica. [LDAP link functions]
-
LDAP_LINK_DRYRUN_SEARCH
- Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run search for users and groups that would get imported from the LDAP server. [LDAP link functions]
-
LDAP_LINK_DRYRUN_SYNC
- Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run synchronization between the database and the LDAP server, which maps and synchronizes the LDAP server's users and groups with their equivalents in Vertica. [LDAP link functions]
-
LDAP_LINK_SYNC_CANCEL
- Cancels in-progress LDAP Link synchronizations (including those started by LDAP_LINK_DRYRUN_SYNC) between the LDAP server and Vertica. [LDAP link functions]
-
LDAP_LINK_SYNC_START
- Begins the synchronization between the LDAP server and Vertica immediately rather than waiting for the interval set in LDAPLinkInterval. [LDAP link functions]
-
LEAD [analytic]
- Returns values from the row after the current row within a , letting you access more than one row in a table at the same time. [Analytic functions]
-
LEAST
- Returns the smallest value in a list of expressions of any data type. [String functions]
-
LEASTB
- Returns the smallest value in a list of expressions of any data type, using binary ordering. [String functions]
-
LEFT
- Returns the specified characters from the left side of a string. [String functions]
-
LENGTH
- Returns the length of a string. [String functions]
-
LIFT_TABLE
- Returns a table that compares the predictive quality of a machine learning model. [Model evaluation]
-
LINEAR_REG
- Executes linear regression on an input relation, and returns a linear regression model. [Machine learning algorithms]
-
LIST_ENABLED_CIPHERS
- Returns a list of enabled cipher suites, which are sets of algorithms used to secure TLS/SSL connections. [System information functions]
-
LISTAGG
- Transforms non-null values from a group of rows into a list of values that are delimited by commas (default) or a configurable separator. [Aggregate functions]
-
LN
- Returns the natural logarithm of the argument. [Mathematical functions]
-
LOCALTIME
- Returns a value of type TIME that represents the start of the current transaction. [Date/time functions]
-
LOCALTIMESTAMP
- Returns a value of type TIMESTAMP/TIMESTAMPTZ that represents the start of the current transaction, and remains unchanged until the transaction is closed. [Date/time functions]
-
LOG
- Returns the logarithm to the specified base of the argument. [Mathematical functions]
-
LOG10
- Returns the base 10 logarithm of the argument, also known as the common logarithm. [Mathematical functions]
-
LOGISTIC_REG
- Executes logistic regression on an input relation. [Machine learning algorithms]
-
LOWER
- Takes a string value and returns a VARCHAR value converted to lowercase. [String functions]
-
LOWERB
- Returns a character string with each ASCII character converted to lowercase. [String functions]
-
LPAD
- Returns a VARCHAR value representing a string of a specific length filled on the left with specific characters. [String functions]
-
LTRIM
- Returns a VARCHAR value representing a string with leading blanks removed from the left side (beginning). [String functions]
-
MAKE_AHM_NOW
- Sets the (AHM) to the greatest allowable value. [Epoch functions]
-
MAKEUTF8
- Coerces a string to UTF-8 by removing or replacing non-UTF-8 characters. [String functions]
-
MAPAGGREGATE
- Returns a LONG VARBINARY VMap with key and value pairs supplied from two VARCHAR input columns. [Flex map functions]
-
MAPCONTAINSKEY
- Determines whether a VMap contains a virtual column (key). [Flex map functions]
-
MAPCONTAINSVALUE
- Determines whether a VMap contains a specific value. [Flex map functions]
-
MAPDELIMITEDEXTRACTOR
- Extracts data with a delimiter character and other optional arguments, returning a single VMap value. [Flex extractor functions]
-
MAPITEMS
- Returns information about items in a VMap. [Flex map functions]
-
MAPJSONEXTRACTOR
- Extracts content of repeated JSON data objects,, including nested maps, or data with an outer list of JSON elements. [Flex extractor functions]
-
MAPKEYS
- Returns the virtual columns (and values) present in any VMap data. [Flex map functions]
-
MAPKEYSINFO
- Returns virtual column information from a given map. [Flex map functions]
-
MAPLOOKUP
- Returns single-key values from VMAP data. [Flex map functions]
-
MAPPUT
- Accepts a VMap and one or more key/value pairs and returns a new VMap with the key/value pairs added. [Flex map functions]
-
MAPREGEXEXTRACTOR
- Extracts data with a regular expression and returns results as a VMap. [Flex extractor functions]
-
MAPSIZE
- Returns the number of virtual columns present in any VMap data. [Flex map functions]
-
MAPTOSTRING
- Recursively builds a string representation of VMap data, including nested JSON maps. [Flex map functions]
-
MAPVALUES
- Returns a string representation of the top-level values from a VMap. [Flex map functions]
-
MAPVERSION
- Returns the version or invalidity of any map data. [Flex map functions]
-
MARK_DESIGN_KSAFE
- Enables or disables high availability in your environment, in case of a failure. [Catalog functions]
-
MATCH_COLUMNS
- Specified as an element in a SELECT list, returns all columns in queried tables that match the specified pattern. [Regular expression functions]
-
MATCH_ID
- Returns a successful pattern match as an INTEGER value. [MATCH clause functions]
-
MATERIALIZE_FLEXTABLE_COLUMNS
- Materializes virtual columns listed as key_names in the flextable_keys table you compute using either COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW. [Flex data functions]
-
MAX [aggregate]
- Returns the greatest value of an expression over a group of rows. [Aggregate functions]
-
MAX [analytic]
- Returns the maximum value of an expression within a. [Analytic functions]
-
MD5
- Calculates the MD5 hash of string, returning the result as a VARCHAR string in hexadecimal. [String functions]
-
MEASURE_LOCATION_PERFORMANCE
- Measures a storage location's disk performance. [Storage functions]
-
MEDIAN [analytic]
- For each row, returns the median value of a value set within each partition. [Analytic functions]
-
MEMORY_TRIM
- Calls glibc function malloc_trim() to reclaim free memory from malloc and return it to the operating system. [Database functions]
-
MICROSECOND
- Returns the microsecond portion of the specified date as an integer. [Date/time functions]
-
MIDNIGHT_SECONDS
- Within the specified date, returns the number of seconds between midnight and the date's time portion. [Date/time functions]
-
MIGRATE_ENTERPRISE_TO_EON
- Migrates an Enterprise database to an Eon Mode database. [Eon Mode functions]
-
MIN [aggregate]
- Returns the smallest value of an expression over a group of rows. [Aggregate functions]
-
MIN [analytic]
- Returns the minimum value of an expression within a. [Analytic functions]
-
MINUTE
- Returns the minute portion of the specified date as an integer. [Date/time functions]
-
MOD
- Returns the remainder of a division operation. [Mathematical functions]
-
MONTH
- Returns the month portion of the specified date as an integer. [Date/time functions]
-
MONTHS_BETWEEN
- Returns the number of months between two dates. [Date/time functions]
-
MOVE_PARTITIONS_TO_TABLE
- Moves partitions from one table to another. [Partition functions]
-
MOVE_RETIRED_LOCATION_DATA
- Moves all data from the specified retired storage location or from all retired storage locations in the database. [Storage functions]
-
MOVE_STATEMENT_TO_RESOURCE_POOL
- Attempts to move the specified query to the specified target pool. [Workload management functions]
-
MOVING_AVERAGE
- Creates a moving-average (MA) model from a stationary time series with consistent timesteps that can then be used for prediction via PREDICT_MOVING_AVERAGE. [Machine learning algorithms]
-
MSE
- Returns a table that displays the mean squared error of the prediction and response columns in a machine learning model. [Model evaluation]
-
NAIVE_BAYES
- Executes the Naive Bayes algorithm on an input relation and returns a Naive Bayes model. [Machine learning algorithms]
-
NEW_TIME
- Converts a timestamp value from one time zone to another and returns a TIMESTAMP. [Date/time functions]
-
NEXT_DAY
- Returns the date of the first instance of a particular day of the week that follows the specified date. [Date/time functions]
-
NEXTVAL
- Returns the next value in a sequence. [Sequence functions]
-
NORMALIZE
- Runs a normalization algorithm on an input relation. [Data preparation]
-
NORMALIZE_FIT
- This function differs from NORMALIZE, which directly outputs a view with normalized results, rather than storing normalization parameters into a model for later operation. [Data preparation]
-
NOTIFY
- Sends a specified message to a NOTIFIER. [Notifier functions]
-
NOW [date/time]
- Returns a value of type TIMESTAMP WITH TIME ZONE representing the start of the current transaction. [Date/time functions]
-
NTH_VALUE [analytic]
- Returns the value evaluated at the row that is the nth row of the window (counting from 1). [Analytic functions]
-
NTILE [analytic]
- Equally divides an ordered data set (partition) into a {value} number of subsets within a , where the subsets are numbered 1 through the value in parameter constant-value. [Analytic functions]
-
NULLIF
- Compares two expressions. [NULL-handling functions]
-
NULLIFZERO
- Evaluates to NULL if the value in the column is 0. [NULL-handling functions]
-
NVL
- Returns the value of the first non-null expression in the list. [NULL-handling functions]
-
NVL2
- Takes three arguments. [NULL-handling functions]
-
OCTET_LENGTH
- Takes one argument as an input and returns the string length in octets for all string types. [String functions]
-
ONE_HOT_ENCODER_FIT
- Generates a sorted list of each of the category levels for each feature to be encoded, and stores the model. [Data preparation]
-
OVERLAPS
- Evaluates two time periods and returns true when they overlap, false otherwise. [Date/time functions]
-
OVERLAY
- Replaces part of a string with another string and returns the new string value as a VARCHAR. [String functions]
-
OVERLAYB
- Replaces part of a string with another string and returns the new string as an octet value. [String functions]
-
PARTITION_PROJECTION
- Splits containers for a specified projection. [Partition functions]
-
PARTITION_TABLE
- Invokes the to reorganize ROS storage containers as needed to conform with the current partitioning policy. [Partition functions]
-
PATTERN_ID
- Returns an integer value that is a partition-wide unique identifier for the instance of the pattern that matched. [MATCH clause functions]
-
PCA
- Computes principal components from the input table/view. [Data preparation]
-
PERCENT_RANK [analytic]
- Calculates the relative rank of a row for a given row in a group within a by dividing that row’s rank less 1 by the number of rows in the partition, also less 1. [Analytic functions]
-
PERCENTILE_CONT [analytic]
- An inverse distribution function where, for each row, PERCENTILE_CONT returns the value that would fall into the specified percentile among a set of values in each partition within a. [Analytic functions]
-
PERCENTILE_DISC [analytic]
- An inverse distribution function where, for each row, PERCENTILE_DISC returns the value that would fall into the specified percentile among a set of values in each partition within a. [Analytic functions]
-
PI
- Returns the constant pi (P), the ratio of any circle's circumference to its diameter in Euclidean geometry The return type is DOUBLE PRECISION. [Mathematical functions]
-
POISSON_REG
- Executes Poisson regression on an input relation, and returns a Poisson regression model. [Machine learning algorithms]
-
POSITION
- Returns an INTEGER value representing the character location of a specified substring with a string (counting from one). [String functions]
-
POSITIONB
- Returns an INTEGER value representing the octet location of a specified substring with a string (counting from one). [String functions]
-
POWER
- Returns a DOUBLE PRECISION value representing one number raised to the power of another number. [Mathematical functions]
-
PRC
- Returns a table that displays the points on a receiver precision recall (PR) curve. [Model evaluation]
-
PREDICT_ARIMA
- Applies an autoregressive integrated moving average (ARIMA) model to an input relation or makes predictions using the in-sample data. [Transformation functions]
-
PREDICT_AUTOREGRESSOR
- Applies an autoregressor (AR) model to an input relation. [Transformation functions]
-
PREDICT_LINEAR_REG
- Applies a linear regression model on an input relation and returns the predicted value as a FLOAT. [Transformation functions]
-
PREDICT_LOGISTIC_REG
- Applies a logistic regression model on an input relation. [Transformation functions]
-
PREDICT_MOVING_AVERAGE
- Applies a moving-average (MA) model, created by MOVING_AVERAGE, to an input relation. [Transformation functions]
-
PREDICT_NAIVE_BAYES
- Applies a Naive Bayes model on an input relation. [Transformation functions]
-
PREDICT_NAIVE_BAYES_CLASSES
- Applies a Naive Bayes model on an input relation and returns the probabilities of classes:. [Transformation functions]
-
PREDICT_PMML
- Applies an imported PMML model on an input relation. [Transformation functions]
-
PREDICT_POISSON_REG
- Applies a Poisson regression model on an input relation and returns the predicted value as a FLOAT. [Transformation functions]
-
PREDICT_RF_CLASSIFIER
- Applies a random forest model on an input relation. [Transformation functions]
-
PREDICT_RF_CLASSIFIER_CLASSES
- Applies a random forest model on an input relation and returns the probabilities of classes:. [Transformation functions]
-
PREDICT_RF_REGRESSOR
- Applies a random forest model on an input relation, and returns with a FLOAT data type that specifies the predicted value of the random forest model—the average of the prediction of the trees in the forest. [Transformation functions]
-
PREDICT_SVM_CLASSIFIER
- Uses an SVM model to predict class labels for samples in an input relation, and returns the predicted value as a FLOAT data type. [Transformation functions]
-
PREDICT_SVM_REGRESSOR
- Uses an SVM model to perform regression on samples in an input relation, and returns the predicted value as a FLOAT data type. [Transformation functions]
-
PREDICT_TENSORFLOW
- Applies a TensorFlow model on an input relation, and returns with the result expected for the encoded model type. [Transformation functions]
-
PREDICT_TENSORFLOW_SCALAR
- Applies a TensorFlow model on an input relation, and returns with the result expected for the encoded model type. This function supports 1D complex types as input and output. [Transformation functions]
-
PREDICT_XGB_CLASSIFIER
- Applies an XGBoost classifier model on an input relation. [Transformation functions]
-
PREDICT_XGB_CLASSIFIER_CLASSES
- Applies an XGBoost classifier model on an input relation and returns the probabilities of classes:. [Transformation functions]
-
PREDICT_XGB_REGRESSOR
- Applies an XGBoost regressor model on an input relation. [Transformation functions]
-
PROMOTE_SUBCLUSTER_TO_PRIMARY
- Converts a secondary subcluster to a. [Eon Mode functions]
-
PURGE
- Permanently removes delete vectors from ROS storage containers so disk space can be reused. [Database functions]
-
PURGE_PARTITION
- Purges a table partition of deleted rows. [Partition functions]
-
PURGE_PROJECTION
- PURGE_PROJECTION can use significant disk space while purging the data. [Projection functions]
-
PURGE_TABLE
- This function was formerly named PURGE_TABLE_PROJECTIONS(). [Table functions]
-
QUARTER
- Returns calendar quarter of the specified date as an integer, where the January-March quarter is 1. [Date/time functions]
-
QUOTE_IDENT
- Returns the specified string argument in the format required to use the string as an identifier in an SQL statement. [String functions]
-
QUOTE_LITERAL
- Returns the given string suitably quoted for use as a string literal in a SQL statement string. [String functions]
-
QUOTE_NULLABLE
- Returns the given string suitably quoted for use as a string literal in an SQL statement string; or if the argument is null, returns the unquoted string NULL. [String functions]
-
RADIANS
- Returns a DOUBLE PRECISION value representing an angle expressed in radians. [Mathematical functions]
-
RANDOM
- Returns a uniformly-distributed random DOUBLE PRECISION value x, where 0 <= x < 1. [Mathematical functions]
-
RANDOMINT
- Accepts and returns an integer between 0 and the integer argument expression-1. [Mathematical functions]
-
RANDOMINT_CRYPTO
- Accepts and returns an INTEGER value from a set of values between 0 and the specified function argument -1. [Mathematical functions]
-
RANK [analytic]
- Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY clause. [Analytic functions]
-
READ_CONFIG_FILE
- Reads and returns the key-value pairs of all the parameters of a given tokenizer. [Text search functions]
-
READ_TREE
- Reads the contents of trees within the random forest or XGBoost model. [Model evaluation]
-
REALIGN_CONTROL_NODES
- Causes Vertica to re-evaluate which nodes in the cluster or subcluster are and which nodes are assigned to them as dependents when large cluster is enabled. [Cluster functions]
-
REBALANCE_CLUSTER
- Rebalances the database cluster synchronously as a session foreground task. [Cluster functions]
-
REBALANCE_SHARDS
- Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode. [Eon Mode functions]
-
REBALANCE_TABLE
- Synchronously rebalances data in the specified table. [Table functions]
-
REENABLE_DUPLICATE_KEY_ERROR
- Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR. [Table functions]
-
REFRESH
- Synchronously refreshes one or more table projections in the foreground, and updates the PROJECTION_REFRESHES system table. [Projection functions]
-
REFRESH_COLUMNS
- Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING. [Projection functions]
-
REGEXP_COUNT
- Returns the number times a regular expression matches a string. [Regular expression functions]
-
REGEXP_ILIKE
- Returns true if the string contains a match for the regular expression. [Regular expression functions]
-
REGEXP_INSTR
- Returns the starting or ending position in a string where a regular expression matches. [Regular expression functions]
-
REGEXP_LIKE
- Returns true if the string matches the regular expression. [Regular expression functions]
-
REGEXP_NOT_ILIKE
- Returns true if the string does not match the case-insensitive regular expression. [Regular expression functions]
-
REGEXP_NOT_LIKE
- Returns true if the string does not contain a match for the regular expression. [Regular expression functions]
-
REGEXP_REPLACE
- Replaces all occurrences of a substring that match a regular expression with another substring. [Regular expression functions]
-
REGEXP_SUBSTR
- Returns the substring that matches a regular expression within a string. [Regular expression functions]
-
REGISTER_MODEL
- Registers a trained model and adds it to Model Versioning environment with a status of 'under_review'. [Model management]
-
REGR_AVGX
- Returns the DOUBLE PRECISION average of the independent expression in an expression pair. [Aggregate functions]
-
REGR_AVGY
- Returns the DOUBLE PRECISION average of the dependent expression in an expression pair. [Aggregate functions]
-
REGR_COUNT
- Returns the count of all rows in an expression pair. [Aggregate functions]
-
REGR_INTERCEPT
- Returns the y-intercept of the regression line determined by a set of expression pairs. [Aggregate functions]
-
REGR_R2
- Returns the square of the correlation coefficient of a set of expression pairs. [Aggregate functions]
-
REGR_SLOPE
- Returns the slope of the regression line, determined by a set of expression pairs. [Aggregate functions]
-
REGR_SXX
- Returns the sum of squares of the difference between the independent expression (expression2) and its average. [Aggregate functions]
-
REGR_SXY
- Returns the sum of products of the difference between the dependent expression (expression1) and its average and the difference between the independent expression (expression2) and its average. [Aggregate functions]
-
REGR_SYY
- Returns the sum of squares of the difference between the dependent expression (expression1) and its average. [Aggregate functions]
-
RELEASE_ALL_JVM_MEMORY
- Forces all sessions to release the memory consumed by their Java Virtual Machines (JVM). [Session functions]
-
RELEASE_JVM_MEMORY
- Terminates a Java Virtual Machine (JVM), making available the memory the JVM was using. [Session functions]
-
RELEASE_SYSTEM_TABLES_ACCESS
- Enables non-superuser access to all system tables. [Privileges and access functions]
-
RELOAD_ADMINTOOLS_CONF
- Updates the admintools.conf on each UP node in the cluster. [Catalog functions]
-
RELOAD_SPREAD
- Updates cluster changes to the catalog's Spread configuration file. [Cluster functions]
-
REPEAT
- Replicates a string the specified number of times and concatenates the replicated values as a single string. [String functions]
-
REPLACE
- Replaces all occurrences of characters in a string with another set of characters. [String functions]
-
RESERVE_SESSION_RESOURCE
- Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process. [Session functions]
-
RESET_LOAD_BALANCE_POLICY
- Resets the counter each host in the cluster maintains, to track which host it will refer a client to when the native connection load balancing scheme is set to ROUNDROBIN. [Client connection functions]
-
RESET_SESSION
- Applies your default connection string configuration settings to your current session. [Session functions]
-
RESHARD_DATABASE
- Changes the number of shards in a database. [Eon Mode functions]
-
RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW
- Restores the keys table and the view. [Flex data functions]
-
RESTORE_LOCATION
- Restores a storage location that was previously retired with RETIRE_LOCATION. [Storage functions]
-
RESTRICT_SYSTEM_TABLES_ACCESS
- Checks system table SYSTEM_TABLES to determine which system tables non-superusers can access. [Privileges and access functions]
-
RETIRE_LOCATION
- Deactivates the specified storage location. [Storage functions]
-
REVERSE_NORMALIZE
- Reverses the normalization transformation on normalized data, thereby de-normalizing the normalized data. [Transformation functions]
-
RF_CLASSIFIER
- Trains a random forest model for classification on an input relation. [Machine learning algorithms]
-
RF_PREDICTOR_IMPORTANCE
- Measures the importance of the predictors in a random forest model using the Mean Decrease Impurity (MDI) approach. [Model evaluation]
-
RF_REGRESSOR
- Trains a random forest model for regression on an input relation. [Machine learning algorithms]
-
RIGHT
- Returns the specified characters from the right side of a string. [String functions]
-
ROC
- Returns a table that displays the points on a receiver operating characteristic curve. [Model evaluation]
-
ROUND
- Rounds the specified date or time. [Date/time functions]
-
ROUND
- Rounds a value to a specified number of decimal places, retaining the original precision and scale. [Mathematical functions]
-
ROW_NUMBER [analytic]
- Assigns a sequence of unique numbers to each row in a partition, starting with 1. [Analytic functions]
-
RPAD
- Returns a VARCHAR value representing a string of a specific length filled on the right with specific characters. [String functions]
-
RSQUARED
- Returns a table with the R-squared value of the predictions in a regression model. [Model evaluation]
-
RTRIM
- Returns a VARCHAR value representing a string with trailing blanks removed from the right side (end). [String functions]
-
RUN_INDEX_TOOL
- Runs the Index tool on a Vertica database to perform one of these tasks:. [Database functions]
-
SANDBOX_SUBCLUSTER
- Creates a sandbox for a secondary subcluster. [Eon Mode functions]
-
SAVE_PLANS
- Creates optimizer-generated directed queries from the most frequently executed queries, up to the maximum specified. [Directed queries functions]
-
SECOND
- Returns the seconds portion of the specified date as an integer. [Date/time functions]
-
SECURITY_CONFIG_CHECK
- Returns the status of various security-related parameters. [Database functions]
-
SESSION_USER
- Returns a VARCHAR containing the name of the user who initiated the current database session. [System information functions]
-
SET_AHM_EPOCH
- Sets the (AHM) to the specified epoch. [Epoch functions]
-
SET_AHM_TIME
- Sets the (AHM) to the epoch corresponding to the specified time on the initiator node. [Epoch functions]
-
SET_AUDIT_TIME
- Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license. [License functions]
-
SET_CLIENT_LABEL
- Assigns a label to a client connection for the current session. [Client connection functions]
-
SET_CONFIG_PARAMETER
- Sets or clears a configuration parameter at the specified level. [Database functions]
-
SET_CONTROL_SET_SIZE
- Sets the number of that participate in the spread service when large cluster is enabled. [Cluster functions]
-
SET_DATA_COLLECTOR_NOTIFY_POLICY
- Creates/enables notification policies for a component. [Notifier functions]
-
SET_DATA_COLLECTOR_POLICY
- Updates the following retention policy properties for the specified component:. [Data collector functions]
-
SET_DATA_COLLECTOR_TIME_POLICY
- Updates the retention policy property INTERVAL_TIME for the specified component. [Data collector functions]
-
SET_DEPOT_PIN_POLICY_PARTITION
- Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction. [Eon Mode functions]
-
SET_DEPOT_PIN_POLICY_PROJECTION
- Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. [Eon Mode functions]
-
SET_DEPOT_PIN_POLICY_TABLE
- Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. [Eon Mode functions]
-
SET_LOAD_BALANCE_POLICY
- Sets how native connection load balancing chooses a host to handle a client connection. [Client connection functions]
-
SET_LOCATION_PERFORMANCE
- Sets disk performance for a storage location. [Storage functions]
-
SET_OBJECT_STORAGE_POLICY
- Creates or changes the storage policy of a database object by assigning it a labeled storage location. [Storage functions]
-
SET_SCALING_FACTOR
- Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled. [Cluster functions]
-
SET_SPREAD_OPTION
- Changes daemon settings. [Database functions]
-
SET_TOKENIZER_PARAMETER
- Configures the tokenizer parameters. [Text search functions]
-
SET_UNION
- Returns a SET containing all elements of two input sets. [Collection functions]
-
SHA1
- Uses the US Secure Hash Algorithm 1 to calculate the SHA1 hash of string. [String functions]
-
SHA224
- Uses the US Secure Hash Algorithm 2 to calculate the SHA224 hash of string. [String functions]
-
SHA256
- Uses the US Secure Hash Algorithm 2 to calculate the SHA256 hash of string. [String functions]
-
SHA384
- Uses the US Secure Hash Algorithm 2 to calculate the SHA384 hash of string. [String functions]
-
SHA512
- Uses the US Secure Hash Algorithm 2 to calculate the SHA512 hash of string. [String functions]
-
SHOW_PROFILING_CONFIG
- Shows whether profiling is enabled. [Profiling functions]
-
SHUTDOWN
- Shuts down a Vertica database. [Database functions]
-
SHUTDOWN_SUBCLUSTER
- Shuts down a subcluster. [Eon Mode functions]
-
SHUTDOWN_WITH_DRAIN
- Gracefully shuts down a subcluster or subclusters. [Eon Mode functions]
-
SIGN
- Returns a DOUBLE PRECISION value of -1, 0, or 1 representing the arithmetic sign of the argument. [Mathematical functions]
-
SIN
- Returns a DOUBLE PRECISION value that represents the trigonometric sine of the passed parameter. [Mathematical functions]
-
SINH
- Returns a DOUBLE PRECISION value that represents the hyperbolic sine of the passed parameter. [Mathematical functions]
-
SLEEP
- Waits a specified number of seconds before executing another statement or command. [Workload management functions]
-
SOUNDEX
- Takes a VARCHAR argument and returns a four-character code that enables comparison of that argument with other SOUNDEX-encoded strings that are spelled differently in English, but are phonetically similar. [String functions]
-
SOUNDEX_MATCHES
- Compares the Soundex encodings of two strings. [String functions]
-
SPACE
- Returns the specified number of blank spaces, typically for insertion into a character string. [String functions]
-
SPLIT_PART
- Splits string on the delimiter and returns the string at the location of the beginning of the specified field (counting from 1). [String functions]
-
SPLIT_PARTB
- Divides an input string on a delimiter character and returns the Nth segment, counting from 1. [String functions]
-
SQRT
- Returns a DOUBLE PRECISION value representing the arithmetic square root of the argument. [Mathematical functions]
-
ST_Area
- Calculates the area of a spatial object. [Geospatial functions]
-
ST_AsBinary
- Creates the Well-Known Binary (WKB) representation of a spatial object. [Geospatial functions]
-
ST_AsText
- Creates the Well-Known Text (WKT) representation of a spatial object. [Geospatial functions]
-
ST_Boundary
- Calculates the boundary of the specified GEOMETRY object. [Geospatial functions]
-
ST_Buffer
- Creates a GEOMETRY object greater than or equal to a specified distance from the boundary of a spatial object. [Geospatial functions]
-
ST_Centroid
- Calculates the geometric center—the centroid—of a spatial object. [Geospatial functions]
-
ST_Contains
- Determines if a spatial object is entirely inside another spatial object without existing only on its boundary. [Geospatial functions]
-
ST_ConvexHull
- Calculates the smallest convex GEOMETRY object that contains a GEOMETRY object. [Geospatial functions]
-
ST_Crosses
- Determines if one GEOMETRY object spatially crosses another GEOMETRY object. [Geospatial functions]
-
ST_Difference
- Calculates the part of a spatial object that does not intersect with another spatial object. [Geospatial functions]
-
ST_Disjoint
- Determines if two GEOMETRY objects do not intersect or touch. [Geospatial functions]
-
ST_Distance
- Calculates the shortest distance between two spatial objects. [Geospatial functions]
-
ST_Envelope
- Calculates the minimum bounding rectangle that contains the specified GEOMETRY object. [Geospatial functions]
-
ST_Equals
- Determines if two spatial objects are spatially equivalent. [Geospatial functions]
-
ST_GeographyFromText
- Converts a Well-Known Text (WKT) string into its corresponding GEOGRAPHY object. [Geospatial functions]
-
ST_GeographyFromWKB
- Converts a Well-Known Binary (WKB) value into its corresponding GEOGRAPHY object. [Geospatial functions]
-
ST_GeoHash
- Returns a GeoHash in the shape of the specified geometry. [Geospatial functions]
-
ST_GeometryN
- Returns the n geometry within a geometry object. [Geospatial functions]
-
ST_GeometryType
- Determines the class of a spatial object. [Geospatial functions]
-
ST_GeomFromGeoHash
- Returns a polygon in the shape of the specified GeoHash. [Geospatial functions]
-
ST_GeomFromGeoJSON
- Converts the geometry portion of a GeoJSON record in the standard format into a GEOMETRY object. [Geospatial functions]
-
ST_GeomFromText
- Converts a Well-Known Text (WKT) string into its corresponding GEOMETRY object. [Geospatial functions]
-
ST_GeomFromWKB
- Converts the Well-Known Binary (WKB) value to its corresponding GEOMETRY object. [Geospatial functions]
-
ST_Intersection
- Calculates the set of points shared by two GEOMETRY objects. [Geospatial functions]
-
ST_Intersects
- Determines if two GEOMETRY or GEOGRAPHY objects intersect or touch at a single point. [Geospatial functions]
-
ST_IsEmpty
- Determines if a spatial object represents the empty set. [Geospatial functions]
-
ST_IsSimple
- Determines if a spatial object does not intersect itself or touch its own boundary at any point. [Geospatial functions]
-
ST_IsValid
- Determines if a spatial object is well formed or valid. [Geospatial functions]
-
ST_Length
- Calculates the length of a spatial object. [Geospatial functions]
-
ST_NumGeometries
- Returns the number of geometries contained within a spatial object. [Geospatial functions]
-
ST_NumPoints
- Calculates the number of vertices of a spatial object, empty objects return NULL. [Geospatial functions]
-
ST_Overlaps
- Determines if a GEOMETRY object shares space with another GEOMETRY object, but is not completely contained within that object. [Geospatial functions]
-
ST_PointFromGeoHash
- Returns the center point of the specified GeoHash. [Geospatial functions]
-
ST_PointN
- Finds the n point of a spatial object. [Geospatial functions]
-
ST_Relate
- Determines if a given GEOMETRY object is spatially related to another GEOMETRY object, based on the specified DE-9IM pattern matrix string. [Geospatial functions]
-
ST_SRID
- Identifies the spatial reference system identifier (SRID) stored with a spatial object. [Geospatial functions]
-
ST_SymDifference
- Calculates all the points in two GEOMETRY objects except for the points they have in common, but including the boundaries of both objects. [Geospatial functions]
-
ST_Touches
- Determines if two GEOMETRY objects touch at a single point or along a boundary, but do not have interiors that intersect. [Geospatial functions]
-
ST_Transform
- Returns a new GEOMETRY with its coordinates converted to the spatial reference system identifier (SRID) used by the srid argument. [Geospatial functions]
-
ST_Union
- Calculates the union of all points in two spatial objects. [Geospatial functions]
-
ST_Within
- If spatial object g1 is completely inside of spatial object g2, then ST_Within returns true. [Geospatial functions]
-
ST_X
- Determines the x- coordinate for a GEOMETRY point or the longitude value for a GEOGRAPHY point. [Geospatial functions]
-
ST_XMax
- Returns the maximum x-coordinate of the minimum bounding rectangle of the GEOMETRY or GEOGRAPHY object. [Geospatial functions]
-
ST_XMin
- Returns the minimum x-coordinate of the minimum bounding rectangle of the GEOMETRY or GEOGRAPHY object. [Geospatial functions]
-
ST_Y
- Determines the y-coordinate for a GEOMETRY point or the latitude value for a GEOGRAPHY point. [Geospatial functions]
-
ST_YMax
- Returns the maximum y-coordinate of the minimum bounding rectangle of the GEOMETRY or GEOGRAPHY object. [Geospatial functions]
-
ST_YMin
- Returns the minimum y-coordinate of the minimum bounding rectangle of the GEOMETRY or GEOGRAPHY object. [Geospatial functions]
-
START_DRAIN_SUBCLUSTER
- Drains a subcluster or subclusters. [Eon Mode functions]
-
START_REAPING_FILES
- Starts the disk file deletion in the background as an asynchronous function. [Eon Mode functions]
-
START_REBALANCE_CLUSTER
- Asynchronously rebalances the database cluster as a background task. [Cluster functions]
-
START_REFRESH
- Refreshes projections in the current schema with the latest data of their respective. [Projection functions]
-
STATEMENT_TIMESTAMP
- Similar to TRANSACTION_TIMESTAMP, returns a value of type TIMESTAMP WITH TIME ZONE that represents the start of the current statement. [Date/time functions]
-
STDDEV [aggregate]
- Evaluates the statistical sample standard deviation for each member of the group. [Aggregate functions]
-
STDDEV [analytic]
- Computes the statistical sample standard deviation of the current row with respect to the group within a. [Analytic functions]
-
STDDEV_POP [aggregate]
- Evaluates the statistical population standard deviation for each member of the group. [Aggregate functions]
-
STDDEV_POP [analytic]
- Evaluates the statistical population standard deviation for each member of the group. [Analytic functions]
-
STDDEV_SAMP [aggregate]
- Evaluates the statistical sample standard deviation for each member of the group. [Aggregate functions]
-
STDDEV_SAMP [analytic]
- Computes the statistical sample standard deviation of the current row with respect to the group within a. [Analytic functions]
-
STRING_TO_ARRAY
- Splits a string containing array values and returns a native one-dimensional array. [Collection functions]
-
STRPOS
- Returns an INTEGER value that represents the location of a specified substring within a string (counting from one). [String functions]
-
STRPOSB
- Returns an INTEGER value representing the location of a specified substring within a string, counting from one, where each octet in the string is counted (as opposed to characters). [String functions]
-
STV_AsGeoJSON
- Returns the geometry or geography argument as a Geometry Javascript Object Notation (GeoJSON) object. [Geospatial functions]
-
STV_Create_Index
- Creates a spatial index on a set of polygons to speed up spatial intersection with a set of points. [Geospatial functions]
-
STV_Describe_Index
- Retrieves information about an index that contains a set of polygons. [Geospatial functions]
-
STV_Drop_Index
- Deletes a spatial index. [Geospatial functions]
-
STV_DWithin
- Determines if the shortest distance from the boundary of one spatial object to the boundary of another object is within a specified distance. [Geospatial functions]
-
STV_Export2Shapefile
- Exports GEOGRAPHY or GEOMETRY data from a database table or a subquery to a shapefile. [Geospatial functions]
-
STV_Extent
- Returns a bounding box containing all of the input data. [Geospatial functions]
-
STV_ForceLHR
- Alters the order of the vertices of a spatial object to follow the left-hand-rule. [Geospatial functions]
-
STV_Geography
- Casts a GEOMETRY object into a GEOGRAPHY object. [Geospatial functions]
-
STV_GeographyPoint
- Returns a GEOGRAPHY point based on the input values. [Geospatial functions]
-
STV_Geometry
- Casts a GEOGRAPHY object into a GEOMETRY object. [Geospatial functions]
-
STV_GeometryPoint
- Returns a GEOMETRY point, based on the input values. [Geospatial functions]
-
STV_GetExportShapefileDirectory
- Returns the path of the export directory. [Geospatial functions]
-
STV_Intersect scalar function
- Spatially intersects a point or points with a set of polygons. [Geospatial functions]
-
STV_Intersect transform function
- Spatially intersects points and polygons. [Geospatial functions]
-
STV_IsValidReason
- Determines if a spatial object is well formed or valid. [Geospatial functions]
-
STV_LineStringPoint
- Retrieves the vertices of a linestring or multilinestring. [Geospatial functions]
-
STV_MemSize
- Returns the length of the spatial object in bytes as an INTEGER. [Geospatial functions]
-
STV_NN
- Calculates the distance of spatial objects from a reference object and returns (object, distance) pairs in ascending order by distance from the reference object. [Geospatial functions]
-
STV_PolygonPoint
- Retrieves the vertices of a polygon as individual points. [Geospatial functions]
-
STV_Refresh_Index
- Appends newly added or updated polygons and removes deleted polygons from an existing spatial index. [Geospatial functions]
-
STV_Rename_Index
- Renames a spatial index. [Geospatial functions]
-
STV_Reverse
- Reverses the order of the vertices of a spatial object. [Geospatial functions]
-
STV_SetExportShapefileDirectory
- Specifies the directory to export GEOMETRY or GEOGRAPHY data to a shapefile. [Geospatial functions]
-
STV_ShpCreateTable
- Returns a CREATE TABLE statement with the columns and types of the attributes found in the specified shapefile. [Geospatial functions]
-
STV_ShpSource and STV_ShpParser
- These two functions work with COPY to parse and load geometries and attributes from a shapefile into a Vertica table, and convert them to the appropriate GEOMETRY data type. [Geospatial functions]
-
SUBSTR
- Returns VARCHAR or VARBINARY value representing a substring of a specified string. [String functions]
-
SUBSTRB
- Returns an octet value representing the substring of a specified string. [String functions]
-
SUBSTRING
- Returns a value representing a substring of the specified string at the given position, given a value, a position, and an optional length. [String functions]
-
SUM [aggregate]
- Computes the sum of an expression over a group of rows. [Aggregate functions]
-
SUM [analytic]
- Computes the sum of an expression over a group of rows within a. [Analytic functions]
-
SUM_FLOAT [aggregate]
- Computes the sum of an expression over a group of rows and returns a DOUBLE PRECISION value. [Aggregate functions]
-
SUMMARIZE_CATCOL
- Returns a statistical summary of categorical data input, in three columns:. [Data preparation]
-
SUMMARIZE_NUMCOL
- Returns a statistical summary of columns in a Vertica table:. [Data preparation]
-
SVD
- Computes singular values (the diagonal of the S matrix) and right singular vectors (the V matrix) of an SVD decomposition of the input relation. [Data preparation]
-
SVM_CLASSIFIER
- Trains the SVM model on an input relation. [Machine learning algorithms]
-
SVM_REGRESSOR
- Trains the SVM model on an input relation. [Machine learning algorithms]
-
SWAP_PARTITIONS_BETWEEN_TABLES
- Swaps partitions between two tables. [Partition functions]
-
SYNC_CATALOG
- Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash. [Eon Mode functions]
-
SYNC_WITH_HCATALOG_SCHEMA
- Copies the structure of a Hive database schema available through the HCatalog Connector to a Vertica schema. [Hadoop functions]
-
SYNC_WITH_HCATALOG_SCHEMA_TABLE
- Copies the structure of a single table in a Hive database schema available through the HCatalog Connector to a Vertica table. [Hadoop functions]
-
SYSDATE
- Returns the current statement's start date and time as a TIMESTAMP value. [Date/time functions]
-
TAN
- Returns a DOUBLE PRECISION value that represents the trigonometric tangent of the passed parameter. [Mathematical functions]
-
TANH
- Returns a DOUBLE PRECISION value that represents the hyperbolic tangent of the passed parameter. [Mathematical functions]
-
Template patterns for date/time formatting
- In an output template string (for TO_CHAR), certain patterns are recognized and replaced with appropriately formatted data from the value to format. [Formatting functions]
-
Template patterns for numeric formatting
- A sign formatted using SG, PL, or MI is not anchored to the number. [Formatting functions]
-
THROW_ERROR
- Returns a user-defined error message. [Error-handling functions]
-
TIME_SLICE
- Aggregates data by different fixed-time intervals and returns a rounded-up input TIMESTAMP value to a value that corresponds with the start or end of the time slice interval. [Date/time functions]
-
TIMEOFDAY
- Returns the wall-clock time as a text string. [Date/time functions]
-
TIMESTAMP_ROUND
- Rounds the specified TIMESTAMP. [Date/time functions]
-
TIMESTAMP_TRUNC
- Truncates the specified TIMESTAMP. [Date/time functions]
-
TIMESTAMPADD
- Adds the specified number of intervals to a TIMESTAMP or TIMESTAMPTZ value and returns a result of the same data type. [Date/time functions]
-
TIMESTAMPDIFF
- Returns the time span between two TIMESTAMP or TIMESTAMPTZ values, in the intervals specified. [Date/time functions]
-
TO_BITSTRING
- This topic is shared in two locations: Formatting Functions and String Functions. [Formatting functions]
-
TO_CHAR
- Converts date/time and numeric values into text strings. [Formatting functions]
-
TO_DATE
- This topic shared in two places: Date/Time functions and Formatting Functions. [Formatting functions]
-
TO_HEX
- This topic is shared in two locations: Formatting Functions and String Functions. [Formatting functions]
-
TO_JSON
- Returns the JSON representation of a complex-type argument, including mixed and nested complex types. [Collection functions]
-
TO_NUMBER
- Converts a string value to DOUBLE PRECISION. [Formatting functions]
-
TO_TIMESTAMP
- Converts a string value or a UNIX/POSIX epoch value to a TIMESTAMP type. [Formatting functions]
-
TO_TIMESTAMP_TZ
- Converts a string value or a UNIX/POSIX epoch value to a TIMESTAMP WITH TIME ZONE type. [Formatting functions]
-
TRANSACTION_TIMESTAMP
- Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
-
TRANSLATE
- Replaces individual characters in string_to_replace with other characters. [String functions]
-
TRIM
- Combines the BTRIM, LTRIM, and RTRIM functions into a single function. [String functions]
-
TRUNC
- Truncates the specified date or time. [Date/time functions]
-
TRUNC
- Returns the expression value fully truncated (toward zero). [Mathematical functions]
-
TS_FIRST_VALUE
- Processes the data that belongs to each time slice. [Aggregate functions]
-
TS_LAST_VALUE
- Processes the data that belongs to each time slice. [Aggregate functions]
-
UNNEST
- Expands the elements of one or more collection columns (ARRAY or SET) into individual rows. [Collection functions]
-
UNSANDBOX_SUBCLUSTER
- Removes a subcluster from a sandbox. [Eon Mode functions]
-
UPGRADE_MODEL
- Upgrades a model from a previous Vertica version. [Model management]
-
UPPER
- Returns a VARCHAR value containing the argument converted to uppercase letters. [String functions]
-
UPPERB
- Returns a character string with each ASCII character converted to uppercase. [String functions]
-
URI_PERCENT_DECODE
- Decodes a percent-encoded Universal Resource Identifier (URI) according to the RFC 3986 standard. [URI functions]
-
URI_PERCENT_ENCODE
- Encodes a Universal Resource Identifier (URI) according to the RFC 3986 standard for percent encoding. [URI functions]
-
USER
- Returns a VARCHAR containing the name of the user who initiated the current database connection. [System information functions]
-
USERNAME
- Returns a VARCHAR containing the name of the user who initiated the current database connection. [System information functions]
-
UUID_GENERATE
- Returns a new universally unique identifier (UUID) that is generated based on high-quality randomness from /dev/urandom. [UUID functions]
-
V6_ATON
- Converts a string containing a colon-delimited IPv6 network address into a VARBINARY string. [IP address functions]
-
V6_NTOA
- Converts an IPv6 address represented as varbinary to a character string. [IP address functions]
-
V6_SUBNETA
- Returns a VARCHAR containing a subnet address in CIDR (Classless Inter-Domain Routing) format from a binary or alphanumeric IPv6 address. [IP address functions]
-
V6_SUBNETN
- Calculates a subnet address in CIDR (Classless Inter-Domain Routing) format from a varbinary or alphanumeric IPv6 address. [IP address functions]
-
V6_TYPE
- Returns an INTEGER value that classifies the type of the network address passed to it as defined in IETF RFC 4291 section 2.4. [IP address functions]
-
VALIDATE_STATISTICS
- Validates statistics in the XML file generated by EXPORT_STATISTICS. [Statistics management functions]
-
VAR_POP [aggregate]
- Evaluates the population variance for each member of the group. [Aggregate functions]
-
VAR_POP [analytic]
- Returns the statistical population variance of a non-null set of numbers (nulls are ignored) in a group within a. [Analytic functions]
-
VAR_SAMP [aggregate]
- Evaluates the sample variance for each row of the group. [Aggregate functions]
-
VAR_SAMP [analytic]
- Returns the sample variance of a non-NULL set of numbers (NULL values in the set are ignored) for each row of the group within a. [Analytic functions]
-
VARIANCE [aggregate]
- Evaluates the sample variance for each row of the group. [Aggregate functions]
-
VARIANCE [analytic]
- Returns the sample variance of a non-NULL set of numbers (NULL values in the set are ignored) for each row of the group within a. [Analytic functions]
-
VERIFY_HADOOP_CONF_DIR
- Verifies that the Hadoop configuration that is used to access HDFS is valid on all Vertica nodes. [Hadoop functions]
-
VERSION
- Returns a VARCHAR containing a Vertica node's version information. [System information functions]
-
WEEK
- Returns the week of the year for the specified date as an integer, where the first week begins on the first Sunday on or preceding January 1. [Date/time functions]
-
WEEK_ISO
- Returns the week of the year for the specified date as an integer, where the first week starts on Monday and contains January 4. [Date/time functions]
-
WIDTH_BUCKET
- Constructs equiwidth histograms, in which the histogram range is divided into intervals (buckets) of identical sizes. [Mathematical functions]
-
WITHIN GROUP ORDER BY clause
- Specifies how to sort rows that are grouped by aggregate functions, one of the following:. [Aggregate functions]
-
XGB_CLASSIFIER
- Trains an XGBoost model for classification on an input relation. [Machine learning algorithms]
-
XGB_PREDICTOR_IMPORTANCE
- Measures the importance of the predictors in an XGBoost model. [Model evaluation]
-
XGB_REGRESSOR
- Trains an XGBoost model for regression on an input relation. [Machine learning algorithms]
-
YEAR
- Returns an integer that represents the year portion of the specified date. [Date/time functions]
-
YEAR_ISO
- Returns an integer that represents the year portion of the specified date. [Date/time functions]
-
ZEROIFNULL
- Evaluates to 0 if the column is NULL. [NULL-handling functions]
1 - Aggregate functions
All functions in this section that have an analytic function counterpart are appended with [Aggregate] to avoid confusion between the two.
Note
All functions in this section that have an
analytic function counterpart are appended with [Aggregate] to avoid confusion between the two.
Aggregate functions summarize data over groups of rows from a query result set. The groups are specified using the GROUP BY clause. They are allowed only in the select list and in the HAVING and ORDER BY clauses of a SELECT statement (as described in Aggregate expressions).
Notes
-
Except for COUNT, these functions return a null value when no rows are selected. In particular, SUM of no rows returns NULL, not zero.
-
In some cases you can replace an expression that includes multiple aggregates with an single aggregate of an expression. For example SUM(x) + SUM(y) can be expressed as as SUM(x+y) (where x and y are NOT NULL).
-
Vertica does not support nested aggregate functions.
You can also use some of the simple aggregate functions as analytic (window) functions. See Analytic functions for details. See also SQL analytics.
1.1 - APPROXIMATE_COUNT_DISTINCT
Returns the number of distinct non-NULL values in a data set.
Returns the number of distinct non-NULL values in a data set.
Behavior type
Immutable
Syntax
APPROXIMATE_COUNT_DISTINCT ( expression[, error-tolerance ] )
Parameters
expression
- Value to be evaluated using any data type that supports equality comparison.
error-tolerance
Numeric value that represents the desired percentage of error tolerance, distributed around the value returned by this function. The smaller the error tolerance, the closer the approximation.
You can set error-tolerance
to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.
If you omit this argument, Vertica uses an error tolerance of 1.25(%).
Restrictions
APPROXIMATE_COUNT_DISTINCT and DISTINCT aggregates cannot be in the same query block.
Error tolerance
APPROXIMATE_COUNT_DISTINCT(
x
,
error-tolerance
)
returns a value equal to COUNT(DISTINCT
x
)
, with an error that is lognormally distributed with standard deviation.
Parameter error-tolerance
is optional. Supply this argument to specify the desired standard deviation. error-tolerance
is defined as 2.17 standard deviations, which corresponds to a 97 percent confidence interval:
standard-deviation = error-tolerance / 2.17
For example:
-
error-tolerance
= 1
Default setting, corresponds to a standard deviation
97 percent of the time, APPROXIMATE_COUNT_DISTINCT(x,5
) returns a value between:
-
COUNT(DISTINCT
x
) * 0.99
-
COUNT(DISTINCT
x
) * 1.01
-
error-tolerance
= 5
97 percent of the time, APPROXIMATE_COUNT_DISTINCT(
x
)
returns a value between:
-
COUNT(DISTINCT
x
) * 0.95
-
COUNT(DISTINCT
x
) * 1.05
A 99 percent confidence interval corresponds to 2.58
standard deviations. To set error-tolerance
confidence level corresponding to 99 (instead of a 97) percent , multiply error-tolerance
by 2.17 / 2.58 = 0.841
.
For example, if you specify error-tolerance
as 5 * 0.841 = 4.2
, APPROXIMATE_COUNT_DISTINCT(
x,4.2
)
returns values 99 percent of the time between:
Examples
Count the total number of distinct values in column product_key
from table store.store_sales_fact
:
=> SELECT COUNT(DISTINCT product_key) FROM store.store_sales_fact;
COUNT
-------
19982
(1 row)
Count the approximate number of distinct values in product_key
with various error tolerances. The smaller the error tolerance, the closer the approximation:
=> SELECT APPROXIMATE_COUNT_DISTINCT(product_key,5) AS five_pct_accuracy,
APPROXIMATE_COUNT_DISTINCT(product_key,1) AS one_pct_accuracy,
APPROXIMATE_COUNT_DISTINCT(product_key,.88) AS point_eighteight_pct_accuracy
FROM store.store_sales_fact;
five_pct_accuracy | one_pct_accuracy | point_eighteight_pct_accuracy
-------------------+------------------+-------------------------------
19431 | 19921 | 19921
(1 row)
See also
Approximate count distinct functions
1.2 - APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS
Calculates the number of distinct non-NULL values from the synopsis objects created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS.
Calculates the number of distinct non-NULL values from the synopsis objects created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS.
Behavior type
Immutable
Syntax
APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS ( synopsis-obj[, error-tolerance ] )
Parameters
synopsis-obj
- A synopsis object created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS.
error-tolerance
Numeric value that represents the desired percentage of error tolerance, distributed around the value returned by this function. The smaller the error tolerance, the closer the approximation.
You can set error-tolerance
to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.
If you omit this argument, Vertica uses an error tolerance of 1.25(%).
For more details, see APPROXIMATE_COUNT_DISTINCT.
Restrictions
APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS and DISTINCT aggregates cannot be in the same query block.
Examples
The following examples review and compare different ways to obtain a count of unique values in a table column:
Return an exact count of unique values in column product_key, from table store.store_sales_fact
:
=> \timing
Timing is on.
=> SELECT COUNT(DISTINCT product_key) from store.store_sales_fact;
count
-------
19982
(1 row)
Time: First fetch (1 row): 553.033 ms. All rows formatted: 553.075 ms
Return an approximate count of unique values in column product_key
:
=> SELECT APPROXIMATE_COUNT_DISTINCT(product_key) as unique_product_keys
FROM store.store_sales_fact;
unique_product_keys
---------------------
19921
(1 row)
Time: First fetch (1 row): 394.562 ms. All rows formatted: 394.600 ms
Create a synopsis object that represents a set of store.store_sales_fact
data with unique product_key
values, store the synopsis in the new table my_summary
:
=> CREATE TABLE my_summary AS SELECT APPROXIMATE_COUNT_DISTINCT_SYNOPSIS (product_key) syn
FROM store.store_sales_fact;
CREATE TABLE
Time: First fetch (0 rows): 582.662 ms. All rows formatted: 582.682 ms
Return a count from the saved synopsis:
=> SELECT APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS(syn) FROM my_summary;
ApproxCountDistinctOfSynopsis
-------------------------------
19921
(1 row)
Time: First fetch (1 row): 105.295 ms. All rows formatted: 105.335 ms
See also
Approximate count distinct functions
1.3 - APPROXIMATE_COUNT_DISTINCT_SYNOPSIS
Summarizes the information of distinct non-NULL values and materializes the result set in a VARBINARY or LONG VARBINARY synopsis object.
Summarizes the information of distinct non-NULL values and materializes the result set in a VARBINARY or LONG VARBINARY synopsis
object. The calculated result is within a specified range of error tolerance. You save the synopsis object in a Vertica table for use by APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS.
Behavior type
Immutable
Syntax
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS ( expression[, error-tolerance] )
Parameters
expression
- Value to evaluate using any data type that supports equality comparison.
error-tolerance
Numeric value that represents the desired percentage of error tolerance, distributed around the value returned by this function. The smaller the error tolerance, the closer the approximation.
You can set error-tolerance
to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.
If you omit this argument, Vertica uses an error tolerance of 1.25(%).
For more details, see APPROXIMATE_COUNT_DISTINCT.
Restrictions
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS and DISTINCT aggregates cannot be in the same query block.
Examples
See APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS.
See also
Approximate count distinct functions
1.4 - APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE
Aggregates multiple synopses into one new synopsis.
Aggregates multiple synopses into one new synopsis. This function is similar to APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS but returns one synopsis instead of the count estimate. The benefit of this function is that it speeds up final estimation when calling APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS.
For example, if you need to regularly estimate count distinct of users for a long period of time (such as several years) you can pre-accumulate synopses of days into one synopsis for a year.
Behavior type
Immutable
Syntax
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE ( synopsis-obj [, error-tolerance] )
Parameters
synopsis-obj
- An expression that can be evaluated to one or more synopses. Typically a
synopsis-obj
is generated as a binary string by either the APPROXIMATE_COUNT_DISTINCT or APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE function and is stored in a table column of type VARBINARY or LONG VARBINARY.
error-tolerance
Numeric value that represents the desired percentage of error tolerance, distributed around the value returned by this function. The smaller the error tolerance, the closer the approximation.
You can set error-tolerance
to a minimum value of 0.88. Vertica imposes no maximum restriction, but any value greater than 5 is implemented with 5% error tolerance.
If you omit this argument, Vertica uses an error tolerance of 1.25(%).
For more details, see APPROXIMATE_COUNT_DISTINCT.
Examples
See Approximate count distinct functions.
1.5 - APPROXIMATE_MEDIAN [aggregate]
Computes the approximate median of an expression over a group of rows.
Computes the approximate median of an expression over a group of rows. The function returns a FLOAT value.
APPROXIMATE_MEDIAN
is an alias of APPROXIMATE_PERCENTILE [aggregate] with a parameter of 0.5.
Note
Note: This function is best suited for large groups of data. If you have a small group of data, use the exact
MEDIAN [analytic] function.
Behavior type
Immutable
Syntax
APPROXIMATE_MEDIAN ( expression )
Parameters
expression
- Any FLOAT or INTEGER data type. The function returns the approximate middle value or an interpolated value that would be the approximate middle value once the values are sorted. Null values are ignored in the calculation.
Examples
Tip
For optimal performance when using GROUP BY
in your query, verify that your table is sorted on the GROUP BY
column.
The following examples uses this table:
CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT) ORDER BY state;
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;
Calculate the approximate median of all sales in this table:
=> SELECT APPROXIMATE_MEDIAN (sales) FROM allsales;
APROXIMATE_MEDIAN
--------------------
20
(1 row)
Modify the query to group sales by state, and obtain the approximate median for each one:
=> SELECT state, APPROXIMATE_MEDIAN(sales) FROM allsales GROUP BY state;
state | APPROXIMATE_MEDIAN
-------+--------------------
MA | 35
NY | 20
(2 rows)
See also
1.6 - APPROXIMATE_PERCENTILE [aggregate]
Computes the approximate percentile of an expression over a group of rows.
Computes the approximate percentile of an expression over a group of rows. This function returns a FLOAT value.
Note
Note: Use this function when many rows are aggregated into groups. If the number of aggregated rows is small, use the analytic function
PERCENTILE_CONT.
Behavior type
Immutable
Syntax
APPROXIMATE_PERCENTILE ( column-expression USING PARAMETERS percentiles='percentile-values' )
Arguments
column-expression
- A column of FLOAT or INTEGER data types whose percentiles will be calculated. NULL values are ignored.
Parameters
percentiles
- One or more (up to 1000) comma-separated
FLOAT
constants ranging from 0 to 1 inclusive, specifying the percentile values to be calculated.
Note
Note: The deprecated parameter percentile
, which takes only a single float, continues to be supported for backwards-compatibility.
Examples
Tip
For optimal performance when using GROUP BY
in your query, verify that your table is sorted on the GROUP BY
column.
The following examples use this table:
=> CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT) ORDER BY state;
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;
=> SELECT * FROM allsales;
state | name | sales
-------+------+-------
MA | A | 60
NY | B | 20
NY | C | 15
NY | F | 40
MA | D | 20
MA | E | 50
MA | G | 10
(7 rows)
Calculate the approximate percentile for sales in each state:
=> SELECT state, APPROXIMATE_PERCENTILE(sales USING PARAMETERS percentiles='0.5') AS median
FROM allsales GROUP BY state;
state | median
-------+--------
MA | 35
NY | 20
(2 rows)
Calculate multiple approximate percentiles for sales in each state:
=> SELECT state, APPROXIMATE_PERCENTILE(sales USING PARAMETERS percentiles='0.5,1.0')
FROM allsales GROUP BY state;
state | APPROXIMATE_PERCENTILE
-------+--------
MA | [35.0,60.0]
NY | [20.0,40.0]
(2 rows)
Calculate multiple approximate percentiles for sales in each state and show results for each percentile in separate columns:
=> SELECT ps[0] as q0, ps[1] as q1, ps[2] as q2, ps[3] as q3, ps[4] as q4
FROM (SELECT APPROXIMATE_PERCENTILE(sales USING PARAMETERS percentiles='0, 0.25, 0.5, 0.75, 1')
AS ps FROM allsales GROUP BY state) as s1;
q0 | q1 | q2 | q3 | q4
------+------+------+------+------
10.0 | 17.5 | 35.0 | 52.5 | 60.0
15.0 | 17.5 | 20.0 | 30.0 | 40.0
(2 rows)
See also
1.7 - APPROXIMATE_QUANTILES
Computes an array of weighted, approximate percentiles of a column within some user-specified error.
Computes an array of weighted, approximate percentiles of a column within some user-specified error. This algorithm is similar to APPROXIMATE_PERCENTILE [aggregate], which instead returns a single percentile.
The performance of this function depends entirely on the specified epsilon and the size of the provided array.
The OVER clause for this function must be empty.
Behavior type
Immutable
Syntax
APPROXIMATE_QUANTILES ( column USING PARAMETERS [nquantiles=n], [epsilon=error] ) OVER() FROM table
Parameters
column
- The
INTEGER
or FLOAT
column for which to calculate the percentiles. NULL values are ignored.
n
- An integer that specifies the number of desired quantiles in the returned array.
Default: 11
error
- The allowed error for any returned percentile. Specifically, for an array of size N, the specified error ε (epsilon) for the φ-quantile guarantees that the rank r of the return value with respect to the rank ⌊φN⌋ of the exact value is such that:
⌊(φ-ε)N⌋ ≤ r ≤ ⌊(φ+ε)N⌋
For n quantiles, if the error ε is specified such that ε > 1/n, this function will return non-deterministic results.
Default: 0.001
table
- The table containing
column
.
Examples
The following example uses this table:
=> CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT) ORDER BY state;
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;
=> SELECT * FROM allsales;
state | name | sales
-------+------+-------
MA | A | 60
NY | B | 20
NY | C | 15
NY | F | 40
MA | D | 20
MA | E | 50
MA | G | 10
(7 rows)
This call to APPROXIMATE_QUANTILES returns a 6-element array of approximate percentiles, one for each quantile. Each quantile relates to the percentile by a factor of 100. For example, the second entry in the output indicates that 15 is the 0.2-quantile of the input column, so 15 is the 20th percentile of the input column.
=> SELECT APPROXIMATE_QUANTILES(sales USING PARAMETERS nquantiles=6) OVER() FROM allsales;
Quantile | Value
----------+-------
0 | 10
0.2 | 15
0.4 | 20
0.6 | 40
0.8 | 50
1 | 60
(6 rows)
1.8 - ARGMAX_AGG
Takes two arguments target and arg, where both are columns or column expressions in the queried dataset.
Takes two arguments target
and arg
, where both are columns or column expressions in the queried dataset. ARGMAX_AGG finds the row with the highest non-null value in target
and returns the value of arg
in that row. If multiple rows contain the highest target
value, ARGMAX_AGG returns arg
from the first row that it finds. Use the WITHIN GROUP ORDER BY clause to control which row ARGMAX_AGG finds first.
Behavior type
Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique values within the group; otherwise Volatile.
Syntax
ARGMAX_AGG ( target, arg ) [ within-group-order-by-clause ]
Arguments
target
, arg
- Columns in the queried dataset.
Note
The
target
argument cannot reference a
spatial data type column, GEOMETRY or GEOGRAPHY.
- [within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)
- Sorts target values within each group of rows:
WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])
sort-qualifiers
:
{ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }
Use this clause to determine which row is returned when multiple rows contain the highest target value; otherwise, results are likely to vary with each iteration of the same query.
Tip
WITHIN GROUP ORDER BY can consume a large amount of memory per group. To minimize memory consumption, create projections that support
GROUPBY PIPELINED.
Examples
The following example calls ARGMAX_AGG in a WITH clause to find which employees in each region are at or near retirement age. If multiple employees within each region have the same age, ARGMAX_AGG chooses the employees with the highest salary level and returns with their IDs. The primary query returns with details on the employees selected from each region:
=> WITH r AS (SELECT employee_region, ARGMAX_AGG(employee_age, employee_key)
WITHIN GROUP (ORDER BY annual_salary DESC) emp_id
FROM employee_dim GROUP BY employee_region ORDER BY employee_region)
SELECT r.employee_region, ed.annual_salary AS highest_salary, employee_key,
ed.employee_first_name||' '||ed.employee_last_name AS employee_name, ed.employee_age
FROM r JOIN employee_dim ed ON r.emp_id = ed.employee_key ORDER BY ed.employee_region;
employee_region | highest_salary | employee_key | employee_name | employee_age
----------------------------------+----------------+--------------+------------------+--------------
East | 927335 | 70 | Sally Gauthier | 65
MidWest | 177716 | 869 | Rebecca McCabe | 65
NorthWest | 100300 | 7597 | Kim Jefferson | 65
South | 196454 | 275 | Alexandra Harris | 65
SouthWest | 198669 | 1043 | Seth Stein | 65
West | 197203 | 681 | Seth Jones | 65
(6 rows)
See also
ARGMIN_AGG
1.9 - ARGMIN_AGG
Takes two arguments target and arg, where both are columns or column expressions in the queried dataset.
Takes two arguments target
and arg
, where both are columns or column expressions in the queried dataset. ARGMIN_AGG finds the row with the lowest non-null value in target
and returns the value of arg
in that row. If multiple rows contain the lowest target
value, ARGMIN_AGG returns arg
from the first row that it finds. Use the WITHIN GROUP ORDER BY clause to control which row ARGMMIN_AGG finds first.
Behavior type
Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique values within the group; otherwise Volatile.
Syntax
ARGMIN_AGG ( target, arg ) [ within-group-order-by-clause ]
Arguments
target
, arg
- Columns in the queried dataset.
Note
The
target
argument cannot reference a
spatial data type column, GEOMETRY or GEOGRAPHY.
- [within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)
- Sorts target values within each group of rows:
WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])
sort-qualifiers
:
{ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }
Use this clause to determine which row is returned when multiple rows contain the lowest target value; otherwise, results are likely to vary with each iteration of the same query.
Tip
WITHIN GROUP ORDER BY can consume a large amount of memory per group. To minimize memory consumption, create projections that support
GROUPBY PIPELINED.
Examples
The following example calls ARGMIN_AGG in a WITH clause to find the lowest salary among all employees in each region, and returns with the lowest-paid employee IDs. The primary query returns with the salary amounts and employee names:
=> WITH msr (employee_region, emp_id) AS
(SELECT employee_region, argmin_agg(annual_salary, employee_key) lowest_paid_employee FROM employee_dim GROUP BY employee_region)
SELECT msr.employee_region, ed.annual_salary AS lowest_salary, ed.employee_first_name||' '||ed.employee_last_name AS employee_name
FROM msr JOIN employee_dim ed ON msr.emp_id = ed.employee_key ORDER BY annual_salary DESC;
employee_region | lowest_salary | employee_name
----------------------------------+---------------+-----------------
NorthWest | 20913 | Raja Garnett
SouthWest | 20750 | Seth Moore
West | 20443 | Midori Taylor
South | 20363 | David Bauer
East | 20306 | Craig Jefferson
MidWest | 20264 | Dean Vu
(6 rows)
See also
ARGMAX_AGG
1.10 - AVG [aggregate]
Computes the average (arithmetic mean) of an expression over a group of rows.
Computes the average (arithmetic mean) of an expression over a group of rows. AVG always returns a DOUBLE PRECISION value.
The AVG aggregate function differs from the AVG analytic function, which computes the average of an expression over a group of rows within a window.
Behavior type
Immutable
Syntax
AVG ( [ ALL | DISTINCT ] expression )
Parameters
ALL
- Invokes the aggregate function for all rows in the group (default).
DISTINCT
- Invokes the aggregate function for all distinct non-null values of the expression found in the group.
expression
- The value whose average is calculated over a set of rows, any expression that can have a DOUBLE PRECISION result.
Overflow handling
By default, Vertica allows silent numeric overflow when you call this function on numeric data types. For more information on this behavior and how to change it, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Examples
The following query returns the average income from the customer table:
=> SELECT AVG(annual_income) FROM customer_dimension;
AVG
--------------
2104270.6485
(1 row)
See also
1.11 - BIT_AND
Takes the bitwise AND of all non-null input values.
Takes the bitwise AND of all non-null input values. If the input parameter is NULL, the return value is also NULL.
Behavior type
Immutable
Syntax
BIT_AND ( expression )
Parameters
expression
- The BINARY or VARBINARY input value to evaluate. BIT_AND operates on VARBINARY types explicitly and on BINARY types implicitly through casts.
Returns
BIT_AND returns:
If the columns are different lengths, the return values are treated as though they are all equal in length and are right-extended with zero bytes. For example, given a group containing hex values ff
, null
, and f
, BIT_AND
ignores the null value and extends the value f
to f0
.
Examples
The example that follows uses table t
with a single column of VARBINARY
data type:
=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));
Query table t
to see column c
output:
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
ff00
ffff
f00f
(3 rows)
Query table t
to get the AND value for column c
:
=> SELECT TO_HEX(BIT_AND(c)) FROM t;
TO_HEX
--------
f000
(1 row)
The function is applied pairwise to all values in the group, resulting in f000
, which is determined as follows:
-
ff00
(record 1) is compared with ffff
(record 2), which results in ff00
.
-
The result from the previous comparison is compared with f00f
(record 3), which results in f000
.
See also
Binary data types (BINARY and VARBINARY)
1.12 - BIT_OR
Takes the bitwise OR of all non-null input values.
Takes the bitwise OR of all non-null input values. If the input parameter is NULL, the return value is also NULL.
Behavior type
Immutable
Syntax
BIT_OR ( expression )
Parameters
expression
- The BINARY or VARBINARY input value to evaluate. BIT_OR operates on VARBINARY types explicitly and on BINARY types implicitly through casts.
Returns
BIT_OR
returns:
If the columns are different lengths, the return values are treated as though they are all equal in length and are right-extended with zero bytes. For example, given a group containing hex values ff
, null
, and f
, the function ignores the null value and extends the value f
to f0
.
Examples
The example that follows uses table t
with a single column of VARBINARY
data type:
=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));
Query table t
to see column c
output:
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
ff00
ffff
f00f
(3 rows)
Query table t
to get the OR value for column c
:
=> SELECT TO_HEX(BIT_OR(c)) FROM t;
TO_HEX
--------
ffff
(1 row)
The function is applied pairwise to all values in the group, resulting in ffff
, which is determined as follows:
-
ff00
(record 1) is compared with ffff
, which results in ffff
.
-
The ff00
result from the previous comparison is compared with f00f
(record 3), which results in ffff
.
See also
Binary data types (BINARY and VARBINARY)
1.13 - BIT_XOR
Takes the bitwise XOR of all non-null input values.
Takes the bitwise XOR
of all non-null input values. If the input parameter is NULL
, the return value is also NULL
.
Behavior type
Immutable
Syntax
BIT_XOR ( expression )
Parameters
expression
- The
BINARY
or VARBINARY
input value to evaluate. BIT_XOR
operates on VARBINARY
types explicitly and on BINARY
types implicitly through casts.
Returns
BIT_XOR
returns:
-
The same value as the argument data type.
-
1 for each bit compared, if there are an odd number of arguments with set bits; otherwise 0.
If the columns are different lengths, the return values are treated as though they are all equal in length and are right-extended with zero bytes. For example, given a group containing hex values ff
, null
, and f
, the function ignores the null value and extends the value f
to f0
.
Examples
First create a sample table and projections with binary columns:
The example that follows uses table t
with a single column of VARBINARY
data type:
=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));
Query table t
to see column c
output:
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
ff00
ffff
f00f
(3 rows)
Query table t
to get the XOR value for column c
:
=> SELECT TO_HEX(BIT_XOR(c)) FROM t;
TO_HEX
--------
f0f0
(1 row)
See also
Binary data types (BINARY and VARBINARY)
1.14 - BOOL_AND [aggregate]
Processes Boolean values and returns a Boolean value result.
Processes Boolean values and returns a Boolean value result. If all input values are true, BOOL_AND
returns t
. Otherwise it returns f
(false).
Behavior type
Immutable
Syntax
BOOL_AND ( expression )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly coerced to a Boolean data type.
Examples
The following example shows how to use aggregate functions BOOL_AND
, BOOL_OR
, and BOOL_XOR
. The sample table mixers
includes columns for models and colors.
=> CREATE TABLE mixers(model VARCHAR(20), colors VARCHAR(20));
CREATE TABLE
Insert sample data into the table. The sample adds two color fields for each model.
=> INSERT INTO mixers
SELECT 'beginner', 'green'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'advanced', 'green'
UNION ALL
SELECT 'advanced', 'blue'
UNION ALL
SELECT 'professional', 'blue'
UNION ALL
SELECT 'professional', 'green'
UNION ALL
SELECT 'beginner', 'green';
OUTPUT
--------
8
(1 row)
Query the table. The result shows models that have two blue (BOOL_AND
), one or two blue (BOOL_OR
), and specifically not more than one blue (BOOL_XOR
) mixer.
=> SELECT model,
BOOL_AND(colors= 'blue')AS two_blue,
BOOL_OR(colors= 'blue')AS one_or_two_blue,
BOOL_XOR(colors= 'blue')AS specifically_not_more_than_one_blue
FROM mixers
GROUP BY model;
model | two_blue | one_or_two_blue | specifically_not_more_than_one_blue
--------------+----------+-----------------+-------------------------------------
advanced | f | t | t
beginner | f | f | f
intermediate | t | t | f
professional | f | t | t
(4 rows)
See also
1.15 - BOOL_OR [aggregate]
Processes Boolean values and returns a Boolean value result.
Processes Boolean values and returns a Boolean value result. If at least one input value is true, BOOL_OR
returns t
. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
BOOL_OR ( expression )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly coerced to a Boolean data type.
Examples
The following example shows how to use aggregate functions BOOL_AND
, BOOL_OR
, and BOOL_XOR
. The sample table mixers
includes columns for models and colors.
=> CREATE TABLE mixers(model VARCHAR(20), colors VARCHAR(20));
CREATE TABLE
Insert sample data into the table. The sample adds two color fields for each model.
=> INSERT INTO mixers
SELECT 'beginner', 'green'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'advanced', 'green'
UNION ALL
SELECT 'advanced', 'blue'
UNION ALL
SELECT 'professional', 'blue'
UNION ALL
SELECT 'professional', 'green'
UNION ALL
SELECT 'beginner', 'green';
OUTPUT
--------
8
(1 row)
Query the table. The result shows models that have two blue (BOOL_AND
), one or two blue (BOOL_OR
), and specifically not more than one blue (BOOL_XOR
) mixer.
=> SELECT model,
BOOL_AND(colors= 'blue')AS two_blue,
BOOL_OR(colors= 'blue')AS one_or_two_blue,
BOOL_XOR(colors= 'blue')AS specifically_not_more_than_one_blue
FROM mixers
GROUP BY model;
model | two_blue | one_or_two_blue | specifically_not_more_than_one_blue
--------------+----------+-----------------+-------------------------------------
advanced | f | t | t
beginner | f | f | f
intermediate | t | t | f
professional | f | t | t
(4 rows)
See also
1.16 - BOOL_XOR [aggregate]
Processes Boolean values and returns a Boolean value result.
Processes Boolean values and returns a Boolean value result. If specifically only one input value is true, BOOL_XOR
returns t
. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
BOOL_XOR ( expression )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly coerced to a Boolean data type.
Examples
The following example shows how to use aggregate functions BOOL_AND
, BOOL_OR
, and BOOL_XOR
. The sample table mixers
includes columns for models and colors.
=> CREATE TABLE mixers(model VARCHAR(20), colors VARCHAR(20));
CREATE TABLE
Insert sample data into the table. The sample adds two color fields for each model.
=> INSERT INTO mixers
SELECT 'beginner', 'green'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'intermediate', 'blue'
UNION ALL
SELECT 'advanced', 'green'
UNION ALL
SELECT 'advanced', 'blue'
UNION ALL
SELECT 'professional', 'blue'
UNION ALL
SELECT 'professional', 'green'
UNION ALL
SELECT 'beginner', 'green';
OUTPUT
--------
8
(1 row)
Query the table. The result shows models that have two blue (BOOL_AND
), one or two blue (BOOL_OR
), and specifically not more than one blue (BOOL_XOR
) mixer.
=> SELECT model,
BOOL_AND(colors= 'blue')AS two_blue,
BOOL_OR(colors= 'blue')AS one_or_two_blue,
BOOL_XOR(colors= 'blue')AS specifically_not_more_than_one_blue
FROM mixers
GROUP BY model;
model | two_blue | one_or_two_blue | specifically_not_more_than_one_blue
--------------+----------+-----------------+-------------------------------------
advanced | f | t | t
beginner | f | f | f
intermediate | t | t | f
professional | f | t | t
(4 rows)
See also
1.17 - CORR
Returns the DOUBLE PRECISION coefficient of correlation of a set of expression pairs, as per the Pearson correlation coefficient.
Returns the DOUBLE PRECISION
coefficient of correlation of a set of expression pairs, as per the Pearson correlation coefficient. CORR
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, the function returns NULL
.
Syntax
CORR ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT CORR (Annual_salary, Employee_age) FROM employee_dimension;
CORR
----------------------
-0.00719153413192422
(1 row)
1.18 - COUNT [aggregate]
Returns as a BIGINT the number of rows in each group where the expression is not NULL.
Returns as a BIGINT the number of rows in each group where the expression is not NULL. If the query has no GROUP BY clause, COUNT returns the number of table rows.
The COUNT aggregate function differs from the COUNT analytic function, which returns the number over a group of rows within a window.
Behavior type
Immutable
Syntax
COUNT ( [ * ] [ ALL | DISTINCT ] expression )
Parameters
*
- Specifies to count all rows in the specified table or each group.
ALL | DISTINCT
- Specifies how to count rows where
expression
has a non-null value:
expression
- The column or expression whose non-null values are counted.
Examples
The following query returns the number of distinct values in a column:
=> SELECT COUNT (DISTINCT date_key) FROM date_dimension;
COUNT
-------
1826
(1 row)
This example returns the number of distinct return values from an expression:
=> SELECT COUNT (DISTINCT date_key + product_key) FROM inventory_fact;
COUNT
-------
21560
(1 row)
You can create an equivalent query using the LIMIT keyword to restrict the number of rows returned:
=> SELECT COUNT(date_key + product_key) FROM inventory_fact GROUP BY date_key LIMIT 10;
COUNT
-------
173
31
321
113
286
84
244
238
145
202
(10 rows)
The following query uses GROUP BY to count distinct values within groups:
=> SELECT product_key, COUNT (DISTINCT date_key) FROM inventory_fact
GROUP BY product_key LIMIT 10;
product_key | count
-------------+-------
1 | 12
2 | 18
3 | 13
4 | 17
5 | 11
6 | 14
7 | 13
8 | 17
9 | 15
10 | 12
(10 rows)
The following query returns the number of distinct products and the total inventory within each date key:
=> SELECT date_key, COUNT (DISTINCT product_key), SUM(qty_in_stock) FROM inventory_fact
GROUP BY date_key LIMIT 10;
date_key | count | sum
----------+-------+--------
1 | 173 | 88953
2 | 31 | 16315
3 | 318 | 156003
4 | 113 | 53341
5 | 285 | 148380
6 | 84 | 42421
7 | 241 | 119315
8 | 238 | 122380
9 | 142 | 70151
10 | 202 | 95274
(10 rows)
This query selects each distinct product_key
value and then counts the number of distinct date_key
values for all records with the specific product_key
value. It also counts the number of distinct warehouse_key
values in all records with the specific product_key
value:
=> SELECT product_key, COUNT (DISTINCT date_key), COUNT (DISTINCT warehouse_key) FROM inventory_fact
GROUP BY product_key LIMIT 15;
product_key | count | count
-------------+-------+-------
1 | 12 | 12
2 | 18 | 18
3 | 13 | 12
4 | 17 | 18
5 | 11 | 9
6 | 14 | 13
7 | 13 | 13
8 | 17 | 15
9 | 15 | 14
10 | 12 | 12
11 | 11 | 11
12 | 13 | 12
13 | 9 | 7
14 | 13 | 13
15 | 18 | 17
(15 rows)
This query selects each distinct product_key
value, counts the number of distinct date_key
and warehouse_key
values for all records with the specific product_key
value, and then sums all qty_in_stock
values in records with the specific product_key
value. It then returns the number of product_version
values in records with the specific product_key
value:
=> SELECT product_key, COUNT (DISTINCT date_key),
COUNT (DISTINCT warehouse_key),
SUM (qty_in_stock),
COUNT (product_version)
FROM inventory_fact GROUP BY product_key LIMIT 15;
product_key | count | count | sum | count
-------------+-------+-------+-------+-------
1 | 12 | 12 | 5530 | 12
2 | 18 | 18 | 9605 | 18
3 | 13 | 12 | 8404 | 13
4 | 17 | 18 | 10006 | 18
5 | 11 | 9 | 4794 | 11
6 | 14 | 13 | 7359 | 14
7 | 13 | 13 | 7828 | 13
8 | 17 | 15 | 9074 | 17
9 | 15 | 14 | 7032 | 15
10 | 12 | 12 | 5359 | 12
11 | 11 | 11 | 6049 | 11
12 | 13 | 12 | 6075 | 13
13 | 9 | 7 | 3470 | 9
14 | 13 | 13 | 5125 | 13
15 | 18 | 17 | 9277 | 18
(15 rows)
See also
1.19 - COVAR_POP
Returns the population covariance for a set of expression pairs.
Returns the population covariance for a set of expression pairs. The return value is of type DOUBLE PRECISION
. COVAR_POP
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, the function returns NULL
.
Syntax
SELECT COVAR_POP ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT COVAR_POP (Annual_salary, Employee_age)
FROM employee_dimension;
COVAR_POP
-------------------
-9032.34810730019
(1 row)
1.20 - COVAR_SAMP
Returns the sample covariance for a set of expression pairs.
Returns the sample covariance for a set of expression pairs. The return value is of type DOUBLE PRECISION
. COVAR_SAMP
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, the function returns NULL
.
Syntax
SELECT COVAR_SAMP ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT COVAR_SAMP (Annual_salary, Employee_age)
FROM employee_dimension;
COVAR_SAMP
-------------------
-9033.25143244343
(1 row)
1.21 - GROUP_ID
Uniquely identifies duplicate sets for GROUP BY queries that return duplicate grouping sets.
Uniquely identifies duplicate sets for GROUP BY queries that return duplicate grouping sets. This function returns one or more integers, starting with zero (0), as identifiers.
For the number of duplicates n
for a particular grouping, GROUP_ID returns a range of sequential numbers, 0 to n
–1. For the first each unique group it encounters, GROUP_ID returns the value 0. If GROUP_ID finds the same grouping again, the function returns 1, then returns 2 for the next found grouping, and so on.
Behavior type
Immutable
Syntax
GROUP_ID ()
Examples
This example shows how GROUP_ID creates unique identifiers when a query produces duplicate groupings. For an expenses table, the following query groups the results by category of expense and year and rolls up the sum for those two columns. The results have duplicate groupings for category and NULL. The first grouping has a GROUP_ID of 0, and the second grouping has a GROUP_ID of 1.
=> SELECT Category, Year, SUM(Amount), GROUPING_ID(Category, Year),
GROUP_ID() FROM expenses GROUP BY Category, ROLLUP(Category,Year)
ORDER BY Category, Year, GROUPING_ID();
Category | Year | SUM | GROUPING_ID | GROUP_ID
-------------+------+--------+-------------+----------
Books | 2005 | 39.98 | 0 | 0
Books | 2007 | 29.99 | 0 | 0
Books | 2008 | 29.99 | 0 | 0
Books | | 99.96 | 1 | 0
Books | | 99.96 | 1 | 1
Electricity | 2005 | 109.99 | 0 | 0
Electricity | 2006 | 109.99 | 0 | 0
Electricity | 2007 | 229.98 | 0 | 0
Electricity | | 449.96 | 1 | 1
Electricity | | 449.96 | 1 | 0
See also
1.22 - GROUPING
Disambiguates the use of NULL values when GROUP BY queries with multilevel aggregates generate NULL values to identify subtotals in grouping columns.
Disambiguates the use of NULL
values when GROUP BY
queries with multilevel aggregates generate NULL values to identify subtotals in grouping columns. Such NULL
values from the original data can also occur in rows. GROUPING
returns 1, if the value of expression
is:
-
NULL
, representing an aggregated value
-
0 for any other value, including NULL
values in rows
Behavior type
Immutable
Syntax
GROUPING ( expression )
Parameters
expression
- An expression in the
GROUP BY
clause
Examples
The following query uses the GROUPING
function, taking one of the GROUP BY
expressions as an argument. For each row, GROUPING
returns one of the following:
The 1 in the GROUPING(Year)
column for electricity and books indicates that these values are subtotals. The right-most column values for both GROUPING(Category)
and GROUPING(Year)
are 1
. This value indicates that neither column contributed to the GROUP BY
. The final row represents the total sales.
=> SELECT Category, Year, SUM(Amount),
GROUPING(Category), GROUPING(Year) FROM expenses
GROUP BY ROLLUP(Category, Year) ORDER BY Category, Year, GROUPING_ID();
Category | Year | SUM | GROUPING | GROUPING
-------------+------+--------+----------+----------
Books | 2005 | 39.98 | 0 | 0
Books | 2007 | 29.99 | 0 | 0
Books | 2008 | 29.99 | 0 | 0
Books | | 99.96 | 0 | 1
Electricity | 2005 | 109.99 | 0 | 0
Electricity | 2006 | 109.99 | 0 | 0
Electricity | 2007 | 229.98 | 0 | 0
Electricity | | 449.96 | 0 | 1
| | 549.92 | 1 | 1
See also
1.23 - GROUPING_ID
Concatenates the set of Boolean values generated by the GROUPING function into a bit vector.
Concatenates the set of Boolean values generated by the GROUPING function into a bit vector. GROUPING_ID
treats the bit vector as a binary number and returns it as a base-10 value that identifies the grouping set combination.
By using GROUPING_ID
you avoid the need for multiple, individual GROUPING functions. GROUPING_ID
simplifies row-filtering conditions, because rows of interest are identified using a single return from GROUPING_ID =
n
. Use GROUPING_ID
to identify grouping combinations.
Behavior type
Immutable
Syntax
GROUPING_ID ( [expression[,...] )
expression
- An expression that matches one of the expressions in the
GROUP BY
clause.
If the GROUP BY
clause includes a list of expressions, GROUPING_ID
returns a number corresponding to the GROUPING
bit vector associated with a row.
Examples
This example shows how calling GROUPING_ID
without an expression returns the GROUPING bit vector associated with a full set of multilevel aggregate expressions. The GROUPING_ID
value is comparable to GROUPING_ID(a,b)
because GROUPING_ID()
includes all columns in the GROUP BY ROLLUP
:
=> SELECT a,b,COUNT(*), GROUPING_ID() FROM T GROUP BY ROLLUP(a,b);
In the following query, the GROUPING(Category)
and GROUPING(Year)
columns have three combinations:
=> SELECT Category, Year, SUM(Amount),
GROUPING(Category), GROUPING(Year) FROM expenses
GROUP BY ROLLUP(Category, Year) ORDER BY Category, Year, GROUPING_ID();
Category | Year | SUM | GROUPING | GROUPING
-------------+------+--------+----------+----------
Books | 2005 | 39.98 | 0 | 0
Books | 2007 | 29.99 | 0 | 0
Books | 2008 | 29.99 | 0 | 0
Books | | 99.96 | 0 | 1
Electricity | 2005 | 109.99 | 0 | 0
Electricity | 2006 | 109.99 | 0 | 0
Electricity | 2007 | 229.98 | 0 | 0
Electricity | | 449.96 | 0 | 1
| | 549.92 | 1 | 1
GROUPING_ID
converts these values as follows:
Binary Set Values |
Decimal Equivalents |
00 |
0 |
01 |
1 |
11 |
3 |
0 |
Category, Year |
The following query returns the single number for each GROUP BY
level that appears in the gr_id column:
=> SELECT Category, Year, SUM(Amount),
GROUPING(Category),GROUPING(Year),GROUPING_ID(Category,Year) AS gr_id
FROM expenses GROUP BY ROLLUP(Category, Year);
Category | Year | SUM | GROUPING | GROUPING | gr_id
-------------+------+--------+----------+----------+-------
Books | 2008 | 29.99 | 0 | 0 | 0
Books | 2005 | 39.98 | 0 | 0 | 0
Electricity | 2007 | 229.98 | 0 | 0 | 0
Books | 2007 | 29.99 | 0 | 0 | 0
Electricity | 2005 | 109.99 | 0 | 0 | 0
Electricity | | 449.96 | 0 | 1 | 1
| | 549.92 | 1 | 1 | 3
Electricity | 2006 | 109.99 | 0 | 0 | 0
Books | | 99.96 | 0 | 1 | 1
The gr_id
value determines the GROUP BY
level for each row:
- GROUP BY Level
- GROUP BY Row Level
- 3
- Total sum
- 1
- Category
- 0
- Category, year
You can also use the DECODE function to give the values more meaning by comparing each search value individually:
=> SELECT Category, Year, SUM(AMOUNT), DECODE(GROUPING_ID(Category, Year),
3, 'Total',
1, 'Category',
0, 'Category,Year')
AS GROUP_NAME FROM expenses GROUP BY ROLLUP(Category, Year);
Category | Year | SUM | GROUP_NAME
-------------+------+--------+---------------
Electricity | 2006 | 109.99 | Category,Year
Books | | 99.96 | Category
Electricity | 2007 | 229.98 | Category,Year
Books | 2007 | 29.99 | Category,Year
Electricity | 2005 | 109.99 | Category,Year
Electricity | | 449.96 | Category
| | 549.92 | Total
Books | 2005 | 39.98 | Category,Year
Books | 2008 | 29.99 | Category,Year
See also
1.24 - LISTAGG
Transforms non-null values from a group of rows into a list of values that are delimited by commas (default) or a configurable separator.
Transforms non-null values from a group of rows into a list of values that are delimited by commas (default) or a configurable separator. LISTAGG can be used to denormalize rows into a string of concatenated values.
Behavior type
Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique values within the aggregated list; otherwise Volatile.
Syntax
LISTAGG ( aggregate-expression [ USING PARAMETERS parameter=value][,...] ] ) [ within-group-order-by-clause ]
Arguments
aggregate-expression
- Aggregation of one or more columns or column expressions to select from the source table or view.
LISTAGG does not support spatial data types directly. In order to pass column data of this type, convert the data to strings with the geospatial function ST_AsText.
Caution
Converted spatial data frequently contains commas. LISTAGG uses comma as the default separator character. To avoid ambiguous output, override this default by setting the function's separator
parameter to another character.
- [within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)
- Sorts aggregated values within each group of rows, where
column-expression
is typically a column in aggregate-expression
:
WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])
sort-qualifiers
:
{ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }
Tip
WITHIN GROUP ORDER BY can consume a large amount of memory per group. Including wide strings in the aggregate expression can also adversely affect performance. To minimize memory consumption, create projections that support
GROUPBY PIPELINED.
Parameters
Parameter name |
Set to... |
max_length |
An integer or integer expression that specifies in bytes the maximum length of the result, up to 32M.
Default: 1024
|
separator |
Separator string of length 0 to 80, inclusive. A length of 0 concatenates the output with no separators.
Default: comma (, )
|
on_overflow |
Specifies behavior when the result overflows the max_length setting, one of the following strings:
-
ERROR (default): Return an error when overflow occurs.
-
TRUNCATE : Remove any characters that exceed max_length setting from the query result, and return the truncated string.
|
Privileges
None
Examples
In the following query, the aggregated results in the CityStat
e column use the string " | " as a separator. The outer GROUP BY clause groups the output rows according to their Region
values. Within each group, the aggregated list items are sorted according to their city
values, as per the WITHIN GROUP ORDER BY clause:
=> \x
Expanded display is on.
=> WITH cd AS (SELECT DISTINCT (customer_city) city, customer_state, customer_region FROM customer_dimension)
SELECT customer_region Region, LISTAGG(city||', '||customer_state USING PARAMETERS separator=' | ')
WITHIN GROUP (ORDER BY city) CityAndState FROM cd GROUP BY region ORDER BY region;
-[ RECORD 1 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | East
CityAndState | Alexandria, VA | Allentown, PA | Baltimore, MD | Boston, MA | Cambridge, MA | Charlotte, NC | Clarksville, TN | Columbia, SC | Elizabeth, NJ | Erie, PA | Fayetteville, NC | Hartford, CT | Lowell, MA | Manchester, NH | Memphis, TN | Nashville, TN | New Haven, CT | New York, NY | Philadelphia, PA | Portsmouth, VA | Stamford, CT | Sterling Heights, MI | Washington, DC | Waterbury, CT
-[ RECORD 2 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | MidWest
CityAndState | Ann Arbor, MI | Cedar Rapids, IA | Chicago, IL | Columbus, OH | Detroit, MI | Evansville, IN | Flint, MI | Gary, IN | Green Bay, WI | Indianapolis, IN | Joliet, IL | Lansing, MI | Livonia, MI | Milwaukee, WI | Naperville, IL | Peoria, IL | Sioux Falls, SD | South Bend, IN | Springfield, IL
-[ RECORD 3 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | NorthWest
CityAndState | Bellevue, WA | Portland, OR | Seattle, WA
-[ RECORD 4 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | South
CityAndState | Abilene, TX | Athens, GA | Austin, TX | Beaumont, TX | Cape Coral, FL | Carrollton, TX | Clearwater, FL | Coral Springs, FL | Dallas, TX | El Paso, TX | Fort Worth, TX | Grand Prairie, TX | Houston, TX | Independence, MS | Jacksonville, FL | Lafayette, LA | McAllen, TX | Mesquite, TX | San Antonio, TX | Savannah, GA | Waco, TX | Wichita Falls, TX
-[ RECORD 5 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | SouthWest
CityAndState | Arvada, CO | Denver, CO | Fort Collins, CO | Gilbert, AZ | Las Vegas, NV | North Las Vegas, NV | Peoria, AZ | Phoenix, AZ | Pueblo, CO | Topeka, KS | Westminster, CO
-[ RECORD 6 ]+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Region | West
CityAndState | Berkeley, CA | Burbank, CA | Concord, CA | Corona, CA | Costa Mesa, CA | Daly City, CA | Downey, CA | El Monte, CA | Escondido, CA | Fontana, CA | Fullerton, CA | Inglewood, CA | Lancaster, CA | Los Angeles, CA | Norwalk, CA | Orange, CA | Palmdale, CA | Pasadena, CA | Provo, UT | Rancho Cucamonga, CA | San Diego, CA | San Francisco, CA | San Jose, CA | Santa Clara, CA | Simi Valley, CA | Sunnyvale, CA | Thousand Oaks, CA | Vallejo, CA | Ventura, CA | West Covina, CA | West Valley City, UT
1.25 - MAX [aggregate]
Returns the greatest value of an expression over a group of rows.
Returns the greatest value of an expression over a group of rows. The return value has the same type as the expression data type.
The MAX
analytic function function differs from the aggregate function, in that it returns the maximum value of an expression over a group of rows within a window.
Aggregate functions MIN
and MAX
can operate with Boolean values. MAX
can act upon a Boolean data type or a value that can be implicitly converted to a Boolean. If at least one input value is true, MAX
returns t
(true). Otherwise, it returns f
(false). In the same scenario, MIN
returns t
(true) if all input values are true. Otherwise it returns f
.
Behavior type
Immutable
Syntax
MAX ( expression )
Parameters
expression
- Any expression for which the maximum value is calculated, typically a column reference.
Examples
The following query returns the largest value in column sales_dollar_amount
.
=> SELECT MAX(sales_dollar_amount) AS highest_sale FROM store.store_sales_fact;
highest_sale
--------------
600
(1 row)
The following example shows you the difference between the MIN
and MAX
aggregate functions when you use them with a Boolean value. The sample creates a table, adds two rows of data, and shows sample output for MIN
and MAX
.
=> CREATE TABLE min_max_functions (torf BOOL);
=> INSERT INTO min_max_functions VALUES (1);
=> INSERT INTO min_max_functions VALUES (0);
=> SELECT * FROM min_max_functions;
torf
------
t
f
(2 rows)
=> SELECT min(torf) FROM min_max_functions;
min
-----
f
(1 row)
=> SELECT max(torf) FROM min_max_functions;
max
-----
t
(1 row)
See also
Data aggregation
1.26 - MIN [aggregate]
Returns the smallest value of an expression over a group of rows.
Returns the smallest value of an expression over a group of rows. The return value has the same type as the expression data type.
The MIN
analytic function differs from the aggregate function, in that it returns the minimum value of an expression over a group of rows within a window.
Aggregate functions MIN
and MAX
can operate with Boolean values. MAX
can act upon a Boolean data type or a value that can be implicitly converted to a Boolean. If at least one input value is true, MAX
returns t
(true). Otherwise, it returns f
(false). In the same scenario, MIN
returns t
(true) if all input values are true. Otherwise it returns f
.
Behavior type
Immutable
Syntax
MIN ( expression )
Parameters
expression
- Any expression for which the minimum value is calculated, typically a column reference.
Examples
The following query returns the lowest salary from the employee
dimension table.
This example shows how you can query to return the lowest salary from the employee
dimension table.
=> SELECT MIN(annual_salary) AS lowest_paid FROM employee_dimension;
lowest_paid
-------------
1200
(1 row)
The following example shows you the difference between the MIN
and MAX
aggregate functions when you use them with a Boolean value. The sample creates a table, adds two rows of data, and shows sample output for MIN
and MAX
.
=> CREATE TABLE min_max_functions (torf BOOL);
=> INSERT INTO min_max_functions VALUES (1);
=> INSERT INTO min_max_functions VALUES (0);
=> SELECT * FROM min_max_functions;
torf
------
t
f
(2 rows)
=> SELECT min(torf) FROM min_max_functions;
min
-----
f
(1 row)
=> SELECT max(torf) FROM min_max_functions;
max
-----
t
(1 row)
See also
Data aggregation
1.27 - REGR_AVGX
Returns the DOUBLE PRECISION average of the independent expression in an expression pair.
Returns the DOUBLE PRECISION
average of the independent expression in an expression pair. REGR_AVGX
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_AVGX
returns NULL
.
Syntax
SELECT REGR_AVGX ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_AVGX (Annual_salary, Employee_age)
FROM employee_dimension;
REGR_AVGX
-----------
39.321
(1 row)
1.28 - REGR_AVGY
Returns the DOUBLE PRECISION average of the dependent expression in an expression pair.
Returns the DOUBLE PRECISION
average of the dependent expression in an expression pair. The function eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, the function returns NULL
.
Syntax
REGR_AVGY ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_AVGY (Annual_salary, Employee_age)
FROM employee_dimension;
REGR_AVGY
------------
58354.4913
(1 row)
1.29 - REGR_COUNT
Returns the count of all rows in an expression pair.
Returns the count of all rows in an expression pair. The function eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, the function returns 0
.
Syntax
SELECT REGR_COUNT ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_COUNT (Annual_salary, Employee_age) FROM employee_dimension;
REGR_COUNT
------------
10000
(1 row)
1.30 - REGR_INTERCEPT
Returns the y-intercept of the regression line determined by a set of expression pairs.
Returns the y-intercept of the regression line determined by a set of expression pairs. The return value is of type DOUBLE PRECISION
. REGR_INTERCEPT
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_INTERCEPT
returns NULL
.
Syntax
SELECT REGR_INTERCEPT ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_INTERCEPT (Annual_salary, Employee_age) FROM employee_dimension;
REGR_INTERCEPT
------------------
59929.5490163437
(1 row)
1.31 - REGR_R2
Returns the square of the correlation coefficient of a set of expression pairs.
Returns the square of the correlation coefficient of a set of expression pairs. The return value is of type DOUBLE PRECISION
. REGR_R2
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_R2
returns NULL
.
Syntax
SELECT REGR_R2 ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_R2 (Annual_salary, Employee_age) FROM employee_dimension;
REGR_R2
----------------------
5.17181631706311e-05
(1 row)
1.32 - REGR_SLOPE
Returns the slope of the regression line, determined by a set of expression pairs.
Returns the slope of the regression line, determined by a set of expression pairs. The return value is of type DOUBLE PRECISION
. REGR_SLOPE
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_SLOPE
returns NULL
.
Syntax
SELECT REGR_SLOPE ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_SLOPE (Annual_salary, Employee_age) FROM employee_dimension;
REGR_SLOPE
------------------
-40.056400303749
(1 row)
1.33 - REGR_SXX
Returns the sum of squares of the difference between the independent expression (expression2) and its average.
Returns the sum of squares of the difference between the independent expression (expression2
) and its average.
That is, REGR_SXX returns: ∑[(expression2
- average(expression2
)(expression2
- average(expression2
)]
The return value is of type DOUBLE PRECISION
. REGR_SXX
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_SXX
returns NULL
.
Syntax
SELECT REGR_SXX ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_SXX (Annual_salary, Employee_age) FROM employee_dimension;
REGR_SXX
------------
2254907.59
(1 row)
1.34 - REGR_SXY
Returns the sum of products of the difference between the dependent expression (expression1) and its average and the difference between the independent expression (expression2) and its average.
Returns the sum of products of the difference between the dependent expression (expression1
) and its average and the difference between the independent expression (expression2
) and its average.
That is, REGR_SXY returns: ∑[(expression1
- average(expression1
)(expression2
- average(expression2
))]
The return value is of type DOUBLE PRECISION
. REGR_SXY
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_SXY
returns NULL
.
Syntax
SELECT REGR_SXY ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_SXY (Annual_salary, Employee_age) FROM employee_dimension;
REGR_SXY
-------------------
-90323481.0730019
(1 row)
1.35 - REGR_SYY
Returns the sum of squares of the difference between the dependent expression (expression1) and its average.
Returns the sum of squares of the difference between the dependent expression (expression1
) and its average.
That is, REGR_SYY returns: ∑[(expression1
- average(expression1
)(expression1
- average(expression1
)]
The return value is of type DOUBLE PRECISION
. REGR_SYY
eliminates expression pairs where either expression in the pair is NULL
. If no rows remain, REGR_SYY
returns NULL
.
Syntax
SELECT REGR_SYY ( expression1, expression2 )
Parameters
expression1
- The dependent
DOUBLE PRECISION
expression
expression2
- The independent
DOUBLE PRECISION
expression
Examples
=> SELECT REGR_SYY (Annual_salary, Employee_age) FROM employee_dimension;
REGR_SYY
------------------
69956728794707.2
(1 row)
1.36 - STDDEV [aggregate]
Evaluates the statistical sample standard deviation for each member of the group.
Evaluates the statistical sample standard deviation for each member of the group. The return value is the same as the square root of
VAR_SAMP
:
STDDEV(expression) = SQRT(VAR_SAMP(expression))
Behavior type
Immutable
Syntax
STDDEV ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. STDDEV
returns the same data type as expression
.
-
Nonstandard function STDDEV
is provided for compatibility with other databases. It is semantically identical to
STDDEV_SAMP
.
-
This aggregate function differs from analytic function
STDDEV
, which computes the statistical sample standard deviation of the current row with respect to the group of rows within a window.
-
When
VAR_SAMP
returns NULL
, STDDEV
returns NULL
.
Examples
The following example returns the statistical sample standard deviation for each household ID from the customer_dimension
table of the VMart example database:
=> SELECT STDDEV(household_id) FROM customer_dimension;
STDDEV
-----------------
8651.5084240071
1.37 - STDDEV_POP [aggregate]
Evaluates the statistical population standard deviation for each member of the group.
Evaluates the statistical population standard deviation for each member of the group.
Behavior type
Immutable
Syntax
STDDEV_POP ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. STDDEV_POP
returns the same data type as expression
.
-
This function differs from the analytic function
STDDEV_POP
, which evaluates the statistical population standard deviation for each member of the group of rows within a window.
-
STDDEV_POP
returns the same value as the square root of
VAR_POP
:
STDDEV_POP(expression) = SQRT(VAR_POP(expression))
-
When
VAR_SAMP
returns NULL
, this function returns NULL
.
Examples
The following example returns the statistical population standard deviation for each household ID in the customer
table.
=> SELECT STDDEV_POP(household_id) FROM customer_dimension;
STDDEV_POP
------------------
8651.41895973367
(1 row)
See also
1.38 - STDDEV_SAMP [aggregate]
Evaluates the statistical sample standard deviation for each member of the group.
Evaluates the statistical sample standard deviation for each member of the group. The return value is the same as the square root of
VAR_SAMP
:
STDDEV_SAMP(expression) = SQRT(VAR_SAMP(expression))
Behavior type
Immutable
Syntax
STDDEV_SAMP ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. STDDEV_SAMP
returns the same data type as expression
.
-
STDDEV_SAMP
is semantically identical to nonstandard function
STDDEV
, which is provided for compatibility with other databases.
-
This aggregate function differs from analytic function
STDDEV_SAMP
, which computes the statistical sample standard deviation of the current row with respect to the group of rows within a window.
-
When
VAR_SAMP
returns NULL
, STDDEV_SAMP
returns NULL
.
Examples
The following example returns the statistical sample standard deviation for each household ID from the customer
dimension table.
=> SELECT STDDEV_SAMP(household_id) FROM customer_dimension;
stddev_samp
------------------
8651.50842400771
(1 row)
1.39 - SUM [aggregate]
Computes the sum of an expression over a group of rows.
Computes the sum of an expression over a group of rows. SUM
returns a DOUBLE PRECISION
value for a floating-point expression. Otherwise, the return value is the same as the expression data type.
The SUM
aggregate function differs from the
SUM
analytic function, which computes the sum of an expression over a group of rows within a window.
Behavior type
Immutable
Syntax
SUM ( [ ALL | DISTINCT ] expression )
Parameters
ALL
- Invokes the aggregate function for all rows in the group (default)
DISTINCT
- Invokes the aggregate function for all distinct non-null values of the expression found in the group
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
Overflow handling
If you encounter data overflow when using SUM()
, use
SUM_FLOAT
which converts the data to a floating point.
By default, Vertica allows silent numeric overflow when you call this function on numeric data types. For more information on this behavior and how to change it, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Examples
The following query returns the total sum of the product_cost
column.
=> SELECT SUM(product_cost) AS cost FROM product_dimension;
cost
---------
9042850
(1 row)
See also
1.40 - SUM_FLOAT [aggregate]
Computes the sum of an expression over a group of rows and returns a DOUBLE PRECISION value.
Computes the sum of an expression over a group of rows and returns a DOUBLE PRECISION
value.
Behavior type
Immutable
Syntax
SUM_FLOAT ( [ ALL | DISTINCT ] expression )
Parameters
ALL
- Invokes the aggregate function for all rows in the group (default).
DISTINCT
- Invokes the aggregate function for all distinct non-null values of the expression found in the group.
expression
- Any expression whose result is type
DOUBLE PRECISION
.
Overflow handling
By default, Vertica allows silent numeric overflow when you call this function on numeric data types. For more information on this behavior and how to change it, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Examples
The following query returns the floating-point sum of the average price from the product table:
=> SELECT SUM_FLOAT(average_competitor_price) AS cost FROM product_dimension;
cost
----------
18181102
(1 row)
1.41 - TS_FIRST_VALUE
Processes the data that belongs to each time slice.
Processes the data that belongs to each time slice. A time series aggregate (TSA) function, TS_FIRST_VALUE
returns the value at the start of the time slice, where an interpolation scheme is applied if the timeslice is missing, in which case the value is determined by the values corresponding to the previous (and next) timeslices based on the interpolation scheme of const (linear).
TS_FIRST_VALUE
returns one output row per time slice, or one output row per partition per time slice if partition expressions are specified
Behavior type
Immutable
Syntax
TS_FIRST_VALUE ( expression [ IGNORE NULLS ] [, { 'CONST' | 'LINEAR' } ] )
Parameters
expression
- An
INTEGER
or FLOAT
expression on which to aggregate and interpolate.
IGNORE NULLS
- The
IGNORE NULLS
behavior changes depending on a CONST
or LINEAR
interpolation scheme. See When Time Series Data Contains Nulls in Analyzing Data for details.
'CONST' | 'LINEAR'
- Specifies the interpolation value as constant or linear:
Requirements
You must use an ORDER BY
clause with a TIMESTAMP
column.
Multiple time series aggregate functions
The same query can call multiple time series aggregate functions. They share the same gap-filling policy as defined by the TIMESERIES clause; however, each time series aggregate function can specify its own interpolation policy. For example:
=> SELECT slice_time, symbol,
TS_FIRST_VALUE(bid, 'const') fv_c,
TS_FIRST_VALUE(bid, 'linear') fv_l,
TS_LAST_VALUE(bid, 'const') lv_c
FROM TickStore
TIMESERIES slice_time AS '3 seconds'
OVER(PARTITION BY symbol ORDER BY ts);
Examples
See Gap Filling and Interpolation in Analyzing Data.
See also
1.42 - TS_LAST_VALUE
Processes the data that belongs to each time slice.
Processes the data that belongs to each time slice. A time series aggregate (TSA) function, TS_LAST_VALUE
returns the value at the end of the time slice, where an interpolation scheme is applied if the timeslice is missing. In this case the value is determined by the values corresponding to the previous (and next) timeslices based on the interpolation scheme of const (linear).
TS_LAST_VALUE
returns one output row per time slice, or one output row per partition per time slice if partition expressions are specified.
Behavior type
Immutable
Syntax
TS_LAST_VALUE ( expression [ IGNORE NULLS ] [, { 'CONST' | 'LINEAR' } ] )
Parameters
expression
- An
INTEGER
or FLOAT
expression on which to aggregate and interpolate.
IGNORE NULLS
- The
IGNORE NULLS
behavior changes depending on a CONST
or LINEAR
interpolation scheme. See When Time Series Data Contains Nulls in Analyzing Data for details.
'CONST' | 'LINEAR'
- Specifies the interpolation value as constant or linear:
Requirements
You must use the ORDER BY
clause with a TIMESTAMP
column.
Multiple time series aggregate functions
The same query can call multiple time series aggregate functions. They share the same gap-filling policy as defined by the TIMESERIES clause; however, each time series aggregate function can specify its own interpolation policy. For example:
=> SELECT slice_time, symbol,
TS_FIRST_VALUE(bid, 'const') fv_c,
TS_FIRST_VALUE(bid, 'linear') fv_l,
TS_LAST_VALUE(bid, 'const') lv_c
FROM TickStore
TIMESERIES slice_time AS '3 seconds'
OVER(PARTITION BY symbol ORDER BY ts);
Examples
See Gap Filling and Interpolation in Analyzing Data.
See also
1.43 - VAR_POP [aggregate]
Evaluates the population variance for each member of the group.
Evaluates the population variance for each member of the group. This is defined as the sum of squares of the difference of *expression
*from the mean of expression
, divided by the number of remaining rows:
(SUM(expression*expression) - SUM(expression)*SUM(expression) / COUNT(expression)) / COUNT(expression)
Behavior type
Immutable
Syntax
VAR_POP ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. VAR_POP
returns the same data type as expression
.
This aggregate function differs from analytic function
VAR_POP
, which computes the population variance of the current row with respect to the group of rows within a window.
Examples
The following example returns the population variance for each household ID in the customer
table.
=> SELECT VAR_POP(household_id) FROM customer_dimension;
var_pop
------------------
74847050.0168393
(1 row)
1.44 - VAR_SAMP [aggregate]
Evaluates the sample variance for each row of the group.
Evaluates the sample variance for each row of the group. This is defined as the sum of squares of the difference of expression
from the mean of expression
divided by the number of remaining rows minus 1:
(SUM(expression*expression) - SUM(expression) *SUM(expression) / COUNT(expression)) / (COUNT(expression) -1)
Behavior type
Immutable
Syntax
VAR_SAMP ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. VAR_SAMP
returns the same data type as expression
.
-
VAR_SAMP
is semantically identical to nonstandard function
VARIANCE
, which is provided for compatibility with other databases.
-
This aggregate function differs from analytic function
VAR_SAMP
, which computes the sample variance of the current row with respect to the group of rows within a window.
Examples
The following example returns the sample variance for each household ID in the customer
table.
=> SELECT VAR_SAMP(household_id) FROM customer_dimension;
var_samp
------------------
74848598.0106764
(1 row)
See also
VARIANCE [aggregate]
1.45 - VARIANCE [aggregate]
Evaluates the sample variance for each row of the group.
Evaluates the sample variance for each row of the group. This is defined as the sum of squares of the difference of expression
from the mean of expression
divided by the number of remaining rows minus 1.
(SUM(expression*expression) - SUM(expression) *SUM(expression) /COUNT(expression)) / (COUNT(expression) -1)
Behavior type
Immutable
Syntax
VARIANCE ( expression )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. VARIANCE
returns the same data type as expression
.
The nonstandard function VARIANCE
is provided for compatibility with other databases. It is semantically identical to
VAR_SAMP
.
This aggregate function differs from analytic function
VARIANCE
, which computes the sample variance of the current row with respect to the group of rows within a window.
Examples
The following example returns the sample variance for each household ID in the customer
table.
=> SELECT VARIANCE(household_id) FROM customer_dimension;
variance
------------------
74848598.0106764
(1 row)
See also
1.46 - WITHIN GROUP ORDER BY clause
Specifies how to sort rows that are grouped by aggregate functions, one of the following:.
Specifies how to sort rows that are grouped by aggregate functions, one of the following:
This clause is also supported for user-defined aggregate functions.
The order clause only specifies order within the result set of each group. The query can have its own ORDER BY clause, which has precedence over order that is specified by WITHIN GROUP ORDER BY, and orders the final result set.
Syntax
WITHIN GROUP (ORDER BY
{ column-expression [ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] ]
}[,...])
Parameters
column-expression
- A column, constant, or arbitrary expression formed on columns, on which to sort grouped rows.
ASC | DESC
- Specifies the ordering sequence as ascending (default) or descending.
NULLS {FIRST | LAST | AUTO}
- Specifies whether to position null values first or last. Default positioning depends on whether the sort order is ascending or descending:
If you specify NULLS AUTO
, Vertica chooses the positioning that is most efficient for this query, either NULLS FIRST
or NULLS LAST
.
If you omit all sort qualifiers, Vertica uses ASC NULLS LAST
.
Examples
For usage examples, see these functions:
2 - Analytic functions
All analytic functions in this section with an aggregate counterpart are appended with [Analytics] in the heading to avoid confusion between the two function types.
Note
All analytic functions in this section with an aggregate counterpart are appended with [Analytics] in the heading to avoid confusion between the two function types.
Vertica analytics are SQL functions based on the ANSI 99 standard. These functions handle complex analysis and reporting tasks—for example:
-
Rank the longest-standing customers in a particular state.
-
Calculate the moving average of retail volume over a specified time.
-
Find the highest score among all students in the same grade.
-
Compare the current sales bonus that salespersons received against their previous bonus.
Analytic functions return aggregate results but they do not group the result set. They return the group value multiple times, once per record. You can sort group values, or partitions, using a window ORDER BY
clause, but the order affects only the function result set, not the entire query result set.
Syntax
General
analytic-function(arguments) OVER(
[ window-partition-clause ]
[ window-order-clause [ window-frame-clause ] ]
)
With named window
analytic-function(arguments) OVER(
[ named-window [ window-frame-clause ] ]
)
Parameters
analytic-function
(
arguments
)
- A Vertica analytic function and its arguments.
OVER
- Specifies how to partition, sort, and window frame function input with respect to the current row. The input data is the result set that the query returns after it evaluates
FROM
, WHERE
, GROUP BY
, and HAVING
clauses.
An empty OVER
clause provides the best performance for single threaded queries on a single node.
- window-partition-clause
- Groups input rows according to one or more columns or expressions.
If you omit this clause, no grouping occurs and the analytic function processes all input rows as a single partition.
- window-order-clause
- Optionally specifies how to sort rows that are supplied to the analytic function. If the
OVER
clause also includes a partition clause, rows are sorted within each partition.
-
window-frame-clause
- Only valid for some analytic functions, specifies as input a set of rows relative to the row that is currently being evaluated by the analytic function. After the function processes that row and its window, Vertica advances the current row and adjusts the window boundaries accordingly.
named-window
- The name of a window that you define in the same query with a window-name-clause. This definition encapsulates window partitioning and sorting. Named windows are useful when the query invokes multiple analytic functions with similar
OVER
clauses.
A window name clause cannot specify a window frame clause. However, you can qualify the named window in an OVER
clause with a window frame clause.
Requirements
The following requirements apply to analytic functions:
-
All require an OVER
clause. Each function has its own OVER
clause requirements. For example, you can supply an empty OVER
clause for some analytic aggregate functions such as
SUM
. For other functions, window frame and order clauses might be required, or might be invalid.
-
Analytic functions can be invoked only in a query's SELECT
and ORDER BY
clauses.
-
Analytic functions cannot be nested. For example, the following query is not allowed:
=> SELECT MEDIAN(RANK() OVER(ORDER BY sal) OVER()).
-
WHERE
, GROUP BY
and HAVING
operators are technically not part of the analytic function. However, they determine input to that function.
See also
2.1 - ARGMAX [analytic]
This function is patterned after the mathematical function argmax(f(x)), which returns the value of x that maximizes f(x).
This function is patterned after the mathematical function argmax(
f
(
x
))
, which returns the value of x
that maximizes f
(
x
)
. Similarly, ARGMAX takes two arguments target
and arg
, where both are columns or column expressions in the queried dataset. ARGMAX finds the row with the largest non-null value in target
and returns the value of arg
in that row. If multiple rows contain the largest target
value, ARGMAX returns arg
from the first row that it finds.
Behavior type
Immutable
Syntax
ARGMAX ( target, arg ) OVER ( [ PARTITION BY expression[,...] ] [ window-order-clause ] )
Arguments
target
, arg
- Columns in the queried dataset.
OVER()
- Specifies the following window clauses:
-
PARTITION BY
expression
: Groups (partitions) input rows according to the values in expression
, which resolves to one or more columns in the queried dataset. If you omit this clause, ARGMAX processes all input rows as a single partition.
-
window-order-clause: Specifies how to sort input rows. If the OVER clause also includes a partition clause, rows are sorted separately within each partition.
Important
To ensure consistent results when multiple rows contain the largest target
value, include a window order clause that sorts on arg
.
For details, see Analytic Functions.
Examples
Create and populate table service_info
, which contains information on various services, their respective development groups, and their userbase. A NULL in the users
column indicates that the service has not been released, and so it cannot have users.
=> CREATE TABLE service_info(dev_group VARCHAR(10), product_name VARCHAR(30), users INT);
=> COPY t FROM stdin NULL AS 'null';
>> iris|chat|48193
>> aspen|trading|3000
>> orchid|cloud|990322
>> iris|video call| 10203
>> daffodil|streaming|44123
>> hydrangea|password manager|null
>> hydrangea|totp|1837363
>> daffodil|clip share|3000
>> hydrangea|e2e sms|null
>> rose|crypto|null
>> iris|forum|48193
>> \.
ARGMAX returns the value in the product_name
column that maximizes the value in the users
column. In this case, ARGMAX returns totp
, which indicates that the totp
service has the largest user base:
=> SELECT dev_group, product_name, users, ARGMAX(users, product_name) OVER (ORDER BY dev_group ASC) FROM service_info;
dev_group | product_name | users | ARGMAX
-----------+------------------+---------+--------
aspen | trading | 3000 | totp
daffodil | clip share | 3000 | totp
daffodil | streaming | 44123 | totp
hydrangea | e2e sms | | totp
hydrangea | password manager | | totp
hydrangea | totp | 1837363 | totp
iris | chat | 48193 | totp
iris | forum | 48193 | totp
iris | video call | 10203 | totp
orchid | cloud | 990322 | totp
rose | crypto | | totp
(11 rows)
The next query partitions the data on dev_group
to identify the most popular service created by each development group. ARGMAX returns NULL if the partition's users
column contains only NULL values and breaks ties using the first value in product_name
from the top of the partition.
=> SELECT dev_group, product_name, users, ARGMAX(users, product_name) OVER (PARTITION BY dev_group ORDER BY product_name ASC) FROM service_info;
dev_group | product_name | users | ARGMAX
-----------+------------------+---------+-----------
iris | chat | 48193 | chat
iris | forum | 48193 | chat
iris | video call | 10203 | chat
orchid | cloud | 990322 | cloud
aspen | trading | 3000 | trading
daffodil | clip share | 3000 | streaming
daffodil | streaming | 44123 | streaming
rose | crypto | |
hydrangea | e2e sms | | totp
hydrangea | password manager | | totp
hydrangea | totp | 1837363 | totp
(11 rows)
See also
ARGMIN [analytic]
2.2 - ARGMIN [analytic]
This function is patterned after the mathematical function argmin(f(x)), which returns the value of x that minimizes f(x).
This function is patterned after the mathematical function argmin(
f
(
x
))
, which returns the value of x
that minimizes f
(
x
)
. Similarly, ARGMIN takes two arguments target
and arg
, where both are columns or column expressions in the queried dataset. ARGMIN finds the row with the smallest non-null value in target
and returns the value of arg
in that row. If multiple rows contain the smallest target
value, ARGMIN returns arg
from the first row that it finds.
Behavior type
Immutable
Syntax
ARGMIN ( target, arg ) OVER ( [ PARTITION BY expression[,...] ] [ window-order-clause ] )
Arguments
target
, arg
- Columns in the queried dataset.
OVER()
- Specifies the following window clauses:
-
PARTITION BY
expression
: Groups (partitions) input rows according to the values in expression
, which resolves to one or more columns in the queried dataset. If you omit this clause, ARGMIN processes all input rows as a single partition.
-
window-order-clause: Specifies how to sort input rows. If the OVER
clause also includes a partition clause, rows are sorted separately within each partition.
Important
To ensure consistent results when multiple rows contain the smallest target
value, include a window order clause that sorts on arg
.
For details, see Analytic Functions.
Examples
Create and populate table service_info
, which contains information on various services, their respective development groups, and their userbase. A NULL in the users
column indicates that the service has not been released, and so it cannot have users.
=> CREATE TABLE service_info(dev_group VARCHAR(10), product_name VARCHAR(30), users INT);
=> COPY t FROM stdin NULL AS 'null';
>> iris|chat|48193
>> aspen|trading|3000
>> orchid|cloud|990322
>> iris|video call| 10203
>> daffodil|streaming|44123
>> hydrangea|password manager|null
>> hydrangea|totp|1837363
>> daffodil|clip share|3000
>> hydrangea|e2e sms|null
>> rose|crypto|null
>> iris|forum|48193
>> \.
ARGMIN returns the value in the product_name
column that minimizes the value in the users
column. In this case, ARGMIN returns totp
, which indicates that the totp
service has the smallest user base:
=> SELECT dev_group, product_name, users, ARGMIN(users, product_name) OVER (ORDER BY dev_group ASC) FROM service_info;
dev_group | product_name | users | ARGMIN
-----------+------------------+---------+---------
aspen | trading | 3000 | trading
daffodil | clip share | 3000 | trading
daffodil | streaming | 44123 | trading
hydrangea | e2e sms | | trading
hydrangea | password manager | | trading
hydrangea | totp | 1837363 | trading
iris | chat | 48193 | trading
iris | forum | 48193 | trading
iris | video call | 10203 | trading
orchid | cloud | 990322 | trading
rose | crypto | | trading
(11 rows)
The next query partitions the data on dev_group
to identify the least popular service created by each development group. ARGMIN returns NULL if the partition's users
column contains only NULL values and breaks ties using the first value in product_name
from the top of the partition.
=> SELECT dev_group, product_name, users, ARGMIN(users, product_name) OVER (PARTITION BY dev_group ORDER BY product_name ASC) FROM service_info;
dev_group | product_name | users | ARGMIN
-----------+------------------+---------+------------
iris | chat | 48193 | video call
iris | forum | 48193 | video call
iris | video call | 10203 | video call
orchid | cloud | 990322 | cloud
aspen | trading | 3000 | trading
daffodil | clip share | 3000 | clip share
daffodil | streaming | 44123 | clip share
rose | crypto | |
hydrangea | e2e sms | | totp
hydrangea | password manager | | totp
hydrangea | totp | 1837363 | totp
(11 rows)
See also
ARGMAX [analytic]
2.3 - AVG [analytic]
Computes an average of an expression in a group within a.
Computes an average of an expression in a group within a window. AVG
returns the same data type as the expression's numeric data type.
The AVG
analytic function differs from the
AVG
aggregate function, which computes the average of an expression over a group of rows.
Behavior type
Immutable
Syntax
AVG ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any data that can be implicitly converted to a numeric data type.
OVER()
- See Analytic Functions.
Overflow handling
By default, Vertica allows silent numeric overflow when you call this function on numeric data types. For more information on this behavior and how to change it, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Examples
The following query finds the sales for that calendar month and returns a running/cumulative average (sometimes called a moving average) using the default window of RANGE UNBOUNDED PRECEDING AND CURRENT ROW
:
=> SELECT calendar_month_number_in_year Mo, SUM(product_price) Sales,
AVG(SUM(product_price)) OVER (ORDER BY calendar_month_number_in_year)::INTEGER Average
FROM product_dimension pd, date_dimension dm, inventory_fact if
WHERE dm.date_key = if.date_key AND pd.product_key = if.product_key GROUP BY Mo;
Mo | Sales | Average
----+----------+----------
1 | 23869547 | 23869547
2 | 19604661 | 21737104
3 | 22877913 | 22117374
4 | 22901263 | 22313346
5 | 23670676 | 22584812
6 | 22507600 | 22571943
7 | 21514089 | 22420821
8 | 24860684 | 22725804
9 | 21687795 | 22610470
10 | 23648921 | 22714315
11 | 21115910 | 22569005
12 | 24708317 | 22747281
(12 rows)
To return a moving average that is not a running (cumulative) average, the window can specify ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
:
=> SELECT calendar_month_number_in_year Mo, SUM(product_price) Sales,
AVG(SUM(product_price)) OVER (ORDER BY calendar_month_number_in_year
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)::INTEGER Average
FROM product_dimension pd, date_dimension dm, inventory_fact if
WHERE dm.date_key = if.date_key AND pd.product_key = if.product_key GROUP BY Mo;
Mo | Sales | Average
----+----------+----------
1 | 23869547 | 22117374
2 | 19604661 | 22313346
3 | 22877913 | 22584812
4 | 22901263 | 22312423
5 | 23670676 | 22694308
6 | 22507600 | 23090862
7 | 21514089 | 22848169
8 | 24860684 | 22843818
9 | 21687795 | 22565480
10 | 23648921 | 23204325
11 | 21115910 | 22790236
12 | 24708317 | 23157716
(12 rows)
See also
2.4 - BOOL_AND [analytic]
Returns the Boolean value of an expression within a.
Returns the Boolean value of an expression within a window. If all input values are true, BOOL_AND
returns t
. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
BOOL_AND ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly converted to a Boolean data type. The function returns a Boolean value.
OVER()
- See Analytic Functions.
Examples
The following example illustrates how you can use the BOOL_AND
, BOOL_OR
, and BOOL_XOR
analytic functions. The sample table, employee, includes a column for type of employee and years paid.
=> CREATE TABLE employee(emptype VARCHAR, yearspaid VARCHAR);
CREATE TABLE
Insert sample data into the table to show years paid. In more than one case, an employee could be paid more than once within one year.
=> INSERT INTO employee
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2015'
UNION ALL
SELECT 'contractor3', '2014'
UNION ALL
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2014'
UNION ALL
SELECT 'contractor3', '2015'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor5', '2015'
UNION ALL
SELECT 'contractor5', '2016';
OUTPUT
--------
10
(1 row)
Query the table. The result shows employees that were paid twice in 2014 (BOOL_AND
), once or twice in 2014 (BOOL_OR
), and specifically not more than once in 2014 (BOOL_XOR
).
=> SELECT DISTINCT emptype,
BOOL_AND(yearspaid='2014') OVER (PARTITION BY emptype) AS paidtwicein2014,
BOOL_OR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidonceortwicein2014,
BOOL_XOR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidjustoncein2014
FROM employee;
emptype | paidtwicein2014 | paidonceortwicein2014 | paidjustoncein2014
-------------+-----------------+-----------------------+--------------------
contractor1 | t | t | f
contractor2 | f | t | t
contractor3 | f | t | t
contractor4 | t | t | f
contractor5 | f | f | f
(5 rows)
See also
2.5 - BOOL_OR [analytic]
Returns the Boolean value of an expression within a.
Returns the Boolean value of an expression within a window. If at least one input value is true, BOOL_OR
returns t
. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
BOOL_OR ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly converted to a Boolean data type. The function returns a Boolean value.
OVER()
- See Analytic Functions.
Examples
The following example illustrates how you can use the BOOL_AND
, BOOL_OR
, and BOOL_XOR
analytic functions. The sample table, employee, includes a column for type of employee and years paid.
=> CREATE TABLE employee(emptype VARCHAR, yearspaid VARCHAR);
CREATE TABLE
Insert sample data into the table to show years paid. In more than one case, an employee could be paid more than once within one year.
=> INSERT INTO employee
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2015'
UNION ALL
SELECT 'contractor3', '2014'
UNION ALL
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2014'
UNION ALL
SELECT 'contractor3', '2015'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor5', '2015'
UNION ALL
SELECT 'contractor5', '2016';
OUTPUT
--------
10
(1 row)
Query the table. The result shows employees that were paid twice in 2014 (BOOL_AND
), once or twice in 2014 (BOOL_OR
), and specifically not more than once in 2014 (BOOL_XOR
).
=> SELECT DISTINCT emptype,
BOOL_AND(yearspaid='2014') OVER (PARTITION BY emptype) AS paidtwicein2014,
BOOL_OR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidonceortwicein2014,
BOOL_XOR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidjustoncein2014
FROM employee;
emptype | paidtwicein2014 | paidonceortwicein2014 | paidjustoncein2014
-------------+-----------------+-----------------------+--------------------
contractor1 | t | t | f
contractor2 | f | t | t
contractor3 | f | t | t
contractor4 | t | t | f
contractor5 | f | f | f
(5 rows)
See also
2.6 - BOOL_XOR [analytic]
Returns the Boolean value of an expression within a.
Returns the Boolean value of an expression within a window. If only one input value is true, BOOL_XOR
returns t
. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
BOOL_XOR ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- A Boolean data type or any non-Boolean data type that can be implicitly converted to a Boolean data type. The function returns a Boolean value.
OVER()
- See Analytic Functions.
Examples
The following example illustrates how you can use the BOOL_AND
, BOOL_OR
, and BOOL_XOR
analytic functions. The sample table, employee, includes a column for type of employee and years paid.
=> CREATE TABLE employee(emptype VARCHAR, yearspaid VARCHAR);
CREATE TABLE
Insert sample data into the table to show years paid. In more than one case, an employee could be paid more than once within one year.
=> INSERT INTO employee
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2015'
UNION ALL
SELECT 'contractor3', '2014'
UNION ALL
SELECT 'contractor1', '2014'
UNION ALL
SELECT 'contractor2', '2014'
UNION ALL
SELECT 'contractor3', '2015'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor4', '2014'
UNION ALL
SELECT 'contractor5', '2015'
UNION ALL
SELECT 'contractor5', '2016';
OUTPUT
--------
10
(1 row)
Query the table. The result shows employees that were paid twice in 2014 (BOOL_AND
), once or twice in 2014 (BOOL_OR
), and specifically not more than once in 2014 (BOOL_XOR
).
=> SELECT DISTINCT emptype,
BOOL_AND(yearspaid='2014') OVER (PARTITION BY emptype) AS paidtwicein2014,
BOOL_OR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidonceortwicein2014,
BOOL_XOR(yearspaid='2014') OVER (PARTITION BY emptype) AS paidjustoncein2014
FROM employee;
emptype | paidtwicein2014 | paidonceortwicein2014 | paidjustoncein2014
-------------+-----------------+-----------------------+--------------------
contractor1 | t | t | f
contractor2 | f | t | t
contractor3 | f | t | t
contractor4 | t | t | f
contractor5 | f | f | f
(5 rows)
See also
2.7 - CONDITIONAL_CHANGE_EVENT [analytic]
Assigns an event window number to each row, starting from 0, and increments by 1 when the result of evaluating the argument expression on the current row differs from that on the previous row.
Assigns an event window number to each row, starting from 0, and increments by 1 when the result of evaluating the argument expression on the current row differs from that on the previous row.
Behavior type
Immutable
Syntax
CONDITIONAL_CHANGE_EVENT ( expression ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
expression
- SQL scalar expression that is evaluated on an input record. The result of *
expression
*can be of any data type.
OVER()
- See Analytic Functions.
Notes
The analytic window-order-clause
is required but the window-partition-clause
is optional.
Examples
=> SELECT CONDITIONAL_CHANGE_EVENT(bid)
OVER (PARTITION BY symbol ORDER BY ts) AS cce
FROM TickStore;
The system returns an error when no ORDER BY
clause is present:
=> SELECT CONDITIONAL_CHANGE_EVENT(bid)
OVER (PARTITION BY symbol) AS cce
FROM TickStore;
ERROR: conditional_change_event must contain an
ORDER BY clause within its analytic clause
For more examples, see Event-based windows.
See also
2.8 - CONDITIONAL_TRUE_EVENT [analytic]
Assigns an event window number to each row, starting from 0, and increments the number by 1 when the result of the boolean argument expression evaluates true.
Assigns an event window number to each row, starting from 0, and increments the number by 1 when the result of the boolean argument expression evaluates true. For example, given a sequence of values for column a, as follows:
( 1, 2, 3, 4, 5, 6 )
CONDITIONAL_TRUE_EVENT(a > 3)
returns 0, 0, 0, 1, 2, 3
.
Behavior type
Immutable
Syntax
CONDITIONAL_TRUE_EVENT ( boolean-expression ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
boolean-expression
- SQL scalar expression that is evaluated on an input record, type BOOLEAN.
OVER()
- See Analytic functions.
Notes
The analytic window-order-clause
is required but the window-partition-clause
is optional.
Examples
> SELECT CONDITIONAL_TRUE_EVENT(bid > 10.6)
OVER(PARTITION BY bid ORDER BY ts) AS cte
FROM Tickstore;
The system returns an error if the ORDER BY
clause is omitted:
> SELECT CONDITIONAL_TRUE_EVENT(bid > 10.6)
OVER(PARTITION BY bid) AS cte
FROM Tickstore;
ERROR: conditional_true_event must contain an ORDER BY
clause within its analytic clause
For more examples, see Event-based windows.
See also
2.9 - COUNT [analytic]
Counts occurrences within a group within a.
Counts occurrences within a group within a window. If you specify * or some non-null constant, COUNT()
counts all rows.
Behavior type
Immutable
Syntax
COUNT ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Returns the number of rows in each group for which the
expression
is not null. Can be any expression resulting in BIGINT.
OVER()
- See Analytic Functions.
Examples
Using the schema defined in Window framing, the following COUNT
function omits window order and window frame clauses; otherwise Vertica would treat it as a window aggregate. Think of the window of reporting aggregates as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
=> SELECT deptno, sal, empno, COUNT(sal)
OVER (PARTITION BY deptno) AS count FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 2
10 | 104 | 4 | 2
20 | 110 | 10 | 6
20 | 110 | 9 | 6
20 | 109 | 7 | 6
20 | 109 | 6 | 6
20 | 109 | 8 | 6
20 | 109 | 11 | 6
30 | 105 | 5 | 3
30 | 103 | 3 | 3
30 | 102 | 2 | 3
Using ORDER BY sal
creates a moving window query with default window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
=> SELECT deptno, sal, empno, COUNT(sal)
OVER (PARTITION BY deptno ORDER BY sal) AS count
FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 1
10 | 104 | 4 | 2
20 | 100 | 11 | 1
20 | 109 | 7 | 4
20 | 109 | 6 | 4
20 | 109 | 8 | 4
20 | 110 | 10 | 6
20 | 110 | 9 | 6
30 | 102 | 2 | 1
30 | 103 | 3 | 2
30 | 105 | 5 | 3
Using the VMart schema, the following query finds the number of employees who make less than or equivalent to the hourly rate of the current employee. The query returns a running/cumulative average (sometimes called a moving average) using the default window of RANGE UNBOUNDED PRECEDING AND CURRENT ROW
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*)
OVER (ORDER BY hourly_rate) AS moving_count from employee_dimension;
last_name | hourly_rate | moving_count
------------+-------------+--------------
Gauthier | 6 | 4
Taylor | 6 | 4
Jefferson | 6 | 4
Nielson | 6 | 4
McNulty | 6.01 | 11
Robinson | 6.01 | 11
Dobisz | 6.01 | 11
Williams | 6.01 | 11
Kramer | 6.01 | 11
Miller | 6.01 | 11
Wilson | 6.01 | 11
Vogel | 6.02 | 14
Moore | 6.02 | 14
Vogel | 6.02 | 14
Carcetti | 6.03 | 19
...
To return a moving average that is not also a running (cumulative) average, the window should specify ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*)
OVER (ORDER BY hourly_rate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
AS moving_count from employee_dimension;
See also
2.10 - CUME_DIST [analytic]
Calculates the cumulative distribution, or relative rank, of the current row with regard to other rows in the same partition within a .
Calculates the cumulative distribution, or relative rank, of the current row with regard to other rows in the same partition within a window.
CUME_DIST()
returns a number greater then 0 and less then or equal to 1, where the number represents the relative position of the specified row within a group of n
rows. For a row x
(assuming ASC
ordering), the CUME_DIST
of x
is the number of rows with values lower than or equal to the value of x
, divided by the number of rows in the partition. For example, in a group of three rows, the cumulative distribution values returned would be 1/3, 2/3, and 3/3.
Note
Because the result for a given row depends on the number of rows preceding that row in the same partition, you should always specify a window-order-clause
when you call this function.
Behavior type
Immutable
Syntax
CUME_DIST ( ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
OVER()
- See Analytic Functions.
Examples
The following example returns the cumulative distribution of sales for different transaction types within each month of the first quarter.
=> SELECT calendar_month_name AS month, tender_type, SUM(sales_quantity),
CUME_DIST()
OVER (PARTITION BY calendar_month_name ORDER BY SUM(sales_quantity)) AS
CUME_DIST
FROM store.store_sales_fact JOIN date_dimension
USING(date_key) WHERE calendar_month_name IN ('January','February','March')
AND tender_type NOT LIKE 'Other'
GROUP BY calendar_month_name, tender_type;
month | tender_type | SUM | CUME_DIST
----------+-------------+--------+-----------
March | Credit | 469858 | 0.25
March | Cash | 470449 | 0.5
March | Check | 473033 | 0.75
March | Debit | 475103 | 1
January | Cash | 441730 | 0.25
January | Debit | 443922 | 0.5
January | Check | 446297 | 0.75
January | Credit | 450994 | 1
February | Check | 425665 | 0.25
February | Debit | 426726 | 0.5
February | Credit | 430010 | 0.75
February | Cash | 430767 | 1
(12 rows)
See also
2.11 - DENSE_RANK [analytic]
Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY clause.
Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY
clause. A DENSE_RANK
function returns a sequence of ranking numbers without any gaps.
DENSE_RANK
executes as follows:
-
Sorts partition rows as specified by the ORDER BY
clause.
-
Compares the ORDER BY
values of the preceding row and current row and ranks the current row as follows:
-
If ORDER BY
values are the same, the current row gets the same ranking as the preceding row.
Note
Null values are considered equal. For detailed information on how null values are sorted, see
NULL sort order.
-
If the ORDER BY
values are different, DENSE_RANK
increments or decrements the current row's ranking by 1, depending whether sort order is ascending or descending.
DENSE_RANK
always changes the ranking by 1, so no gaps appear in the ranking sequence. The largest rank value is the number of unique ORDER BY
values returned by the query.
Behavior type
Immutable
Syntax
DENSE_RANK() OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
OVER()
- See Analytic Functions.
See Analytic Functions
Compared with RANK
RANK
leaves gaps in the ranking sequence, while DENSE_RANK
does not. The example below compares the behavior of the two functions.
Examples
The following query invokes RANK
and DENSE_RANK
to rank customers by annual income. The two functions return different rankings, as follows:
-
If annual_salary
contains duplicate values, RANK()
inserts duplicate rankings and then skips one or more values—for example, from 4 to 6 and 7 to 9.
-
In the parallel column Dense Rank
, DENSE_RANK()
also inserts duplicate rankings, but leaves no gaps in the rankings sequence:
=> SELECT employee_region region, employee_key, annual_salary,
RANK() OVER (PARTITION BY employee_region ORDER BY annual_salary) Rank,
DENSE_RANK() OVER (PARTITION BY employee_region ORDER BY annual_salary) "Dense Rank"
FROM employee_dimension;
region | employee_key | annual_salary | Rank | Dense Rank
----------------------------------+--------------+---------------+------+------------
West | 5248 | 1200 | 1 | 1
West | 6880 | 1204 | 2 | 2
West | 5700 | 1214 | 3 | 3
West | 9857 | 1218 | 4 | 4
West | 6014 | 1218 | 4 | 4
West | 9221 | 1220 | 6 | 5
West | 7646 | 1222 | 7 | 6
West | 6621 | 1222 | 7 | 6
West | 6488 | 1224 | 9 | 7
West | 7659 | 1226 | 10 | 8
West | 7432 | 1226 | 10 | 8
West | 9905 | 1226 | 10 | 8
West | 9021 | 1228 | 13 | 9
...
West | 56 | 963104 | 2794 | 2152
West | 100 | 992363 | 2795 | 2153
East | 8353 | 1200 | 1 | 1
East | 9743 | 1202 | 2 | 2
East | 9975 | 1202 | 2 | 2
East | 9205 | 1204 | 4 | 3
East | 8894 | 1206 | 5 | 4
East | 7740 | 1206 | 5 | 4
East | 7324 | 1208 | 7 | 5
East | 6505 | 1208 | 7 | 5
East | 5404 | 1208 | 7 | 5
East | 5010 | 1208 | 7 | 5
East | 9114 | 1212 | 11 | 6
...
See also
SQL analytics
2.12 - EXPONENTIAL_MOVING_AVERAGE [analytic]
Calculates the exponential moving average (EMA) of expression E with smoothing factor X.
Calculates the exponential moving average (EMA) of expression E
with smoothing factor X
. An EMA differs from a simple moving average in that it provides a more stable picture of changes to data over time.
The EMA is calculated by adding the previous EMA value to the current data point scaled by the smoothing factor, as in the following formula:
EMA
=
EMA0
+ (
X
* (
E
-
EMA0
))
where:
-
E
is the current data point
-
EMA0
is the previous row's EMA value.
-
X
is the smoothing factor.
This function also works at the row level. For example, EMA assumes the data in a given column is sampled at uniform intervals. If the users' data points are sampled at non-uniform intervals, they should run the time series gap filling and interpolation (GFI) operations before EMA()
Behavior type
Immutable
Syntax
EXPONENTIAL_MOVING_AVERAGE ( E, X ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
E
- The value whose average is calculated over a set of rows. Can be
INTEGER
, FLOAT
or NUMERIC
type and must be a constant.
X
- A positive
FLOAT
value between 0 and 1 that is used as the smoothing factor.
OVER()
- See Analytic Functions.
Examples
The following example uses time series gap filling and interpolation (GFI) first in a subquery, and then performs an EXPONENTIAL_MOVING_AVERAGE
operation on the subquery result.
Create a simple four-column table:
=> CREATE TABLE ticker(
time TIMESTAMP,
symbol VARCHAR(8),
bid1 FLOAT,
bid2 FLOAT );
Insert some data, including nulls, so GFI can do its interpolation and gap filling:
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:00', 'ABC', 60.45, 60.44);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:01', 'ABC', 60.49, 65.12);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:02', 'ABC', 57.78, 59.25);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:03', 'ABC', null, 65.12);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:04', 'ABC', 67.88, null);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:00', 'XYZ', 47.55, 40.15);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:01', 'XYZ', 44.35, 46.78);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:02', 'XYZ', 71.56, 75.78);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:03', 'XYZ', 85.55, 70.21);
=> INSERT INTO ticker VALUES ('2009-07-12 03:00:04', 'XYZ', 45.55, 58.65);
=> COMMIT;
Note
During gap filling and interpolation, Vertica takes the closest non null value on either side of the time slice and uses that value. For example, if you use a linear interpolation scheme and you do not specify
IGNORE NULLS
, and your data has one real value and one null, the result is null. If the value on either side is null, the result is null. See
When Time Series Data Contains Nulls for details.
Query the table that you just created to you can see the output:
=> SELECT * FROM ticker;
time | symbol | bid1 | bid2
---------------------+--------+-------+-------
2009-07-12 03:00:00 | ABC | 60.45 | 60.44
2009-07-12 03:00:01 | ABC | 60.49 | 65.12
2009-07-12 03:00:02 | ABC | 57.78 | 59.25
2009-07-12 03:00:03 | ABC | | 65.12
2009-07-12 03:00:04 | ABC | 67.88 |
2009-07-12 03:00:00 | XYZ | 47.55 | 40.15
2009-07-12 03:00:01 | XYZ | 44.35 | 46.78
2009-07-12 03:00:02 | XYZ | 71.56 | 75.78
2009-07-12 03:00:03 | XYZ | 85.55 | 70.21
2009-07-12 03:00:04 | XYZ | 45.55 | 58.65
(10 rows)
The following query processes the first and last values that belong to each 2-second time slice in table trades
' column a
. The query then calculates the exponential moving average of expression fv and lv with a smoothing factor of 50%:
=> SELECT symbol, slice_time, fv, lv,
EXPONENTIAL_MOVING_AVERAGE(fv, 0.5)
OVER (PARTITION BY symbol ORDER BY slice_time) AS ema_first,
EXPONENTIAL_MOVING_AVERAGE(lv, 0.5)
OVER (PARTITION BY symbol ORDER BY slice_time) AS ema_last
FROM (
SELECT symbol, slice_time,
TS_FIRST_VALUE(bid1 IGNORE NULLS) as fv,
TS_LAST_VALUE(bid2 IGNORE NULLS) AS lv
FROM ticker TIMESERIES slice_time AS '2 seconds'
OVER (PARTITION BY symbol ORDER BY time) ) AS sq;
symbol | slice_time | fv | lv | ema_first | ema_last
--------+---------------------+-------+-------+-----------+----------
ABC | 2009-07-12 03:00:00 | 60.45 | 65.12 | 60.45 | 65.12
ABC | 2009-07-12 03:00:02 | 57.78 | 65.12 | 59.115 | 65.12
ABC | 2009-07-12 03:00:04 | 67.88 | 65.12 | 63.4975 | 65.12
XYZ | 2009-07-12 03:00:00 | 47.55 | 46.78 | 47.55 | 46.78
XYZ | 2009-07-12 03:00:02 | 71.56 | 70.21 | 59.555 | 58.495
XYZ | 2009-07-12 03:00:04 | 45.55 | 58.65 | 52.5525 | 58.5725
(6 rows)
See also
2.13 - FIRST_VALUE [analytic]
Lets you select the first value of a table or partition (determined by the window-order-clause) without having to use a self join.
Lets you select the first value of a table or partition (determined by the window-order-clause
) without having to use a self join. This function is useful when you want to use the first value as a baseline in calculations.
Use FIRST_VALUE()
with the window-order-clause
to produce deterministic results. If no window is specified for the current row, the default window is UNBOUNDED PRECEDING AND CURRENT ROW
.
Behavior type
Immutable
Syntax
FIRST_VALUE ( expression [ IGNORE NULLS ] ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Expression to evaluate—or example, a constant, column, nonanalytic function, function expression, or expressions involving any of these.
IGNORE NULLS
- Specifies to return the first non-null value in the set, or
NULL
if all values are NULL
. If you omit this option and the first value in the set is null, the function returns NULL
.
OVER()
- See Analytic Functions.
Examples
The following query asks for the first value in the partitioned day of week, and illustrates the potential nondeterministic nature of FIRST_VALUE()
:
=> SELECT calendar_year, date_key, day_of_week, full_date_description,
FIRST_VALUE(full_date_description)
OVER(PARTITION BY calendar_month_number_in_year ORDER BY day_of_week)
AS "first_value"
FROM date_dimension
WHERE calendar_year=2003 AND calendar_month_number_in_year=1;
The first value returned is January 31, 2003; however, the next time the same query is run, the first value might be January 24 or January 3, or the 10th or 17th. This is because the analytic ORDER BY
column day_of_week
returns rows that contain ties (multiple Fridays). These repeated values make the ORDER BY
evaluation result nondeterministic, because rows that contain ties can be ordered in any way, and any one of those rows qualifies as being the first value of day_of_week
.
calendar_year | date_key | day_of_week | full_date_description | first_value
--------------+----------+-------------+-----------------------+------------------
2003 | 31 | Friday | January 31, 2003 | January 31, 2003
2003 | 24 | Friday | January 24, 2003 | January 31, 2003
2003 | 3 | Friday | January 3, 2003 | January 31, 2003
2003 | 10 | Friday | January 10, 2003 | January 31, 2003
2003 | 17 | Friday | January 17, 2003 | January 31, 2003
2003 | 6 | Monday | January 6, 2003 | January 31, 2003
2003 | 27 | Monday | January 27, 2003 | January 31, 2003
2003 | 13 | Monday | January 13, 2003 | January 31, 2003
2003 | 20 | Monday | January 20, 2003 | January 31, 2003
2003 | 11 | Saturday | January 11, 2003 | January 31, 2003
2003 | 18 | Saturday | January 18, 2003 | January 31, 2003
2003 | 25 | Saturday | January 25, 2003 | January 31, 2003
2003 | 4 | Saturday | January 4, 2003 | January 31, 2003
2003 | 12 | Sunday | January 12, 2003 | January 31, 2003
2003 | 26 | Sunday | January 26, 2003 | January 31, 2003
2003 | 5 | Sunday | January 5, 2003 | January 31, 2003
2003 | 19 | Sunday | January 19, 2003 | January 31, 2003
2003 | 23 | Thursday | January 23, 2003 | January 31, 2003
2003 | 2 | Thursday | January 2, 2003 | January 31, 2003
2003 | 9 | Thursday | January 9, 2003 | January 31, 2003
2003 | 16 | Thursday | January 16, 2003 | January 31, 2003
2003 | 30 | Thursday | January 30, 2003 | January 31, 2003
2003 | 21 | Tuesday | January 21, 2003 | January 31, 2003
2003 | 14 | Tuesday | January 14, 2003 | January 31, 2003
2003 | 7 | Tuesday | January 7, 2003 | January 31, 2003
2003 | 28 | Tuesday | January 28, 2003 | January 31, 2003
2003 | 22 | Wednesday | January 22, 2003 | January 31, 2003
2003 | 29 | Wednesday | January 29, 2003 | January 31, 2003
2003 | 15 | Wednesday | January 15, 2003 | January 31, 2003
2003 | 1 | Wednesday | January 1, 2003 | January 31, 2003
2003 | 8 | Wednesday | January 8, 2003 | January 31, 2003
(31 rows)
Note
The day_of_week
results are returned in alphabetical order because of lexical rules. The fact that each day does not appear ordered by the 7-day week cycle (for example, starting with Sunday followed by Monday, Tuesday, and so on) has no affect on results.
To return deterministic results, modify the query so that it performs its analytic ORDER BY
operations on a unique field, such as date_key
:
=> SELECT calendar_year, date_key, day_of_week, full_date_description,
FIRST_VALUE(full_date_description) OVER
(PARTITION BY calendar_month_number_in_year ORDER BY date_key) AS "first_value"
FROM date_dimension WHERE calendar_year=2003;
FIRST_VALUE()
returns a first value of January 1 for the January partition and the first value of February 1 for the February partition. Also, the full_date_description
column contains no ties:
calendar_year | date_key | day_of_week | full_date_description | first_value
---------------+----------+-------------+-----------------------+------------
2003 | 1 | Wednesday | January 1, 2003 | January 1, 2003
2003 | 2 | Thursday | January 2, 2003 | January 1, 2003
2003 | 3 | Friday | January 3, 2003 | January 1, 2003
2003 | 4 | Saturday | January 4, 2003 | January 1, 2003
2003 | 5 | Sunday | January 5, 2003 | January 1, 2003
2003 | 6 | Monday | January 6, 2003 | January 1, 2003
2003 | 7 | Tuesday | January 7, 2003 | January 1, 2003
2003 | 8 | Wednesday | January 8, 2003 | January 1, 2003
2003 | 9 | Thursday | January 9, 2003 | January 1, 2003
2003 | 10 | Friday | January 10, 2003 | January 1, 2003
2003 | 11 | Saturday | January 11, 2003 | January 1, 2003
2003 | 12 | Sunday | January 12, 2003 | January 1, 2003
2003 | 13 | Monday | January 13, 2003 | January 1, 2003
2003 | 14 | Tuesday | January 14, 2003 | January 1, 2003
2003 | 15 | Wednesday | January 15, 2003 | January 1, 2003
2003 | 16 | Thursday | January 16, 2003 | January 1, 2003
2003 | 17 | Friday | January 17, 2003 | January 1, 2003
2003 | 18 | Saturday | January 18, 2003 | January 1, 2003
2003 | 19 | Sunday | January 19, 2003 | January 1, 2003
2003 | 20 | Monday | January 20, 2003 | January 1, 2003
2003 | 21 | Tuesday | January 21, 2003 | January 1, 2003
2003 | 22 | Wednesday | January 22, 2003 | January 1, 2003
2003 | 23 | Thursday | January 23, 2003 | January 1, 2003
2003 | 24 | Friday | January 24, 2003 | January 1, 2003
2003 | 25 | Saturday | January 25, 2003 | January 1, 2003
2003 | 26 | Sunday | January 26, 2003 | January 1, 2003
2003 | 27 | Monday | January 27, 2003 | January 1, 2003
2003 | 28 | Tuesday | January 28, 2003 | January 1, 2003
2003 | 29 | Wednesday | January 29, 2003 | January 1, 2003
2003 | 30 | Thursday | January 30, 2003 | January 1, 2003
2003 | 31 | Friday | January 31, 2003 | January 1, 2003
2003 | 32 | Saturday | February 1, 2003 | February 1, 2003
2003 | 33 | Sunday | February 2, 2003 | February 1,2003
...
(365 rows)
See also
2.14 - LAG [analytic]
Returns the value of the input expression at the given offset before the current row within a.
Returns the value of the input expression at the given offset before the current row within a window. This function lets you access more than one row in a table at the same time. This is useful for comparing values when the relative positions of rows can be reliably known. It also lets you avoid the more costly self join, which enhances query processing speed.
For information on getting the rows that follow, see LEAD.
Behavior type
Immutable
Syntax
LAG ( expression[, offset ] [, default ] ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
expression
- The expression to evaluate—for example, a constant, column, non-analytic function, function expression, or expressions involving any of these.
offset
- Indicates how great is the lag. The default value is 1 (the previous row). This parameter must evaluate to a constant positive integer.
default
- The value returned if
offset
falls outside the bounds of the table or partition. This value must be a constant value or an expression that can be evaluated to a constant; its data type is coercible to that of the first argument.
Examples
This example sums the current balance by date in a table and also sums the previous balance from the last day. Given the inputs that follow, the data satisfies the following conditions:
-
For each some_id
, there is exactly 1 row for each date represented by month_date
.
-
For each some_id
, the set of dates is consecutive; that is, if there is a row for February 24 and a row for February 26, there would also be a row for February 25.
-
Each some_id
has the same set of dates.
=> CREATE TABLE balances (
month_date DATE,
current_bal INT,
some_id INT);
=> INSERT INTO balances values ('2009-02-24', 10, 1);
=> INSERT INTO balances values ('2009-02-25', 10, 1);
=> INSERT INTO balances values ('2009-02-26', 10, 1);
=> INSERT INTO balances values ('2009-02-24', 20, 2);
=> INSERT INTO balances values ('2009-02-25', 20, 2);
=> INSERT INTO balances values ('2009-02-26', 20, 2);
=> INSERT INTO balances values ('2009-02-24', 30, 3);
=> INSERT INTO balances values ('2009-02-25', 20, 3);
=> INSERT INTO balances values ('2009-02-26', 30, 3);
Now run LAG to sum the current balance for each date and sum the previous balance from the last day:
=> SELECT month_date,
SUM(current_bal) as current_bal_sum,
SUM(previous_bal) as previous_bal_sum FROM
(SELECT month_date, current_bal,
LAG(current_bal, 1, 0) OVER
(PARTITION BY some_id ORDER BY month_date)
AS previous_bal FROM balances) AS subQ
GROUP BY month_date ORDER BY month_date;
month_date | current_bal_sum | previous_bal_sum
------------+-----------------+------------------
2009-02-24 | 60 | 0
2009-02-25 | 50 | 60
2009-02-26 | 60 | 50
(3 rows)
Using the same example data, the following query would not be allowed because LAG is nested inside an aggregate function:
=> SELECT month_date,
SUM(current_bal) as current_bal_sum,
SUM(LAG(current_bal, 1, 0) OVER
(PARTITION BY some_id ORDER BY month_date)) AS previous_bal_sum
FROM some_table GROUP BY month_date ORDER BY month_date;
The following example uses the VMart database. LAG first returns the annual income from the previous row, and then it calculates the difference between the income in the current row from the income in the previous row:
=> SELECT occupation, customer_key, customer_name, annual_income,
LAG(annual_income, 1, 0) OVER (PARTITION BY occupation
ORDER BY annual_income) AS prev_income, annual_income -
LAG(annual_income, 1, 0) OVER (PARTITION BY occupation
ORDER BY annual_income) AS difference
FROM customer_dimension ORDER BY occupation, customer_key LIMIT 20;
occupation | customer_key | customer_name | annual_income | prev_income | difference
------------+--------------+----------------------+---------------+-------------+------------
Accountant | 15 | Midori V. Peterson | 692610 | 692535 | 75
Accountant | 43 | Midori S. Rodriguez | 282359 | 280976 | 1383
Accountant | 93 | Robert P. Campbell | 471722 | 471355 | 367
Accountant | 102 | Sam T. McNulty | 901636 | 901561 | 75
Accountant | 134 | Martha B. Overstreet | 705146 | 704335 | 811
Accountant | 165 | James C. Kramer | 376841 | 376474 | 367
Accountant | 225 | Ben W. Farmer | 70574 | 70449 | 125
Accountant | 270 | Jessica S. Lang | 684204 | 682274 | 1930
Accountant | 273 | Mark X. Lampert | 723294 | 722737 | 557
Accountant | 295 | Sharon K. Gauthier | 29033 | 28412 | 621
Accountant | 338 | Anna S. Jackson | 816858 | 815557 | 1301
Accountant | 377 | William I. Jones | 915149 | 914872 | 277
Accountant | 438 | Joanna A. McCabe | 147396 | 144482 | 2914
Accountant | 452 | Kim P. Brown | 126023 | 124797 | 1226
Accountant | 467 | Meghan K. Carcetti | 810528 | 810284 | 244
Accountant | 478 | Tanya E. Greenwood | 639649 | 639029 | 620
Accountant | 511 | Midori P. Vogel | 187246 | 185539 | 1707
Accountant | 525 | Alexander K. Moore | 677433 | 677050 | 383
Accountant | 550 | Sam P. Reyes | 735691 | 735355 | 336
Accountant | 577 | Robert U. Vu | 616101 | 615439 | 662
(20 rows)
The next example uses LEAD and LAG to return the third row after the salary in the current row and fifth salary before the salary in the current row:
=> SELECT hire_date, employee_key, employee_last_name,
LEAD(hire_date, 1) OVER (ORDER BY hire_date) AS "next_hired" ,
LAG(hire_date, 1) OVER (ORDER BY hire_date) AS "last_hired"
FROM employee_dimension ORDER BY hire_date, employee_key;
hire_date | employee_key | employee_last_name | next_hired | last_hired
------------+--------------+--------------------+------------+------------
1956-04-11 | 2694 | Farmer | 1956-05-12 |
1956-05-12 | 5486 | Winkler | 1956-09-18 | 1956-04-11
1956-09-18 | 5525 | McCabe | 1957-01-15 | 1956-05-12
1957-01-15 | 560 | Greenwood | 1957-02-06 | 1956-09-18
1957-02-06 | 9781 | Bauer | 1957-05-25 | 1957-01-15
1957-05-25 | 9506 | Webber | 1957-07-04 | 1957-02-06
1957-07-04 | 6723 | Kramer | 1957-07-07 | 1957-05-25
1957-07-07 | 5827 | Garnett | 1957-11-11 | 1957-07-04
1957-11-11 | 373 | Reyes | 1957-11-21 | 1957-07-07
1957-11-21 | 3874 | Martin | 1958-02-06 | 1957-11-11
(10 rows)
See also
2.15 - LAST_VALUE [analytic]
Lets you select the last value of a table or partition (determined by the window-order-clause) without having to use a self join.
Lets you select the last value of a table or partition (determined by the window-order-clause
) without having to use a self join. LAST_VALUE
takes the last record from the partition after the window order clause. The function then computes the expression against the last record, and returns the results. This function is useful when you want to use the last value as a baseline in calculations.
Use LAST_VALUE()
with the window-order-clause
to produce deterministic results. If no window is specified for the current row, the default window is UNBOUNDED PRECEDING AND CURRENT ROW
.
Tip
Due to default window semantics,
LAST_VALUE
does not always return the last value of a partition. If you omit
window-frame-clause from the analytic clause,
LAST_VALUE
operates on this default window. Although results can seem non-intuitive by not returning the bottom of the current partition, it returns the bottom of the window, which continues to change along with the current input row being processed. If you want to return the last value of a partition, use
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
. See examples below.
Behavior type
Immutable
Syntax
LAST_VALUE ( expression [ IGNORE NULLS ] ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Expression to evaluate—for example, a constant, column, nonanalytic function, function expression, or expressions involving any of these.
IGNORE NULLS
- Specifies to return the last non-null value in the set, or
NULL
if all values are NULL
. If you omit this option and the last value in the set is null, the function returns NULL
.
OVER()
- See Analytic Functions.
Examples
Using the schema defined in Window framing in Analyzing Data, the following query does not show the highest salary value by department; instead it shows the highest salary value by department by salary.
=> SELECT deptno, sal, empno, LAST_VALUE(sal)
OVER (PARTITION BY deptno ORDER BY sal) AS lv
FROM emp;
deptno | sal | empno | lv
--------+-----+-------+--------
10 | 101 | 1 | 101
10 | 104 | 4 | 104
20 | 100 | 11 | 100
20 | 109 | 7 | 109
20 | 109 | 6 | 109
20 | 109 | 8 | 109
20 | 110 | 10 | 110
20 | 110 | 9 | 110
30 | 102 | 2 | 102
30 | 103 | 3 | 103
30 | 105 | 5 | 105
If you include the window frame clause ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
, LAST_VALUE()
returns the highest salary by department, an accurate representation of the information:
=> SELECT deptno, sal, empno, LAST_VALUE(sal)
OVER (PARTITION BY deptno ORDER BY sal
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lv
FROM emp;
deptno | sal | empno | lv
--------+-----+-------+--------
10 | 101 | 1 | 104
10 | 104 | 4 | 104
20 | 100 | 11 | 110
20 | 109 | 7 | 110
20 | 109 | 6 | 110
20 | 109 | 8 | 110
20 | 110 | 10 | 110
20 | 110 | 9 | 110
30 | 102 | 2 | 105
30 | 103 | 3 | 105
30 | 105 | 5 | 105
For more examples, see FIRST_VALUE().
See also
2.16 - LEAD [analytic]
Returns values from the row after the current row within a , letting you access more than one row in a table at the same time.
Returns values from the row after the current row within a window, letting you access more than one row in a table at the same time. This is useful for comparing values when the relative positions of rows can be reliably known. It also lets you avoid the more costly self join, which enhances query processing speed.
Behavior type
Immutable
Syntax
LEAD ( expression[, offset ] [, default ] ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
expression
- The expression to evaluate—for example, a constant, column, non-analytic function, function expression, or expressions involving any of these.
offset
- Is an optional parameter that defaults to 1 (the next row). This parameter must evaluate to a constant positive integer.
default
- The value returned if
offset
falls outside the bounds of the table or partition. This value must be a constant value or an expression that can be evaluated to a constant; its data type is coercible to that of the first argument.
Examples
LEAD
finds the hire date of the employee hired just after the current row:
=> SELECT employee_region, hire_date, employee_key, employee_last_name,
LEAD(hire_date, 1) OVER (PARTITION BY employee_region ORDER BY hire_date) AS "next_hired"
FROM employee_dimension ORDER BY employee_region, hire_date, employee_key;
employee_region | hire_date | employee_key | employee_last_name | next_hired
-------------------+------------+--------------+--------------------+------------
East | 1956-04-08 | 9218 | Harris | 1957-02-06
East | 1957-02-06 | 7799 | Stein | 1957-05-25
East | 1957-05-25 | 3687 | Farmer | 1957-06-26
East | 1957-06-26 | 9474 | Bauer | 1957-08-18
East | 1957-08-18 | 570 | Jefferson | 1957-08-24
East | 1957-08-24 | 4363 | Wilson | 1958-02-17
East | 1958-02-17 | 6457 | McCabe | 1958-06-26
East | 1958-06-26 | 6196 | Li | 1958-07-16
East | 1958-07-16 | 7749 | Harris | 1958-09-18
East | 1958-09-18 | 9678 | Sanchez | 1958-11-10
(10 rows)
The next example uses LEAD
and LAG
to return the third row after the salary in the current row and fifth salary before the salary in the current row.
=> SELECT hire_date, employee_key, employee_last_name,
LEAD(hire_date, 1) OVER (ORDER BY hire_date) AS "next_hired" ,
LAG(hire_date, 1) OVER (ORDER BY hire_date) AS "last_hired"
FROM employee_dimension ORDER BY hire_date, employee_key;
hire_date | employee_key | employee_last_name | next_hired | last_hired
------------+--------------+--------------------+------------+------------
1956-04-11 | 2694 | Farmer | 1956-05-12 |
1956-05-12 | 5486 | Winkler | 1956-09-18 | 1956-04-11
1956-09-18 | 5525 | McCabe | 1957-01-15 | 1956-05-12
1957-01-15 | 560 | Greenwood | 1957-02-06 | 1956-09-18
1957-02-06 | 9781 | Bauer | 1957-05-25 | 1957-01-15
1957-05-25 | 9506 | Webber | 1957-07-04 | 1957-02-06
1957-07-04 | 6723 | Kramer | 1957-07-07 | 1957-05-25
1957-07-07 | 5827 | Garnett | 1957-11-11 | 1957-07-04
1957-11-11 | 373 | Reyes | 1957-11-21 | 1957-07-07
1957-11-21 | 3874 | Martin | 1958-02-06 | 1957-11-11
(10 rows)
The following example returns employee name and salary, along with the next highest and lowest salaries.
=> SELECT employee_last_name, annual_salary,
NVL(LEAD(annual_salary) OVER (ORDER BY annual_salary),
MIN(annual_salary) OVER()) "Next Highest",
NVL(LAG(annual_salary) OVER (ORDER BY annual_salary),
MAX(annual_salary) OVER()) "Next Lowest"
FROM employee_dimension;
employee_last_name | annual_salary | Next Highest | Next Lowest
--------------------+---------------+--------------+-------------
Nielson | 1200 | 1200 | 995533
Lewis | 1200 | 1200 | 1200
Harris | 1200 | 1202 | 1200
Robinson | 1202 | 1202 | 1200
Garnett | 1202 | 1202 | 1202
Weaver | 1202 | 1202 | 1202
Nielson | 1202 | 1202 | 1202
McNulty | 1202 | 1204 | 1202
Farmer | 1204 | 1204 | 1202
Martin | 1204 | 1204 | 1204
(10 rows)
The next example returns, for each assistant director in the employees table, the hire date of the director hired just after the director on the current row. For example, Jackson was hired on 2016-12-28, and the next director hired was Bauer:
=> SELECT employee_last_name, hire_date,
LEAD(hire_date, 1) OVER (ORDER BY hire_date DESC) as "NextHired"
FROM employee_dimension WHERE job_title = 'Assistant Director';
employee_last_name | hire_date | NextHired
--------------------+------------+------------
Jackson | 2016-12-28 | 2016-12-26
Bauer | 2016-12-26 | 2016-12-11
Miller | 2016-12-11 | 2016-12-07
Fortin | 2016-12-07 | 2016-11-27
Harris | 2016-11-27 | 2016-11-15
Goldberg | 2016-11-15 |
(5 rows)
See also
2.17 - MAX [analytic]
Returns the maximum value of an expression within a.
Returns the maximum value of an expression within a window. The return value has the same type as the expression data type.
The analytic functions MIN()
and MAX()
can operate with Boolean values. The MAX()
function acts upon a Boolean data type or a value that can be implicitly converted to a Boolean value. If at least one input value is true, MAX()
returns t
(true). Otherwise, it returns f
(false). In the same scenario, the MIN()
function returns t
(true) if all input values are true. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
MAX ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any expression for which the maximum value is calculated, typically a column reference.
OVER()
- See Analytic Functions.
Examples
The following query computes the deviation between the employees' annual salary and the maximum annual salary in Massachusetts:
=> SELECT employee_state, annual_salary,
MAX(annual_salary)
OVER(PARTITION BY employee_state ORDER BY employee_key) max,
annual_salary- MAX(annual_salary)
OVER(PARTITION BY employee_state ORDER BY employee_key) diff
FROM employee_dimension
WHERE employee_state = 'MA';
employee_state | annual_salary | max | diff
----------------+---------------+--------+---------
MA | 1918 | 995533 | -993615
MA | 2058 | 995533 | -993475
MA | 2586 | 995533 | -992947
MA | 2500 | 995533 | -993033
MA | 1318 | 995533 | -994215
MA | 2072 | 995533 | -993461
MA | 2656 | 995533 | -992877
MA | 2148 | 995533 | -993385
MA | 2366 | 995533 | -993167
MA | 2664 | 995533 | -992869
(10 rows)
The following example shows you the difference between the MIN
and MAX
analytic functions when you use them with a Boolean value. The sample creates a table with two columns, adds two rows of data, and shows sample output for MIN
and MAX
.
CREATE TABLE min_max_functions (emp VARCHAR, torf BOOL);
INSERT INTO min_max_functions VALUES ('emp1', 1);
INSERT INTO min_max_functions VALUES ('emp1', 0);
SELECT DISTINCT emp,
min(torf) OVER (PARTITION BY emp) AS worksasbooleanand,
Max(torf) OVER (PARTITION BY emp) AS worksasbooleanor
FROM min_max_functions;
emp | worksasbooleanand | worksasbooleanor
------+-------------------+------------------
emp1 | f | t
(1 row)
See also
2.18 - MEDIAN [analytic]
For each row, returns the median value of a value set within each partition.
For each row, returns the median value of a value set within each partition. MEDIAN
determines the argument with the highest numeric precedence, implicitly converts the remaining arguments to that data type, and returns that data type.
MEDIAN
is an alias of
PERCENTILE_CONT [analytic]
with an argument of 0.5 (50%).
Behavior type
Immutable
Syntax
MEDIAN ( expression ) OVER ( [ window-partition-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the middle value or an interpolated value that would be the middle value once the values are sorted. Null values are ignored in the calculation.
OVER()
- If the
OVER
clause specifies window-partition-clause
, MEDIAN
groups input rows according to one or more columns or expressions. If this clause is omitted, no grouping occurs and MEDIAN
processes all input rows as a single partition.
Examples
See Calculating a median value
See also
2.19 - MIN [analytic]
Returns the minimum value of an expression within a.
Returns the minimum value of an expression within a window. The return value has the same type as the expression data type.
The analytic functions MIN()
and MAX()
can operate with Boolean values. The MAX()
function acts upon a Boolean data type or a value that can be implicitly converted to a Boolean value. If at least one input value is true, MAX()
returns t
(true). Otherwise, it returns f
(false). In the same scenario, the MIN()
function returns t
(true) if all input values are true. Otherwise, it returns f
.
Behavior type
Immutable
Syntax
MIN ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any expression for which the minimum value is calculated, typically a column reference.
OVER()
- See Analytic Functions.
Examples
The following example shows how you can query to determine the deviation between the employees' annual salary and the minimum annual salary in Massachusetts:
=> SELECT employee_state, annual_salary,
MIN(annual_salary)
OVER(PARTITION BY employee_state ORDER BY employee_key) min,
annual_salary- MIN(annual_salary)
OVER(PARTITION BY employee_state ORDER BY employee_key) diff
FROM employee_dimension
WHERE employee_state = 'MA';
employee_state | annual_salary | min | diff
----------------+---------------+------+------
MA | 1918 | 1204 | 714
MA | 2058 | 1204 | 854
MA | 2586 | 1204 | 1382
MA | 2500 | 1204 | 1296
MA | 1318 | 1204 | 114
MA | 2072 | 1204 | 868
MA | 2656 | 1204 | 1452
MA | 2148 | 1204 | 944
MA | 2366 | 1204 | 1162
MA | 2664 | 1204 | 1460
(10 rows)
The following example shows you the difference between the MIN
and MAX
analytic functions when you use them with a Boolean value. The sample creates a table with two columns, adds two rows of data, and shows sample output for MIN
and MAX
.
CREATE TABLE min_max_functions (emp VARCHAR, torf BOOL);
INSERT INTO min_max_functions VALUES ('emp1', 1);
INSERT INTO min_max_functions VALUES ('emp1', 0);
SELECT DISTINCT emp,
min(torf) OVER (PARTITION BY emp) AS worksasbooleanand,
Max(torf) OVER (PARTITION BY emp) AS worksasbooleanor
FROM min_max_functions;
emp | worksasbooleanand | worksasbooleanor
------+-------------------+------------------
emp1 | f | t
(1 row)
See also
2.20 - NTH_VALUE [analytic]
Returns the value evaluated at the row that is the nth row of the window (counting from 1).
Returns the value evaluated at the row that is the *n
*th row of the window (counting from 1). If the specified row does not exist, NTH_VALUE returns NULL
.
Behavior type
Immutable
Syntax
NTH_VALUE ( expression, row-number [ IGNORE NULLS ] ) OVER (
[ window-frame-clause ]
[ window-order-clause ])
Parameters
expression
- Expression to evaluate. The expression can be a constant, column name, nonanalytic function, function expression, or expressions that include any of these.
row-number
- Specifies the row to evaluate, where
row-number
evaluates to an integer ≥ 1.
IGNORE NULLS
- Specifies to return the first non-
NULL
value in the set, or NULL
if all values are NULL
.
OVER()
- See Analytic Functions.
Examples
In the following example, for each tuple (current row) in table t1
, the window frame clause defines the window as follows:
ORDER BY b ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
For each window, n
for *n
*th value is a+1
. a
is the value of column a
in the tuple.
NTH_VALUE returns the result of the expression b+1
, where b
is the value of column b
in the *n
*th row, which is the a+1
row within the window.
=> SELECT * FROM t1 ORDER BY a;
a | b
---+----
1 | 10
2 | 20
2 | 21
3 | 30
4 | 40
5 | 50
6 | 60
(7 rows)
=> SELECT NTH_VALUE(b+1, a+1) OVER
(ORDER BY b ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) FROM t1;
?column?
----------
22
31
(7 rows)
2.21 - NTILE [analytic]
Equally divides an ordered data set (partition) into a {value} number of subsets within a , where the subsets are numbered 1 through the value in parameter constant-value.
Equally divides an ordered data set (partition) into a {
value
}
number of subsets within a window, where the subsets are numbered 1 through the value in parameter constant-value
. For example, if constant-value
= 4 and the partition contains 20 rows, NTILE
divides the partition rows into four equal subsets of five rows. NTILE
assigns each row to a subset by giving row a number from 1 to 4. The rows in the first subset are assigned 1, the next five are assigned 2, and so on.
If the number of partition rows is not evenly divisible by the number of subsets, the rows are distributed so no subset is more than one row larger than any other subset, and the lowest subsets have extra rows. For example, if constant-value
= 4 and the number of rows = 21, the first subset has six rows, the second subset has five rows, and so on.
If the number of subsets is greater than the number of rows, then a number of subsets equal to the number of rows is filled, and the remaining subsets are empty.
Behavior type
Immutable
Syntax
NTILE ( constant-value ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
constant-value
- Specifies the number of subsets , where
constant-value
must resolve to a positive constant for each partition.
OVER()
- See Analytic Functions.
Examples
The following query assigns each month's sales total into one of four subsets:
=> SELECT calendar_month_name AS MONTH, SUM(sales_quantity),
NTILE(4) OVER (ORDER BY SUM(sales_quantity)) AS NTILE
FROM store.store_sales_fact JOIN date_dimension
USING(date_key)
GROUP BY calendar_month_name
ORDER BY NTILE;
MONTH | SUM | NTILE
-----------+---------+-------
November | 2040726 | 1
June | 2088528 | 1
February | 2134708 | 1
April | 2181767 | 2
January | 2229220 | 2
October | 2316363 | 2
September | 2323914 | 3
March | 2354409 | 3
August | 2387017 | 3
July | 2417239 | 4
May | 2492182 | 4
December | 2531842 | 4
(12 rows)
See also
2.22 - PERCENT_RANK [analytic]
Calculates the relative rank of a row for a given row in a group within a by dividing that row’s rank less 1 by the number of rows in the partition, also less 1.
Calculates the relative rank of a row for a given row in a group within a window by dividing that row’s rank less 1 by the number of rows in the partition, also less 1. PERCENT_RANK
always returns values from 0 to 1 inclusive. The first row in any set has a PERCENT_RANK
of 0. The return value is NUMBER
.
( rank - 1 ) / ( [ rows ] - 1 )
In the preceding formula, rank
is the rank position of a row in the group and rows
is the total number of rows in the partition defined by the OVER()
clause.
Behavior type
Immutable
Syntax
PERCENT_RANK ( ) OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
OVER()
- See Analytic Functions
Examples
The following example finds the percent rank of gross profit for different states within each month of the first quarter:
=> SELECT calendar_month_name AS MONTH, store_state,
SUM(gross_profit_dollar_amount),
PERCENT_RANK() OVER (PARTITION BY calendar_month_name
ORDER BY SUM(gross_profit_dollar_amount)) AS PERCENT_RANK
FROM store.store_sales_fact JOIN date_dimension
USING(date_key)
JOIN store.store_dimension
USING (store_key)
WHERE calendar_month_name IN ('January','February','March')
AND store_state IN ('OR','IA','DC','NV','WI')
GROUP BY calendar_month_name, store_state
ORDER BY calendar_month_name, PERCENT_RANK;
MONTH | store_state | SUM | PERCENT_RANK
----------+-------------+--------+--------------
February | IA | 418490 | 0
February | OR | 460588 | 0.25
February | DC | 616553 | 0.5
February | WI | 619204 | 0.75
February | NV | 838039 | 1
January | OR | 446528 | 0
January | IA | 474501 | 0.25
January | DC | 628496 | 0.5
January | WI | 679382 | 0.75
January | NV | 871824 | 1
March | IA | 460282 | 0
March | OR | 481935 | 0.25
March | DC | 716063 | 0.5
March | WI | 771575 | 0.75
March | NV | 970878 | 1
(15 rows)
The following example calculates, for each employee, the percent rank of the employee's salary by their job title:
=> SELECT job_title, employee_last_name, annual_salary,
PERCENT_RANK()
OVER (PARTITION BY job_title ORDER BY annual_salary DESC) AS percent_rank
FROM employee_dimension
ORDER BY percent_rank, annual_salary;
job_title | employee_last_name | annual_salary | percent_rank
--------------------+--------------------+---------------+---------------------
Cashier | Fortin | 3196 | 0
Delivery Person | Garnett | 3196 | 0
Cashier | Vogel | 3196 | 0
Customer Service | Sanchez | 3198 | 0
Shelf Stocker | Jones | 3198 | 0
Custodian | Li | 3198 | 0
Customer Service | Kramer | 3198 | 0
Greeter | McNulty | 3198 | 0
Greeter | Greenwood | 3198 | 0
Shift Manager | Miller | 99817 | 0
Advertising | Vu | 99853 | 0
Branch Manager | Jackson | 99858 | 0
Marketing | Taylor | 99928 | 0
Assistant Director | King | 99973 | 0
Sales | Kramer | 99973 | 0
Head of PR | Goldberg | 199067 | 0
Regional Manager | Gauthier | 199744 | 0
Director of HR | Moore | 199896 | 0
Head of Marketing | Overstreet | 199955 | 0
VP of Advertising | Meyer | 199975 | 0
VP of Sales | Sanchez | 199992 | 0
Founder | Gauthier | 927335 | 0
CEO | Taylor | 953373 | 0
Investor | Garnett | 963104 | 0
Co-Founder | Vu | 977716 | 0
CFO | Vogel | 983634 | 0
President | Sanchez | 992363 | 0
Delivery Person | Li | 3194 | 0.00114155251141553
Delivery Person | Robinson | 3194 | 0.00114155251141553
Custodian | McCabe | 3192 | 0.00126582278481013
Shelf Stocker | Moore | 3196 | 0.00128040973111396
Branch Manager | Moore | 99716 | 0.00186567164179104
...
See also
2.23 - PERCENTILE_CONT [analytic]
An inverse distribution function where, for each row, PERCENTILE_CONT returns the value that would fall into the specified percentile among a set of values in each partition within a.
An inverse distribution function where, for each row, PERCENTILE_CONT returns the value that would fall into the specified percentile among a set of values in each partition within a window. For example, if the argument to the function is 0.5, the result of the function is the median of the data set (50th percentile). PERCENTILE_CONT assumes a continuous distribution data model. NULL values are ignored.
PERCENTILE_CONT computes the percentile by first computing the row number where the percentile row would exist. For example:
row-number = 1 + percentile-value * (num-partition-rows -1)
If row-number
is a whole number (within an error of 0.00001), the percentile is the value of row row-number
.
Otherwise, Vertica interpolates the percentile value between the value of the CEILING(
row-number
)
row and the value of the FLOOR(
row-number
)
row. In other words, the percentile is calculated as follows:
( CEILING( row-number) - row-number ) * ( value of FLOOR(row-number) row )
+ ( row-number - FLOOR(row-number) ) * ( value of CEILING(row-number) row)
Note
If the percentile value is 0.5, PERCENTILE_CONT returns the same result set as the function
MEDIAN.
Behavior type
Immutable
Syntax
PERCENTILE_CONT ( percentile ) WITHIN GROUP ( ORDER BY expression [ ASC | DESC ] ) OVER ( [ window-partition-clause ] )
Parameters
percentile
- Percentile value, a FLOAT constant that ranges from 0 to 1 (inclusive).
WITHIN GROUP (ORDER BY
expression
)
- Specifies how to sort data within each group. ORDER BY takes only one column/expression that must be INTEGER, FLOAT, INTERVAL, or NUMERIC data type. NULL values are discarded.
The WITHIN GROUP(ORDER BY)
clause does not guarantee the order of the SQL result. To order the final result , use the SQL ORDER BY clause set.
ASC | DESC
- Specifies the ordering sequence as ascending (default) or descending.
Specifying ASC or DESC in the WITHIN GROUP
clause affects results as long as the percentile
is not 0.5
.
OVER()
- See Analytic Functions
Examples
This query computes the median annual income per group for the first 300 customers in Wisconsin and the District of Columbia.
=> SELECT customer_state, customer_key, annual_income, PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY annual_income)
OVER (PARTITION BY customer_state) AS PERCENTILE_CONT
FROM customer_dimension WHERE customer_state IN ('DC','WI') AND customer_key < 300
ORDER BY customer_state, customer_key;
customer_state | customer_key | annual_income | PERCENTILE_CONT
----------------+--------------+---------------+-----------------
DC | 52 | 168312 | 483266.5
DC | 118 | 798221 | 483266.5
WI | 62 | 283043 | 377691
WI | 139 | 472339 | 377691
(4 rows)
This query computes the median annual income per group for all customers in Wisconsin and the District of Columbia.
=> SELECT customer_state, customer_key, annual_income, PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY annual_income)
OVER (PARTITION BY customer_state) AS PERCENTILE_CONT
FROM customer_dimension WHERE customer_state IN ('DC','WI') ORDER BY customer_state, customer_key;
customer_state | customer_key | annual_income | PERCENTILE_CONT
----------------+--------------+---------------+-----------------
DC | 52 | 168312 | 483266.5
DC | 118 | 798221 | 483266.5
DC | 622 | 220782 | 555088
DC | 951 | 178453 | 555088
DC | 972 | 961582 | 555088
DC | 1286 | 760445 | 555088
DC | 1434 | 44836 | 555088
...
WI | 62 | 283043 | 377691
WI | 139 | 472339 | 377691
WI | 359 | 42242 | 517717
WI | 364 | 867543 | 517717
WI | 403 | 509031 | 517717
WI | 455 | 32000 | 517717
WI | 485 | 373129 | 517717
...
(1353 rows)
See also
2.24 - PERCENTILE_DISC [analytic]
An inverse distribution function where, for each row, PERCENTILE_DISC returns the value that would fall into the specified percentile among a set of values in each partition within a.
An inverse distribution function where, for each row, PERCENTILE_DISC
returns the value that would fall into the specified percentile among a set of values in each partition within a window. PERCENTILE_DISC()
assumes a discrete distribution data model. NULL
values are ignored.
PERCENTILE_DISC
examines the cumulative distribution values in each group until it finds one that is greater than or equal to the specified percentile. Vertica computes the percentile where, for each row, PERCENTILE_DISC
outputs the first value of the WITHIN GROUP(ORDER BY)
column whose CUME_DIST
(cumulative distribution) value is >= the argument FLOAT
value—for example, 0.4
:
PERCENTILE_DIST(0.4) WITHIN GROUP (ORDER BY salary) OVER(PARTITION BY deptno)...
Given the following query:
SELECT CUME_DIST() OVER(ORDER BY salary) FROM table-name;
The smallest CUME_DIST
value that is greater than 0.4 is also the PERCENTILE_DISC
.
Behavior type
Immutable
Syntax
PERCENTILE_DISC ( percentile ) WITHIN GROUP (
ORDER BY expression [ ASC | DESC ] ) OVER (
[ window-partition-clause ] )
Parameters
percentile
- Percentile value, a
FLOAT
constant that ranges from 0 to 1 (inclusive).
WITHIN GROUP(ORDER BY
expression
)
- Specifies how to sort data within each group.
ORDER BY
takes only one column/expression that must be INTEGER
, FLOAT
, INTERVAL
, or NUMERIC
data type. NULL
values are discarded.
The WITHIN GROUP(ORDER BY)
clause does not guarantee the order of the SQL result. To order the final result , use the SQL
ORDER BY
clause set.
ASC | DESC
- Specifies the ordering sequence as ascending (default) or descending.
OVER()
- See Analytic Functions
Examples
This query computes the 20th percentile annual income by group for first 300 customers in Wisconsin and the District of Columbia.
=> SELECT customer_state, customer_key, annual_income,
PERCENTILE_DISC(.2) WITHIN GROUP(ORDER BY annual_income)
OVER (PARTITION BY customer_state) AS PERCENTILE_DISC
FROM customer_dimension
WHERE customer_state IN ('DC','WI')
AND customer_key < 300
ORDER BY customer_state, customer_key;
customer_state | customer_key | annual_income | PERCENTILE_DISC
----------------+--------------+---------------+-----------------
DC | 104 | 658383 | 417092
DC | 168 | 417092 | 417092
DC | 245 | 670205 | 417092
WI | 106 | 227279 | 227279
WI | 127 | 703889 | 227279
WI | 209 | 458607 | 227279
(6 rows)
See also
2.25 - RANK [analytic]
Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY clause.
Within each window partition, ranks all rows in the query results set according to the order specified by the window's ORDER BY
clause.
RANK
executes as follows:
-
Sorts partition rows as specified by the ORDER BY
clause.
-
Compares the ORDER BY
values of the preceding row and current row and ranks the current row as follows:
-
If ORDER BY
values are the same, the current row gets the same ranking as the preceding row.
Note
Null values are considered equal. For detailed information on how null values are sorted, see
NULL sort order.
-
If the ORDER BY
values are different, DENSE_RANK
increments or decrements the current row's ranking by 1, plus the number of consecutive duplicate values in the rows that precede it.
The largest rank value is the equal to the total number of rows returned by the query.
Behavior type
Immutable
Syntax
RANK() OVER (
[ window-partition-clause ]
window-order-clause )
Parameters
OVER()
- See Analytic Functions
Compared with DENSE_RANK
RANK
can leave gaps in the ranking sequence, while
DENSE_RANK
does not.
Examples
The following query ranks by state all company customers that have been customers since 2007. In rows where the customer_since
dates are the same, RANK
assigns the rows equal ranking. When the customer_since
date changes, RANK
skips one or more rankings—for example, within CA
, from 12 to 14, and from 17 to 19.
=> SELECT customer_state, customer_name, customer_since,
RANK() OVER (PARTITION BY customer_state ORDER BY customer_since) AS rank
FROM customer_dimension WHERE customer_type='Company' AND customer_since > '01/01/2007'
ORDER BY customer_state;
customer_state | customer_name | customer_since | rank
----------------+---------------+----------------+------
AZ | Foodshop | 2007-01-20 | 1
AZ | Goldstar | 2007-08-11 | 2
CA | Metahope | 2007-01-05 | 1
CA | Foodgen | 2007-02-05 | 2
CA | Infohope | 2007-02-09 | 3
CA | Foodcom | 2007-02-19 | 4
CA | Amerihope | 2007-02-22 | 5
CA | Infostar | 2007-03-05 | 6
CA | Intracare | 2007-03-14 | 7
CA | Infocare | 2007-04-07 | 8
...
CO | Goldtech | 2007-02-19 | 1
CT | Foodmedia | 2007-02-11 | 1
CT | Metatech | 2007-02-20 | 2
CT | Infocorp | 2007-04-10 | 3
...
See also
SQL analytics
2.26 - ROW_NUMBER [analytic]
Assigns a sequence of unique numbers to each row in a partition, starting with 1.
Assigns a sequence of unique numbers to each row in a window partition, starting with 1. ROW_NUMBER and RANK are generally interchangeable, with the following differences:
-
ROW_NUMBER assigns a unique ordinal number to each row in the ordered set, starting with 1.
-
ROW_NUMBER() is a Vertica extension, while RANK conforms to the SQL-99 standard.
Behavior type
Immutable
Syntax
ROW_NUMBER () OVER (
[ window-partition-clause ]
[ window-order-clause ] )
Parameters
OVER()
- See Analytic Functions
Examples
The following ROW_NUMBER query partitions customers in the VMart table customer_dimension
by customer_regio
n. Within each partition, the function ranks those customers in order of seniority, as specified by its window order clause:
=> SELECT * FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY customer_region ORDER BY customer_since) AS most_senior,
customer_region, customer_name, customer_since FROM public.customer_dimension WHERE customer_type = 'Individual') sq
WHERE most_senior <= 5;
most_senior | customer_region | customer_name | customer_since
-------------+-----------------+----------------------+----------------
1 | West | Jack Y. Perkins | 1965-01-01
2 | West | Linda Q. Winkler | 1965-01-02
3 | West | Marcus K. Li | 1965-01-03
4 | West | Carla R. Jones | 1965-01-07
5 | West | Seth P. Young | 1965-01-09
1 | East | Kim O. Vu | 1965-01-01
2 | East | Alexandra L. Weaver | 1965-01-02
3 | East | Steve L. Webber | 1965-01-04
4 | East | Thom Y. Li | 1965-01-05
5 | East | Martha B. Farmer | 1965-01-07
1 | SouthWest | Martha V. Gauthier | 1965-01-01
2 | SouthWest | Jessica U. Goldberg | 1965-01-07
3 | SouthWest | Robert O. Stein | 1965-01-07
4 | SouthWest | Emily I. McCabe | 1965-01-18
5 | SouthWest | Jack E. Miller | 1965-01-25
1 | NorthWest | Julie O. Greenwood | 1965-01-08
2 | NorthWest | Amy X. McNulty | 1965-01-25
3 | NorthWest | Kevin S. Carcetti | 1965-02-09
4 | NorthWest | Sam K. Carcetti | 1965-03-16
5 | NorthWest | Alexandra X. Winkler | 1965-04-05
1 | MidWest | Michael Y. Meyer | 1965-01-01
2 | MidWest | Joanna W. Bauer | 1965-01-06
3 | MidWest | Amy E. Harris | 1965-01-08
4 | MidWest | Julie W. McCabe | 1965-01-09
5 | MidWest | William . Peterson | 1965-01-09
1 | South | Dean . Martin | 1965-01-01
2 | South | Ruth U. Williams | 1965-01-02
3 | South | Steve Y. Farmer | 1965-01-03
4 | South | Mark V. King | 1965-01-08
5 | South | Lucas Y. Young | 1965-01-10
(30 rows)
See also
2.27 - STDDEV [analytic]
Computes the statistical sample standard deviation of the current row with respect to the group within a.
Computes the statistical sample standard deviation of the current row with respect to the group within a window. STDDEV_SAMP
returns the same value as the square root of the variance defined for the
VAR_SAMP
function:
STDDEV( expression ) = SQRT(VAR_SAMP( expression ))
When VAR_SAMP
returns NULL
, this function returns NULL
.
Note
The nonstandard function
STDDEV
is provided for compatibility with other databases. It is semantically identical to
STDDEV_SAMP
.
Behavior type
Immutable
Syntax
STDDEV ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
OVER()
- See Analytic Functions
Examples
The following example returns the standard deviations of salaries in the employee dimension table by job title Assistant Director:
=> SELECT employee_last_name, annual_salary,
STDDEV(annual_salary) OVER (ORDER BY hire_date) as "stddev"
FROM employee_dimension
WHERE job_title = 'Assistant Director';
employee_last_name | annual_salary | stddev
--------------------+---------------+------------------
Bauer | 85003 | NaN
Reyes | 91051 | 4276.58181261624
Overstreet | 53296 | 20278.6923394976
Gauthier | 97216 | 19543.7184537642
Jones | 82320 | 16928.0764028285
Fortin | 56166 | 18400.2738421652
Carcetti | 71135 | 16968.9453554483
Weaver | 74419 | 15729.0709901852
Stein | 85689 | 15040.5909495309
McNulty | 69423 | 14401.1524291943
Webber | 99091 | 15256.3160166536
Meyer | 74774 | 14588.6126417355
Garnett | 82169 | 14008.7223268494
Roy | 76974 | 13466.1270356647
Dobisz | 83486 | 13040.4887828347
Martin | 99702 | 13637.6804131055
Martin | 73589 | 13299.2838158566
...
See also
2.28 - STDDEV_POP [analytic]
Evaluates the statistical population standard deviation for each member of the group.
Computes the statistical population standard deviation and returns the square root of the population variance within a window. The STDDEV_POP()
return value is the same as the square root of the VAR_POP()
function:
STDDEV_POP( expression ) = SQRT(VAR_POP( expression ))
When VAR_POP
returns null, STDDEV_POP
returns null.
Behavior type
Immutable
Syntax
STDDEV_POP ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
OVER()
- See Analytic Functions.
Examples
The following example returns the population standard deviations of salaries in the employee dimension table by job title Assistant Director:
=> SELECT employee_last_name, annual_salary,
STDDEV_POP(annual_salary) OVER (ORDER BY hire_date) as "stddev_pop"
FROM employee_dimension WHERE job_title = 'Assistant Director';
employee_last_name | annual_salary | stddev_pop
--------------------+---------------+------------------
Goldberg | 61859 | 0
Miller | 79582 | 8861.5
Goldberg | 74236 | 7422.74712548456
Campbell | 66426 | 6850.22125098891
Moore | 66630 | 6322.08223926257
Nguyen | 53530 | 8356.55480080699
Harris | 74115 | 8122.72288970008
Lang | 59981 | 8053.54776538731
Farmer | 60597 | 7858.70140687825
Nguyen | 78941 | 8360.63150784682
See also
2.29 - STDDEV_SAMP [analytic]
Computes the statistical sample standard deviation of the current row with respect to the group within a.
Computes the statistical sample standard deviation of the current row with respect to the group within a window. STDDEV_SAM
's return value is the same as the square root of the variance defined for the VAR_SAMP
function:
STDDEV( expression ) = SQRT(VAR_SAMP( expression ))
When VAR_SAMP
returns NULL
, STDDEV_SAMP
returns NULL.
Note
STDDEV_SAMP()
is semantically identical to the nonstandard function,
STDDEV()
.
Behavior type
Immutable
Syntax
STDDEV_SAMP ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument..
OVER()
- See Analytic Functions
Examples
The following example returns the sample standard deviations of salaries in the employee
dimension table by job title Assistant Director:
=> SELECT employee_last_name, annual_salary,
STDDEV(annual_salary) OVER (ORDER BY hire_date) as "stddev_samp"
FROM employee_dimension WHERE job_title = 'Assistant Director';
employee_last_name | annual_salary | stddev_samp
--------------------+---------------+------------------
Bauer | 85003 | NaN
Reyes | 91051 | 4276.58181261624
Overstreet | 53296 | 20278.6923394976
Gauthier | 97216 | 19543.7184537642
Jones | 82320 | 16928.0764028285
Fortin | 56166 | 18400.2738421652
Carcetti | 71135 | 16968.9453554483
Weaver | 74419 | 15729.0709901852
Stein | 85689 | 15040.5909495309
McNulty | 69423 | 14401.1524291943
Webber | 99091 | 15256.3160166536
Meyer | 74774 | 14588.6126417355
Garnett | 82169 | 14008.7223268494
Roy | 76974 | 13466.1270356647
Dobisz | 83486 | 13040.4887828347
...
See also
2.30 - SUM [analytic]
Computes the sum of an expression over a group of rows within a.
Computes the sum of an expression over a group of rows within a window. It returns a DOUBLE PRECISION
value for a floating-point expression. Otherwise, the return value is the same as the expression data type.
Behavior type
Immutable
Syntax
SUM ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
OVER()
- See Analytic Functions
Overflow handling
If you encounter data overflow when using SUM
, use
SUM_FLOAT
which converts data to a floating point.
By default, Vertica allows silent numeric overflow when you call this function on numeric data types. For more information on this behavior and how to change it, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Examples
The following query returns the cumulative sum all of the returns made to stores in January:
=> SELECT calendar_month_name AS month, transaction_type, sales_quantity,
SUM(sales_quantity)
OVER (PARTITION BY calendar_month_name ORDER BY date_dimension.date_key) AS SUM
FROM store.store_sales_fact JOIN date_dimension
USING(date_key) WHERE calendar_month_name IN ('January')
AND transaction_type= 'return';
month | transaction_type | sales_quantity | SUM
---------+------------------+----------------+------
January | return | 7 | 651
January | return | 3 | 651
January | return | 7 | 651
January | return | 7 | 651
January | return | 7 | 651
January | return | 3 | 651
January | return | 7 | 651
January | return | 5 | 651
January | return | 1 | 651
January | return | 6 | 651
January | return | 6 | 651
January | return | 3 | 651
January | return | 9 | 651
January | return | 7 | 651
January | return | 6 | 651
January | return | 8 | 651
January | return | 7 | 651
January | return | 2 | 651
January | return | 4 | 651
January | return | 5 | 651
January | return | 7 | 651
January | return | 8 | 651
January | return | 4 | 651
January | return | 10 | 651
January | return | 6 | 651
...
See also
2.31 - VAR_POP [analytic]
Returns the statistical population variance of a non-null set of numbers (nulls are ignored) in a group within a.
Returns the statistical population variance of a non-null set of numbers (nulls are ignored) in a group within a window. Results are calculated by the sum of squares of the difference of expression
from the mean of expression
, divided by the number of rows remaining:
(SUM( expression * expression ) - SUM( expression ) * SUM( expression ) / COUNT( expression )) / COUNT( expression )
Behavior type
Immutable
Syntax
VAR_POP ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument
OVER()
- See Analytic Functions
Examples
The following example calculates the cumulative population in the store orders fact table of sales in January 2007:
=> SELECT date_ordered,
VAR_POP(SUM(total_order_cost))
OVER (ORDER BY date_ordered) "var_pop"
FROM store.store_orders_fact s
WHERE date_ordered BETWEEN '2007-01-01' AND '2007-01-31'
GROUP BY s.date_ordered;
date_ordered | var_pop
--------------+------------------
2007-01-01 | 0
2007-01-02 | 89870400
2007-01-03 | 3470302472
2007-01-04 | 4466755450.6875
2007-01-05 | 3816904780.80078
2007-01-06 | 25438212385.25
2007-01-07 | 22168747513.1016
2007-01-08 | 23445191012.7344
2007-01-09 | 39292879603.1113
2007-01-10 | 48080574326.9609
(10 rows)
See also
2.32 - VAR_SAMP [analytic]
Returns the sample variance of a non-NULL set of numbers (NULL values in the set are ignored) for each row of the group within a.
Returns the sample variance of a non-NULL
set of numbers (NULL
values in the set are ignored) for each row of the group within a window. Results are calculated as follows:
(SUM( expression * expression ) - SUM( expression ) * SUM( expression ) / COUNT( expression ) )
/ (COUNT( expression ) - 1 )
This function and
VARIANCE
differ in one way: given an input set of one element, VARIANCE
returns 0 and VAR_SAMP
returns NULL
.
Behavior type
Immutable
Syntax
VAR_SAMP ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any
NUMERIC
data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument
OVER()
- See Analytic Functions
Null handling
-
VAR_SAMP
returns the sample variance of a set of numbers after it discards the NULL
values in the set.
-
If the function is applied to an empty set, then it returns NULL
.
Examples
The following example calculates the sample variance in the store orders fact table of sales in December 2007:
=> SELECT date_ordered,
VAR_SAMP(SUM(total_order_cost))
OVER (ORDER BY date_ordered) "var_samp"
FROM store.store_orders_fact s
WHERE date_ordered BETWEEN '2007-12-01' AND '2007-12-31'
GROUP BY s.date_ordered;
date_ordered | var_samp
--------------+------------------
2007-12-01 | NaN
2007-12-02 | 90642601088
2007-12-03 | 48030548449.3359
2007-12-04 | 32740062504.2461
2007-12-05 | 32100319112.6992
2007-12-06 | 26274166814.668
2007-12-07 | 23017490251.9062
2007-12-08 | 21099374085.1406
2007-12-09 | 27462205977.9453
2007-12-10 | 26288687564.1758
(10 rows)
See also
2.33 - VARIANCE [analytic]
Returns the sample variance of a non-NULL set of numbers (NULL values in the set are ignored) for each row of the group within a.
Returns the sample variance of a non-NULL
set of numbers (NULL
values in the set are ignored) for each row of the group within a window. Results are calculated as follows:
( SUM( expression * expression ) - SUM( expression ) * SUM( expression ) / COUNT( expression )) / (COUNT( expression ) - 1 )
VARIANCE
returns the variance of expression
, which is calculated as follows:
Note
The nonstandard function
VARIANCE
is provided for compatibility with other databases. It is semantically identical to
VAR_SAMP
.
Behavior type
Immutable
Syntax
VAR_SAMP ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Any NUMERIC data type or any non-numeric data type that can be implicitly converted to a numeric data type. The function returns the same data type as the numeric data type of the argument.
OVER()
- See Analytic Functions
Examples
The following example calculates the cumulative variance in the store orders fact table of sales in December 2007:
=> SELECT date_ordered,
VARIANCE(SUM(total_order_cost))
OVER (ORDER BY date_ordered) "variance"
FROM store.store_orders_fact s
WHERE date_ordered BETWEEN '2007-12-01' AND '2007-12-31'
GROUP BY s.date_ordered;
date_ordered | variance
--------------+------------------
2007-12-01 | NaN
2007-12-02 | 2259129762
2007-12-03 | 1809012182.33301
2007-12-04 | 35138165568.25
2007-12-05 | 26644110029.3003
2007-12-06 | 25943125234
2007-12-07 | 23178202223.9048
2007-12-08 | 21940268901.1431
2007-12-09 | 21487676799.6108
2007-12-10 | 21521358853.4331
(10 rows)
See also
3 - Client connection functions
This section contains client connection management functions specific to Vertica.
This section contains client connection management functions specific to Vertica.
3.1 - CLOSE_ALL_RESULTSETS
Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.
Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SELECT CLOSE_ALL_RESULTSETS ('session_id')
Parameters
session_id
- A string that specifies the Multiple Active Result Sets session.
Privileges
None; however, without superuser privileges, you can only close your own session's results.
Examples
This example shows how you can view a MARS result set, then close the result set, and then confirm that the result set has been closed.
Query the MARS storage table. One session ID is open and three result sets appear in the output.
=> SELECT * FROM SESSION_MARS_STORE;
node_name | session_id | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
v_vmart_node0001 | server1.company.-83046:1y28gu9 | dbadmin | 7 | 777460 | 776460 | 89692848
v_vmart_node0001 | server1.company.-83046:1y28gu9 | dbadmin | 8 | 324349 | 323349 | 81862010
v_vmart_node0001 | server1.company.-83046:1y28gu9 | dbadmin | 9 | 277947 | 276947 | 32978280
(1 row)
Close all result sets for session server1.company.-83046:1y28gu9:
=> SELECT CLOSE_ALL_RESULTSETS('server1.company.-83046:1y28gu9');
close_all_resultsets
-------------------------------------------------------------
Closing all result sets from server1.company.-83046:1y28gu9
(1 row)
Query the MARS storage table again for the current status. You can see that the session and result sets have been closed:
=> SELECT * FROM SESSION_MARS_STORE;
node_name | session_id | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
(0 rows)
3.2 - CLOSE_RESULTSET
Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.
Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SELECT CLOSE_RESULTSET ('session_id', ResultSetID)
Parameters
session_id
- A string that specifies the Multiple Active Result Sets session containing the ResultSetID to close.
ResultSetID
- An integer that specifies which result set to close.
Privileges
None; however, without superuser privileges, you can only close your own session's results.
Examples
This example shows a MARS storage table opened. One session_id is currently open, and one result set appears in the output.
=> SELECT * FROM SESSION_MARS_STORE;
node_name | session_id | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
v_vmart_node0001 | server1.company.-83046:1y28gu9 | dbadmin | 1 | 318718 | 312718 | 80441904
(1 row)
Close user session server1.company.-83046:1y28gu9 and result set 1:
=> SELECT CLOSE_RESULTSET('server1.company.-83046:1y28gu9', 1);
close_resultset
-------------------------------------------------------------
Closing result set 1 from server1.company.-83046:1y28gu9
(1 row)
Query the MARS storage table again for current status. You can see that result set 1 is now closed:
SELECT * FROM SESSION_MARS_STORE;
node_name | session_id | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
(0 rows)
3.3 - DESCRIBE_LOAD_BALANCE_DECISION
Evaluates if any load balancing routing rules apply to a given IP address and This function is useful when you are evaluating connection load balancing policies you have created, to ensure they work the way you expect them to.
Evaluates if any load balancing routing rules apply to a given IP address and describes how the client connection would be handled. This function is useful when you are evaluating connection load balancing policies you have created, to ensure they work the way you expect them to.
You pass this function an IP address of a client connection, and it uses the load balancing routing rules to determine how the connection will be handled. The logic this function uses is the same logic used when Vertica load balances client connections, including determining which nodes are available to handle the client connection.
This function assumes the client connection has opted into being load balanced. If actual clients have not opted into load balancing, the connections will not be redirected. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing, for information on enabling load balancing on the client. For vsql, use the -C
command-line option to enable load balancing.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
DESCRIBE_LOAD_BALANCE_DECISION('ip_address')
Arguments
'ip_address'
- An IP address of a client connection to be tested against the load balancing rules. This can be either an IPv4 or IPv6 address.
Return value
A step-by-step description of how the load balancing rules are being evaluated, including the final decision of which node in the database has been chosen to service the connection.
Privileges
None.
Examples
The following example demonstrates calling DESCRIBE_LOAD_BALANCE_DECISION with three different IP addresses, two of which are handled by different routing rules, and one which is not handled by any rule.
=> SELECT describe_load_balance_decision('192.168.1.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]
(1 row)
=> SELECT describe_load_balance_decision('192.168.2.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.2.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
matches this rule
Matched to load balance group [group_all] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
(2) LB Address: [10.20.100.249]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]
(1 row)
=> SELECT describe_load_balance_decision('1.2.3.4');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either
the policy was NONE or no target was available. Details: [NONE or invalid]
(1 row)
The following example demonstrates calling DESCRIBE_LOAD_BALANCE_DECISION repeatedly with the same IP address. You can see that the load balance group's ROUNDROBIN load balance policy has it switch between the two nodes in the load balance group:
=> SELECT describe_load_balance_decision('192.168.1.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248]
port [5433]
(1 row)
=> SELECT describe_load_balance_decision('192.168.1.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [0]
Routing table decision: Success. Load balance redirect to: [10.20.100.247]
port [5433]
(1 row)
=> SELECT describe_load_balance_decision('192.168.1.25');
describe_load_balance_decision
--------------------------------------------------------------------------------
Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248]
port [5433]
(1 row)
See also
3.4 - GET_CLIENT_LABEL
Returns the client connection label for the current session.
Returns the client connection label for the current session.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
GET_CLIENT_LABEL()
Privileges
None
Examples
Return the current client connection label:
=> SELECT GET_CLIENT_LABEL();
GET_CLIENT_LABEL
-----------------------
data_load_application
(1 row)
See also
Setting a client connection label
3.5 - RESET_LOAD_BALANCE_POLICY
Resets the counter each host in the cluster maintains, to track which host it will refer a client to when the native connection load balancing scheme is set to ROUNDROBIN.
Resets the counter each host in the cluster maintains, to track which host it will refer a client to when the native connection load balancing scheme is set to ROUNDROBIN
. To reset the counter, run this function on all cluster nodes.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
RESET_LOAD_BALANCE_POLICY()
Privileges
Superuser
Examples
=> SELECT RESET_LOAD_BALANCE_POLICY();
RESET_LOAD_BALANCE_POLICY
-------------------------------------------------------------------------
Successfully reset stateful client load balance policies: "roundrobin".
(1 row)
3.6 - SET_CLIENT_LABEL
Assigns a label to a client connection for the current session.
Assigns a label to a client connection for the current session. You can use this label to distinguish client connections.
Labels appear in the SESSIONS system table. However, only certain Data collector tables show new client labels set by SET_CLIENT_LABEL. For example, DC_REQUESTS_ISSUED reflects changes by SET_CLIENT_LABEL, while DC_SESSION_STARTS, which collects login data before SET_CLIENT_LABEL can be run, does not.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SET_CLIENT_LABEL('label-name')
Parameters
label-name
- VARCHAR name assigned to the client connection label.
Privileges
None
Examples
Assign label data_load_application
to the current client connection:
=> SELECT SET_CLIENT_LABEL('data_load_application');
SET_CLIENT_LABEL
-------------------------------------------
client_label set to data_load_application
(1 row)
See also
Setting a client connection label
3.7 - SET_LOAD_BALANCE_POLICY
Sets how native connection load balancing chooses a host to handle a client connection.
Sets how native connection load balancing chooses a host to handle a client connection.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
Volatile
Syntax
SET_LOAD_BALANCE_POLICY('policy')
Parameters
policy
- The name of the load balancing policy to use, one of the following:
-
NONE
(default): Disables native connection load balancing.
-
ROUNDROBIN
: Chooses the next host from a circular list of hosts in the cluster that are up—for example, in a three-node cluster, iterates over node1, node2, and node3, then wraps back to node1. Each host in the cluster maintains its own pointer to the next host in the circular list, rather than there being a single cluster-wide state.
-
RANDOM
: Randomly chooses a host from among all hosts in the cluster that are up.
Note
Even if the load balancing policy is set on the server to something other than NONE
, clients must indicate they want their connections to be load balanced by setting a connection property.
Privileges
Superuser
Examples
The following example demonstrates enabling native connection load balancing on the server by setting the load balancing scheme to ROUNDROBIN
:
=> SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)
See also
About native connection load balancing
4 - Data-type-specific functions
Vertica provides functions for use with specific data types, described in this section.
Vertica provides functions for use with specific data types, described in this section.
4.1 - Collection functions
The functions in this section apply to collection types (arrays and sets).
The functions in this section apply to collection types (arrays and sets).
Some functions apply aggregation operations (such as sum) to collections. These function names all begin with APPLY.
Other functions in this section operate specifically on arrays or sets, as indicated on the individual reference pages. Array functions operate on both native array values and array values in external tables.
Notes
-
Arrays are 0-indexed. The first element's ordinal position in 0, second is 1, and so on. Indexes are not meaningful for sets.
-
Unless otherwise stated, functions operate on one-dimensional (1D) collections only. To use multidimensional arrays, you must first dereference to a 1D array type. Sets can only be one-dimensional.
4.1.1 - APPLY_AVG
Returns the average of all elements in a with numeric values.
Returns the average of all elements in a collection (array or set) with numeric values.
Behavior type
Immutable
Syntax
APPLY_AVG(collection)
Arguments
collection
- Target collection
Null-handling
The following cases return NULL:
-
if the input collection is NULL
-
if the input collection contains only null values
-
if the input collection is empty
If the input collection contains a mix of null and non-null elements, only the non-null values are considered in the calculation of the average.
Examples
=> SELECT apply_avg(ARRAY[1,2.4,5,6]);
apply_avg
-----------
3.6
(1 row)
See also
4.1.2 - APPLY_COUNT (ARRAY_COUNT)
Returns the total number of non-null elements in a.
Returns the total number of non-null elements in a collection (array or set). To count all elements including nulls, use APPLY_COUNT_ELEMENTS (ARRAY_LENGTH).
Behavior type
Immutable
Syntax
APPLY_COUNT(collection)
ARRAY_COUNT is a synonym of APPLY_COUNT.
Arguments
collection
- Target collection
Null-handling
Null values are not included in the count.
Examples
The array in this example contains six elements, one of which is null:
=> SELECT apply_count(ARRAY[1,NULL,3,7,8,5]);
apply_count
-------------
5
(1 row)
4.1.3 - APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
Returns the total number of elements in a , including NULLs.
Returns the total number of elements in a collection (array or set), including NULLs. To count only non-null values, use APPLY_COUNT (ARRAY_COUNT).
Behavior type
Immutable
Syntax
APPLY_COUNT_ELEMENTS(collection)
ARRAY_LENGTH is a synonym of APPLY_COUNT_ELEMENTS.
Arguments
collection
- Target collection
Null-handling
This function counts all members, including nulls.
An empty collection (ARRAY[]
or SET[]
) has a length of 0. A collection containing a single null (ARRAY[null]
or SET[null]
) has a length of 1.
Examples
The following array has six elements including one null:
=> SELECT apply_count_elements(ARRAY[1,NULL,3,7,8,5]);
apply_count_elements
---------------------
6
(1 row)
As the previous example shows, a null element is an element. Thus, an array containing only a null element has one element:
=> SELECT apply_count_elements(ARRAY[null]);
apply_count_elements
---------------------
1
(1 row)
A set does not contain duplicates. If you construct a set and pass it directly to this function, the result could differ from the number of inputs:
=> SELECT apply_count_elements(SET[1,1,3]);
apply_count_elements
---------------------
2
(1 row)
4.1.4 - APPLY_MAX
Returns the largest non-null element in a.
Returns the largest non-null element in a collection (array or set). This function is similar to the MAX [aggregate] function; APPLY_MAX operates on elements of a collection and MAX operates on an expression such as a column selection.
Behavior type
Immutable
Syntax
APPLY_MAX(collection)
Arguments
collection
- Target collection
Null-handling
This function ignores null elements. If all elements are null or the collection is empty, this function returns null.
Examples
=> SELECT apply_max(ARRAY[1,3.4,15]);
apply_max
-----------
15.0
(1 row)
4.1.5 - APPLY_MIN
Returns the smallest non-null element in a.
Returns the smallest non-null element in a collection (array or set). This function is similar to the MIN [aggregate] function; APPLY_MIN operates on elements of a collection and MIN operates on an expression such as a column selection.
Behavior type
Immutable
Syntax
APPLY_MIN(collection)
Arguments
collection
- Target collection
Null-handling
This function ignores null elements. If all elements are null or the collection is empty, this function returns null.
Examples
=> SELECT apply_min(ARRAY[1,3.4,15]);
apply_min
-----------
1.0
(1 row)
4.1.6 - APPLY_SUM
Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).
Computes the sum of all elements in a collection (array or set) of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).
Behavior type
Immutable
Syntax
APPLY_SUM(collection)
Arguments
collection
- Target collection
Null-handling
The following cases return NULL:
-
if the input collection is NULL
-
if the input collection contains only null values
-
if the input collection is empty
Examples
=> SELECT apply_sum(ARRAY[12.5,3,4,1]);
apply_sum
-----------
20.5
(1 row)
See also
4.1.7 - ARRAY_CAT
Concatenates two arrays of the same element type and dimensionality.
Concatenates two arrays of the same element type and dimensionality. For example, ROW elements must have the same fields.
If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.
If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.
Behavior type
Immutable
Syntax
ARRAY_CAT(array1,array2)
Arguments
array1
, array2
- Arrays of matching dimensionality and element type
Null-handling
If either input is NULL, the function returns NULL.
Examples
Types are coerced if necessary, as shown in the second example.
=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5]);
array_cat
-----------------------
[1,2,3,4,5]
(1 row)
=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5.0]);
array_cat
-----------------------
["1.0","2.0","3.0","4.0","5.0"]
(1 row)
4.1.8 - ARRAY_CONTAINS
Returns true if the specified element is found in the array and false if not.
Returns true if the specified element is found in the array and false if not. Both arguments must be non-null, but the array may be empty.
Deprecated
This function has been renamed to
CONTAINS.
4.1.9 - ARRAY_DIMS
Returns the dimensionality of the input array.
Returns the dimensionality of the input array.
Behavior type
Immutable
Syntax
ARRAY_DIMS(array)
Arguments
array
- Target array
Examples
=> SELECT array_dims(ARRAY[[1,2],[2,3]]);
array_dims
------------
2
(1 row)
4.1.10 - ARRAY_FIND
Returns the ordinal position of a specified element in an array, or -1 if not found.
Returns the ordinal position of a specified element in an array, or -1 if not found. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
ARRAY_FIND(array, { value | lambda-expression })
Arguments
array
- Target array.
value
- Value to search for; type must match or be coercible to the element type of the array.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
The function returns the first occurrence of the specified element. However, nothing ensures that value is unique in the array:
=> SELECT array_find(ARRAY[1,2,7,5,7],7);
array_find
------------
2
(1 row)
The function returns -1 if the specified element is not found:
=> SELECT array_find(ARRAY[1,3,5,7],4);
array_find
------------
-1
(1 row)
You can search for complex element types:
=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,2,3]);
ARRAY_FIND
------------
0
(1 row)
=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,null,4]);
ARRAY_FIND
------------
1
(1 row)
The second example, comparing arrays with null elements, finds a match because ARRAY_FIND uses a null-safe equality check when evaluating elements.
Lambdas
Consider a table of departments where each department has an array of ROW elements representing employees. The following example searches for a specific employee name in those records. The results show that Alice works (or has worked) for two departments:
=> SELECT deptID, ARRAY_FIND(employees, e -> e.name = 'Alice Adams') AS 'has_alice'
FROM departments;
deptID | has_alice
--------+-----------
1 | 0
2 | -1
3 | 0
(3 rows)
In the following example, each person in the table has an array of email addresses, and the function locates fake addresses. The function takes one argument, the array element to test, and calls a regular-expression function that returns a Boolean:
=> SELECT name, ARRAY_FIND(email, e -> REGEXP_LIKE(e,'example.com','i'))
AS 'example.com'
FROM people;
name | example.com
----------------+-------------
Elaine Jackson | -1
Frank Adams | 0
Lee Jones | -1
M Smith | 0
(4 rows)
See also
4.1.11 - CONTAINS
Returns true if the specified element is found in the collection and false if not.
Returns true if the specified element is found in the collection and false if not. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
CONTAINS(collection, { value | lambda-expression })
Arguments
collection
- Target collection (ARRAY or SET).
value
- Value to search for; type must match or be coercible to the element type of the collection.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
=> SELECT CONTAINS(SET[1,2,3,4],2);
contains
----------
t
(1 row)
You can search for NULL as an element value:
=> SELECT CONTAINS(ARRAY[1,null,2],null);
contains
----------
t
(1 row)
You can search for complex element types:
=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,2,3]);
CONTAINS
----------
t
(1 row)
=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,null,4]);
CONTAINS
----------
t
(1 row)
The second example, comparing arrays with null elements, returns true because CONTAINS uses a null-safe equality check when evaluating elements.
In the following example, the orders table has the following definition:
=> CREATE EXTERNAL TABLE orders(
orderid int,
accountid int,
shipments Array[
ROW(
shipid int,
address ROW(
street varchar,
city varchar,
zip int
),
shipdate date
)
]
) AS COPY FROM '...' PARQUET;
The following query tests for a specific order. When passing a ROW literal as the second argument, cast any ambiguous fields to ensure type matches:
=> SELECT CONTAINS(shipments,
ROW(1,ROW('911 San Marcos St'::VARCHAR,
'Austin'::VARCHAR, 73344),
'2020-11-05'::DATE))
FROM orders;
CONTAINS
----------
t
f
f
(3 rows)
Lambdas
Consider a table of departments where each department has an array of ROW elements representing employees. The following query finds departments with early hires (low employee IDs):
=> SELECT deptID FROM departments
WHERE CONTAINS(employees, e -> e.id < 20);
deptID
--------
1
3
(2 rows)
In the following example, a schedules table includes an array of events, where each event is a ROW with several fields:
=> CREATE TABLE schedules
(guest VARCHAR,
events ARRAY[ROW(e_date DATE, e_name VARCHAR, price NUMERIC(8,2))]);
You can use the CONTAINS function with a lambda expression to find people who have more than one event on the same day. The second argument, idx
, is the index of the current element:
=> SELECT guest FROM schedules
WHERE CONTAINS(events, (e, idx) ->
(idx < ARRAY_LENGTH(events) - 1)
AND (e.e_date = events[idx + 1].e_date));
guest
-------------
Alice Adams
(1 row)
See also
4.1.12 - EXPLODE
Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element.
Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element. For each exploded collection, the results include two columns, one for the element index, and one for the value at that position. If the function explodes a single collection, these columns are named position
and value
by default. If the function explodes two or more collections, the columns for each collection are named pos_
column-name
and val_
column-name
. You can use an AS clause in the SELECT to change these column names.
EXPLODE is similar to UNNEST, which returns values but not positions.
By default, EXPLODE requires an OVER clause. If you set the skip_partitioning
parameter to true, an OVER clause is not required and is ignored if present.
Behavior type
Immutable
Syntax
EXPLODE (column[,...] [USING PARAMETERS param=value])
[ OVER ( [window-partition-clause] ) ]
Arguments
column
- Column in the table being queried. You must specify at least as many collection columns as the value of the
explode_count
parameter. Columns that are not collections are passed through without modification.
Passthrough columns are not needed if skip_partitioning
is true.
OVER(...)
- How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses. For EXPLODE, use OVER() or OVER(PARTITION BEST).
This clause is ignored if skip_partitioning
is true.
Parameters
explode_count
- The number of collection columns to explode. The function checks each column, up to this value, and either explodes it if is a collection or passes it through if it is not a collection or if this limit has been reached. If the value of
explode_count
is greater than the number of collection columns specified, the function returns an error.
Default: 1
skip_partitioning
- Whether to skip partitioning and ignore the OVER clause if present. EXPLODE translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation. Even so, setting this parameter can negatively affect performance in rare cases.
Default: false
Null-handling
This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:
=> SELECT EXPLODE(ARRAY[1,2,null,4]) OVER();
position | value
----------+-------
0 | 1
1 | 2
2 |
3 | 4
(4 rows)
=> SELECT EXPLODE(ARRAY[]::ARRAY[INT]) OVER();
position | value
----------+-------
(0 rows)
=> SELECT EXPLODE(NULL::ARRAY[INT]) OVER();
position | value
----------+-------
(0 rows)
Joining on results
To use JOIN with this function you must set the skip_partitioning
parameter, either in the function call or as a session parameter.
You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.
Consider the following table of students and exam scores:
=> SELECT * FROM tests;
student | scores | questions
---------+---------------+-----------------
Bob | [92,78,79] | [20,20,100]
Lee | |
Pat | [] | []
Sam | [97,98,85] | [20,20,100]
Tom | [68,75,82,91] | [20,20,100,100]
(5 rows)
The following query finds the best test scores across all students who have scores:
=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning = true;
=> SELECT student, score FROM tests
CROSS JOIN EXPLODE(scores) AS t (pos, score)
ORDER BY score DESC;
student | score
---------+-------
Sam | 98
Sam | 97
Bob | 92
Tom | 91
Sam | 85
Tom | 82
Bob | 79
Bob | 78
Tom | 75
Tom | 68
(10 rows)
The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:
=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
CROSS JOIN EXPLODE(scores, questions USING PARAMETERS explode_count=2)
AS t(pos_s, score, pos_q, qcount)
GROUP BY student;
student | MAX | AVG
---------+----------------------+------------------
Bob | 4.600000000000000000 | 3.04333333333333
Sam | 4.900000000000000000 | 3.42222222222222
Tom | 4.550000000000000000 | 2.37
(3 rows)
These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.
To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:
=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL EXPLODE(scores) AS t (pos, score)
GROUP BY student;
student | MIN | AVG
---------+-----+------------------
Bob | 78 | 83
Lee | |
Pat | |
Sam | 85 | 93.3333333333333
Tom | 68 | 79
(5 rows)
The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.
Examples
Consider an orders table with the following contents:
=> SELECT orderkey, custkey, prodkey, orderprices, email_addrs
FROM orders LIMIT 5;
orderkey | custkey | prodkey | orderprices | email_addrs
------------+---------+-----------------------------------------------+-----------------------------------+----------------------------------------------------------------------------------------------------------------
113-341987 | 342799 | ["MG-7190 ","VA-4028 ","EH-1247 ","MS-7018 "] | ["60.00","67.00","22.00","14.99"] | ["bob@example,com","robert.jones@example.com"]
111-952000 | 342845 | ["ID-2586 ","IC-9010 ","MH-2401 ","JC-1905 "] | ["22.00","35.00",null,"12.00"] | ["br92@cs.example.edu"]
111-345634 | 342536 | ["RS-0731 ","SJ-2021 "] | ["50.00",null] | [null]
113-965086 | 342176 | ["GW-1808 "] | ["108.00"] | ["joe.smith@example.com"]
111-335121 | 342321 | ["TF-3556 "] | ["50.00"] | ["789123@example-isp.com","alexjohnson@example.com","monica@eng.example.com","sara@johnson.example.name",null]
(5 rows)
The following query explodes the order prices for a single customer. The other two columns are passed through and are repeated for each returned row:
=> SELECT EXPLODE(orderprices, custkey, email_addrs
USING PARAMETERS skip_partitioning=true)
AS (position, orderprices, custkey, email_addrs)
FROM orders WHERE custkey='342845' ORDER BY orderprices;
position | orderprices | custkey | email_addrs
----------+-------------+---------+------------------------------
2 | | 342845 | ["br92@cs.example.edu",null]
3 | 12.00 | 342845 | ["br92@cs.example.edu",null]
0 | 22.00 | 342845 | ["br92@cs.example.edu",null]
1 | 35.00 | 342845 | ["br92@cs.example.edu",null]
(4 rows)
The previous example uses the skip_partitioning
parameter. Instead of setting it for each call to EXPLODE, you can set it as a session parameter. EXPLODE is part of the ComplexTypesLib UDx library. The following example returns the same results:
=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning=true;
=> SELECT EXPLODE(orderprices, custkey, email_addrs)
AS (position, orderprices, custkey, email_addrs)
FROM orders WHERE custkey='342845' ORDER BY orderprices;
You can explode more than one column by specifying the explode_count
parameter:
=> SELECT EXPLODE(orderkey, prodkey, orderprices
USING PARAMETERS explode_count=2, skip_partitioning=true)
AS (orderkey,pk_idx,pk_val,ord_idx,ord_val)
FROM orders
WHERE orderkey='113-341987';
orderkey | pk_idx | pk_val | ord_idx | ord_val
------------+--------+----------+---------+---------
113-341987 | 0 | MG-7190 | 0 | 60.00
113-341987 | 0 | MG-7190 | 1 | 67.00
113-341987 | 0 | MG-7190 | 2 | 22.00
113-341987 | 0 | MG-7190 | 3 | 14.99
113-341987 | 1 | VA-4028 | 0 | 60.00
113-341987 | 1 | VA-4028 | 1 | 67.00
113-341987 | 1 | VA-4028 | 2 | 22.00
113-341987 | 1 | VA-4028 | 3 | 14.99
113-341987 | 2 | EH-1247 | 0 | 60.00
113-341987 | 2 | EH-1247 | 1 | 67.00
113-341987 | 2 | EH-1247 | 2 | 22.00
113-341987 | 2 | EH-1247 | 3 | 14.99
113-341987 | 3 | MS-7018 | 0 | 60.00
113-341987 | 3 | MS-7018 | 1 | 67.00
113-341987 | 3 | MS-7018 | 2 | 22.00
113-341987 | 3 | MS-7018 | 3 | 14.99
(16 rows)
The following example uses a multi-dimensional array:
=> SELECT name, pingtimes FROM network_tests;
name | pingtimes
------+-------------------------------------------------------
eng1 | [[24.24,25.27,27.16,24.97],[23.97,25.01,28.12,29.5]]
eng2 | [[27.12,27.91,28.11,26.95],[29.01,28.99,30.11,31.56]]
qa1 | [[23.15,25.11,24.63,23.91],[22.85,22.86,23.91,31.52]]
(3 rows)
=> SELECT EXPLODE(name, pingtimes USING PARAMETERS explode_count=1) OVER()
FROM network_tests;
name | position | value
------+----------+---------------------------
eng1 | 0 | [24.24,25.27,27.16,24.97]
eng1 | 1 | [23.97,25.01,28.12,29.5]
eng2 | 0 | [27.12,27.91,28.11,26.95]
eng2 | 1 | [29.01,28.99,30.11,31.56]
qa1 | 0 | [23.15,25.11,24.63,23.91]
qa1 | 1 | [22.85,22.86,23.91,31.52]
(6 rows)
You can rewrite the previous query as follows to produce the same results:
=> SELECT name, EXPLODE(pingtimes USING PARAMETERS skip_partitioning=true)
FROM network_tests;
4.1.13 - FILTER
Takes an input array and returns an array containing only elements that meet a specified condition.
Takes an input array and returns an array containing only elements that meet a specified condition. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
FILTER(array, lambda-expression )
Arguments
array
- Input array.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
Given a table that contains names and arrays of email addresses, the following query filters out fake email addresses and returns the rest:
=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
FROM people;
name | real_email
----------------+-------------------------------------------------
Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
Frank Adams | []
Lee Jones | ["lee.jones@somewhere.org"]
M Smith | ["ms@msmith.com"]
(4 rows)
You can use the results in a WHERE clause to exclude rows that no longer contain any email addresses:
=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
FROM people
WHERE ARRAY_LENGTH(real_email) > 0;
name | real_email
----------------+-------------------------------------------------
Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
Lee Jones | ["lee.jones@somewhere.org"]
M Smith | ["ms@msmith.com"]
(3 rows)
See also
4.1.14 - IMPLODE
Takes a column of any scalar type and returns an unbounded array.
Takes a column of any scalar type and returns an unbounded array. Combined with GROUP BY, this function can be used to reverse an EXPLODE operation.
Behavior type
-
Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique element values within each output array group.
-
Volatile otherwise because results are non-commutative.
Syntax
IMPLODE (input-column [ USING PARAMETERS param=value[,...] ] )
[ within-group-order-by-clause ]
Arguments
input-column
- Column of any scalar type from which to create the array.
- [within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)
- Sorts elements within each output array group:
WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])
sort-qualifiers
: { ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }
Tip
WITHIN GROUP ORDER BY can consume a large amount of memory per group. To minimize memory consumption, create projections that support
GROUPBY PIPELINED.
Parameters
allow_truncate
- Boolean, if true truncates results when output length exceeds column size. If false (the default), the function returns an error if the output array is too large.
Even if this parameter is set to true, IMPLODE returns an error if any single array element is too large. Truncation removes elements from the output array but does not alter individual elements.
max_binary_size
- The maximum binary size in bytes for the returned array. If you omit this parameter, IMPLODE uses the value of the configuration parameter DefaultArrayBinarySize.
Examples
Consider a table with the following contents:
=> SELECT * FROM filtered;
position | itemprice | itemkey
----------+-----------+---------
0 | 14.99 | 345
0 | 27.99 | 567
1 | 18.99 | 567
1 | 35.99 | 345
2 | 14.99 | 123
(5 rows)
The following query calls IMPLODE to assemble prices into arrays (grouped by keys):
=> SELECT itemkey AS key, IMPLODE(itemprice) AS prices
FROM filtered GROUP BY itemkey ORDER BY itemkey;
key | prices
-----+-------------------
123 | ["14.99"]
345 | ["35.99","14.99"]
567 | ["27.99","18.99"]
(3 rows)
You can modify this query by including a WITHIN GROUP ORDER BY clause, which specifies how to sort array elements within each group:
=> SELECT itemkey AS key, IMPLODE(itemprice) WITHIN GROUP (ORDER BY itemprice) AS prices
FROM filtered GROUP BY itemkey ORDER BY itemkey;
key | prices
-----+-------------------
123 | ["14.99"]
345 | ["14.99","35.99"]
567 | ["18.99","27.99"]
(3 rows)
See Arrays and sets (collections) for a fuller example.
4.1.15 - SET_UNION
Returns a SET containing all elements of two input sets.
Returns a SET containing all elements of two input sets.
If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.
If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.
Behavior type
Immutable
Syntax
SET_UNION(set1,set2)
Arguments
set1
, set2
- Sets of matching element type
Null-handling
-
Null arguments are ignored. If one of the inputs is null, the function returns the non-null input. In other words, an argument of NULL is equivalent to SET[].
-
If both inputs are null, the function returns null.
Examples
=> SELECT SET_UNION(SET[1,2,4], SET[2,3,4,5.9]);
set_union
-----------------------
["1.0","2.0","3.0","4.0","5.9"]
(1 row)
4.1.16 - STRING_TO_ARRAY
Splits a string containing array values and returns a native one-dimensional array.
Splits a string containing array values and returns a native one-dimensional array. The output does not include the "ARRAY" keyword. This function does not support nested (multi-dimensional) arrays.
This function returns array elements as strings by default. You can cast to other types, as in the following example:
=> SELECT STRING_TO_ARRAY('[1,2,3]')::ARRAY[INT];
Behavior
Immutable
Syntax
STRING_TO_ARRAY(string [USING PARAMETERS param=value[,...]])
The following syntax is deprecated:
STRING_TO_ARRAY(string, delimiter)
Arguments
string
- String representation of a one-dimensional array; can be a VARCHAR or LONG VARCHAR column, a literal string, or the string output of an expression.
Spaces in the string are removed unless elements are individually quoted. For example, ' a,b,c'
is equivalent to 'a,b,c'
. To preserve the space, use '" a","b","c"'
.
Parameters
These parameters behave the same way as the corresponding options when loading delimited data (see DELIMITED).
No parameter may have the same value as any other parameter.
collection_delimiter
- The character or character sequence used to separate array elements (VARCHAR(8)). You can use any ASCII values in the range E'\000' to E'\177', inclusive.
Default: Comma (',').
collection_open
, collection_close
- The characters that mark the beginning and end of the array (VARCHAR(8)). It is an error to use these characters elsewhere within the list of elements without escaping them. These characters can be omitted from the input string.
Default: Square brackets ('[' and ']').
collection_null_element
- The string representing a null element value (VARCHAR(65000)). You can specify a null value using any ASCII values in the range E'\001' to E'\177' inclusive (any ASCII value except NULL: E'\000').
Default: 'null'
collection_enclose
- An optional quote character within which to enclose individual elements, allowing delimiter characters to be embedded in string values. You can choose any ASCII value in the range E'\001' to E'\177' inclusive (any ASCII character except NULL: E'\000'). Elements do not need to be enclosed by this value.
Default: double quote ('"')
Examples
The function uses comma as the default delimiter. You can specify a different value:
=> SELECT STRING_TO_ARRAY('[1,3,5]');
STRING_TO_ARRAY
-----------------
["1","3","5"]
(1 row)
=> SELECT STRING_TO_ARRAY('[t|t|f|t]' USING PARAMETERS collection_delimiter = '|');
STRING_TO_ARRAY
-------------------
["t","t","f","t"]
(1 row)
The bounding brackets are optional:
=> SELECT STRING_TO_ARRAY('t|t|f|t' USING PARAMETERS collection_delimiter = '|');
STRING_TO_ARRAY
-------------------
["t","t","f","t"]
(1 row)
The input can use other characters for open and close:
=> SELECT STRING_TO_ARRAY('{NASA-1683,NASA-7867,SPX-76}' USING PARAMETERS collection_open = '{', collection_close = '}');
STRING_TO_ARRAY
------------------------------------
["NASA-1683","NASA-7867","SPX-76"]
(1 row)
By default the string 'null' in input is treated as a null value:
=> SELECT STRING_TO_ARRAY('{"us-1672",null,"darpa-1963"}' USING PARAMETERS collection_open = '{', collection_close = '}');
STRING_TO_ARRAY
-------------------------------
["us-1672",null,"darpa-1963"]
(1 row)
In the following example, the input comes from a column:
=> SELECT STRING_TO_ARRAY(name USING PARAMETERS collection_delimiter=' ') FROM employees;
STRING_TO_ARRAY
-----------------------
["Howard","Wolowitz"]
["Sheldon","Cooper"]
(2 rows)
4.1.17 - TO_JSON
Returns the JSON representation of a complex-type argument, including mixed and nested complex types.
Returns the JSON representation of a complex-type argument, including mixed and nested complex types. This is the same format that queries of complex-type columns return.
Behavior
Immutable
Syntax
TO_JSON(value)
Arguments
value
- Column or literal of a complex type
Examples
These examples query the following table:
=> SELECT name, contact FROM customers;
name | contact
--------------------+-----------------------------------------------------------------------------------------------------------------------
Missy Cooper | {"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
Sheldon Cooper | {"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
Leonard Hofstadter | {"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
Leslie Winkle | {"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
Raj Koothrappali | {"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}
Stuart Bloom |
(6 rows)
You can call TO_JSON on a column or on specific fields or array elements:
=> SELECT TO_JSON(contact) FROM customers;
to_json
-----------------------------------------------------------------------------------------------------------------------
{"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
{"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
{"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
{"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
{"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}
(6 rows)
=> SELECT TO_JSON(contact.email) FROM customers;
to_json
---------------------------------------------
["missy@mit.edu","mcooper@cern.gov"]
["shelly@meemaw.name","cooper@caltech.edu"]
["hofstadter@caltech.edu"]
[]
["raj@available.com"]
(6 rows)
When calling TO_JSON with a SET, note that duplicates are removed and elements can be reordered:
=> SELECT TO_JSON(SET[1683,7867,76,76]);
TO_JSON
----------------
[76,1683,7867]
(1 row)
4.1.18 - UNNEST
Expands the elements of one or more collection columns (ARRAY or SET) into individual rows.
Expands the elements of one or more collection columns (ARRAY or SET) into individual rows. UNNEST is similar to EXPLODE, but UNNEST returns only the elements, while EXPLODE returns elements and their positions.
If called with a single array, UNNEST returns the elements in a column named value
. If called with two or more arrays, it returns columns named val_
column-name
. You can use an AS clause in the SELECT to change these names.
By default, UNNEST does not partition its input and ignores an OVER() clause if present.
Behavior type
Immutable
Syntax
UNNEST (column[,...])
[USING PARAMETERS param=value])
[ OVER ( [window-partition-clause
Arguments
column
- Collection column in the table being queried.
OVER(...)
- How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses.
This clause only applies if skip_partitioning
is false.
Parameters
skip_partitioning
- Whether to skip partitioning and ignore the OVER clause if present. UNNEST translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation.
Default: true
Null-handling
This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:
=> SELECT UNNEST(ARRAY[1,2,null,4]) OVER();
value
-------
1
2
4
(4 rows)
=> SELECT UNNEST(ARRAY[]::ARRAY[INT]) OVER();
value
-------
(0 rows)
=> SELECT UNNEST(NULL::ARRAY[INT]) OVER();
value
-------
(0 rows)
Joining on results
You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.
Consider the following table of students and exam scores:
=> SELECT * FROM tests;
student | scores | questions
---------+---------------+-----------------
Bob | [92,78,79] | [20,20,100]
Lee | |
Pat | [] | []
Sam | [97,98,85] | [20,20,100]
Tom | [68,75,82,91] | [20,20,100,100]
(5 rows)
The following query finds the best test scores across all students who have scores:
=> SELECT student, score FROM tests
CROSS JOIN UNNEST(scores) AS t (score)
ORDER BY score DESC;
student | score
---------+-------
Sam | 98
Sam | 97
Bob | 92
Tom | 91
Sam | 85
Tom | 82
Bob | 79
Bob | 78
Tom | 75
Tom | 68
(10 rows)
The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:
=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
CROSS JOIN UNNEST(scores, questions) AS t(score, qcount)
GROUP BY student;
student | MAX | AVG
---------+----------------------+------------------
Bob | 4.600000000000000000 | 3.04333333333333
Sam | 4.900000000000000000 | 3.42222222222222
Tom | 4.550000000000000000 | 2.37
(3 rows)
These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.
To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:
=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL UNNEST(scores) AS t (score)
GROUP BY student;
student | MIN | AVG
---------+-----+------------------
Bob | 78 | 83
Lee | |
Pat | |
Sam | 85 | 93.3333333333333
Tom | 68 | 79
(5 rows)
The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.
Examples
Consider a table with the following definition:
=> CREATE TABLE orders (
orderkey VARCHAR, custkey INT,
prodkey ARRAY[VARCHAR], orderprices ARRAY[DECIMAL(12,2)],
email_addrs ARRAY[VARCHAR]);
The following query expands one of the array columns. One of the elements is null:
=> SELECT UNNEST(orderprices) AS price, custkey, email_addrs
FROM orders WHERE custkey='342845' ORDER BY price;
price | custkey | email_addrs
-------+---------+-------------------------
| 342845 | ["br92@cs.example.edu"]
12.00 | 342845 | ["br92@cs.example.edu"]
22.00 | 342845 | ["br92@cs.example.edu"]
35.00 | 342845 | ["br92@cs.example.edu"]
(4 rows)
UNNEST can expand more than one column:
=> SELECT orderkey, UNNEST(prodkey, orderprices)
FROM orders WHERE orderkey='113-341987';
orderkey | val_prodkey | val_orderprices
------------+-------------+-----------------
113-341987 | MG-7190 | 60.00
113-341987 | MG-7190 | 67.00
113-341987 | MG-7190 | 22.00
113-341987 | MG-7190 | 14.99
113-341987 | VA-4028 | 60.00
113-341987 | VA-4028 | 67.00
113-341987 | VA-4028 | 22.00
113-341987 | VA-4028 | 14.99
113-341987 | EH-1247 | 60.00
113-341987 | EH-1247 | 67.00
113-341987 | EH-1247 | 22.00
113-341987 | EH-1247 | 14.99
113-341987 | MS-7018 | 60.00
113-341987 | MS-7018 | 67.00
113-341987 | MS-7018 | 22.00
113-341987 | MS-7018 | 14.99
(16 rows)
4.2 - Date/time functions
Date and time functions perform conversion, extraction, or manipulation operations on date and time data types and can return date and time information.
Date and time functions perform conversion, extraction, or manipulation operations on date and time data types and can return date and time information.
Usage
Functions that take TIME
or TIMESTAMP
inputs come in two variants:
For brevity, these variants are not shown separately.
The + and * operators come in commutative pairs; for example, both DATE + INTEGER
and INTEGER + DATE
. We show only one of each such pair.
Daylight savings time considerations
When adding an INTERVAL
value to (or subtracting an INTERVAL
value from) a TIMESTAMP
WITH TIME ZONE
value, the days component advances (or decrements) the date of the TIMESTAMP WITH TIME ZONE
by the indicated number of days. Across daylight saving time changes (with the session time zone set to a time zone that recognizes DST), this means INTERVAL '1 day'
does not necessarily equal INTERVAL '24 hours'
.
For example, with the session time zone set to CST7CDT
:
TIMESTAMP WITH TIME ZONE '2014-04-02 12:00-07' + INTERVAL '1 day'
produces
TIMESTAMP WITH TIME ZONE '2014-04-03 12:00-06'
Adding INTERVAL '24 hours'
to the same initial TIMESTAMP WITH TIME ZONE
produces
TIMESTAMP WITH TIME ZONE '2014-04-03 13:00-06',
This result occurs because there is a change in daylight saving time at 2014-04-03 02:00
in time zone CST7CDT
.
Date/time functions in transactions
Certain date/time functions such as
CURRENT_TIMESTAMP
and
NOW
return the start time of the current transaction; for the duration of that transaction, they return the same value. Other date/time functions such as
TIMEOFDAY
always return the current time.
See also
Template patterns for date/time formatting
4.2.1 - ADD_MONTHS
Adds the specified number of months to a date and returns the sum as a DATE.
Adds the specified number of months to a date and returns the sum as a DATE
. In general, ADD_MONTHS returns a date with the same day component as the start date. For example:
=> SELECT ADD_MONTHS ('2015-09-15'::date, -2) "2 Months Ago";
2 Months Ago
--------------
2015-07-15
(1 row)
Two exceptions apply:
-
If the start date's day component is greater than the last day of the result month, ADD_MONTHS returns the last day of the result month. For example:
=> SELECT ADD_MONTHS ('31-Jan-2016'::TIMESTAMP, 1) "Leap Month";
Leap Month
------------
2016-02-29
(1 row)
-
If the start date's day component is the last day of that month, and the result month has more days than the start date month, ADD_MONTHS returns the last day of the result month. For example:
=> SELECT ADD_MONTHS ('2015-09-30'::date,-1) "1 Month Ago";
1 Month Ago
-------------
2015-08-31
(1 row)
Behavior type
Syntax
ADD_MONTHS ( start-date, num-months );
Parameters
start-date
- The date to process, an expression that evaluates to one of the following data types:
-
DATE
-
TIMESTAMP
-
TIMESTAMPTZ
num-months
- An integer expression that specifies the number of months to add to or subtract from
start-date
.
Examples
Add one month to the current date:
=> SELECT CURRENT_DATE Today;
Today
------------
2016-05-05
(1 row)
VMart=> SELECT ADD_MONTHS(CURRENT_TIMESTAMP,1);
ADD_MONTHS
------------
2016-06-05
(1 row)
Subtract four months from the current date:
=> SELECT ADD_MONTHS(CURRENT_TIMESTAMP, -4);
ADD_MONTHS
------------
2016-01-05
(1 row)
Add one month to January 31 2016:
=> SELECT ADD_MONTHS('31-Jan-2016'::TIMESTAMP, 1) "Leap Month";
Leap Month
------------
2016-02-29
(1 row)
The following example sets the timezone to EST; it then adds 24 months to a TIMESTAMPTZ that specifies a PST time zone, so ADD_MONTHS
takes into account the time change:
=> SET TIME ZONE 'America/New_York';
SET
VMart=> SELECT ADD_MONTHS('2008-02-29 23:30 PST'::TIMESTAMPTZ, 24);
ADD_MONTHS
------------
2010-03-01
(1 row)
4.2.2 - AGE_IN_MONTHS
Returns the difference in months between two dates, expressed as an integer.
Returns the difference in months between two dates, expressed as an integer.
Behavior type
Syntax
AGE_IN_MONTHS ( [ date1,] date2 )
Parameters
date1
date2
- Specify the boundaries of the period to measure. If you supply only one argument, Vertica sets
date2
to the current date. Both parameters must evaluate to one of the following data types:
-
DATE
-
TIMESTAMP
-
TIMESTAMPTZ
If date1
< date2
, AGE_IN_MONTHS returns a negative value.
Examples
Get the age in months of someone born March 2 1972, as of June 21 1990:
=> SELECT AGE_IN_MONTHS('1990-06-21'::TIMESTAMP, '1972-03-02'::TIMESTAMP);
AGE_IN_MONTHS
---------------
219
(1 row)
If the first date is less than the second date, AGE_IN_MONTHS returns a negative value
=> SELECT AGE_IN_MONTHS('1972-03-02'::TIMESTAMP, '1990-06-21'::TIMESTAMP);
AGE_IN_MONTHS
---------------
-220
(1 row)
Get the age in months of someone who was born November 21 1939, as of today:
=> SELECT AGE_IN_MONTHS ('1939-11-21'::DATE);
AGE_IN_MONTHS
---------------
930
(1 row)
4.2.3 - AGE_IN_YEARS
Returns the difference in years between two dates, expressed as an integer.
Returns the difference in years between two dates, expressed as an integer.
Behavior type
Syntax
AGE_IN_YEARS( [ date1,] date2 )
Parameters
date1
date2
- Specify the boundaries of the period to measure. If you supply only one argument, Vertica sets
date1
to the current date. Both parameters must evaluate to one of the following data types:
-
DATE
-
TIMESTAMP
-
TIMESTAMPTZ
If date1
< date2
, AGE_IN_YEARS returns a negative value.
Examples
Get the age of someone born March 2 1972, as of June 21 1990:
=> SELECT AGE_IN_YEARS('1990-06-21'::TIMESTAMP, '1972-03-02'::TIMESTAMP);
AGE_IN_YEARS
--------------
18
(1 row)
If the first date is earlier than the second date, AGE_IN_YEARS returns a negative number:
=> SELECT AGE_IN_YEARS('1972-03-02'::TIMESTAMP, '1990-06-21'::TIMESTAMP);
AGE_IN_YEARS
--------------
-19
(1 row)
Get the age of someone who was born November 21 1939, as of today:
=> SELECT AGE_IN_YEARS('1939-11-21'::DATE);
AGE_IN_YEARS
--------------
77
(1 row)
4.2.4 - CLOCK_TIMESTAMP
Returns a value of type TIMESTAMP WITH TIMEZONE that represents the current system-clock time.
Returns a value of type TIMESTAMP WITH TIMEZONE that represents the current system-clock time.
CLOCK_TIMESTAMP
uses the date and time supplied by the operating system on the server to which you are connected, which should be the same across all servers. The value changes each time you call it.
Behavior type
Volatile
Syntax
CLOCK_TIMESTAMP()
Examples
The following command returns the current time on your system:
SELECT CLOCK_TIMESTAMP() "Current Time";
Current Time
------------------------------
2010-09-23 11:41:23.33772-04
(1 row)
Each time you call the function, you get a different result. The difference in this example is in microseconds:
SELECT CLOCK_TIMESTAMP() "Time 1", CLOCK_TIMESTAMP() "Time 2";
Time 1 | Time 2
-------------------------------+-------------------------------
2010-09-23 11:41:55.369201-04 | 2010-09-23 11:41:55.369202-04
(1 row)
See also
4.2.5 - CURRENT_DATE
Returns the date (date-type value) on which the current transaction started.
Returns the date (date-type value) on which the current transaction started.
Behavior type
Stable
Syntax
CURRENT_DATE()
Note
You can call this function without parentheses.
Examples
SELECT CURRENT_DATE;
?column?
------------
2010-09-23
(1 row)
4.2.6 - CURRENT_TIME
Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction.
Returns a value of type TIME WITH TIMEZONE
that represents the start of the current transaction.
The return value does not change during the transaction. Thus, multiple calls to CURRENT_TIME within the same transaction return the same timestamp.
Behavior type
Stable
Syntax
CURRENT_TIME [ ( precision ) ]
Note
If you specify a column label without precision, you must also omit parentheses.
Parameters
precision
- An integer value between 0-6, specifies to round the seconds fraction field result to the specified number of digits.
Examples
=> SELECT CURRENT_TIME(1) AS Time;
Time
---------------
06:51:45.2-07
(1 row)
=> SELECT CURRENT_TIME(5) AS Time;
Time
-------------------
06:51:45.18435-07
(1 row)
4.2.7 - CURRENT_TIMESTAMP
Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction.
Returns a value of type TIME WITH TIMEZONE
that represents the start of the current transaction.
The return value does not change during the transaction. Thus, multiple calls to CURRENT_TIMESTAMP
within the same transaction return the same timestamp.
Behavior type
Stable
Syntax
CURRENT_TIMESTAMP ( precision )
Parameters
precision
- An integer value between 0-6, specifies to round the seconds fraction field result to the specified number of digits.
Examples
=> SELECT CURRENT_TIMESTAMP(1) AS time;
time
--------------------------
2017-03-27 06:50:49.7-07
(1 row)
=> SELECT CURRENT_TIMESTAMP(5) AS time;
time
------------------------------
2017-03-27 06:50:49.69967-07
(1 row)
4.2.8 - DATE
Converts the input value to a DATE data type.
Converts the input value to a
DATE
data type.
Behavior type
-
Immutable if the input value is a TIMESTAMP
, DATE
, VARCHAR
, or integer
-
Stable if the input value is a TIMESTAMPTZ
Syntax
DATE ( value )
Parameters
value
- The value to convert, one of the following:
-
TIMESTAMP
, TIMESTAMPTZ
, VARCHAR
, or another DATE
.
-
Integer: Vertica treats the integer as the number of days since 01/01/0001 and returns the date.
Examples
=> SELECT DATE (1);
DATE
------------
0001-01-01
(1 row)
=> SELECT DATE (734260);
DATE
------------
2011-05-03
(1 row)
=> SELECT DATE('TODAY');
DATE
------------
2016-12-07
(1 row)
See also
4.2.9 - DATE_PART
Extracts a sub-field such as year or hour from a date/time expression, equivalent to the the SQL-standard function EXTRACT.
Extracts a sub-field such as year or hour from a date/time expression, equivalent to the the SQL-standard function
EXTRACT
.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or INTERVAL
-
Stable if the specified date is a TIMESTAMPTZ
Syntax
DATE_PART ( 'field', date )
Parameters
field
- A constant value that specifies the sub-field to extract from
date
(see Field Values below).
date
- The date to process, an expression that evaluates to one of the following data types:
Field values
CENTURY
- The century number.
The first century starts at 0001-01-01 00:00:00 AD. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from –1 to 1.
DAY
- The day (of the month) field (1–31).
DECADE
- The year field divided by 10.
DOQ
- The day within the current quarter. DOQ recognizes leap year days.
DOW
- Zero-based day of the week, where Sunday=0.
Note
EXTRACT
's day of week numbering differs from the function
TO_CHAR
.
DOY
- The day of the year (1–365/366)
EPOCH
- Specifies to return one of the following:
-
For DATE
and TIMESTAMP
values: the number of seconds before or since 1970-01-01 00:00:00-00 (if before, a negative number).
-
For INTERVAL
values, the total number of seconds in the interval.
HOUR
- The hour field (0–23).
ISODOW
- The ISO day of the week, an integer between 1 and 7 where Monday is 1.
ISOWEEK
- The ISO week of the year, an integer between 1 and 53.
ISOYEAR
- The ISO year.
MICROSECONDS
- The seconds field, including fractional parts, multiplied by 1,000,000. This includes full seconds.
MILLENNIUM
- The millennium number, where the first millennium is 1 and each millenium starts on
01-01-
y
001
. For example, millennium 2 starts on 01-01-1001.
MILLISECONDS
- The seconds field, including fractional parts, multiplied by 1000. This includes full seconds.
MINUTE
- The minutes field (0 - 59).
MONTH
- For
TIMESTAMP
values, the number of the month within the year (1 - 12) ; for interval
values the number of months, modulo 12 (0 - 11).
QUARTER
- The calendar quarter of the specified date as an integer, where the January-March quarter is 1, valid only for
TIMESTAMP
values.
SECOND
- The seconds field, including fractional parts, 0–59, or 0-60 if the operating system implements leap seconds.
TIME ZONE
- The time zone offset from UTC, in seconds. Positive values correspond to time zones east of UTC, negative values to zones west of UTC.
TIMEZONE_HOUR
- The hour component of the time zone offset.
TIMEZONE_MINUTE
- The minute component of the time zone offset.
WEEK
- The number of the week of the calendar year that the day is in.
YEAR
- The year field. There is no
0 AD
, so subtract BC
years from AD
years accordingly.
Notes
According to the ISO-8601 standard, the week starts on Monday, and the first week of a year contains January 4. Thus, an early January date can sometimes be in the week 52 or 53 of the previous calendar year. For example:
=> SELECT YEAR_ISO('01-01-2016'::DATE), WEEK_ISO('01-01-2016'), DAYOFWEEK_ISO('01-01-2016');
YEAR_ISO | WEEK_ISO | DAYOFWEEK_ISO
----------+----------+---------------
2015 | 53 | 5
(1 row)
Examples
Extract the day value:
SELECT DATE_PART('DAY', TIMESTAMP '2009-02-24 20:38:40') "Day";
Day
-----
24
(1 row)
Extract the month value:
SELECT DATE_PART('MONTH', '2009-02-24 20:38:40'::TIMESTAMP) "Month";
Month
-------
2
(1 row)
Extract the year value:
SELECT DATE_PART('YEAR', '2009-02-24 20:38:40'::TIMESTAMP) "Year";
Year
------
2009
(1 row)
Extract the hours:
SELECT DATE_PART('HOUR', '2009-02-24 20:38:40'::TIMESTAMP) "Hour";
Hour
------
20
(1 row)
Extract the minutes:
SELECT DATE_PART('MINUTES', '2009-02-24 20:38:40'::TIMESTAMP) "Minutes";
Minutes
---------
38
(1 row)
Extract the day of quarter (DOQ):
SELECT DATE_PART('DOQ', '2009-02-24 20:38:40'::TIMESTAMP) "DOQ";
DOQ
-----
55
(1 row)
See also
TO_CHAR
4.2.10 - DATE_TRUNC
Truncates date and time values to the specified precision.
Truncates date and time values to the specified precision. The return value is the same data type as the input value. All fields that are less than the specified precision are set to 0, or to 1 for day and month.
Behavior type
Stable
Syntax
DATE_TRUNC( precision, trunc-target )
Parameters
precision
- A string constant that specifies precision for the truncated value. See Precision Field Values below. The precision must be valid for the
trunc-target
date or time.
trunc-target
- Valid date/time expression.
Precision field values
MILLENNIUM
- The millennium number.
CENTURY
- The century number.
The first century starts at 0001-01-01 00:00:00 AD. This definition applies to all Gregorian calendar countries.
DECADE
- The year field divided by 10.
YEAR
- The year field. Keep in mind there is no
0 AD
, so subtract BC
years from AD
years with care.
QUARTER
- The calendar quarter of the specified date as an integer, where the January-March quarter is 1.
MONTH
- For
timestamp
values, the number of the month within the year (1–12) ; for interval
values the number of months, modulo 12 (0–11).
WEEK
- The number of the week of the year that the day is in.
According to the ISO-8601 standard, the week starts on Monday, and the first week of a year contains January 4. Thus, an early January date can sometimes be in the week 52 or 53 of the previous calendar year. For example:
=> SELECT YEAR_ISO('01-01-2016'::DATE), WEEK_ISO('01-01-2016'), DAYOFWEEK_ISO('01-01-2016');
YEAR_ISO | WEEK_ISO | DAYOFWEEK_ISO
----------+----------+---------------
2015 | 53 | 5
(1 row)
DAY
- The day (of the month) field (1–31).
HOUR
- The hour field (0–23).
MINUTE
- The minutes field (0–59).
SECOND
- The seconds field, including fractional parts (0–59) (60 if leap seconds are implemented by the operating system).
MILLISECONDS
- The seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds.
MICROSECONDS
- The seconds field, including fractional parts, multiplied by 1,000,000. This includes full seconds.
Examples
The following example sets the field value as hour and returns the hour, truncating the minutes and seconds:
=> SELECT DATE_TRUNC('HOUR', TIMESTAMP '2012-02-24 13:38:40') AS HOUR;
HOUR
---------------------
2012-02-24 13:00:00
(1 row)
The following example returns the year from the input timestamptz '2012-02-24 13:38:40'
. The function also defaults the month and day to January 1, truncates the hour:minute:second of the timestamp, and appends the time zone (-05
):
=> SELECT DATE_TRUNC('YEAR', TIMESTAMPTZ '2012-02-24 13:38:40') AS YEAR;
YEAR
------------------------
2012-01-01 00:00:00-05
(1 row)
The following example returns the year and month and defaults day of month to 1, truncating the rest of the string:
=> SELECT DATE_TRUNC('MONTH', TIMESTAMP '2012-02-24 13:38:40') AS MONTH;
MONTH
---------------------
2012-02-01 00:00:00
(1 row)
4.2.11 - DATEDIFF
Returns the time span between two dates, in the intervals specified.
Returns the time span between two dates, in the intervals specified. DATEDIFF
excludes the start date in its calculation.
Behavior type
-
Immutable if start and end dates are TIMESTAMP
, DATE
, TIME
, or INTERVAL
-
Stable if start and end dates are TIMESTAMPTZ
Syntax
DATEDIFF ( datepart, start, end );
Parameters
datepart
- Specifies the type of date or time intervals that
DATEDIFF
returns. If datepart
is an expression, it must be enclosed in parentheses:
DATEDIFF((expression), start, end);
datepart
must evaluate to one of the following string literals, either quoted or unquoted:
start
,
end
- Specify the start and end dates, where
start
and end
evaluate to one of the following data types:
If end
< start
, DATEDIFF
returns a negative value.
Note
TIME
and INTERVAL
data types are invalid for start and end dates if datepart
is set to year
, quarter
, or month
.
Compatible start and end date data types
The following table shows which data types can be matched as start and end dates:
|
DATE |
TIMESTAMP |
TIMESTAMPTZ |
TIME |
INTERVAL |
DATE |
• |
• |
• |
|
|
TIMESTAMP |
• |
• |
• |
|
|
TIMESTAMPTZ |
• |
• |
• |
|
|
TIME |
|
|
|
• |
|
INTERVAL |
|
|
|
|
• |
For example, if you set the start date to an INTERVAL
data type, the end date must also be an INTERVAL
, otherwise Vertica returns an error:
SELECT DATEDIFF(day, INTERVAL '26 days', INTERVAL '1 month ');
datediff
----------
4
(1 row)
Date part intervals
DATEDIFF
uses the datepart
argument to calculate the number of intervals between two dates, rather than the actual amount of time between them. DATEDIFF
uses the following cutoff points to calculate those intervals:
-
year
: January 1
-
quarter
: January 1, April 1, July 1, October 1
-
month
: the first day of the month
-
week
: Sunday at midnight (24:00)
For example, if datepart
is set to year
, DATEDIFF
uses January 01 to calculate the number of years between two dates. The following DATEDIFF
statement sets datepart
to year
, and specifies a time span 01/01/2005 - 06/15/2008:
SELECT DATEDIFF(year, '01-01-2005'::date, '12-31-2008'::date);
datediff
----------
3
(1 row)
DATEDIFF
always excludes the start date when it calculates intervals—in this case, 01/01//2005. DATEDIFF
considers only calendar year starts in its calculation, so in this case it only counts years 2006, 2007, and 2008. The function returns 3, although the actual time span is nearly four years.
If you change the start and end dates to 12/31/2004 and 01/01/2009, respectively, DATEDIFF
also counts years 2005 and 2009. This time, it returns 5, although the actual time span is just over four years:
=> SELECT DATEDIFF(year, '12-31-2004'::date, '01-01-2009'::date);
datediff
----------
5
(1 row)
Similarly, DATEDIFF
uses month start dates when it calculates the number of months between two dates. Thus, given the following statement, DATEDIFF
counts months February through September and returns 8:
=> SELECT DATEDIFF(month, '01-31-2005'::date, '09-30-2005'::date);
datediff
----------
8
(1 row)
See also
TIMESTAMPDIFF
4.2.12 - DAY
Returns as an integer the day of the month from the input value.
Returns as an integer the day of the month from the input value.
Behavior type
-
Immutable if the input value is a TIMESTAMP
, DATE
, VARCHAR
, or INTEGER
-
Stable if the specified date is a TIMESTAMPTZ
Syntax
DAY ( value )
Parameters
value
- The value to convert, one of the following:
TIMESTAMP
, TIMESTAMPTZ
, INTERVAL
, VARCHAR
, or INTEGER
.
Examples
=> SELECT DAY (6);
DAY
-----
6
(1 row)
=> SELECT DAY(TIMESTAMP 'sep 22, 2011 12:34');
DAY
-----
22
(1 row)
=> SELECT DAY('sep 22, 2011 12:34');
DAY
-----
22
(1 row)
=> SELECT DAY(INTERVAL '35 12:34');
DAY
-----
35
(1 row)
4.2.13 - DAYOFMONTH
Returns the day of the month as an integer.
Returns the day of the month as an integer.
Behavior type
-
Immutable if thetarget date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the target date is aTIMESTAMPTZ
Syntax
DAYOFMONTH ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT DAYOFMONTH (TIMESTAMP 'sep 22, 2011 12:34');
DAYOFMONTH
------------
22
(1 row)
4.2.14 - DAYOFWEEK
Returns the day of the week as an integer, where Sunday is day 1.
Returns the day of the week as an integer, where Sunday is day 1.
Behavior type
-
Immutable if thetarget date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the target date is aTIMESTAMPTZ
Syntax
DAYOFWEEK ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT DAYOFWEEK (TIMESTAMP 'sep 17, 2011 12:34');
DAYOFWEEK
-----------
7
(1 row)
4.2.15 - DAYOFWEEK_ISO
Returns the ISO 8061 day of the week as an integer, where Monday is day 1.
Returns the ISO 8061 day of the week as an integer, where Monday is day 1.
Behavior type
-
Immutable if thetarget date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the target date is aTIMESTAMPTZ
Syntax
DAYOFWEEK_ISO ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT DAYOFWEEK_ISO(TIMESTAMP 'Sep 22, 2011 12:34');
DAYOFWEEK_ISO
---------------
4
(1 row)
The following example shows how to combine the DAYOFWEEK_ISO, WEEK_ISO, and YEAR_ISO functions to find the ISO day of the week, week, and year:
=> SELECT DAYOFWEEK_ISO('Jan 1, 2000'), WEEK_ISO('Jan 1, 2000'),YEAR_ISO('Jan1,2000');
DAYOFWEEK_ISO | WEEK_ISO | YEAR_ISO
---------------+----------+----------
6 | 52 | 1999
(1 row)
See also
4.2.16 - DAYOFYEAR
Returns the day of the year as an integer, where January 1 is day 1.
Returns the day of the year as an integer, where January 1 is day 1.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
DAYOFYEAR ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT DAYOFYEAR (TIMESTAMP 'SEPT 22,2011 12:34');
DAYOFYEAR
-----------
265
(1 row)
4.2.17 - DAYS
Returns the integer value of the specified date, where 1 AD is 1.
Returns the integer value of the specified date, where 1 AD is 1. If the date precedes 1 AD, DAYS
returns a negative integer.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
DAYS ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT DAYS (DATE '2011-01-22');
DAYS
--------
734159
(1 row)
=> SELECT DAYS (DATE 'March 15, 0044 BC');
DAYS
--------
-15997
(1 row)
4.2.18 - EXTRACT
Retrieves sub-fields such as year or hour from date/time values and returns values of type NUMERIC.
Retrieves sub-fields such as year or hour from date/time values and returns values of type
NUMERIC
. EXTRACT
is intended for computational processing, rather than for formatting date/time values for display.
Behavior type
-
Immutable if the specified date is a TIMESTAMP
, DATE
, or INTERVAL
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
EXTRACT ( field FROM date )
Parameters
field
- A constant value that specifies the sub-field to extract from
date
(see Field Values below).
date
- The date to process, an expression that evaluates to one of the following data types:
Field values
CENTURY
- The century number.
The first century starts at 0001-01-01 00:00:00 AD. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from –1 to 1.
DAY
- The day (of the month) field (1–31).
DECADE
- The year field divided by 10.
DOQ
- The day within the current quarter. DOQ recognizes leap year days.
DOW
- Zero-based day of the week, where Sunday=0.
Note
EXTRACT
's day of week numbering differs from the function
TO_CHAR
.
DOY
- The day of the year (1–365/366)
EPOCH
- Specifies to return one of the following:
-
For DATE
and TIMESTAMP
values: the number of seconds before or since 1970-01-01 00:00:00-00 (if before, a negative number).
-
For INTERVAL
values, the total number of seconds in the interval.
HOUR
- The hour field (0–23).
ISODOW
- The ISO day of the week, an integer between 1 and 7 where Monday is 1.
ISOWEEK
- The ISO week of the year, an integer between 1 and 53.
ISOYEAR
- The ISO year.
MICROSECONDS
- The seconds field, including fractional parts, multiplied by 1,000,000. This includes full seconds.
MILLENNIUM
- The millennium number, where the first millennium is 1 and each millenium starts on
01-01-
y
001
. For example, millennium 2 starts on 01-01-1001.
MILLISECONDS
- The seconds field, including fractional parts, multiplied by 1000. This includes full seconds.
MINUTE
- The minutes field (0 - 59).
MONTH
- For
TIMESTAMP
values, the number of the month within the year (1 - 12) ; for interval
values the number of months, modulo 12 (0 - 11).
QUARTER
- The calendar quarter of the specified date as an integer, where the January-March quarter is 1, valid only for
TIMESTAMP
values.
SECOND
- The seconds field, including fractional parts, 0–59, or 0-60 if the operating system implements leap seconds.
TIME ZONE
- The time zone offset from UTC, in seconds. Positive values correspond to time zones east of UTC, negative values to zones west of UTC.
TIMEZONE_HOUR
- The hour component of the time zone offset.
TIMEZONE_MINUTE
- The minute component of the time zone offset.
WEEK
- The number of the week of the calendar year that the day is in.
YEAR
- The year field. There is no
0 AD
, so subtract BC
years from AD
years accordingly.
Examples
Extract the day of the week and day in quarter from the current TIMESTAMP:
=> SELECT CURRENT_TIMESTAMP AS NOW;
NOW
-------------------------------
2016-05-03 11:36:08.829004-04
(1 row)
=> SELECT EXTRACT (DAY FROM CURRENT_TIMESTAMP);
date_part
-----------
3
(1 row)
=> SELECT EXTRACT (DOQ FROM CURRENT_TIMESTAMP);
date_part
-----------
33
(1 row)
Extract the timezone hour from the current time:
=> SELECT CURRENT_TIMESTAMP;
?column?
-------------------------------
2016-05-03 11:36:08.829004-04
(1 row)
=> SELECT EXTRACT(TIMEZONE_HOUR FROM CURRENT_TIMESTAMP);
date_part
-----------
-4
(1 row)
Extract the number of seconds since 01-01-1970 00:00:
=> SELECT EXTRACT(EPOCH FROM '2001-02-16 20:38:40-08'::TIMESTAMPTZ);
date_part
------------------
982384720.000000
(1 row)
Extract the number of seconds between 01-01-1970 00:00 and 5 days 3 hours before:
=> SELECT EXTRACT(EPOCH FROM -'5 days 3 hours'::INTERVAL);
date_part
----------------
-442800.000000
(1 row)
Convert the results from the last example to a TIMESTAMP:
=> SELECT 'EPOCH'::TIMESTAMPTZ -442800 * '1 second'::INTERVAL;
?column?
------------------------
1969-12-26 16:00:00-05
(1 row)
4.2.19 - GETDATE
Returns the current statement's start date and time as a TIMESTAMP value.
Returns the current statement's start date and time as a TIMESTAMP
value. This function is identical to
SYSDATE
.
GETDATE
uses the date and time supplied by the operating system on the server to which you are connected, which is the same across all servers. Internally, GETDATE
converts
STATEMENT_TIMESTAMP
from TIMESTAMPTZ
to TIMESTAMP
.
Behavior type
Stable
Syntax
GETDATE()
Examples
=> SELECT GETDATE();
GETDATE
----------------------------
2011-03-07 13:21:29.497742
(1 row)
See also
Date/time expressions
4.2.20 - GETUTCDATE
Returns the current statement's start date and time as a TIMESTAMP value.
Returns the current statement's start date and time as a TIMESTAMP
value.
GETUTCDATE
uses the date and time supplied by the operating system on the server to which you are connected, which is the same across all servers. Internally, GETUTCDATE
converts
STATEMENT_TIMESTAMP
at TIME ZONE 'UTC'.
Behavior type
Stable
Syntax
GETUTCDATE()
Examples
=> SELECT GETUTCDATE();
GETUTCDATE
----------------------------
2011-03-07 20:20:26.193052
(1 row)
See also
4.2.21 - HOUR
Returns the hour portion of the specified date as an integer, where 0 is 00:00 to 00:59.
Returns the hour portion of the specified date as an integer, where 0 is 00:00 to 00:59.
Behavior type
Syntax
HOUR( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT HOUR (TIMESTAMP 'sep 22, 2011 12:34');
HOUR
------
12
(1 row)
=> SELECT HOUR (INTERVAL '35 12:34');
HOUR
------
12
(1 row)
=> SELECT HOUR ('12:34');
HOUR
------
12
(1 row)
4.2.22 - ISFINITE
Tests for the special TIMESTAMP constant INFINITY and returns a value of type BOOLEAN.
Tests for the special TIMESTAMP constant INFINITY
and returns a value of type BOOLEAN.
Behavior type
Immutable
Syntax
ISFINITE ( timestamp )
Parameters
timestamp
- Expression of type TIMESTAMP
Examples
SELECT ISFINITE(TIMESTAMP '2009-02-16 21:28:30');
ISFINITE
----------
t
(1 row)
SELECT ISFINITE(TIMESTAMP 'INFINITY');
ISFINITE
----------
f
(1 row)
4.2.23 - JULIAN_DAY
Returns the integer value of the specified day according to the Julian calendar, where day 1 is the first day of the Julian period, January 1, 4713 BC (on the Gregorian calendar, November 24, 4714 BC).
Returns the integer value of the specified day according to the Julian calendar, where day 1 is the first day of the Julian period, January 1, 4713 BC (on the Gregorian calendar, November 24, 4714 BC).
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
JULIAN_DAY ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT JULIAN_DAY (DATE 'MARCH 15, 0044 BC');
JULIAN_DAY
------------
1705428
(1 row)
=> SELECT JULIAN_DAY (DATE '2001-01-01');
JULIAN_DAY
------------
2451911
(1 row)
4.2.24 - LAST_DAY
Returns the last day of the month in the specified date.
Returns the last day of the month in the specified date.
Behavior type
Syntax
LAST_DAY ( date )
Parameters
date
- The date to process, one of the following data types:
Calculating first day of month
SQL does not support any function that returns the first day in the month of a given date. You must use other functions to work around this limitation. For example:
=> SELECT DATE ('2022/07/04') - DAYOFMONTH ('2022/07/04') +1;
?column?
------------
2022-07-01
(1 row)
=> SELECT LAST_DAY('1929/06/06') - (SELECT DAY(LAST_DAY('1929/06/06'))-1);
?column?
------------
1929-06-01
(1 row)
Examples
The following example returns the last day of February as 29 because 2016 is a leap year:
=> SELECT LAST_DAY('2016-02-28 23:30 PST') "Last Day";
Last Day
------------
2016-02-29
(1 row)
The following example returns the last day of February in a non-leap year:
> SELECT LAST_DAY('2017/02/03') "Last";
Last
------------
2017-02-28
(1 row)
The following example returns the last day of March, after converting the string value to the specified DATE type:
=> SELECT LAST_DAY('2003/03/15') "Last";
Last
------------
2012-03-31
(1 row)
4.2.25 - LOCALTIME
Returns a value of type TIME that represents the start of the current transaction.
Returns a value of type TIME
that represents the start of the current transaction.
The return value does not change during the transaction. Thus, multiple calls to LOCALTIME
within the same transaction return the same timestamp.
Behavior type
Stable
Syntax
LOCALTIME [ ( precision ) ]
Parameters
precision
- Rounds the result to the specified number of fractional digits in the seconds field.
Examples
=> CREATE TABLE t1 (a int, b int);
CREATE TABLE
=> INSERT INTO t1 VALUES (1,2);
OUTPUT
--------
1
(1 row)
=> SELECT LOCALTIME time;
time
-----------------
15:03:14.595296
(1 row)
=> INSERT INTO t1 VALUES (3,4);
OUTPUT
--------
1
(1 row)
=> SELECT LOCALTIME;
time
-----------------
15:03:14.595296
(1 row)
=> COMMIT;
COMMIT
=> SELECT LOCALTIME;
time
-----------------
15:03:49.738032
(1 row)
4.2.26 - LOCALTIMESTAMP
Returns a value of type TIMESTAMP/TIMESTAMPTZ that represents the start of the current transaction, and remains unchanged until the transaction is closed.
Returns a value of type TIMESTAMP/TIMESTAMPTZ that represents the start of the current transaction, and remains unchanged until the transaction is closed. Thus, multiple calls to LOCALTIMESTAMP within a given transaction return the same timestamp.
Behavior type
Stable
Syntax
LOCALTIMESTAMP [ ( precision ) ]
Parameters
precision
- Rounds the result to the specified number of fractional digits in the seconds field.
Examples
=> CREATE TABLE t1 (a int, b int);
CREATE TABLE
=> INSERT INTO t1 VALUES (1,2);
OUTPUT
--------
1
(1 row)
=> SELECT LOCALTIMESTAMP(2) AS 'local timestamp';
local timestamp
------------------------
2021-03-05 10:48:58.26
(1 row)
=> INSERT INTO t1 VALUES (3,4);
OUTPUT
--------
1
(1 row)
=> SELECT LOCALTIMESTAMP(2) AS 'local timestamp';
local timestamp
------------------------
2021-03-05 10:48:58.26
(1 row)
=> COMMIT;
COMMIT
=> SELECT LOCALTIMESTAMP(2) AS 'local timestamp';
local timestamp
------------------------
2021-03-05 10:50:08.99
(1 row)
4.2.27 - MICROSECOND
Returns the microsecond portion of the specified date as an integer.
Returns the microsecond portion of the specified date as an integer.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, INTERVAL
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
MICROSECOND ( date )
Parameters
date
- The date to process, one of the following data types:
Examples
=> SELECT MICROSECOND (TIMESTAMP 'Sep 22, 2011 12:34:01.123456');
MICROSECOND
-------------
123456
(1 row)
4.2.28 - MIDNIGHT_SECONDS
Within the specified date, returns the number of seconds between midnight and the date's time portion.
Within the specified date, returns the number of seconds between midnight and the date's time portion.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
MIDNIGHT_SECONDS ( date )
Parameters
date
The date to process, one of the following data types:
Examples
Get the number of seconds since midnight:
=> SELECT MIDNIGHT_SECONDS(CURRENT_TIMESTAMP);
MIDNIGHT_SECONDS
------------------
36480
(1 row)
Get the number of seconds between midnight and noon on March 3 2016:
=> SELECT MIDNIGHT_SECONDS('3-3-2016 12:00'::TIMESTAMP);
MIDNIGHT_SECONDS
------------------
43200
(1 row)
4.2.29 - MINUTE
Returns the minute portion of the specified date as an integer.
Returns the minute portion of the specified date as an integer.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, VARCHAR
or INTERVAL
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
MINUTE ( date )
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT MINUTE('12:34:03.456789');
MINUTE
--------
34
(1 row)
=>SELECT MINUTE (TIMESTAMP 'sep 22, 2011 12:34');
MINUTE
--------
34
(1 row)
=> SELECT MINUTE(INTERVAL '35 12:34:03.456789');
MINUTE
--------
34
(1 row)
4.2.30 - MONTH
Returns the month portion of the specified date as an integer.
Returns the month portion of the specified date as an integer.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, VARCHAR
or INTERVAL
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
MONTH ( date )
Parameters
date
The date to process, one of the following data types:
Examples
In the following examples, Vertica returns the month portion of the specified string. For example, '6-9'
represent September 6.
=> SELECT MONTH('6-9');
MONTH
-------
9
(1 row)
=> SELECT MONTH (TIMESTAMP 'sep 22, 2011 12:34');
MONTH
-------
9
(1 row)
=> SELECT MONTH(INTERVAL '2-35' year to month);
MONTH
-------
11
(1 row)
4.2.31 - MONTHS_BETWEEN
Returns the number of months between two dates.
Returns the number of months between two dates. MONTHS_BETWEEN
can return an integer or a FLOAT:
-
Integer: The day portions of date1
and date2
are the same, and neither date is the last day of the month. MONTHS_BETWEEN
also returns an integer if both dates in date1
and date2
are the last days of their respective months. For example, MONTHS_BETWEEN
calculates the difference between April 30 and March 31 as 1 month.
-
FLOAT: The day portions of date1
and date2
are different and one or both dates are not the last day of their respective months. For example, the difference between April 2 and March 1 is 1.03225806451613
. To calculate month fractions, MONTHS_BETWEEN
assumes all months contain 31 days.
MONTHS_BETWEEN
disregards timestamp time portions.
Behavior type
Syntax
MONTHS_BETWEEN ( date1 , date2 );
Parameters
date1
date2
- Specify the dates to evaluate where
date1
and date2
evaluate to one of the following data types:
-
DATE
-
TIMESTAMP
-
TIMESTAMPTZ
If date1
< date2
, MONTHS_BETWEEN
returns a negative value.
Examples
Return the number of months between April 7 2016 and January 7 2015:
=> SELECT MONTHS_BETWEEN ('04-07-16'::TIMESTAMP, '01-07-15'::TIMESTAMP);
MONTHS_BETWEEN
----------------
15
(1 row)
Return the number of months between March 31 2016 and February 28 2016 (MONTHS_BETWEEN
assumes both months contain 31 days):
=> SELECT MONTHS_BETWEEN ('03-31-16'::TIMESTAMP, '02-28-16'::TIMESTAMP);
MONTHS_BETWEEN
------------------
1.09677419354839
(1 row)
Return the number of months between March 31 2016 and February 29 2016:
=> SELECT MONTHS_BETWEEN ('03-31-16'::TIMESTAMP, '02-29-16'::TIMESTAMP);
MONTHS_BETWEEN
----------------
1
(1 row)
4.2.32 - NEW_TIME
Converts a timestamp value from one time zone to another and returns a TIMESTAMP.
Converts a timestamp value from one time zone to another and returns a TIMESTAMP.
Behavior type
Immutable
Syntax
NEW_TIME( 'timestamp' , 'timezone1' , 'timezone2')
Parameters
timestamp
- The timestamp to convert, conforms to one of the following formats:
- timezone1
*`timezone2`*
- Specify the source and target timezones, one of the strings defined in
/opt/vertica/share/timezonesets
. For example:
-
GMT
: Greenwich Mean Time
-
AST
/ ADT
: Atlantic Standard/Daylight Time
-
EST
/ EDT
: Eastern Standard/Daylight Time
-
CST
/ CDT
: Central Standard/Daylight Time
-
MST
/ MDT
: Mountain Standard/Daylight Time
-
PST
/ PDT
: Pacific Standard/Daylight Time
Examples
Convert the specified time from Eastern Standard Time (EST) to Pacific Standard Time (PST):
=> SELECT NEW_TIME('05-24-12 13:48:00', 'EST', 'PST');
NEW_TIME
---------------------
2012-05-24 10:48:00
(1 row)
Convert 1:00 AM January 2012 from EST to PST:
=> SELECT NEW_TIME('01-01-12 01:00:00', 'EST', 'PST');
NEW_TIME
---------------------
2011-12-31 22:00:00
(1 row)
Convert the current time EST to PST:
=> SELECT NOW();
NOW
-------------------------------
2016-12-09 10:30:36.727307-05
(1 row)
=> SELECT NEW_TIME('NOW', 'EDT', 'CDT');
NEW_TIME
----------------------------
2016-12-09 09:30:36.727307
(1 row)
The following example returns the year 45 before the Common Era in Greenwich Mean Time and converts it to Newfoundland Standard Time:
=> SELECT NEW_TIME('April 1, 45 BC', 'GMT', 'NST')::DATE;
NEW_TIME
---------------
0045-03-31 BC
(1 row)
4.2.33 - NEXT_DAY
Returns the date of the first instance of a particular day of the week that follows the specified date.
Returns the date of the first instance of a particular day of the week that follows the specified date.
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Syntax
NEXT_DAY( 'date', 'day-string')
Parameters
date
The date to process, one of the following data types:
day-string
- The day of the week to process, a CHAR or VARCHAR string or character constant. Supply the full English name such as Tuesday, or any conventional abbreviation, such as Tue or Tues.
day-string
is not case sensitive and trailing spaces are ignored.
Examples
Get the date of the first Monday that follows April 29 2016:
=> SELECT NEXT_DAY('4-29-2016'::TIMESTAMP,'Monday') "NEXT DAY" ;
NEXT DAY
------------
2016-05-02
(1 row)
Get the first Tuesday that follows today:
SELECT NEXT_DAY(CURRENT_TIMESTAMP,'tues') "NEXT DAY" ;
NEXT DAY
------------
2016-05-03
(1 row)
4.2.34 - NOW [date/time]
Returns a value of type TIMESTAMP WITH TIME ZONE representing the start of the current transaction.
Returns a value of type TIMESTAMP WITH TIME ZONE representing the start of the current transaction. NOW is equivalent to
CURRENT_TIMESTAMP
except that it does not accept a precision parameter.
The return value does not change during the transaction. Thus, multiple calls to CURRENT_TIMESTAMP
within the same transaction return the same timestamp.
Behavior type
Stable
Syntax
NOW()
Examples
=> CREATE TABLE t1 (a int, b int);
CREATE TABLE
=> INSERT INTO t1 VALUES (1,2);
OUTPUT
--------
1
(1 row)
=> SELECT NOW();
NOW
------------------------------
2016-12-09 13:00:08.74685-05
(1 row)
=> INSERT INTO t1 VALUES (3,4);
OUTPUT
--------
1
(1 row)
=> SELECT NOW();
NOW
------------------------------
2016-12-09 13:00:08.74685-05
(1 row)
=> COMMIT;
COMMIT
dbadmin=> SELECT NOW();
NOW
-------------------------------
2016-12-09 13:01:31.420624-05
(1 row)
4.2.35 - OVERLAPS
Evaluates two time periods and returns true when they overlap, false otherwise.
Evaluates two time periods and returns true when they overlap, false otherwise.
Behavior type
Syntax
( start, end ) OVERLAPS ( start, end )
( start, interval) OVERLAPS ( start, interval )
Parameters
start
DATE
, TIME
, or TIMESTAMP
/TIMESTAMPTZ
value that specifies the beginning of a time period.
end
DATE
, TIME
, or TIMESTAMP
/TIMESTAMPTZ
value that specifies the end of a time period.
interval
- Value that specifies the length of the time period.
Examples
Evaluate whether date ranges Feb 16 - Dec 21, 2016 and Oct 10 2008 - Oct 3 2016 overlap:
=> SELECT (DATE '2016-02-16', DATE '2016-12-21') OVERLAPS (DATE '2008-10-30', DATE '2016-10-30');
overlaps
----------
t
(1 row)
Evaluate whether date ranges Feb 16 - Dec 21, 2016 and Jan 01 - Oct 30 2008 - Oct 3, 2016 overlap:
=> SELECT (DATE '2016-02-16', DATE '2016-12-21') OVERLAPS (DATE '2008-01-30', DATE '2008-10-30');
overlaps
----------
f
(1 row)
Evaluate whether date range Feb 02 2016 + 1 week overlaps with date range Oct 16 2016 - 8 months:
=> SELECT (DATE '2016-02-16', INTERVAL '1 week') OVERLAPS (DATE '2016-10-16', INTERVAL '-8 months');
overlaps
----------
t
(1 row)
4.2.36 - QUARTER
Returns calendar quarter of the specified date as an integer, where the January-March quarter is 1.
Returns calendar quarter of the specified date as an integer, where the January-March quarter is 1.
Syntax
QUARTER ( date )
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
.
-
Stable if the specified date is aTIMESTAMPTZ
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT QUARTER (TIMESTAMP 'sep 22, 2011 12:34');
QUARTER
---------
3
(1 row)
4.2.37 - ROUND
Rounds the specified date or time.
Rounds the specified date or time. If you omit the precision argument, ROUND
rounds to day (DD
) precision.
Behavior type
Syntax
ROUND( rounding-target[, 'precision'] )
Parameters
*
rounding-target*
- An expression that evaluates to one of the following data types:
precision
- A string constant that specifies precision for the rounded value, one of the following:
-
Century: CC
| SCC
-
Year: SYYY
| YYYY
| YEAR
| YYY
| YY
| Y
-
ISO Year: IYYY
| IYY
| IY
| I
-
Quarter: Q
-
Month: MONTH
| MON
| MM
| RM
-
Same weekday as first day of year: WW
-
Same weekday as first day of ISO year: IW
-
Same weekday as first day of month: W
-
Day (default): DDD
| DD
| J
-
First weekday: DAY
| DY
| D
-
Hour: HH
| HH12
| HH24
-
Minute: MI
-
Second: SS
Note
Hour, minute, and second rounding is not supported by DATE
expressions.
Examples
Round to the nearest hour:
=> SELECT ROUND(CURRENT_TIMESTAMP, 'HH');
ROUND
---------------------
2016-04-28 15:00:00
(1 row)
Round to the nearest month:
=> SELECT ROUND('9-22-2011 12:34:00'::TIMESTAMP, 'MM');
ROUND
---------------------
2011-10-01 00:00:00
(1 row)
See also
TIMESTAMP_ROUND
4.2.38 - SECOND
Returns the seconds portion of the specified date as an integer.
Returns the seconds portion of the specified date as an integer.
Syntax
SECOND ( date )
Behavior type
Immutable, except for TIMESTAMPTZ arguments where it is stable.
Parameters
date
- The date to process, one of the following data types:
Examples
=> SELECT SECOND ('23:34:03.456789');
SECOND
--------
3
(1 row)
=> SELECT SECOND (TIMESTAMP 'sep 22, 2011 12:34');
SECOND
--------
0
(1 row)
=> SELECT SECOND (INTERVAL '35 12:34:03.456789');
SECOND
--------
3
(1 row)
4.2.39 - STATEMENT_TIMESTAMP
Similar to TRANSACTION_TIMESTAMP, returns a value of type TIMESTAMP WITH TIME ZONE that represents the start of the current statement.
Similar to
TRANSACTION_TIMESTAMP
, returns a value of type TIMESTAMP WITH TIME ZONE
that represents the start of the current statement.
The return value does not change during statement execution. Thus, different stages of statement execution always have the same timestamp.
Behavior type
Stable
Syntax
STATEMENT_TIMESTAMP()
Examples
=> SELECT foo, bar FROM (SELECT STATEMENT_TIMESTAMP() AS foo)foo, (SELECT STATEMENT_TIMESTAMP() as bar)bar;
foo | bar
-------------------------------+-------------------------------
2016-12-07 14:55:51.543988-05 | 2016-12-07 14:55:51.543988-05
(1 row)
See also
4.2.40 - SYSDATE
Returns the current statement's start date and time as a TIMESTAMP value.
Returns the current statement's start date and time as a TIMESTAMP
value. This function is identical to
GETDATE
.
SYSDATE
uses the date and time supplied by the operating system on the server to which you are connected, which is the same across all servers. Internally, GETDATE
converts
STATEMENT_TIMESTAMP
from TIMESTAMPTZ
to TIMESTAMP
.
Behavior type
Stable
Syntax
SYSDATE()
Note
You can call this function with no parentheses.
Examples
=> SELECT SYSDATE;
sysdate
----------------------------
2016-12-12 06:11:10.699642
(1 row)
See also
Date/time expressions
4.2.41 - TIME_SLICE
Aggregates data by different fixed-time intervals and returns a rounded-up input TIMESTAMP value to a value that corresponds with the start or end of the time slice interval.
Aggregates data by different fixed-time intervals and returns a rounded-up input TIMESTAMP
value to a value that corresponds with the start or end of the time slice interval.
Given an input TIMESTAMP
value such as 2000-10-28 00:00:01
, the start time of a 3-second time slice interval is 2000-10-28 00:00:00
, and the end time of the same time slice is 2000-10-28 00:00:03
.
Behavior type
Immutable
Syntax
TIME_SLICE( expression, slice-length [, 'time-unit' [, 'start-or-end' ] ] )
Parameters
expression
- One of the following:
Vertica evaluates expression
on each row.
slice-length
- A positive integer that specifies the slice length.
time-unit
- Time unit of the slice, one of the following:
-
HOUR
-
MINUTE
-
SECOND
(default)
-
MILLISECOND
-
MICROSECOND
start-or-end
- Specifies whether the returned value corresponds to the start or end time with one of the following strings:
Note
This parameter can be included only if you also supply a non-null time-unit
argument.
Null argument handling
TIME_SLICE
handles null arguments as follows:
-
TIME_SLICE
returns an error when any one of slice-length
, time-unit
, or start-or-end
parameters is null.
-
If expression
is null and *
slice-length*, *
time-unit*, or *
start-or-end*
contain legal values, TIME_SLICE
returns a NULL value instead of an error.
Usage
The following command returns the (default) start time of a 3-second time slice:
=> SELECT TIME_SLICE('2009-09-19 00:00:01', 3);
TIME_SLICE
---------------------
2009-09-19 00:00:00
(1 row)
The following command returns the end time of a 3-second time slice:
=> SELECT TIME_SLICE('2009-09-19 00:00:01', 3, 'SECOND', 'END');
TIME_SLICE
---------------------
2009-09-19 00:00:03
(1 row)
This command returns results in milliseconds, using a 3-second time slice:
=> SELECT TIME_SLICE('2009-09-19 00:00:01', 3, 'ms');
TIME_SLICE
-------------------------
2009-09-19 00:00:00.999
(1 row)
This command returns results in microseconds, using a 9-second time slice:
=> SELECT TIME_SLICE('2009-09-19 00:00:01', 3, 'us');
TIME_SLICE
----------------------------
2009-09-19 00:00:00.999999
(1 row)
The next example uses a 3-second interval with an input value of '00:00:01'. To focus specifically on seconds, the example omits date, though all values are implied as being part of the timestamp with a given input of '00:00:01'
:
-
'00:00:00' is the start of the 3-second time slice
-
'00:00:03' is the end of the 3-second time slice.
-
'00:00:03' is also the start of the second
3-second time slice. In time slice boundaries, the end value of a time slice does not belong to that time slice; it starts the next one.
When the time slice interval is not a factor of 60 seconds, such as a given slice length of 9 in the following example, the slice does not always start or end on 00 seconds:
=> SELECT TIME_SLICE('2009-02-14 20:13:01', 9);
TIME_SLICE
---------------------
2009-02-14 20:12:54
(1 row)
This is expected behavior, as the following properties are true for all time slices:
To force the above example ('2009-02-14 20:13:01') to start at '2009-02-14 20:13:00', adjust the output timestamp values so that the remainder of 54 counts up to 60:
=> SELECT TIME_SLICE('2009-02-14 20:13:01', 9 )+'6 seconds'::INTERVAL AS time;
time
---------------------
2009-02-14 20:13:00
(1 row)
Alternatively, you could use a different slice length, which is divisible by 60, such as 5:
=> SELECT TIME_SLICE('2009-02-14 20:13:01', 5);
TIME_SLICE
---------------------
2009-02-14 20:13:00
(1 row)
A TIMESTAMPTZ value is implicitly cast to TIMESTAMP. For example, the following two statements have the same effect.
=> SELECT TIME_SLICE('2009-09-23 11:12:01'::timestamptz, 3);
TIME_SLICE
---------------------
2009-09-23 11:12:00
(1 row)
=> SELECT TIME_SLICE('2009-09-23 11:12:01'::timestamptz::timestamp, 3);
TIME_SLICE
---------------------
2009-09-23 11:12:00
(1 row)
Examples
You can use the SQL analytic functions
FIRST_VALUE
and
LAST_VALUE
to find the first/last price within each time slice group (set of rows belonging to the same time slice). This structure can be useful if you want to sample input data by choosing one row from each time slice group.
=> SELECT date_key, transaction_time, sales_dollar_amount,TIME_SLICE(DATE '2000-01-01' + date_key + transaction_time, 3),
FIRST_VALUE(sales_dollar_amount)
OVER (PARTITION BY TIME_SLICE(DATE '2000-01-01' + date_key + transaction_time, 3)
ORDER BY DATE '2000-01-01' + date_key + transaction_time) AS first_value
FROM store.store_sales_fact
LIMIT 20;
date_key | transaction_time | sales_dollar_amount | time_slice | first_value
----------+------------------+---------------------+---------------------+-------------
1 | 00:41:16 | 164 | 2000-01-02 00:41:15 | 164
1 | 00:41:33 | 310 | 2000-01-02 00:41:33 | 310
1 | 15:32:51 | 271 | 2000-01-02 15:32:51 | 271
1 | 15:33:15 | 419 | 2000-01-02 15:33:15 | 419
1 | 15:33:44 | 193 | 2000-01-02 15:33:42 | 193
1 | 16:36:29 | 466 | 2000-01-02 16:36:27 | 466
1 | 16:36:44 | 250 | 2000-01-02 16:36:42 | 250
2 | 03:11:28 | 39 | 2000-01-03 03:11:27 | 39
3 | 03:55:15 | 375 | 2000-01-04 03:55:15 | 375
3 | 11:58:05 | 369 | 2000-01-04 11:58:03 | 369
3 | 11:58:24 | 174 | 2000-01-04 11:58:24 | 174
3 | 11:58:52 | 449 | 2000-01-04 11:58:51 | 449
3 | 19:01:21 | 201 | 2000-01-04 19:01:21 | 201
3 | 22:15:05 | 156 | 2000-01-04 22:15:03 | 156
4 | 13:36:57 | -125 | 2000-01-05 13:36:57 | -125
4 | 13:37:24 | -251 | 2000-01-05 13:37:24 | -251
4 | 13:37:54 | 353 | 2000-01-05 13:37:54 | 353
4 | 13:38:04 | 426 | 2000-01-05 13:38:03 | 426
4 | 13:38:31 | 209 | 2000-01-05 13:38:30 | 209
5 | 10:21:24 | 488 | 2000-01-06 10:21:24 | 488
(20 rows)
TIME_SLICE
rounds the transaction time to the 3-second slice length.
The following example uses the analytic (window) OVER clause to return the last trading price (the last row ordered by TickTime) in each 3-second time slice partition:
=> SELECT DISTINCT TIME_SLICE(TickTime, 3), LAST_VALUE(price)OVER (PARTITION BY TIME_SLICE(TickTime, 3)
ORDER BY TickTime ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
Note
If you omit the windowing clause from an analytic clause,
LAST_VALUE
defaults to
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. Results can seem non-intuitive, because instead of returning the value from the bottom of the current partition, the function returns the bottom of the
window
, which continues to change along with the current input row that is being processed. For more information, see
Time series analytics and
SQL analytics.
In the next example, FIRST_VALUE
is evaluated once for each input record and the data is sorted by ascending values. Use SELECT DISTINCT
to remove the duplicates and return only one output record per TIME_SLICE
:
=> SELECT DISTINCT TIME_SLICE(TickTime, 3), FIRST_VALUE(price)OVER (PARTITION BY TIME_SLICE(TickTime, 3)
ORDER BY TickTime ASC)
FROM tick_store;
TIME_SLICE | ?column?
---------------------+----------
2009-09-21 00:00:06 | 20.00
2009-09-21 00:00:09 | 30.00
2009-09-21 00:00:00 | 10.00
(3 rows)
The information output by the above query can also return MIN
, MAX
, and AVG
of the trading prices within each time slice.
=> SELECT DISTINCT TIME_SLICE(TickTime, 3),FIRST_VALUE(Price) OVER (PARTITION BY TIME_SLICE(TickTime, 3)
ORDER BY TickTime ASC),
MIN(price) OVER (PARTITION BY TIME_SLICE(TickTime, 3)),
MAX(price) OVER (PARTITION BY TIME_SLICE(TickTime, 3)),
AVG(price) OVER (PARTITION BY TIME_SLICE(TickTime, 3))
FROM tick_store;
See also
4.2.42 - TIMEOFDAY
Returns the wall-clock time as a text string.
Returns the wall-clock time as a text string. Function results advance during transactions.
Behavior type
Volatile
Syntax
TIMEOFDAY()
Examples
=> SELECT TIMEOFDAY();
TIMEOFDAY
-------------------------------------
Mon Dec 12 08:18:01.022710 2016 EST
(1 row)
4.2.43 - TIMESTAMP_ROUND
Rounds the specified TIMESTAMP.
Rounds the specified TIMESTAMP. If you omit the precision argument, TIMESTAMP_ROUND
rounds to day (DD
) precision.
Behavior type
Syntax
TIMESTAMP_ROUND ( rounding-target[, 'precision'] )
Parameters
rounding-target
- An expression that evaluates to one of the following data types:
precision
- A string constant that specifies precision for the rounded value, one of the following:
-
Century: CC
| SCC
-
Year: SYYY
| YYYY
| YEAR
| YYY
| YY
| Y
-
ISO Year: IYYY
| IYY
| IY
| I
-
Quarter: Q
-
Month: MONTH
| MON
| MM
| RM
-
Same weekday as first day of year: WW
-
Same weekday as first day of ISO year: IW
-
Same weekday as first day of month: W
-
Day (default): DDD
| DD
| J
-
First weekday: DAY
| DY
| D
-
Hour: HH
| HH12
| HH24
-
Minute: MI
-
Second: SS
Note
Hour, minute, and second rounding is not supported by DATE
expressions.
Examples
Round to the nearest hour:
=> SELECT TIMESTAMP_ROUND(CURRENT_TIMESTAMP, 'HH');
ROUND
---------------------
2016-04-28 15:00:00
(1 row)
Round to the nearest month:
=> SELECT TIMESTAMP_ROUND('9-22-2011 12:34:00'::TIMESTAMP, 'MM');
ROUND
---------------------
2011-10-01 00:00:00
(1 row)
See also
ROUND
4.2.44 - TIMESTAMP_TRUNC
Truncates the specified TIMESTAMP.
Truncates the specified TIMESTAMP. If you omit the precision argument, TIMESTAMP_TRUNC
truncates to day (DD
) precision.
Behavior type
Syntax
TIMESTAMP_TRUNC( trunc-target[, 'precision'] )
Parameters
trunc-target
- An expression that evaluates to one of the following data types:
precision
- A string constant that specifies precision for the truncated value, one of the following:
-
Century: CC
| SCC
-
Year: SYYY
| YYYY
| YEAR
| YYY
| YY
| Y
-
ISO Year: IYYY
| IYY
| IY
| I
-
Quarter: Q
-
Month: MONTH
| MON
| MM
| RM
-
Same weekday as first day of year: WW
-
Same weekday as first day of ISO year: IW
-
Same weekday as first day of month: W
-
Day: DDD
| DD
| J
-
First weekday: DAY
| DY
| D
-
Hour: HH
| HH12
| HH24
-
Minute: MI
-
Second: SS
Note
Hour, minute, and second truncating is not supported by DATE
expressions.
Examples
Truncate to the current hour:
=> SELECT TIMESTAMP_TRUNC(CURRENT_TIMESTAMP, 'HH');
TIMESTAMP_TRUNC
---------------------
2016-04-29 08:00:00
(1 row)
Truncate to the month:
=> SELECT TIMESTAMP_TRUNC('9-22-2011 12:34:00'::TIMESTAMP, 'MM');
TIMESTAMP_TRUNC
---------------------
2011-09-01 00:00:00
(1 row)
See also
TRUNC
4.2.45 - TIMESTAMPADD
Adds the specified number of intervals to a TIMESTAMP or TIMESTAMPTZ value and returns a result of the same data type.
Adds the specified number of intervals to a TIMESTAMP or TIMESTAMPTZ value and returns a result of the same data type.
Behavior type
Syntax
TIMESTAMPADD ( datepart, count, start-date );
Parameters
datepart
- Specifies the type of time intervals that
TIMESTAMPADD
adds to the specified start date. If datepart
is an expression, it must be enclosed in parentheses:
TIMESTAMPADD((expression), interval, start;
datepart
must evaluate to one of the following string literals, either quoted or unquoted:
count
- Integer or integer expression that specifies the number of
datepart
intervals to add to start-date
.
start-date
- TIMESTAMP or TIMESTAMPTZ value.
Examples
Add two months to the current date:
=> SELECT CURRENT_TIMESTAMP AS Today;
Today
-------------------------------
2016-05-02 06:56:57.923045-04
(1 row)
=> SELECT TIMESTAMPADD (MONTH, 2, (CURRENT_TIMESTAMP)) AS TodayPlusTwoMonths;;
TodayPlusTwoMonths
-------------------------------
2016-07-02 06:56:57.923045-04
(1 row)
Add 14 days to the beginning of the current month:
=> SELECT TIMESTAMPADD (DD, 14, (SELECT TRUNC((CURRENT_TIMESTAMP), 'MM')));
timestampadd
---------------------
2016-05-15 00:00:00
(1 row)
4.2.46 - TIMESTAMPDIFF
Returns the time span between two TIMESTAMP or TIMESTAMPTZ values, in the intervals specified.
Returns the time span between two TIMESTAMP or TIMESTAMPTZ values, in the intervals specified. TIMESTAMPDIFF
excludes the start date in its calculation.
Behavior type
Syntax
TIMESTAMPDIFF ( datepart, start, end );
Parameters
datepart
- Specifies the type of date or time intervals that
TIMESTAMPDIFF
returns. If datepart
is an expression, it must be enclosed in parentheses:
TIMESTAMPDIFF((expression), start, end );
datepart
must evaluate to one of the following string literals, either quoted or unquoted:
start
,
end
- Specify the start and end dates, where
start
and end
evaluate to one of the following data types:
If end
< start
, TIMESTAMPDIFF
returns a negative value.
Date part intervals
TIMESTAMPDIFF
uses the datepart
argument to calculate the number of intervals between two dates, rather than the actual amount of time between them. For detailed information, see
DATEDIFF
.
Examples
=> SELECT TIMESTAMPDIFF (YEAR,'1-1-2006 12:34:00', '1-1-2008 12:34:00');
timestampdiff
---------------
2
(1 row)
See also
DATEDIFF
4.2.47 - TRANSACTION_TIMESTAMP
Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction.
Returns a value of type
`TIME WITH TIMEZONE`
that represents the start of the current transaction.
The return value does not change during the transaction. Thus, multiple calls to TRANSACTION_TIMESTAMP
within the same transaction return the same timestamp.
TRANSACTION_TIMESTAMP
is equivalent to
CURRENT_TIMESTAMP
, except it does not accept a precision parameter.
Behavior type
Stable
Syntax
TRANSACTION_TIMESTAMP()
Examples
=> SELECT foo, bar FROM (SELECT TRANSACTION_TIMESTAMP() AS foo)foo, (SELECT TRANSACTION_TIMESTAMP() as bar)bar;
foo | bar
-------------------------------+-------------------------------
2016-12-12 08:18:00.988528-05 | 2016-12-12 08:18:00.988528-05
(1 row)
See also
4.2.48 - TRUNC
Truncates the specified date or time.
Truncates the specified date or time. If you omit the precision argument, TRUNC
truncates to day (DD
) precision.
Behavior type
Syntax
TRUNC( trunc-target[, 'precision'] )
Parameters
*
trunc-target*
- An expression that evaluates to one of the following data types:
precision
- A string constant that specifies precision for the truncated value, one of the following:
-
Century: CC
| SCC
-
Year: SYYY
| YYYY
| YEAR
| YYY
| YY
| Y
-
ISO Year: IYYY
| IYY
| IY
| I
-
Quarter: Q
-
Month: MONTH
| MON
| MM
| RM
-
Same weekday as first day of year: WW
-
Same weekday as first day of ISO year: IW
-
Same weekday as first day of month: W
-
Day (default): DDD
| DD
| J
-
First weekday: DAY
| DY
| D
-
Hour: HH
| HH12
| HH24
-
Minute: MI
-
Second: SS
Note
Hour, minute, and second truncating is not supported by DATE
expressions.
Examples
Truncate to the current hour:
=> => SELECT TRUNC(CURRENT_TIMESTAMP, 'HH');
TRUNC
---------------------
2016-04-29 10:00:00
(1 row)
Truncate to the month:
=> SELECT TRUNC('9-22-2011 12:34:00'::TIMESTAMP, 'MM');
TIMESTAMP_TRUNC
---------------------
2011-09-01 00:00:00
(1 row)
See also
TIMESTAMP_TRUNC
4.2.49 - WEEK
Returns the week of the year for the specified date as an integer, where the first week begins on the first Sunday on or preceding January 1.
Returns the week of the year for the specified date as an integer, where the first week begins on the first Sunday on or preceding January 1.
Syntax
WEEK ( date )
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Parameters
date
The date to process, one of the following data types:
Examples
January 2 is on Saturday, so WEEK
returns 1:
=> SELECT WEEK ('1-2-2016'::DATE);
WEEK
------
1
(1 row)
January 3 is the second Sunday in 2016, so WEEK
returns 2:
=> SELECT WEEK ('1-3-2016'::DATE);
WEEK
------
2
(1 row)
4.2.50 - WEEK_ISO
Returns the week of the year for the specified date as an integer, where the first week starts on Monday and contains January 4.
Returns the week of the year for the specified date as an integer, where the first week starts on Monday and contains January 4. This function conforms with the ISO 8061 standard.
Syntax
WEEK_ISO ( date )
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Parameters
date
The date to process, one of the following data types:
Examples
The first week of 2016 begins on Monday January 4:
=> SELECT WEEK_ISO ('1-4-2016'::DATE);
WEEK_ISO
----------
1
(1 row)
January 3 2016 returns week 53 of the previous year (2015):
=> SELECT WEEK_ISO ('1-3-2016'::DATE);
WEEK_ISO
----------
53
(1 row)
In 2015, January 4 is on Sunday, so the first week of 2015 begins on the preceding Monday (December 29 2014):
=> SELECT WEEK_ISO ('12-29-2014'::DATE);
WEEK_ISO
----------
1
(1 row)
4.2.51 - YEAR
Returns an integer that represents the year portion of the specified date.
Returns an integer that represents the year portion of the specified date.
Syntax
YEAR( date )
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, VARCHAR
, or INTERVAL
-
Stable if the specified date is aTIMESTAMPTZ
Parameters
date
The date to process, one of the following data types:
Examples
=> SELECT YEAR(CURRENT_DATE::DATE);
YEAR
------
2016
(1 row)
See also
YEAR_ISO
4.2.52 - YEAR_ISO
Returns an integer that represents the year portion of the specified date.
Returns an integer that represents the year portion of the specified date. The return value is based on the ISO 8061 standard.
The first week of the ISO year is the week that contains January 4.
Syntax
YEAR_ISO ( date )
Behavior type
-
Immutable if thespecified date is a TIMESTAMP
, DATE
, or VARCHAR
-
Stable if the specified date is aTIMESTAMPTZ
Parameters
date
The date to process, one of the following data types:
Examples
> SELECT YEAR_ISO(CURRENT_DATE::DATE);
YEAR_ISO
----------
2016
(1 row)
See also
YEAR
4.3 - IP address functions
IP functions perform conversion, calculation, and manipulation operations on IP, network, and subnet addresses.
IP functions perform conversion, calculation, and manipulation operations on IP, network, and subnet addresses.
4.3.1 - INET_ATON
Converts a string that contains a dotted-quad representation of an IPv4 network address to an INTEGER.
Converts a string that contains a dotted-quad representation of an IPv4 network address to an INTEGER. It trims any surrounding white space from the string. This function returns NULL if the string is NULL or contains anything other than a quad dotted IPv4 address.
Behavior type
Immutable
Syntax
INET_ATON ( expression )
Arguments
expression
- the string to convert.
Examples
=> SELECT INET_ATON('209.207.224.40');
inet_aton
------------
3520061480
(1 row)
=> SELECT INET_ATON('1.2.3.4');
inet_aton
-----------
16909060
(1 row)
=> SELECT TO_HEX(INET_ATON('1.2.3.4'));
to_hex
---------
1020304
(1 row)
See also
4.3.2 - INET_NTOA
Converts an INTEGER value into a VARCHAR dotted-quad representation of an IPv4 network address.
Converts an INTEGER value into a VARCHAR dotted-quad representation of an IPv4 network address. INET_NTOA returns NULL if the integer value is NULL, negative, or is greater than 232 (4294967295).
Behavior type
Immutable
Syntax
INET_NTOA ( expression )
Arguments
expression
- The integer network address to convert.
Examples
=> SELECT INET_NTOA(16909060);
inet_ntoa
-----------
1.2.3.4
(1 row)
=> SELECT INET_NTOA(03021962);
inet_ntoa
-------------
0.46.28.138
(1 row)
See also
4.3.3 - V6_ATON
Converts a string containing a colon-delimited IPv6 network address into a VARBINARY string.
Converts a string containing a colon-delimited IPv6 network address into a VARBINARY string. Any spaces around the IPv6 address are trimmed. This function returns NULL if the input value is NULL or it cannot be parsed as an IPv6 address. This function relies on the Linux function inet_pton.
Behavior type
Immutable
Syntax
V6_ATON ( expression )
Arguments
expression
- (VARCHAR) the string containing an IPv6 address to convert.
Examples
=> SELECT V6_ATON('2001:DB8::8:800:200C:417A');
v6_aton
------------------------------------------------------
\001\015\270\000\000\000\000\000\010\010\000 \014Az
(1 row)
=> SELECT V6_ATON('1.2.3.4');
v6_aton
------------------------------------------------------------------
\000\000\000\000\000\000\000\000\000\000\377\377\001\002\003\004
(1 row)
SELECT TO_HEX(V6_ATON('2001:DB8::8:800:200C:417A'));
to_hex
----------------------------------
20010db80000000000080800200c417a
(1 row)
=> SELECT V6_ATON('::1.2.3.4');
v6_aton
------------------------------------------------------------------
\000\000\000\000\000\000\000\000\000\000\000\000\001\002\003\004
(1 row)
See also
4.3.4 - V6_NTOA
Converts an IPv6 address represented as varbinary to a character string.
Converts an IPv6 address represented as varbinary to a character string.
Behavior type
Immutable
Syntax
V6_NTOA ( expression )
Arguments
expression
- (
VARBINARY
) is the binary string to convert.
Notes
The following syntax converts an IPv6 address represented as VARBINARY
B to a string A.
V6_NTOA
right-pads B to 16 bytes with zeros, if necessary, and calls the Linux function inet_ntop.
=> V6_NTOA(VARBINARY B) -> VARCHAR A
If B is NULL or longer than 16 bytes, the result is NULL.
Vertica automatically converts the form '::ffff:1.2.3.4' to '1.2.3.4'.
Examples
=> SELECT V6_NTOA(' \001\015\270\000\000\000\000\000\010\010\000 \014Az');
v6_ntoa
---------------------------
2001:db8::8:800:200c:417a
(1 row)
=> SELECT V6_NTOA(V6_ATON('1.2.3.4'));
v6_ntoa
---------
1.2.3.4
(1 row)
=> SELECT V6_NTOA(V6_ATON('::1.2.3.4'));
v6_ntoa
-----------
::1.2.3.4
(1 row)
See also
4.3.5 - V6_SUBNETA
Returns a VARCHAR containing a subnet address in CIDR (Classless Inter-Domain Routing) format from a binary or alphanumeric IPv6 address.
Returns a VARCHAR containing a subnet address in CIDR (Classless Inter-Domain Routing) format from a binary or alphanumeric IPv6 address. Returns NULL if either parameter is NULL, the address cannot be parsed as an IPv6 address, or the subnet value is outside the range of 0 to 128.
Behavior type
Immutable
Syntax
V6_SUBNETA ( address, subnet)
Arguments
address
- VARBINARY or VARCHAR containing the IPv6 address.
subnet
- The size of the subnet in bits as an INTEGER. This value must be greater than zero and less than or equal to 128.
Examples
=> SELECT V6_SUBNETA(V6_ATON('2001:db8::8:800:200c:417a'), 28);
v6_subneta
---------------
2001:db0::/28
(1 row)
See also
4.3.6 - V6_SUBNETN
Calculates a subnet address in CIDR (Classless Inter-Domain Routing) format from a varbinary or alphanumeric IPv6 address.
Calculates a subnet address in CIDR (Classless Inter-Domain Routing) format from a varbinary or alphanumeric IPv6 address.
Behavior type
Immutable
Syntax
V6_SUBNETN ( address, subnet-size)
Arguments
address
- The IPv6 address as a VARBINARY or VARCHAR. The format you pass in determines the date type of the output. If you pass in a VARBINARY address, V6_SUBNETN returns a VARBINARY value. If you pass in a VARCHAR value, it returns a VARCHAR.
subnet-size
- The size of the subnet as an INTEGER.
Notes
The following syntax masks a BINARY IPv6 address B
so that the N left-most bits of S
form a subnet address, while the remaining right-most bits are cleared.
V6_SUBNETN
right-pads B
to 16 bytes with zeros, if necessary and masks B
, preserving its N-bit subnet prefix.
=> V6_SUBNETN(VARBINARY B, INT8 N) -> VARBINARY(16) S
If B
is NULL or longer than 16 bytes, or if N
is not between 0 and 128 inclusive, the result is NULL.
S = [B]/N
in Classless Inter-Domain Routing notation (CIDR notation).
The following syntax masks an alphanumeric IPv6 address A
so that the N
leftmost bits form a subnet address, while the remaining rightmost bits are cleared.
=> V6_SUBNETN(VARCHAR A, INT8 N) -> V6_SUBNETN(V6_ATON(A), N) -> VARBINARY(16) S
Examples
This example returns VARBINARY, after using V6_ATON to convert the VARCHAR string to VARBINARY:
=> SELECT V6_SUBNETN(V6_ATON('2001:db8::8:800:200c:417a'), 28);
v6_subnetn
---------------------------------------------------------------
\001\015\260\000\000\000\000\000\000\000\000\000\000\000\000
See also
4.3.7 - V6_TYPE
Returns an INTEGER value that classifies the type of the network address passed to it as defined in IETF RFC 4291 section 2.4.
Returns an INTEGER value that classifies the type of the network address passed to it as defined in IETF RFC 4291 section 2.4. For example, If you pass this function the string 127.0.0.1
, it returns 2 which indicates the address is a loopback address. This function accepts both IPv4 and IPv6 addresses.
Behavior type
Immutable
Syntax
V6_TYPE ( address)
Arguments
address
- A VARBINARY or VARCHAR containing an IPv6 or IPv4 address to describe.
Returns
The values returned by this function are:
Return Value |
Address Type |
Description |
0 |
GLOBAL |
Global unicast addresses |
1 |
LINKLOCAL |
Link-Local unicast (and private-use) addresses |
2 |
LOOPBACK |
Loopback addresses |
3 |
UNSPECIFIED |
Unspecifiedaddresses |
4 |
MULTICAST |
Multicastaddresses |
The return value is based on the following table of IP address ranges:
Address Family |
CIDR |
Type |
IPv4 |
0.0.0.0/8 |
UNSPECIFIED |
10.0.0.0/8 |
LINKLOCAL |
127.0.0.0/8 |
LOOPBACK |
169.254.0.0/16 |
LINKLOCAL |
172.16.0.0/12 |
LINKLOCAL |
192.168.0.0/16 |
LINKLOCAL |
224.0.0.0/4 |
MULTICAST |
All other addresses |
GLOBAL |
IPv6 |
::0/128 |
UNSPECIFIED |
::1/128 |
LOOPBACK |
fe80::/10 |
LINKLOCAL |
ff00::/8 |
MULTICAST |
All other addresses |
GLOBAL |
This function returns NULL if you pass it a NULL value or an invalid address.
Examples
=> SELECT V6_TYPE(V6_ATON('192.168.2.10'));
v6_type
---------
1
(1 row)
=> SELECT V6_TYPE(V6_ATON('2001:db8::8:800:200c:417a'));
v6_type
---------
0
(1 row)
See also
4.4 - Sequence functions
The sequence functions provide simple, multiuser-safe methods for obtaining successive sequence values from sequence objects.
The sequence functions provide simple, multiuser-safe methods for obtaining successive sequence values from sequence objects.
4.4.1 - CURRVAL
Returns the last value across all nodes that was set by NEXTVAL on this sequence in the current session.
Returns the last value across all nodes that was set by NEXTVAL on this sequence in the current session. If NEXTVAL was never called on this sequence since its creation, Vertica returns an error.
Syntax
CURRVAL ('[[database.]schema.]sequence-name')
Parameters
[
database
.]
schema
Database and schema. The default schema is public
. If you specify a database, it must be the current database.
sequence-name
- The target sequence
Privileges
Restrictions
You cannot invoke CURRVAL in a SELECT statement, in the following contexts:
-
WHERE clause
-
GROUP BY clause
-
ORDER BY clause
-
DISTINCT clause
-
UNION
-
Subquery
You also cannot invoke CURRVAL to act on a sequence in:
Examples
See Creating and using named sequences.
See also
NEXTVAL
4.4.2 - NEXTVAL
Returns the next value in a sequence.
Returns the next value in a sequence. Call NEXTVAL after creating a sequence to initialize the sequence with its default value. Thereafter, call NEXTVAL to increment the sequence value for ascending sequences, or decrement its value for descending sequences.
Syntax
NEXTVAL ('[[database.]schema.]sequence')
Parameters
[
database
.]
schema
Database and schema. The default schema is public
. If you specify a database, it must be the current database.
sequence
- Identifies the target sequence.
Privileges
Restrictions
You cannot invoke NEXTVAL in a SELECT statement, in the following contexts:
-
WHERE clause
-
GROUP BY clause
-
ORDER BY clause
-
DISTINCT clause
-
UNION
-
Subquery
You also cannot invoke NEXTVAL to act on a sequence in:
You can use subqueries to work around some of these restrictions. For example, to use sequences with a DISTINCT clause:
=> SELECT t.col1, shift_allocation_seq.NEXTVAL FROM (
SELECT DISTINCT col1 FROM av_temp1) t;
Examples
See Creating and using named sequences.
See also
CURRVAL
4.5 - String functions
String functions perform conversion, extraction, or manipulation operations on strings, or return information about strings.
String functions perform conversion, extraction, or manipulation operations on strings, or return information about strings.
This section describes functions and operators for examining and manipulating string values. Strings in this context include values of the types CHAR, VARCHAR, BINARY, and VARBINARY.
Unless otherwise noted, all of the functions listed in this section work on all four data types. As opposed to some other SQL implementations, Vertica keeps CHAR strings unpadded internally, padding them only on final output. So converting a CHAR(3) 'ab' to VARCHAR(5) results in a VARCHAR of length 2, not one with length 3 including a trailing space.
Some of the functions described here also work on data of non-string types by converting that data to a string representation first. Some functions work only on character strings, while others work only on binary strings. Many work for both. BINARY and VARBINARY functions ignore multibyte UTF-8 character boundaries.
Non-binary character string functions handle normalized multibyte UTF-8 characters, as specified by the Unicode Consortium. Unless otherwise specified, those character string functions for which it matters can optionally specify whether VARCHAR arguments should be interpreted as octet (byte) sequences, or as (locale-aware) sequences of UTF-8 characters. This is accomplished by adding "USING OCTETS" or "USING CHARACTERS" (default) as a parameter to the function.
Some character string functions are stable because in general UTF-8 case-conversion, searching and sorting can be locale dependent. Thus, LOWER is stable, while LOWERB is immutable. The USING OCTETS clause converts these functions into their "B" forms, so they become immutable. If the locale is set to collation=binary, which is the default, all string functions—except CHAR_LENGTH/CHARACTER_LENGTH, LENGTH, SUBSTR, and OVERLAY—are converted to their "B" forms and so are immutable.
BINARY implicitly converts to VARBINARY, so functions that take VARBINARY arguments work with BINARY.
For other functions that operate on strings (but not VARBINARY), see Regular expression functions.
4.5.1 - ASCII
Converts the first character of a VARCHAR datatype to an INTEGER.
Converts the first character of a VARCHAR datatype to an INTEGER. This function is the opposite of the CHR function.
ASCII operates on UTF-8 characters and single-byte ASCII characters. It returns the same results for the ASCII subset of UTF-8.
Behavior type
Immutable
Syntax
ASCII ( expression )
Arguments
expression
- VARCHAR (string) to convert.
Examples
This example returns employee last names that begin with L. The ASCII equivalent of L is 76:
=> SELECT employee_last_name FROM employee_dimension
WHERE ASCII(SUBSTR(employee_last_name, 1, 1)) = 76
LIMIT 5;
employee_last_name
--------------------
Lewis
Lewis
Lampert
Lampert
Li
(5 rows)
4.5.2 - BIT_LENGTH
Returns the length of the string expression in bits (bytes * 8) as an INTEGER.
Returns the length of the string expression in bits (bytes * 8) as an INTEGER. BIT_LENGTH applies to the contents of VARCHAR and VARBINARY fields.
Behavior type
Immutable
Syntax
BIT_LENGTH ( expression )
Arguments
expression
- (CHAR or VARCHAR or BINARY or VARBINARY) is the string to convert.
Examples
Expression |
Result |
SELECT BIT_LENGTH('abc'::varbinary); |
24 |
SELECT BIT_LENGTH('abc'::binary); |
8 |
SELECT BIT_LENGTH(''::varbinary); |
0 |
SELECT BIT_LENGTH(''::binary); |
8 |
SELECT BIT_LENGTH(null::varbinary); |
|
SELECT BIT_LENGTH(null::binary); |
|
SELECT BIT_LENGTH(VARCHAR 'abc'); |
24 |
SELECT BIT_LENGTH(CHAR 'abc'); |
24 |
SELECT BIT_LENGTH(CHAR(6) 'abc'); |
48 |
SELECT BIT_LENGTH(VARCHAR(6) 'abc'); |
24 |
SELECT BIT_LENGTH(BINARY(6) 'abc'); |
48 |
SELECT BIT_LENGTH(BINARY 'abc'); |
24 |
SELECT BIT_LENGTH(VARBINARY 'abc'); |
24 |
SELECT BIT_LENGTH(VARBINARY(6) 'abc'); |
24 |
See also
4.5.3 - BITCOUNT
Returns the number of one-bits (sometimes referred to as set-bits) in the given VARBINARY value.
Returns the number of one-bits (sometimes referred to as set-bits) in the given VARBINARY value. This is also referred to as the population count.
Behavior type
Immutable
Syntax
BITCOUNT ( expression )
Arguments
expression
- (BINARY or VARBINARY) is the string to return.
Examples
=> SELECT BITCOUNT(HEX_TO_BINARY('0x10'));
BITCOUNT
----------
1
(1 row)
=> SELECT BITCOUNT(HEX_TO_BINARY('0xF0'));
BITCOUNT
----------
4
(1 row)
=> SELECT BITCOUNT(HEX_TO_BINARY('0xAB'));
BITCOUNT
----------
5
(1 row)
4.5.4 - BITSTRING_TO_BINARY
Translates the given VARCHAR bitstring representation into a VARBINARY value.
Translates the given VARCHAR bitstring representation into a VARBINARY value. This function is the inverse of
TO_BITSTRING
.
Behavior type
Immutable
Syntax
BITSTRING_TO_BINARY ( expression )
Arguments
expression
- The VARCHAR string to process.
Examples
If there are an odd number of characters in the hex value, the first character is treated as the low nibble of the first (furthest to the left) byte.
=> SELECT BITSTRING_TO_BINARY('0110000101100010');
BITSTRING_TO_BINARY
---------------------
ab
(1 row)
4.5.5 - BTRIM
Removes the longest string consisting only of specified characters from the start and end of a string.
Removes the longest string consisting only of specified characters from the start and end of a string.
Behavior type
Immutable
Syntax
BTRIM ( expression [ , characters-to-remove ] )
Arguments
expression
- (CHAR or VARCHAR) is the string to modify
characters-to-remove
- (CHAR or VARCHAR) specifies the characters to remove. The default is the space character.
Examples
=> SELECT BTRIM('xyxtrimyyx', 'xy');
BTRIM
-------
trim
(1 row)
See also
4.5.6 - CHARACTER_LENGTH
The CHARACTER_LENGTH() function:.
The CHARACTER_LENGTH() function:
-
Returns the string length in UTF-8 characters for CHAR and VARCHAR columns
-
Returns the string length in bytes (octets) for BINARY and VARBINARY columns
-
Strips the padding from CHAR expressions but not from VARCHAR expressions
-
Is identical to LENGTH() for CHAR and VARCHAR. For binary types, CHARACTER_LENGTH() is identical to OCTET_LENGTH().
Behavior type
Immutable if USING OCTETS
, stable otherwise.
Syntax
[ CHAR_LENGTH | CHARACTER_LENGTH ] ( expression ... [ USING { CHARACTERS | OCTETS } ] )
Arguments
expression
- (CHAR or VARCHAR) is the string to measure
USING CHARACTERS | OCTETS
- Determines whether the character length is expressed in characters (the default) or octets.
Examples
=> SELECT CHAR_LENGTH('1234 '::CHAR(10) USING OCTETS);
octet_length
--------------
4
(1 row)
=> SELECT CHAR_LENGTH('1234 '::VARCHAR(10));
char_length
-------------
6
(1 row)
=> SELECT CHAR_LENGTH(NULL::CHAR(10)) IS NULL;
?column?
----------
t
(1 row)
See also
4.5.7 - CHR
Converts the first character of an INTEGER datatype to a VARCHAR.
Converts the first character of an INTEGER datatype to a VARCHAR.
Behavior type
Immutable
Syntax
CHR ( expression )
Arguments
expression
- (INTEGER) is the string to convert and is masked to a single character.
Notes
-
CHR is the opposite of the ASCII function.
-
CHR operates on UTF-8 characters, not only on single-byte ASCII characters. It continues to get the same results for the ASCII subset of UTF-8.
Examples
This example returns the VARCHAR datatype of the CHR expressions 65 and 97 from the employee table:
=> SELECT CHR(65), CHR(97) FROM employee;
CHR | CHR
-----+-----
A | a
A | a
A | a
A | a
A | a
A | a
A | a
A | a
A | a
A | a
A | a
A | a
(12 rows)
4.5.8 - COLLATION
Applies a collation to two or more strings.
Applies a collation to two or more strings. Use COLLATION
with ORDER BY
, GROUP BY
, and equality clauses.
Syntax
COLLATION ( 'expression' [ , 'locale_or_collation_name' ] )
Arguments
'expression'
- Any expression that evaluates to a column name or to two or more values of type
CHAR
or VARCHAR
.
'locale_or_collation_name'
- The ICU (International Components for Unicode) locale or collation name to use when collating the string. If you omit this parameter,
COLLATION
uses the collation associated with the session locale.
To determine the current session locale, enter the vsql meta-command \locale
:
=> \locale
en_US@collation=binary
To set the locale and collation, use \locale
as follows:
=> \locale en_US@collation=binary
INFO 2567: Canonical locale: 'en_US'
Standard collation: 'LEN_KBINARY'
English (United States)
Locales
The locale used for COLLATION
can be one of the following:
For a list of valid ICU locales, go to Locale Explorer (ICU).
Binary and non-binary collations
The Vertica default locale is en_US@collation=binary
, which uses binary collation
. Binary collation compares binary representations of strings. Binary collation is fast, but it can result in a sort order where K
precedes c
because the binary representation of K
is lower than c
.
For non-binary collation, Vertica transforms the data according to the rules of the locale or the specified collation, and then applies the sorting rules. Suppose the locale collation is non-binary and you request a GROUP BY on string data. In this case,Vertica calls COLLATION
, whether or not you specify the function in your query.
For information about collation naming, see Collator Naming Scheme.
Examples
Collating GROUP BY results
The following examples are based on a Premium_Customer
table that contains the following data:
=> SELECT * FROM Premium_Customer;
ID | LName | FName
----+--------+---------
1 | Mc Coy | Bob
2 | Mc Coy | Janice
3 | McCoy | Jody
4 | McCoy | Peter
5 | McCoy | Brendon
6 | Mccoy | Cameron
7 | Mccoy | Lisa
The first statement shows how COLLATION
applies the collation for the EN_US
locale to the LName
column for the locale EN_US
. Vertica sorts the GROUP BY
output as follows:
=> SELECT * FROM Premium_Customer ORDER BY COLLATION(LName, 'EN_US'), FName;
ID | LName | FName
----+--------+---------
1 | Mc Coy | Bob
2 | Mc Coy | Janice
6 | Mccoy | Cameron
7 | Mccoy | Lisa
5 | McCoy | Brendon
3 | McCoy | Jody
4 | McCoy | Peter
The next statement shows how COLLATION
collates the LName
column for the locale LEN_AS
:
In the results, the last names in which "coy" starts with a lowercase letter precede the last names where "Coy" starts with an uppercase letter.
=> SELECT * FROM Premium_Customer ORDER BY COLLATION(LName, 'LEN_AS'), FName;
ID | LName | FName
----+--------+---------
6 | Mccoy | Cameron
7 | Mccoy | Lisa
1 | Mc Coy | Bob
5 | McCoy | Brendon
2 | Mc Coy | Janice
3 | McCoy | Jody
4 | McCoy | Peter
Comparing strings with an equality clause
In the following query, COLLATION
removes spaces and punctuation when comparing two strings in English. It then determines whether the two strings still have the same value after the punctuation has been removed:
=> SELECT COLLATION ('U.S.A', 'LEN_AS') = COLLATION('USA', 'LEN_AS');
?column?
----------
t
Sorting strings in non-english languages
The following table contains data that uses the German character eszett, ß:
=> SELECT * FROM t1;
a | b | c
------------+---+----
ßstringß | 1 | 10
SSstringSS | 2 | 20
random1 | 3 | 30
random1 | 4 | 40
random2 | 5 | 50
When you specify the collation LDE_S1
:
The query returns the data in the following order:
=> SELECT a FROM t1 ORDER BY COLLATION(a, 'LDE_S1'));
a
------------
random1
random1
random2
SSstringSS
ßstringß
4.5.9 - CONCAT
Concatenates two strings and returns a varchar data type.
Concatenates two strings and returns a varchar data type. If either argument is null, concat returns null.
Syntax
CONCAT ('string-expression1, string-expression2)
Behavior type
Immutable
Arguments
string-expression1
,
string-expression2
- The values to concatenate, any data type that can be cast to a string value.
Examples
The following examples use a sample table named alphabet
with two varchar columns:
=> CREATE TABLE alphabet (letter1 varchar(2), letter2 varchar(2));
CREATE TABLE
=> COPY alphabet FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> A|B
>> C|D
>> \.
=> SELECT * FROM alphabet;
letter1 | letter2
---------+---------
C | D
A | B
(2 rows)
Concatenate the contents of the first column with a character string:
=> SELECT CONCAT(letter1, ' is a letter') FROM alphabet;
CONCAT
---------------
A is a letter
C is a letter
(2 rows)
Concatenate the output of two nested CONCAT functions:
=> SELECT CONCAT(CONCAT(letter1, ' and '), CONCAT(letter2, ' are both letters')) FROM alphabet;
CONCAT
--------------------------
C and D are both letters
A and B are both letters
(2 rows)
Concatenate a date and string:
=> SELECT current_date today;
today
------------
2021-12-10
(1 row)
=> SELECT CONCAT('2021-12-31'::date - current_date, ' days until end of year 2021');
CONCAT
--------------------------------
21 days until end of year 2021
(1 row)
4.5.10 - DECODE
Compares expression to each search value one by one.
Compares *expression
*to each search value one by one. If *expression
*is equal to a search, the function returns the corresponding result. If no match is found, the function returns default. If default is omitted, the function returns null.
DECODE is similar to the IF-THEN-ELSE and CASE expressions:
CASE expression
[WHEN search THEN result]
[WHEN search THEN result]
...
[ELSE default];
The arguments can have any data type supported by Vertica. The result types of individual results are promoted to the least common type that can be used to represent all of them. This leads to a character string type, an exact numeric type, an approximate numeric type, or a DATETIME type, where all the various result arguments must be of the same type grouping.
Behavior type
Immutable
Syntax
DECODE ( expression, search, result [ , search, result ]...[, default ] )
Arguments
expression
- The value to compare.
search
- The value compared against
expression.
result
- The value returned, if *
expression
*is equal to search.
default
- Optional. If no matches are found, DECODE returns default. If default is omitted, then DECODE returns NULL (if no matches are found).
Examples
The following example converts numeric values in the weight column from the product_dimension table to descriptive values in the output.
=> SELECT product_description, DECODE(weight,
2, 'Light',
50, 'Medium',
71, 'Heavy',
99, 'Call for help',
'N/A')
FROM product_dimension
WHERE category_description = 'Food'
AND department_description = 'Canned Goods'
AND sku_number BETWEEN 'SKU-#49750' AND 'SKU-#49999'
LIMIT 15;
product_description | case
-----------------------------------+---------------
Brand #499 canned corn | N/A
Brand #49900 fruit cocktail | Medium
Brand #49837 canned tomatoes | Heavy
Brand #49782 canned peaches | N/A
Brand #49805 chicken noodle soup | N/A
Brand #49944 canned chicken broth | N/A
Brand #49819 canned chili | N/A
Brand #49848 baked beans | N/A
Brand #49989 minestrone soup | N/A
Brand #49778 canned peaches | N/A
Brand #49770 canned peaches | N/A
Brand #4977 fruit cocktail | N/A
Brand #49933 canned olives | N/A
Brand #49750 canned olives | Call for help
Brand #49777 canned tomatoes | N/A
(15 rows)
4.5.11 - EDIT_DISTANCE
Calculates and returns the Levenshtein distance between two strings.
Calculates and returns the Levenshtein distance between two strings. The return value indicates the minimum number of single-character edits—insertions, deletions, or substitutions—that are required to change one string into the other. Compare to Jaro distance and Jaro-Winkler distance.
Behavior type
Immutable
Syntax
EDIT_DISTANCE ( string-expression1, string-expression2 )
Arguments
string-expression1
, string-expression2
- The two VARCHAR expressions to compare.
Examples
The Levenshtein distance between kitten
and knitting
is 3:
=> SELECT EDIT_DISTANCE ('kitten', 'knitting');
EDIT_DISTANCE
---------------
3
(1 row)
EDIT_DISTANCE calculates that no fewer than three changes are required to transform kitten
to knitting
:
-
kitten
→ knitten
(insert n
after k
)
-
knitten
→ knittin
(substitute i
for e
)
-
knittin
→ knitting
(append g
)
4.5.12 - GREATEST
Returns the largest value in a list of expressions of any data type.
Returns the largest value in a list of expressions of any data type. All data types in the list must be the same or compatible. A NULL value in any one of the expressions returns NULL. Results can vary, depending on the locale's collation setting.
Behavior type
Stable
Syntax
GREATEST ( { * | expression[,...] } )
Arguments
* |
expression
[,...]
- The expressions to evaluate, one of the following:
Examples
GREATEST returns 10 as the largest value in the list:
=> SELECT GREATEST(7,5,10);
GREATEST
----------
10
(1 row)
If you put quotes around the integer expressions, GREATEST compares the values as strings and returns '7' as the greatest value:
=> SELECT GREATEST('7', '5', '10');
GREATEST
----------
7
(1 row)
The next example returns FLOAT 1.5 as the greatest because the integer is implicitly cast to float:
=> SELECT GREATEST(1, 1.5);
GREATEST
----------
1.5
(1 row)
GREATEST queries all columns in a view based on the VMart table product_dimension
, and returns the largest value in each row:
=> CREATE VIEW query1 AS SELECT shelf_width, shelf_height, shelf_depth FROM product_dimension;
CREATE VIEW
=> SELECT shelf_width, shelf_height, shelf_depth, greatest(*) FROM query1 WHERE shelf_width = 1;
shelf_width | shelf_height | shelf_depth | greatest
-------------+--------------+-------------+----------
1 | 3 | 1 | 3
1 | 3 | 3 | 3
1 | 5 | 4 | 5
1 | 2 | 2 | 2
1 | 1 | 3 | 3
1 | 2 | 2 | 2
1 | 2 | 3 | 3
1 | 1 | 5 | 5
1 | 1 | 4 | 4
1 | 5 | 3 | 5
1 | 4 | 2 | 4
1 | 4 | 5 | 5
1 | 5 | 3 | 5
1 | 2 | 5 | 5
1 | 4 | 2 | 4
1 | 4 | 4 | 4
1 | 1 | 2 | 2
1 | 4 | 3 | 4
...
See also
LEAST
4.5.13 - GREATESTB
Returns the largest value in a list of expressions of any data type, using binary ordering.
Returns the largest value in a list of expressions of any data type, using binary ordering. All data types in the list must be the same or compatible. A NULL value in any one of the expressions returns NULL. Results can vary, depending on the locale's collation setting.
Behavior type
Immutable
Syntax
GREATEST ( { * | expression[,...] } )
Arguments
* |
expression
[,...]
- The expressions to evaluate, one of the following:
Examples
The following command selects straße as the greatest in the series of inputs:
=> SELECT GREATESTB('straße', 'strasse');
GREATESTB
-----------
straße
(1 row)
GREATESTB returns 10 as the largest value in the list:
=> SELECT GREATESTB(7,5,10);
GREATESTB
-----------
10
(1 row)
If you put quotes around the integer expressions, GREATESTB compares the values as strings and returns '7' as the greatest value:
=> SELECT GREATESTB('7', '5', '10');
GREATESTB
-----------
7
(1 row)
The next example returns FLOAT 1.5 as the greatest because the integer is implicitly cast to float:
=> SELECT GREATESTB(1, 1.5);
GREATESTB
-----------
1.5
(1 row)
GREATESTB queries all columns in a view based on the VMart table product_dimension
, and returns the largest value in each row:
=> CREATE VIEW query1 AS SELECT shelf_width, shelf_height, shelf_depth FROM product_dimension;
CREATE VIEW
=> SELECT shelf_width, shelf_height, shelf_depth, greatestb(*) FROM query1 WHERE shelf_width = 1;
shelf_width | shelf_height | shelf_depth | greatestb
-------------+--------------+-------------+-----------
1 | 3 | 1 | 3
1 | 3 | 3 | 3
1 | 5 | 4 | 5
1 | 2 | 2 | 2
1 | 1 | 3 | 3
1 | 2 | 2 | 2
1 | 2 | 3 | 3
1 | 1 | 5 | 5
1 | 1 | 4 | 4
1 | 5 | 3 | 5
1 | 4 | 2 | 4
1 | 4 | 5 | 5
1 | 5 | 3 | 5
1 | 2 | 5 | 5
1 | 4 | 2 | 4
...
See also
LEASTB
4.5.14 - HEX_TO_BINARY
Translates the given VARCHAR hexadecimal representation into a VARBINARY value.
Translates the given VARCHAR hexadecimal representation into a VARBINARY value.
Behavior type
Immutable
Syntax
HEX_TO_BINARY ( [ 0x ] expression )
Arguments
expression
- (BINARY or VARBINARY) String to translate.
0x
- Optional prefix.
Notes
VARBINARY HEX_TO_BINARY(VARCHAR) converts data from character type in hexadecimal format to binary type. This function is the inverse of TO_HEX.
HEX_TO_BINARY(TO_HEX(x)) = x)
TO_HEX(HEX_TO_BINARY(x)) = x)
If there are an odd number of characters in the hexadecimal value, the first character is treated as the low nibble of the first (furthest to the left) byte.
Examples
If the given string begins with "0x" the prefix is ignored. For example:
=> SELECT HEX_TO_BINARY('0x6162') AS hex1, HEX_TO_BINARY('6162') AS hex2;
hex1 | hex2
------+------
ab | ab
(1 row)
If an invalid hex value is given, Vertica returns an “invalid binary representation" error; for example:
=> SELECT HEX_TO_BINARY('0xffgf');
ERROR: invalid hex string "0xffgf"
See also
4.5.15 - HEX_TO_INTEGER
Translates the given VARCHAR hexadecimal representation into an INTEGER value.
Translates the given VARCHAR hexadecimal representation into an INTEGER value.
Vertica completes this conversion as follows:
-
Adds the 0x prefix if it is not specified in the input
-
Casts the VARCHAR string to a NUMERIC
-
Casts the NUMERIC to an INTEGER
Behavior type
Immutable
Syntax
HEX_TO_INTEGER ( [ 0x ] expression )
Arguments
expression
- VARCHAR is the string to translate.
0x
- Is the optional prefix.
Examples
You can enter the string with or without the Ox prefix. For example:
=> SELECT HEX_TO_INTEGER ('0aedc')
AS hex1,HEX_TO_INTEGER ('aedc') AS hex2;
hex1 | hex2
-------+-------
44764 | 44764
(1 row)
If you pass the function an invalid hex value, Vertica returns an invalid input syntax
error; for example:
=> SELECT HEX_TO_INTEGER ('0xffgf');
ERROR 3691: Invalid input syntax for numeric: "0xffgf"
See also
4.5.16 - INITCAP
Capitalizes first letter of each alphanumeric word and puts the rest in lowercase.
Capitalizes first letter of each alphanumeric word and puts the rest in lowercase.
Behavior type
Immutable
Syntax
INITCAP ( expression )
Arguments
expression
- (VARCHAR) is the string to format.
Notes
-
Depends on collation setting of the locale.
-
INITCAP is restricted to 32750 octet inputs, since it is possible for the UTF-8 representation of result to double in size.
Examples
Expression |
Result |
SELECT INITCAP('high speed database'); |
High Speed Database |
SELECT INITCAP('LINUX TUTORIAL'); |
Linux Tutorial |
SELECT INITCAP('abc DEF 123aVC 124Btd,lAsT'); |
Abc Def 123Avc 124Btd,Last |
SELECT INITCAP(''); |
|
SELECT INITCAP(null); |
|
4.5.17 - INITCAPB
Capitalizes first letter of each alphanumeric word and puts the rest in lowercase.
Capitalizes first letter of each alphanumeric word and puts the rest in lowercase. Multibyte characters are not converted and are skipped.
Behavior type
Immutable
Syntax
INITCAPB ( expression )
Arguments
expression
- (VARCHAR) is the string to format.
Notes
Depends on collation setting of the locale.
Examples
Expression |
Result |
SELECT INITCAPB('étudiant'); |
éTudiant |
SELECT INITCAPB('high speed database'); |
High Speed Database |
SELECT INITCAPB('LINUX TUTORIAL'); |
Linux Tutorial |
SELECT INITCAPB('abc DEF 123aVC 124Btd,lAsT'); |
Abc Def 123Avc 124Btd,Last |
SELECT INITCAPB(''); |
|
SELECT INITCAPB(null); |
|
4.5.18 - INSERT
Inserts a character string into a specified location in another character string.
Inserts a character string into a specified location in another character string.
Syntax
INSERT( 'string1', n, m, 'string2' )
Behavior type
Immutable
Arguments
string1
- (VARCHAR) Is the string in which to insert the new string.
n
- A character of type INTEGER that represents the starting point for the insertion within*
string1
*. You specify the number of characters from the first character in string1 as the starting point for the insertion. For example, to insert characters before "c", in the string "abcdef," enter 3.
m
- A character of type INTEGER that represents the number of characters in*
string1
(if any)
*that should be replaced by the insertion. For example,if you want the insertion to replace the letters "cd" in the string "abcdef, " enter 2.
string2
- (VARCHAR) Is the string to be inserted.
Examples
The following example changes the string Warehouse to Storehouse using the INSERT function:
=> SELECT INSERT ('Warehouse',1,3,'Stor');
INSERT
------------
Storehouse
(1 row)
4.5.19 - INSTR
Searches string for substring and returns an integer indicating the position of the character in string that is the first character of this occurrence.
Searches *string
*for *substring
*and returns an integer indicating the position of the character in *string
*that is the first character of this occurrence
. The return value is based on the character position of the identified character.
Behavior type
Immutable
Syntax
INSTR ( string , substring [, position [, occurrence ] ] )
Arguments
string
- (CHAR or VARCHAR, or BINARY or VARBINARY) Text expression to search.
substring
- (CHAR or VARCHAR, or BINARY or VARBINARY) String to search for.
position
- Nonzero integer indicating the character of string where Vertica begins the search. If position is negative, then Vertica counts backward from the end of string and then searches backward from the resulting position. The first character of string occupies the default position 1, and position cannot be 0.
occurrence
- Integer indicating which occurrence of string Vertica searches. The value of occurrence must be positive (greater than 0), and the default is 1.
Notes
Both position
and occurrence
must be of types that can resolve to an integer. The default values of both parameters are 1, meaning Vertica begins searching at the first character of string for the first occurrence of substring. The return value is relative to the beginning of string, regardless of the value of position, and is expressed in characters.
If the search is unsuccessful (that is, if substring does not appear *occurrence
*times after the position
character of string,
the return value is 0.
Examples
The first example searches forward in string ‘abc’ for substring ‘b’. The search returns the position in ‘abc’ where ‘b’ occurs, or position 2. Because no position parameters are given, the default search starts at ‘a’, position 1.
=> SELECT INSTR('abc', 'b');
INSTR
-------
2
(1 row)
The following three examples use character position to search backward to find the position of a substring.
Note
Although it might seem intuitive that the function returns a negative integer, the position of n
occurrence is read left to right in the sting, even though the search happens in reverse (from the end—or right side—of the string).
In the first example, the function counts backward one character from the end of the string, starting with character ‘c’. The function then searches backward for the first occurrence of ‘a’, which it finds it in the first position in the search string.
=> SELECT INSTR('abc', 'a', -1);
INSTR
-------
1
(1 row)
In the second example, the function counts backward one byte from the end of the string, starting with character ‘c’. The function then searches backward for the first occurrence of ‘a’, which it finds it in the first position in the search string.
=> SELECT INSTR(VARBINARY 'abc', VARBINARY 'a', -1);
INSTR
-------
1
(1 row)
In the third example, the function counts backward one character from the end of the string, starting with character ‘b’, and searches backward for substring ‘bc’, which it finds in the second position of the search string.
=> SELECT INSTR('abcb', 'bc', -1);
INSTR
-------
2
(1 row)
In the fourth example, the function counts backward one character from the end of the string, starting with character ‘b’, and searches backward for substring ‘bcef’, which it does not find. The result is 0.
=> SELECT INSTR('abcb', 'bcef', -1);
INSTR
-------
0
(1 row)
In the fifth example, the function counts backward one byte from the end of the string, starting with character ‘b’, and searches backward for substring ‘bcef’, which it does not find. The result is 0.
=> SELECT INSTR(VARBINARY 'abcb', VARBINARY 'bcef', -1);
INSTR
-------
0
(1 row)
Multibyte characters are treated as a single character:
=> SELECT INSTR('aébc', 'b');
INSTR
-------
3
(1 row)
Use INSTRB to treat multibyte characters as binary:
=> SELECT INSTRB('aébc', 'b');
INSTRB
--------
4
(1 row)
4.5.20 - INSTRB
Searches string for substring and returns an integer indicating the octet position within string that is the first occurrence.
Searches string
for substring
and returns an integer indicating the octet position within string that is the first occurrence
. The return value is based on the octet position of the identified byte.
Behavior type
Immutable
Syntax
INSTRB ( string , substring [, position [, occurrence ] ] )
Arguments
string
- Is the text expression to search.
substring
- Is the string to search for.
position
- Is a nonzero integer indicating the character of string where Vertica begins the search. If position is negative, then Vertica counts backward from the end of string and then searches backward from the resulting position. The first byte of string occupies the default position 1, and position cannot be 0.
occurrence
- Is an integer indicating which occurrence of string Vertica searches. The value of occurrence must be positive (greater than 0), and the default is 1.
Notes
Both position
and occurrence
must be of types that can resolve to an integer. The default values of both parameters are 1, meaning Vertica begins searching at the first byte of string for the first occurrence of substring. The return value is relative to the beginning of string, regardless of the value of position, and is expressed in octets.
If the search is unsuccessful (that is, if substring does not appear *occurrence
*times after the *position
*character of *string,
*then the return value is 0.
Examples
=> SELECT INSTRB('straße', 'ß');
INSTRB
--------
5
(1 row)
See also
4.5.21 - ISUTF8
Tests whether a string is a valid UTF-8 string.
Tests whether a string is a valid UTF-8 string. Returns true if the string conforms to UTF-8 standards, and false otherwise. This function is useful to test strings for UTF-8 compliance before passing them to one of the regular expression functions, such as REGEXP_LIKE, which expect UTF-8 characters by default.
ISUTF8 checks for invalid UTF8 byte sequences, according to UTF-8 rules:
The presence of an invalid UTF-8 byte sequence results in a return value of false.
To coerce a string to UTF-8, use MAKEUTF8.
Syntax
ISUTF8( string );
Arguments
string
- The string to test for UTF-8 compliance.
Examples
=> SELECT ISUTF8(E'\xC2\xBF'); -- UTF-8 INVERTED QUESTION MARK ISUTF8
--------
t
(1 row)
=> SELECT ISUTF8(E'\xC2\xC0'); -- UNDEFINED UTF-8 CHARACTER
ISUTF8
--------
f
(1 row)
4.5.22 - JARO_DISTANCE
Calculates and returns the Jaro similarity, an edit distance between two sequences.
Calculates and returns the Jaro similarity, an edit distance between two sequences. It is useful for queries designed for short strings, such as finding similar names. Also see Jaro-Winkler distance, which adds a prefix scale favoring strings that match in the beginning, and edit distance, which returns the Levenshtein distance between two strings.
Behavior type
Immutable
Syntax
JARO_DISTANCE (string-expression1, string-expression2)
Arguments
string-expression1, string-expression2
- The two VARCHAR expressions to compare. Neither can be NULL.
Example
Return only the names with a Jaro distance from 'rode' that is greater than 0.6:
=> SELECT name FROM names WHERE JARO_DISTANCE('rode', name) > 0.6;
name
---------
fred
frieda
rodgers
rogers
(4 rows)
4.5.23 - JARO_WINKLER_DISTANCE
Calculates and returns the Jaro-Winkler similarity, an edit distance between two sequences.
Calculates and returns the Jaro-Winkler similarity, an edit distance between two sequences. It is useful for queries designed for short strings, such as finding similar names. It is a variant of the Jaro distance metric, to which it adds a prefix scale giving more favorable ratings for strings that match from the beginning. See also edit distance, which returns the Levenshtein distance between two strings.
Behavior type
Immutable
Syntax
JARO_WINKLER_DISTANCE (string-expression1 , string-expression2 [ USING PARAMETERS prefix_scale=scale, prefix_length=length])
Arguments
string-expression1, string-expression2
- The two VARCHAR expressions to compare. Neither can be NULL.
Parameters
scale
- A FLOAT specifying the scale value by which to weight the importance of matching prefixes. Optional.
default = 0.1
length
- An non-negative INT representing the maximum matching prefix length. Optional.
default = 4
Examples
Return only the names with a Jaro-Winkler distance from 'rode' that is greater than 0.6:
=> SELECT name FROM names WHERE JARO_WINKLER_DISTANCE('rode', name) > 0.6;
name
---------
fred
frieda
rodgers
rogers
(4 rows)
The Jaro-Winkler distance between 'help' and 'hello' given a prefix_scale
of 0.1 and prefix_length
of 0 is 0.783333333333333:
=> select JARO_WINKLER_DISTANCE('help', 'hello' USING PARAMETERS prefix_scale=0.1, prefix_length=0);
jaro_winkler_distance
-----------------------
0.783333333333333
(1 row)
4.5.24 - LEAST
Returns the smallest value in a list of expressions of any data type.
Returns the smallest value in a list of expressions of any data type. All data types in the list must be the same or compatible. A NULL value in any one of the expressions returns NULL. Results can vary, depending on the locale's collation setting.
Behavior type
Stable
Syntax
LEAST ( { * | expression[,...] } )
Arguments
* |
expression
[,...]
- The expressions to evaluate, one of the following:
Examples
LEASTB returns 5 as the smallest value in the list:
=> SELECT LEASTB(7, 5, 10);
LEASTB
--------
5
(1 row)
If you put quotes around the integer expressions, LEASTB compares the values as strings and returns '10' as the smallest value:
=> SELECT LEASTB('7', '5', '10');
LEASTB
--------
10
(1 row)
LEAST returns 1.5, as INTEGER 2 is implicitly cast to FLOAT:
=> SELECT LEAST(2, 1.5);
LEAST
-------
1.5
(1 row)
LEAST queries all columns in a view based on the VMart table product_dimension
, and returns the smallest value in each row:
=> CREATE VIEW query1 AS SELECT shelf_width, shelf_height, shelf_depth FROM product_dimension;
CREATE VIEW
=> SELECT shelf_height, shelf_width, shelf_depth, least(*) FROM query1 WHERE shelf_height = 5;
shelf_height | shelf_width | shelf_depth | least
--------------+-------------+-------------+-------
5 | 3 | 4 | 3
5 | 4 | 3 | 3
5 | 1 | 4 | 1
5 | 4 | 1 | 1
5 | 2 | 4 | 2
5 | 2 | 3 | 2
5 | 1 | 3 | 1
5 | 1 | 3 | 1
5 | 5 | 1 | 1
5 | 2 | 4 | 2
5 | 4 | 5 | 4
5 | 2 | 4 | 2
5 | 4 | 4 | 4
5 | 3 | 4 | 3
...
See also
GREATEST
4.5.25 - LEASTB
Returns the smallest value in a list of expressions of any data type, using binary ordering.
Returns the smallest value in a list of expressions of any data type, using binary ordering. All data types in the list must be the same or compatible. A NULL value in any one of the expressions returns NULL. Results can vary, depending on the locale's collation setting.
Behavior type
Immutable
Syntax
LEASTB ( { * | expression[,...] } )
Arguments
* |
expression
[,...]
- The expressions to evaluate, one of the following:
Examples
The following command selects strasse
as the smallest value in the list:
=> SELECT LEASTB('straße', 'strasse');
LEASTB
---------
strasse
(1 row)
LEASTB returns 5 as the smallest value in the list:
=> SELECT LEAST(7, 5, 10);
LEAST
-------
5
(1 row)
If you put quotes around the integer expressions, LEAST compares the values as strings and returns '10' as the smallest value:
=> SELECT LEASTB('7', '5', '10');
LEAST
-------
10
(1 row)
The next example returns 1.5, as INTEGER 2 is implicitly cast to FLOAT:
=> SELECT LEASTB(2, 1.5);
LEASTB
--------
1.5
(1 row)
LEASTB queries all columns in a view based on the VMart table product_dimension
, and returns the smallest value in each row:
=> CREATE VIEW query1 AS SELECT shelf_width, shelf_height, shelf_depth FROM product_dimension;
CREATE VIEW
=> SELECT shelf_height, shelf_width, shelf_depth, leastb(*) FROM query1 WHERE shelf_height = 5;
shelf_height | shelf_width | shelf_depth | leastb
--------------+-------------+-------------+--------
5 | 3 | 4 | 3
5 | 4 | 3 | 3
5 | 1 | 4 | 1
5 | 4 | 1 | 1
5 | 2 | 4 | 2
5 | 2 | 3 | 2
5 | 1 | 3 | 1
5 | 1 | 3 | 1
5 | 5 | 1 | 1
5 | 2 | 4 | 2
5 | 4 | 5 | 4
5 | 2 | 4 | 2
5 | 4 | 4 | 4
5 | 3 | 4 | 3
5 | 5 | 4 | 4
5 | 5 | 1 | 1
5 | 3 | 1 | 1
...
See also
GREATESTB
4.5.26 - LEFT
Returns the specified characters from the left side of a string.
Returns the specified characters from the left side of a string.
Behavior type
Immutable
Syntax
LEFT ( string-expr, length )
Arguments
string-expr
- The string expression to return.
length
- An integer value that specifies how many characters to return.
Examples
=> SELECT LEFT('vertica', 3);
LEFT
------
ver
(1 row)
SELECT DISTINCT(
LEFT (customer_name, 4)) FnameTruncated
FROM customer_dimension ORDER BY FnameTruncated LIMIT 10;
FnameTruncated
----------------
Alex
Amer
Amy
Anna
Barb
Ben
Bett
Bria
Carl
Crai
(10 rows)
See also
SUBSTR
4.5.27 - LENGTH
Returns the length of a string.
Returns the length of a string. The behavior of LENGTH
varies according to the input data type:
-
CHAR and VARCHAR: Identical to
CHARACTER_LENGTH
, returns the string length in UTF-8 characters, .
-
CHAR: Strips padding.
-
BINARY and VARBINARY: Identical to
OCTET_LENGTH
, returns the string length in bytes (octets).
Behavior type
Immutable
Syntax
LENGTH ( expression )
Arguments
expression
- String to evaluate, one of the following: CHAR, VARCHAR, BINARY or VARBINARY.
Examples
Statement |
Returns |
SELECT LENGTH('1234 '::CHAR(10)); |
4 |
SELECT LENGTH('1234 '::VARCHAR(10)); |
6 |
SELECT LENGTH('1234 '::BINARY(10)); |
10 |
SELECT LENGTH('1234 '::VARBINARY(10)); |
6 |
SELECT LENGTH(NULL::CHAR(10)) IS NULL; |
t |
See also
BIT_LENGTH
4.5.28 - LOWER
Takes a string value and returns a VARCHAR value converted to lowercase.
Takes a string value and returns a VARCHAR value converted to lowercase.
Behavior type
stable
Syntax
LOWER ( expression )
Arguments
expression
- CHAR or VARCHAR string to convert, where the string width is ≤ 65000 octets.
Important
In practice, expression
should not exceed 32,500 octets. LOWER does not use the locale's collation setting—for example, collation=binary
—to identify its encoding; rather, it treats the input argument as a UTF-8 encoded string. The UTF-8 representation of the input value might be double its original width. As a result, LOWER returns an error if the input value exceeds 32,500 octets.
Note also that if expression
is a table column, LOWER calculates its size from the column's defined width, and not from the column data. If the column width is greater than VARCHAR(32500), Vertica returns an error.
Examples
=> SELECT LOWER('AbCdEfG');
LOWER
---------
abcdefg
(1 row)
=> SELECT LOWER('The Bat In The Hat');
LOWER
--------------------
the bat in the hat
(1 row)
=> SELECT LOWER('ÉTUDIANT');
LOWER
----------
étudiant
(1 row)
4.5.29 - LOWERB
Returns a character string with each ASCII character converted to lowercase.
Returns a character string with each ASCII character converted to lowercase. Multi-byte characters are skipped and not converted.
Behavior type
Immutable
Syntax
LOWERB ( expression )
Arguments
expression
- CHAR or VARCHAR string to convert
Examples
In the following example, the multi-byte UTF-8 character É is not converted to lowercase:
=> SELECT LOWERB('ÉTUDIANT');
LOWERB
----------
Étudiant
(1 row)
=> SELECT LOWER('ÉTUDIANT');
LOWER
----------
étudiant
(1 row)
=> SELECT LOWERB('AbCdEfG');
LOWERB
---------
abcdefg
(1 row)
=> SELECT LOWERB('The Vertica Database');
LOWERB
----------------------
the vertica database
(1 row)
4.5.30 - LPAD
Returns a VARCHAR value representing a string of a specific length filled on the left with specific characters.
Returns a VARCHAR value representing a string of a specific length filled on the left with specific characters.
Behavior type
Immutable
Syntax
LPAD ( expression , length [ , fill ] )
Arguments
expression
- (CHAR OR VARCHAR) specifies the string to fill
length
- (INTEGER) specifies the number of characters to return
fill
- (CHAR OR VARCHAR) specifies the repeating string of characters with which to fill the output string. The default is the space character.
Examples
=> SELECT LPAD('database', 15, 'xzy');
LPAD
-----------------
xzyxzyxdatabase
(1 row)
If the string is already longer than the specified length it is truncated on the right:
=> SELECT LPAD('establishment', 10, 'abc');
LPAD
------------
establishm
(1 row)
4.5.31 - LTRIM
Returns a VARCHAR value representing a string with leading blanks removed from the left side (beginning).
Returns a VARCHAR value representing a string with leading blanks removed from the left side (beginning).
Behavior type
Immutable
Syntax
LTRIM ( expression [ , characters ] )
Arguments
expression
- (CHAR or VARCHAR) is the string to trim
characters
- (CHAR or VARCHAR) specifies the characters to remove from the left side of
expression
. The default is the space character.
Examples
=> SELECT LTRIM('zzzyyyyyyxxxxxxxxtrim', 'xyz');
LTRIM
-------
trim
(1 row)
See also
4.5.32 - MAKEUTF8
Coerces a string to UTF-8 by removing or replacing non-UTF-8 characters.
Coerces a string to UTF-8 by removing or replacing non-UTF-8 characters.
MAKEUTF8 flags invalid UTF-8 characters byte by byte. For example, the byte sequence 0xE0 0x7F 0x80
is an invalid three-byte UTF-8 sequence, but the middle byte, 0x7F
, is a valid one-byte UTF-8 character. In this example, 0x7F
is preserved and the other two bytes are removed or replaced.
Syntax
MAKEUTF8( string-expression [USING PARAMETERS param=value] );
Arguments
string-expression
- The string expression to evaluate for non-UTF-8 characters
Parameters
replacement_string
- Specifies the VARCHAR(16) string that MAKEUTF8 uses to replace each non-UTF-8 character that it f