Binary data types (BINARY and VARBINARY)
Store raw-byte data, such as IP addresses, up to 65000 bytes. The BINARY and BINARY VARYING (VARBINARY) data types are collectively referred to as binary string types and the values of binary string types are referred to as binary strings. A binary string is a sequence of octets or bytes.
BYTEA and RAW are synonyms for VARBINARY.
Syntax
BINARY ( length )
{ VARBINARY | BINARY VARYING | BYTEA | RAW } ( max-length )
Arguments
length
,max-length
- The length of the string or column width, in bytes (octets).
BINARY and VARBINARY data types
BINARY and VARBINARY data types have the following attributes:
-
BINARY: A fixed-width string of
length
bytes, where the number of bytes is declared as an optional specifier to the type. Iflength
is omitted, the default is 1. Where necessary, values are right-extended to the full width of the column with the zero byte. For example:=> SELECT TO_HEX('ab'::BINARY(4)); to_hex ---------- 61620000
-
VARBINARY: A variable-width string up to a length of
max-length
bytes, where the maximum number of bytes is declared as an optional specifier to the type. The default is the default attribute size, which is 80, and the maximum length is 65000 bytes. VARBINARY values are not extended to the full width of the column. For example:=> SELECT TO_HEX('ab'::VARBINARY(4)); to_hex -------- 6162
Input formats
You can use several formats when working with binary values. The hexadecimal format is generally the most straightforward and is emphasized in Vertica documentation.
Binary values can also be represented in octal format by prefixing the value with a backslash '\'
.
Note
If you usevsql
, you must use the escape character (\
) when you insert another backslash on input; for example, input '\141'
as '\\141'
.
You can also input values represented by printable characters. For example, the hexadecimal value '0x61'
can also be represented by the symbol a
.
See Data load.
On input, strings are translated from:
-
Hexadecimal representation to a binary value using the function HEX_TO_BINARY.
-
Bitstring representation to a binary value using the function BITSTRING_TO_BINARY.
Both functions take a VARCHAR argument and return a VARBINARY value.
Output formats
Like the input format, the output format is a hybrid of octal codes and printable ASCII characters. A byte in the range of printable ASCII characters (the range [0x20, 0x7e]
) is represented by the corresponding ASCII character, with the exception of the backslash ('\'
), which is escaped as '\\'
. All other byte values are represented by their corresponding octal values. For example, the bytes {97,92,98,99}, which in ASCII are {a,\,b,c}
, are translated to text as 'a\\bc'
.
Binary operators and functions
The binary operators &
, ~
, |
, and #
have special behavior for binary data types, as described in Bitwise operators.
The following aggregate functions are supported for binary data types:
BIT_AND, BIT_OR, and BIT_XOR are bit-wise operations that are applied to each non-null value in a group, while MAX and MIN are byte-wise comparisons of binary values.
Like their binary operator counterparts, if the values in a group vary in length, the aggregate functions treat the values as though they are all equal in length by extending shorter values with zero bytes to the full width of the column. For example, given a group containing the values 'ff', null, and 'f'
, a binary aggregate ignores the null value and treats the value 'f'
as 'f0'
. Also, like their binary operator counterparts, these aggregate functions operate on VARBINARY types explicitly and operate on BINARY types implicitly through casts. See Data type coercion operators (CAST).
Binary versus character data types
The BINARY and VARBINARY binary types are similar to the CHAR and VARCHAR character data types, respectively. They differ as follows:
-
Binary data types contain byte strings (a sequence of octets or bytes).
-
Character data types contain character strings (text).
-
The lengths of binary data types are measured in bytes, while character data types are measured in characters.
Examples
The following example shows HEX_TO_BINARY and TO_HEX usage.
Table t
and its projection are created with binary columns:
=> CREATE TABLE t (c BINARY(1));
=> CREATE PROJECTION t_p (c) AS SELECT c FROM t;
Insert minimum byte and maximum byte values:
=> INSERT INTO t values(HEX_TO_BINARY('0x00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFF'));
Binary values can then be formatted in hex on output using the TO_HEX function:
=> SELECT TO_HEX(c) FROM t;
to_hex
--------
00
ff
(2 rows)
The BIT_AND, BIT_OR, and BIT_XOR functions are interesting when operating on a group of values. For example, create a sample table and projections with binary columns:
The example that follows uses table t
with a single column of VARBINARY
data type:
=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));
Query table t
to see column c
output:
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
ff00
ffff
f00f
(3 rows)
Now issue the bitwise AND operation. Because these are aggregate functions, an implicit GROUP BY operation is performed on results using (ff00&(ffff)&f00f)
:
=> SELECT TO_HEX(BIT_AND(c)) FROM t;
TO_HEX
--------
f000
(1 row)
Issue the bitwise OR operation on (ff00|(ffff)|f00f)
:
=> SELECT TO_HEX(BIT_OR(c)) FROM t;
TO_HEX
--------
ffff
(1 row)
Issue the bitwise XOR operation on (ff00#(ffff)#f00f)
:
=> SELECT TO_HEX(BIT_XOR(c)) FROM t;
TO_HEX
--------
f0f0
(1 row)