Binary data types (BINARY and VARBINARY)

Store raw-byte data, such as IP addresses, up to bytes.

Store raw-byte data, such as IP addresses, up to 65000 bytes. The BINARY and BINARY VARYING (VARBINARY) data types are collectively referred to as binary string types and the values of binary string types are referred to as binary strings. A binary string is a sequence of octets or bytes.

BYTEA and RAW are synonyms for VARBINARY.

Syntax

BINARY ( length )
{ VARBINARY | BINARY VARYING | BYTEA | RAW } ( max-length )

Arguments

length, max-length
The length of the string or column width, in bytes (octets).

BINARY and VARBINARY data types

BINARY and VARBINARY data types have the following attributes:

  • BINARY: A fixed-width string of length bytes, where the number of bytes is declared as an optional specifier to the type. If length is omitted, the default is 1. Where necessary, values are right-extended to the full width of the column with the zero byte. For example:

    => SELECT TO_HEX('ab'::BINARY(4));
      to_hex
    ----------
     61620000
    
  • VARBINARY: A variable-width string up to a length of max-length bytes, where the maximum number of bytes is declared as an optional specifier to the type. The default is the default attribute size, which is 80, and the maximum length is 65000 bytes. VARBINARY values are not extended to the full width of the column. For example:

    => SELECT TO_HEX('ab'::VARBINARY(4));
     to_hex
    --------
     6162
    

Input formats

You can use several formats when working with binary values. The hexadecimal format is generally the most straightforward and is emphasized in Vertica documentation.

Binary values can also be represented in octal format by prefixing the value with a backslash '\'.

You can also input values represented by printable characters. For example, the hexadecimal value '0x61' can also be represented by the symbol a.

See Data load.

On input, strings are translated from:

Both functions take a VARCHAR argument and return a VARBINARY value.

Output formats

Like the input format, the output format is a hybrid of octal codes and printable ASCII characters. A byte in the range of printable ASCII characters (the range [0x20, 0x7e]) is represented by the corresponding ASCII character, with the exception of the backslash ('\'), which is escaped as '\\'. All other byte values are represented by their corresponding octal values. For example, the bytes {97,92,98,99}, which in ASCII are {a,\,b,c}, are translated to text as 'a\\bc'.

Binary operators and functions

The binary operators &, ~, |, and # have special behavior for binary data types, as described in Bitwise operators.

The following aggregate functions are supported for binary data types:

BIT_AND, BIT_OR, and BIT_XOR are bit-wise operations that are applied to each non-null value in a group, while MAX and MIN are byte-wise comparisons of binary values.

Like their binary operator counterparts, if the values in a group vary in length, the aggregate functions treat the values as though they are all equal in length by extending shorter values with zero bytes to the full width of the column. For example, given a group containing the values 'ff', null, and 'f', a binary aggregate ignores the null value and treats the value 'f' as 'f0'. Also, like their binary operator counterparts, these aggregate functions operate on VARBINARY types explicitly and operate on BINARY types implicitly through casts. See Data type coercion operators (CAST).

Binary versus character data types

The BINARY and VARBINARY binary types are similar to the CHAR and VARCHAR character data types, respectively. They differ as follows:

  • Binary data types contain byte strings (a sequence of octets or bytes).

  • Character data types contain character strings (text).

  • The lengths of binary data types are measured in bytes, while character data types are measured in characters.

Examples

The following example shows HEX_TO_BINARY and TO_HEX usage.

Table t and its projection are created with binary columns:

=> CREATE TABLE t (c BINARY(1));
=> CREATE PROJECTION t_p (c) AS SELECT c FROM t;

Insert minimum byte and maximum byte values:

=> INSERT INTO t values(HEX_TO_BINARY('0x00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFF'));

Binary values can then be formatted in hex on output using the TO_HEX function:

=> SELECT TO_HEX(c) FROM t;
 to_hex
 --------
  00
  ff
 (2 rows)

The BIT_AND, BIT_OR, and BIT_XOR functions are interesting when operating on a group of values. For example, create a sample table and projections with binary columns:

The example that follows uses table t with a single column of VARBINARY data type:

=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));

Query table t to see column c output:

=> SELECT TO_HEX(c) FROM t;
 TO_HEX
--------
 ff00
 ffff
 f00f
(3 rows)

Now issue the bitwise AND operation. Because these are aggregate functions, an implicit GROUP BY operation is performed on results using (ff00&(ffff)&f00f):

=> SELECT TO_HEX(BIT_AND(c)) FROM t;
 TO_HEX
--------
f000
(1 row)

Issue the bitwise OR operation on (ff00|(ffff)|f00f):

=> SELECT TO_HEX(BIT_OR(c)) FROM t;
 TO_HEX
--------
ffff
(1 row)

Issue the bitwise XOR operation on (ff00#(ffff)#f00f):

=> SELECT TO_HEX(BIT_XOR(c)) FROM t;
 TO_HEX
--------
f0f0
(1 row)