ISUTF8

Tests whether a string is a valid UTF-8 string.

Tests whether a string is a valid UTF-8 string. Returns true if the string conforms to UTF-8 standards, and false otherwise. This function is useful to test strings for UTF-8 compliance before passing them to one of the regular expression functions, such as REGEXP_LIKE, which expect UTF-8 characters by default.

ISUTF8 checks for invalid UTF8 byte sequences, according to UTF-8 rules:

  • invalid bytes

  • an unexpected continuation byte

  • a start byte not followed by enough continuation bytes

  • an Overload Encoding

The presence of an invalid UTF-8 byte sequence results in a return value of false.

To coerce a string to UTF-8, use MAKEUTF8.

Syntax

ISUTF8( string );

Arguments

string
The string to test for UTF-8 compliance.

Examples

=> SELECT ISUTF8(E'\xC2\xBF'); -- UTF-8 INVERTED QUESTION MARK ISUTF8
--------
 t
(1 row)

=> SELECT ISUTF8(E'\xC2\xC0'); -- UNDEFINED UTF-8 CHARACTER
 ISUTF8
--------
 f
(1 row)