Best practices for safe unicode FPE
Starting with Voltage SecureData Simple API 6.0, you can use Safe Unicode Format Preserving Encryption (FPE) formats to encrypt and decrypt Unicode strings.
When encrypting Unicode with Safe Unicode FPE formats, the plaintext passed into the VoltageSecureProtect function is first encrypted with an alphabet of all Unicode code points, and then encoded with the Base32K-encoding to produce encrypted text in Unicode Normalization Form C (NFC). The encrypted text will generally consist of 3-byte Chinese and Japanese characters, as they account for a significant portion of the NFC-stable alphabet. VoltageSecureAccess reverses this process.
Unlike regular FPE, which handles ASCII strings consisting of single bytes, Safe Unicode FPE handles strings consisting of variable-length code points. This variability introduces an important complication: Safe Unicode FPE formats like UNICODE_BASE32K
are not length-preserving. The encryption algorithm guarantees that the encrypted text will be larger than the original, and failing to account for this expansion can lead to truncated or otherwise improperly stored encrypted text, making decryption impossible.
Additionally, Unicode allows semantically equivalent characters to be encoded in different ways, which means that unnormalized, semantically equivalent plaintexts can have different forms when encrypted, compromising referential integrity. To prevent this, you should always normalize your plaintext strings into their NFC forms before encryption. For more information on Unicode normalization, see the Unicode Consortium's Normalization FAQ.
The encrypted text is guaranteed to be larger than the plaintext in the following ways:
-
16/15ths longer in character-length (rounded up)
-
up to 4-times larger (in bytes)
For a list of predefined formats for Safe Unicode FPE, consult your copy of the Voltage SecureData Simple API documentation.
Checklist for safe unicode FPE
Before using Safe Unicode FPE:
-
Normalize your plaintext to its NFC form before encryption.
-
For fixed-length data types: remove any padding from the plaintext string.
-
To store the encrypted text, ensure that the columns storing the encrypted text can handle:
-
strings 16/15ths times longer than your longest plaintext string
-
strings up to 4-times larger (in bytes) than your largest plaintext string
-