As UTF-16 is supposed to encode the full Unicode character set.
- UTF-16 uses a double two-byte sequence for characters outside the BMP
- Special ranges in the Unicode range are used for this
byte 1 = 0xd800 - 0xdbff
byte 2 = 0xdc00 - 0xdfff
𝌝
U+1D31D TETRAGRAM FOR JOY
0xD834 0xDF1D
- Code point: a character
- Code unit: a two-byte sequence with UTF-16