Character encoding beyond the Basic Multilingual Plane


#1

Hello All!

I’m wondering whether the JUCE string class is able to recognize characters outside of the Basic Multilingual Plane, in other words, characters which require 16-bit surrogate pairs in UTF-16. For example, I imported some SMP UTF-8 data into a JUCE String, but JUCE printed two blank default characters when I drew the text to screen. Is this an OS/API-dependent issue or is JUCE only recognizing Unicode characters as a fixed-width 16-bit word (the deprecated UCS-2 way of doing things)?


#2

You’ll have to select a font containing the “UCS-2” glyph to display it.
Windows does this automatically and select a font of the same familly when it lacks the glyphs. If no font installed in the system owns the glyphs, then a [] (SQUARE) is displayed instead.

In Juce, you have to select the special font for the glyph, and set the default font to the common font (like Tahoma, Verdana, …) so when the special font is missing the basic ASCII glyph, the default font is used automatically.