UTF8 characters in Audio Unit Plugins Descriptions?


I have a problem loading an (opensource) MDA AU plugin.
The problem is, the call to FindNextComponent(comp, &desc) in juce_AudioUnitPluginFormat.mm:1511 fills the desc with:

OSType componentType "aufx"
OSType componentSubType "mda\337"
OSType componentManufacturer "mdaX"

The strange escaped character in the componentSubType then raises an assertion when Juce tries to convert this OSType into a juce::String (String::String (const char* const t, const size_t maxChars)).

I suppose it’s a mistake on plugin side, and that I should just ignore this plugin through the pedalFile mechanism, but in Debug mode, I will hit this assertion everytime, and crash my program everytime, unless I just remove the faulty plugin, or manually mark it as invalid. Is there a way to make this process more robust on JUCE side? Like consider the pointer as an UTF8 pointer everytime when doing the OSType->String conversion, so that wrong characters will just be ignored?

Where is the assertion? It uses wide chars:

String osTypeToString (OSType type) { const juce_wchar s[4] = { (juce_wchar) ((type >> 24) & 0xff), (juce_wchar) ((type >> 16) & 0xff), (juce_wchar) ((type >> 8) & 0xff), (juce_wchar) (type & 0xff) }; return String (s, 4); }

…so it’ll work fine for any character values (?)

It’s here:

 300 String::String (const char* const t, const size_t maxChars)
 301     : text (StringHolder::createFromCharPointer (CharPointer_ASCII (t), maxChars))
 302 {
 303     /*  If you get an assertion here, then you're trying to create a string from 8-bit data
 304         that contains values greater than 127. These can NOT be correctly converted to unicode
 305         because there's no way for the String class to know what encoding was used to
 306         create them. The source data could be UTF-8, ASCII or one of many local code-pages.
 308         To get around this problem, you must be more explicit when you pass an ambiguous 8-bit
 309         string to the String class - so for example if your source data is actually UTF-8,
 310         you'd call String (CharPointer_UTF8 ("my utf8 string..")), and it would be able to
 311         correctly convert the multi-byte characters to unicode. It's *highly* recommended that
 312         you use UTF-8 with escape characters in your source code to represent extended characters,
 313         because there's no other way to represent these strings in a way that isn't dependent on
 314         the compiler, source code editor and platform.
 315     */
 316     jassert (t == nullptr || CharPointer_ASCII::isValidString (t, (int) maxChars));
 317 }

But I have chars in my version, not wchar, so I’ll just bump to the latest tip, which we already started anyway, but now I have an even better reason :smiley:
You’re too fast for me Julian :wink: