Typeface glyphNumber and codePoint

Can somebody shed some light on the connection between the UTF-8 codepoint and the int glyphNumber?

I am in the process of writing a SMuFL compliant score renderer, since it is hard to write music theory software, if you can’t visualise it properly.

The SMuFL defines code points for every glyph in a json file, so the UTF codepoint is known.

I am using the method:

Path p;
typeface->getOutlineForGlyph (0x1e050, p);

However, I realise there is an additional mapping necessary. By outputting tables of glyphs, I could figure out an offset, but that is not consistent, neither within one font, nor across fonts (I am comparing bravura and petaluma, both publicly available).

Now I found the character map, but no information, how to access that, nor could I see in the JUCE code, that it uses that anywhere.

So how can I map from the smufl codepoint to the glyphnumber the Typeface expects?

Links I looked at:
SMuFL: https://w3c.github.io/smufl/gitbook/

cmap: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html / https://docs.microsoft.com/en-us/typography/opentype/spec/cmap

Forum post: Get glyph from typeface with unicode (UTF8 codepoint)

I don’t really get how that would work… any hints please?

Oof! You’re not having much luck getting any help with these troublesome glyphs :confused:

I hope someone can help out - you’re always so helpful to others.

1 Like

I guess you would have something equivalent to this:

Font font(typeface);
font.getGlyphPositions(String::charToString(x), glyphNumbers, glyphOffsets);

Thank you @roeland. That is the workflow for characters and how to get the glyph numbers and the kerning of proportional fonts.

The problem of the positions of the glyphs is well documented in the SMuFL definition and metadata for anchor points, so Daniel Spreadbury and the team did a great job here.
But the glyphs are defined by a number “codepoint”, they are no characters.

Like I wrote above, I found blocks matching with an offset, but no consistent way to compute the glyphnumber from the codepoint other than looking at the table, and I would have to verify every single glyph in any font I want to use, so this is not an option.

I was digging a bit further, CustomTypeface has a private method called “findGlyph”, but it fills a lookup table with the size of 128 glyphs. I have a hunch, that JUCE actually can only deal with Latin-1 characters. Should that be true? I can’t believe that…

I found the tool otfinfo from the package lcdf typetools, that can be used to lookup the codes from an otf file:

otfinfo -u Bravura/otf/Bravura.otf

So I know, that the cmap has exactly the information needed and that it is specific to the otf file.

That makes me think, that the Typeface class should have this method to do the lookup, I created a feature request for that:

1 Like

Are you sure those aren’t Unicode code points? I thought your original post refers to code point U+1E050 (which is a code point in the private use area).

Yes, they are unicode code points in the PUA, but I am missing the mapping from codepoint to glyphnumber, which is not the same. JUCE offers me to get the glyphnumbers for a text (and therefore for single characters), but not from codepoints, or am I missing something?

Text is a sequence of code points. String::charToString(0x1E050) should return a a string consisting of one code point — U+1E050. That can be passed into getGlyphPositions.

What do you actually get if you try this out?

It returns glyph number 4, where the correct answer was 72.
So it was printing a 3/4 sign instead of the g clef.

I had hoped you were right…

Isn’t U+E050 G CLET (from the specification)?

Wild guess - try using code points below 0xffff. Code points above 0xffff must be converted to UTF-16 surrogate pairs. Smufl fonts must be able to handle UCS2 (utf16 without surrogate pairs, or Unicode BMP) anyway (as per specification).

Hope it helps.

Indeed, that yields the right result! Thank you so much @perob. Some of the code points are not in the PUA, so I can use them directly.

I also realised, that I need to specify multiple bytes for the PUA, so I will have to look into encoding the actual bit patterns.

I would have hoped not to use the getGlyphPositions, since it does additional work that is not necessary, i.e. the layout for the individual characters, but it’s probably not too bad, since I will give it single characters anyway.

Thank you so much (and all others that provided pieces to the puzzle).
I still hope my FR will get addressed, since it will make the API and my code much more concise, but at least it seems I am no longer deadlocked.

That is not expected. A fragment like this should yield one glyph, a smiling face in this case. Whether or not to generate multiple bytes is handled internally by JUCE.

        String s = String::charToString(0x1F603);
        // pick a font which actually has that character: 😃︎
        Font font("Segoe UI Symbol", 20.f, 0);
        Array<int> glyphNumbers;
        Array<float>  glyphOffsets;
        font.getGlyphPositions(s, glyphNumbers, glyphOffsets);
        juce::Path path;
        font.getTypeface()->getOutlineForGlyph(glyphNumbers[0], path);