Juce 8 Performance of getStringWidth

Font::getStringWidth() is deprecated and the suggested replacement is GlyphArrangement::getStringWidthInt().

I ran a simple test on 100000 calls to these methods in juce 6 and juce 8, for a string with 51 characters.

Juce 6 Font::getStringWidth() = 2.1sec
Juce 8 Font::getStringWidth() = 11.8sec
Juce 8 GlyphArrangement::getStringWidthInt = 81.0sec

Not super scientific but the difference is massive, and in a real world scenario the performance drop is definitely noticeable.

I’m aware that the new font system will be a little slower due to the new tricks it can pull, but is this what you would expect to see?

2 Likes

Do you have min/max/avg/SD data for that?

It would be useful to know if it’s the odd cache miss or something similar giving a massive hit.

I also found that the result is not correct, I had to add a +4 pixel offset on a 14.f high font (no font-face specified).

I think the problem here is the same as the other thread, GlyphArrangement::getStringWidthInt() boils down to the static function GlyphArrangement ::getStringBounds() which in turn creates a GlyphArrangement calling addLineOfText(). This will need to shape the text, the problem here is that there is no caching in this case. I’m trying to see if I can improve caching in general but it’s tricky as there are no easy answers with this.

You could probably do this manually for the time being with something like this (untested)…

static int getCachedStringWidth (const Font& font, const String& text)
{
    struct ArrangementArgs
    {
        auto tie() const noexcept { return std::tie (font, text); }
        bool operator< (const ArrangementArgs& other) const { return tie() < other.tie(); }

        const Font font;
        const String text;
    };

    constexpr auto maximumCacheSize = 1024;
    LruCache<ArrangementArgs, int, maximumCacheSize> cache;

    return cache.get ({ font, text }, [] (const ArrangementArgs& args)
    { 
        return GlyphArrangment::getStringWidthInt (args.font, args.text);  
    };
}

The trick is getting that maximumCacheSize right but you could probably get away with a pretty large cache to be honest (much larger than I’ve shown here).

The other issue is if you have lots of text that is constantly changing. You may find you want to call this cached version if it’s a string that’s not changing often but if it’s say a value for a parameter that’s changing lots and has a lot of possible values it might be worth accepting the shaping cost for that one, or have a separate smaller cache for each string you want the width of.

Thanks. Equally we could manually stash the string width for any text or labels which we know ahead of time… but these are really just workarounds for what appears to be a much much slower text shaping process.

I’m out of my depth with the text shaping, but it seems like that is where the problem lies and where any real solutions might be found.

Our app is essentially like a DAW UI, so for me this is a really critical issue which I will be motivated to work around, but most people will not reach this point, and will instead just have a slightly sluggish UI.

I think for the vast majority of texts (labels for dials etc.) using the fonts “getStringWidth” is just fine and as such the depreciation should be reverted.

3 Likes

but even that old getStringWidth() seemed to be about 5x slower in j8.

My real concern is the underlying “issue” responsible for the very slow text shaping. We can work around the getting of string widths, but this is no help at all when we need to drawText() or set size on a label.

1 Like

A version of this getCachedStringWidth () idea is helping… but only with situations like undo/redo where all the strings exist. This new string behaviour is really a massive drag and we’re making all kinds of nasty hacks and workarounds to keep performance passable.

I’d love to know why the old Font::getStringWidth() is about 5x slower on juce 8…

@thecargocult can you try the following patch for me and see how you get on?

JUCE-dev-1b34805-Font - Improve caching.patch (10.2 KB)

Didn’t seem to help much, if at all, on our basic test from top of this thread.
Sorry.

For anyone else arriving here, we’ve refactored all our heaviest text rendering to use pre-computed glyphArrangements, and left/right pixel values for first/last glyph to get width - plus a custom getStringWidth() like the one above for any place we need it.

1 Like

Thanks for trying the patch. Could you please share an example that demonstrates the problem (with the patch applied) to help me concentrate on whatever you’re finding the problem area to be. These are the results I was getting with the patch applied.

I took 10,000 measurements of calling GlyphArrangement::getStringWidth() on the string “Hello, world!” in a release build

7.0.12 develop This branch
Average 0.127 ms. 0.573 ms 0.109 ms
Minimum 0.014 ms 0.147 ms 0.028 ms
Maximum 0.426 ms 6.582 ms 6.840 ms

I then tried using the “Lorem ipsum” text on 1’000 measurements

7.0.12 develop This branch
Average 0.184 ms 3.845 ms 0.964 ms
Minimum 0.074 ms 3.355 ms 0.349 ms
Maximum 0.420 ms 10 ms 6.912 ms

I’m not caching the text, the only caching that is happening is in the font. One thing to point out is that if you’re re-creating the Font (say by storing a FontOptions) each time you call getStringWidth() you might be missing out on some of the optimisations in this patch. To resolve this make sure the Font is created and kept alive for the duration of the program.

I’m still exploring more optimisations but the costly bits now are basically the shaping itself, anything where it spends time in HarfBuzz.

Hey! I take it that JUCE 8 uses harfbuzz for text shaping. Without having looked into how JUCE 8 applies it, here is something that helped a lot in another text layouting engine I’ve been working on:

Disabling/Enabling certain OpenType shaping features can have a massive impact on harfbuzz performance. One of the heaviest I found is “CALT” (contextual alternates). This is what produces these nice variants of “…”, “=>” etc in coding-fonts like Fira Code. In many cases, fancy features like this are not required, especially if we’re talking UI. Disabling contextual alternates alone gave me a 5x speedup of hb_shape.

That, plus good splitting into line runs and caching those, might help.
cheers

3 Likes

@jcomusic thanks I’ll take a look. I guess we would likely want to allow users to only optionally disable those features if at all possible. We do split things into runs but it wasn’t clear to me that caching those runs was going help much on a grander scale when we consider all the possible inputs.

It looks like you can already turn these features off in FontOptions using withFeatureDisabled ("calt") for example. Unfortunately I’m not finding it makes any difference in the example I have, but maybe @thecargocult this is something worth trying?

1 Like

Unfortunately, CALT isn’t helping either.
@0Lek tried some testing on windows and got similar results, relatively speaking:

Performed the test described at the top of the thread, with the font explicitly made once.
GlpyhArrangment::getStringWidthInt() is:
~18x slower for short strings compared to 6.1.6 (Font::getStringWdith)
~31x slower for long strings compared to 6.1.6 (Font::getStringWidth)
GlpyhArrangment::getStringWidth() has a similar problem being ~25x slower then 6.1.6 Font::getStringWidth()
Patch didn’t seem to make any difference to speed of GlpyhArrangment::getStringWidthInt().

GlpyhArrangment::getStringWidthInt()
---------------------------------------------------------------------------------
JUCE version: JUCE v8.0.8 - Develop
Constant text (len=12): min=0.4745 ms, max=14.9273 ms, avg=0.492183 ms
Constant text (len=85): min=1.5266 ms, max=3.6693 ms, avg=1.5765 ms
Varying text (last len=15): min=0.4983 ms, max=5.2208 ms, avg=0.555916 ms

JUCE version: JUCE v8.0.8 - Patch applied
Constant text (len=12): min=0.519 ms, max=14.6867 ms, avg=0.534059 ms
Constant text (len=85): min=1.8553 ms, max=4.5235 ms, avg=1.90515 ms
Varying test (last len=15): min=0.5384 ms, max=2.5508 ms, avg=0.605749 ms

testTextScroll1 2.zip (24.6 KB)

@thecargocult thanks for reporting this. Sorry it’s been a while but some changes have just made it into the develop branch, the most important being this

A couple of things to note

  • The very first call made in an application is still quite expensive, there is some compressed data that needs to be uncompressed first. This penalty should only be paid once per process

  • It’s important that the Typeface isn’t constantly being recreated as that’s where the caching is happening

  • The maximum times for the strings can still be more expensive than JUCE 7 but the minimum and average times have in many cases improved significantly (even more so in release builds)

Let me know how you get on with it.

1 Like