Poor performance with OpenGL Renderer on Apple M1 machines

With the OpenGL Renderer in particular or just generally?

The OpenGL renderer in an app that I’m working on has become extremely slow since JUCE 6.1.0. The demo runner runs fine.

Got it. Will test with JUCE 6.0.8 to see if it makes any difference for us.

Okay, I can confirm that there has been a significant worsening of OpenGL performance between JUCE 6.0.8 (545e9f353) and 6.1.2 (d49d20397). Thanks for the tip @jellebakker!

Across 4 different M1 machines, on average, the framerates went down by 54% while CPU usage went up 58% for two different tests in the DemoRunner. Full results here.

For JUCE 6.1.2, none of the M1 machines could achieve 60 fps for the GraphicsDemo using the OpenGL Renderer. Even before JUCE 6.1.2, the framerates on M1 were not great, but now they are essentially unworkable for anything close to “smooth” animations, all while basically saturating the CPU & GPU.

2 Likes

Looks like one of the offending commits is:

See our comment here: Thread: Avoid setting realtime priority on Thread instances by defau… · juce-framework/JUCE@802f33b · GitHub

Setting the priority of the OpenGL ThreadPool to something high seems to improve things.

4 Likes

Thanks for reporting. I’ve added a commit bumping the priority of the OpenGL thread here, which seems to improve performance on my machine:

3 Likes

Hello.
Even with the last commit, I’m having trouble with our plugin dropping frames on M1 Max and macOS 12.0.1. Using Instruments profiler, under Rosetta (and my last Intel Mac), the OpenGL render thread executes with high priority and at 60 fps. With native M1, it appears that the render thread spends time on both efficiency and performance cores. While on an E core it’s often preempted by higher priority tasks such as hardware events and the render thread may not complete in time. I tried setting the render thread to priority 10 (instead of 9 as in the above commit) but this didn’t help much. In Rosetta mode, it’s always running on a performance core and isn’t interrupted by anything significant.

Apple recommends assigning QoS classes to threads instead of setting priorities (Apple Developer Documentation). I attempted to do this on creation of the render thread but it didn’t appear to be any different than running it with a high priority.

2 Likes

We’re seeing the same type of performance issues even with the priority improvement.

I also wondered about QoS — our plugins always seem to mostly run on the E cores which seems very strange to me. Could it be because JUCE doesn’t directly use a CVDisplayLink?

I’m certainly no expert but it seems like JUCE is using CVDisplayLink sensibly: The CVDisplayLink callback signals the render thread for work to be done. For what it’s worth, calling renderFrame() directly from the CVDisplayLink callback (as it used to be done) doesn’t make any difference in terms of spending time on both E and P cores.

Still testing here, but it seems like something changed with 6.14.

M1 2D performance now pretty good, but MBP intel non-openGL seems to have slowed way down.

Anyone else seeing this?

Just a little update, there were two more changes after the November fix (above) to fix potential performance issues.

15 Dec 2021
OpenGL: Avoid querying the native view hierarchy from a background…
15 Dec 2021
Thread: Update macOS thread priority calculation

The current state is that the teapot demo (latest tip) still doesn’t run smoothly on Apple M1 Max at all (MacBook Pro 14 M1 Max)

Maybe the patch from here should be generally build into the Demo, so it is more easy to create comparable measurings.

1 Like

Experiencing this as well

@reuk why was the OpenGL priority change (which helped a good deal) reverted for the renderThread here?

1 Like

Just a quick note - I am wondering if these thread priority related changes had anything to do with the audio performance drop I saw on Mac M1 (but not in Intel Mac, which seems weird) - as mentioned in:

I won’t speculate any further. I am going to test different versions until I can figure out which commit is causing my issue. But just thought I’d mention it here in case anybody has an opinion on that.

EDIT: I tracked my M1 performance issue down to a fairly old commit right after JUCE 6.0.7 - related to something specific about Mac OS drawing - see original thread for the details.

If memory serves, the OpenGL performance got worse in this commit:

Given that the commit you linked essentially reverts the above change on macOS, it seemed reasonable to revert the OpenGL “workaround” too. I’m sure I tested and verified this, but I just had another go and I’m not seeing massive differences in performance before and after either of those changes…

Are you seeing an actual performance regression on develop currently, or are you asking out of interest about the change? If you are seeing a performance regression, is there a JUCE demo that demonstrates the issue reliably?

Got it. Looks like I get the same performance on M1 with these:

So that change seems ok!

1 Like

I’ve got an entry level M1 coming in the mail next week so I can help test this stuff. @goodhertz are you saying if you revert those changes performance got back as expected?

Hard to summarize succinctly (probably just easier to look at / replicate our test numbers on Google Drive), but it looks like OpenGL performance has more or less been restored on latest JUCE 6.1.5.

It’s still not “good” on M1 by any stretch of the imagination, but that seems to have more to do with M1 machines being a very poor fit for JUCE’s 2D OpenGL rendering.

1 Like

I’ve been testing this a lot in the past few days – and it’s seeming like where on Intel OpenGL is still more performant than not – on M1 simply flipping off OpenGL may be more performant. The interesting this is the timer profiler shows more “CPU” being used by the graphics – however they’re appearing smoother.

I haven’t used JUCE without OpenGL rendering on for so long, analyzing the stack traces is a totally different world – but I’ll keep reporting back with finding.

I’ve tried various things like rendering animations in BG threads and then displaying those – but then you hit the Image cache bottle necks. – Seems there’s no easy win here if you were pushing animations to their limits on intel!

So it’s very weird, Higher CPU + Smoother Graphics with it off :sweat_smile:

Alright my final note here is that JUCE_COREGRAPHICS_DRAW_ASYNC=1 is blowing GL out of the water, I’m not sure when this flag got introduced but bye bye OpenGL

1 Like