We’re in the process of testing our ARM plugin builds on the latest Apple M1 MBP’s and have seen very poor performance for OpenGL rendering.
Testing with the JUCE GraphicsDemo shows an Actual FPS of around 14 when the renderer is set to OpenGL. Activity Monitor shows ~80% GPU utilization. Loading the DemoRunner app in Rosetta 2 shows similarly bad performance.
Tested with the tip of JUCE on master and develop on macOS 12.0.1. Obviously, this could just be an issue with the OS or the new graphics processors in these machines, but other non-JUCE OpenGL apps seem to do better on M1.
Okay, I can confirm that there has been a significant worsening of OpenGL performance between JUCE 6.0.8 (545e9f353) and 6.1.2 (d49d20397). Thanks for the tip @jellebakker!
Across 4 different M1 machines, on average, the framerates went down by 54% while CPU usage went up 58% for two different tests in the DemoRunner. Full results here.
For JUCE 6.1.2, none of the M1 machines could achieve 60 fps for the GraphicsDemo using the OpenGL Renderer. Even before JUCE 6.1.2, the framerates on M1 were not great, but now they are essentially unworkable for anything close to “smooth” animations, all while basically saturating the CPU & GPU.
Even with the last commit, I’m having trouble with our plugin dropping frames on M1 Max and macOS 12.0.1. Using Instruments profiler, under Rosetta (and my last Intel Mac), the OpenGL render thread executes with high priority and at 60 fps. With native M1, it appears that the render thread spends time on both efficiency and performance cores. While on an E core it’s often preempted by higher priority tasks such as hardware events and the render thread may not complete in time. I tried setting the render thread to priority 10 (instead of 9 as in the above commit) but this didn’t help much. In Rosetta mode, it’s always running on a performance core and isn’t interrupted by anything significant.
Apple recommends assigning QoS classes to threads instead of setting priorities (Apple Developer Documentation). I attempted to do this on creation of the render thread but it didn’t appear to be any different than running it with a high priority.
I’m certainly no expert but it seems like JUCE is using CVDisplayLink sensibly: The CVDisplayLink callback signals the render thread for work to be done. For what it’s worth, calling renderFrame() directly from the CVDisplayLink callback (as it used to be done) doesn’t make any difference in terms of spending time on both E and P cores.
Just a quick note - I am wondering if these thread priority related changes had anything to do with the audio performance drop I saw on Mac M1 (but not in Intel Mac, which seems weird) - as mentioned in:
I won’t speculate any further. I am going to test different versions until I can figure out which commit is causing my issue. But just thought I’d mention it here in case anybody has an opinion on that.
EDIT: I tracked my M1 performance issue down to a fairly old commit right after JUCE 6.0.7 - related to something specific about Mac OS drawing - see original thread for the details.
If memory serves, the OpenGL performance got worse in this commit:
Given that the commit you linked essentially reverts the above change on macOS, it seemed reasonable to revert the OpenGL “workaround” too. I’m sure I tested and verified this, but I just had another go and I’m not seeing massive differences in performance before and after either of those changes…
Are you seeing an actual performance regression on develop currently, or are you asking out of interest about the change? If you are seeing a performance regression, is there a JUCE demo that demonstrates the issue reliably?