Mac M1 thread priority & Audio Workgroups

I just got a new Macbook Pro M1 and I noticed that on a JUCE app that we’re developing, CPU performance on the audio thread is way off when using the AudioIODeviceCombiner (i.e. when using built-in speaker/mic). Not only way higher compared with a similar Intel Mac, but very inconsistent, jumping up and down. There also seems to be some correlation between the GUI/message thread and the audio thread, as when the window is resized continuously, the audio thread CPU drops and becomes stable (probably pushing the core or workgroup into P mode).

After looking over the code base and profiling, searching, etc. I stumbled on this guide from Apple on using the newish (MacOS 11) Audio Workgroups API.

I don’t see any mentions of this API in the JUCE CoreAudio code or the forums. Am I missing something obvious?

I tried hacking this into AudioIODeviceCombiner::run() myself and it makes a huge difference - the audio CPU meter becomes stable again and satisfyingly low.

Anyone else noticed this? It would be great if JUCE could look at this as I don’t want to have to use my hacky patch.

Rail

Perhaps the title of my post was misleading because i’m not so much talking about thread priority as the behavior of the audio thread, which is always set to priority of 9.

I read that thread before writing this post and it doesn’t mention anything about Audio Workgroups or issues with audio thread performance.

This post…
Joining threads to CoreAudio HAL A… | Apple Developer Forums

and this guide…
adding_parallel_real-time_threads_to_audio_workgroups

… show the relevent code for adding the device combiner thread to the audio device workgroup.

1 Like

We’re currently updating Thread to make use of modern performance classifications. These changes may well fix this issue you’re seeing, but I will check it out.

3 Likes

Thank you!

It’s simple to recreate by just taking the AudioSettingsDemo and adding some heavy processing load, like in the AudioPerformanceTest with a relatively small buffer size, and then check for xruns and fluctuating CPU usage when using AudioIODeviceCombiner VS using a single device for IO.

But, the problem seems most apparent on the faster M1 macbook pros with 10 cores. It doesn’t seem as much (if at all) of a problem on the M1 Airs for some reason.

The M1 performance issue is a known one, the higher core M1 variants have fewer efficiency cores. Which version of JUCE are you currently targetting?

When I wrote this originally I was building against JUCE 7.0.0 with XCode 13.4.1. But, I just tested JUCE 7.0.1 and there’s no improvement on the issue.

When you say M1 performance is a known issue, do you mean in general? Or as it relates to JUCE code?

Just to be clear, the issue goes away if you use the Audio Workgroup API as described above, so it seems JUCE specific to me. Also, it is not a general audio performance issue as it runs fine when you use the same device for input and output.

The pthread API used on macOS doesn’t work very well with the M1 platforms; a pthread priority below four will lock the thread to the E cores, which, ironically, on the higher performance variants causes a significant performance disparity due to having 2 vs 4 E cores.

I inquired about the JUCE version as a fix was applied recently that prevented low-priority threads. I tested our new code with AudioDeviceCombiner, and it was as stable on my M1, but I have the M1 with 4 E Cores.

We’re currently transitioning our native threading code to use modern threading priority classifications and APIs (QoS on Apple platforms).

The Audio Workgroup API is something we’re still looking into and where/if it fits into our Thread API.

2 Likes

Sorry for dumb question, but does JUCE Thread use pthread API? That is my assumption, but I never looked under the hood to see what is going on (my bad).

I also just assumed JUCE Thread class does what it says, but it seems no longer the case for all platforms?

Any news on this new Thread API ?

Thanks !

1 Like

It’s going through the final staging of review and, hopefully, be available sometime next week!

8 Likes

I see there are updates to the juce::Thread class in the develop branch for the thread priority API, but still see nothing there related to joining audio work groups. I believe that is the original point of the creator of this discussion thread - to improve audio performance by making threads join audio workgroups (on Mac systems).

In this related forum thread - rmuller provided some example code for integrating the audio render context observer callback into juce_AU_Wrappper.mm:

I still don’t fully understand how to integrate that callback with my juce::Thread threads, but it should in theory allow to join those threads to the same workgroup as the main audio callback for the au plugin - and a similar process should be possible for standalone builds.
I am following up with rmuller for some tips, but preferably would not need to hack in those changes into the JUCE API code myself.

Ideally there will be an option in the juce::Thread class to request that newly created threads are automatically joined to the same audio workgroup as the main audio callback in AudioProcessor - maybe a generic cross platform option, that then has the platform specific code handled in the underlying thread/plugin wrappers code.

Are there any updates on if/when that may happen? (oli1 - you mentioned you’d revisit it after ‘unicode support’ work).

(I would expect there could be something very similar required for audio threads in Windows at some point as new Intel processors have similar e/p core splitting.)

Thanks!

2 Likes

If anyone wants to try to try to patch this into JUCE themselves, i posted my hack here: MacOS Audio Thread Workgroups - #4 by tlacael

I’m a long-time user of Unify, a JUCE-based combination host/plug-in from PlugInGuru. It runs poorly on my Macbook M1 Max, with a lot of CPU spiking. I have already complained to the developers, who have told me this only happens on the Pro/Max versions of the M1 chip, and requested that I report this directly to the JUCE development team. I see that this phenomenon, which affects other multi-threaded JUCE apps/plug-ins, has been discussed extensively on the JUCE Forum for a year, but there is still no official solution. I urge you to address this as soon as possible.

Hi @myuusic

I have one customer who has reported similar CPU spiking issues on a Mac Studio M1 Max system with a multi-threaded plugin of mine, also in Rosetta mode. In Logic with just one virtual instrument and this plugin. Less powerful Macs like an older M1 Mini can run 10-20 instances of this plugin. He should be able to run many more simultaneously.

It was built with the Juce 7 dev branch from around March or April if I remember well.

Does the CPU spiking also occur when running the Unify plugin in Rosetta mode on your system? On his system it did.

Hi Peter,

thank you for your fast response! I use this plugin inside Cubase 12 natively. I just switched to Rosetta to test it and yes, the CPU spiking happens in Rosetta mode as well.

I want to ENCOURAGE the dev community and JUCE team to resolve this ASAP

Hi folks, First of all, I’m a fan of much of what is being accomplished, with JUCE, I own and use many of these plugins.

Central to what I do is PluginGuru’s Unify since about 18 months ago. As Apple has dead-ended all of my Intel machines, excepting one maxed out i9, which is still under AppleCare which is probably why they can’t obsolete it yet, I’m now embarking on the transition to Apple Silicon for all of my music-making gear (live and studio rigs).

Unify is severely under-performing on AS, to be specific a Mac Studio. Audio is cracking and popping under as good as zero load. This would appear to be a problem of Unify also utilizing energy cores versus performance cores, with no mean by which, we, as users, can instruct Unify to stay away from the e-cores.

Realtime audio, especially performance-oriented software needs to be up and in the p-cores only. Even Apple understands this as MainStage won’t even offer e-core selection, and Logic allows users to select e-cores if needed, but there are p-core only options. Again, Unify cannot do this as what I’m hearing from the developer (who is always responsive!) is that he’s dependent on a long-promised solution via the platform, but none available yet…

I have taken a few days to read through the relevant threads here, and… I’m not just a musician, I have worked for “those guys in Redmond” and have decades of work in product launches, product development, and implementations. I’m retired from that world, but “I get it”. Folks are searching for a solution, and kludging solutions that maybe work for AU, but not VST… so forth and so on…

Hence the encouragement to get this issue solved. Cleanest is obviously to have p-only core affiliation handled by JUCE. It appears that folks aren’t clearly in understanding how Apple wants this accomplished, is it a priority, a flag, or some high-road QoS based audio workers workgroup solution??? Dunno… That last one is a lot of work for something Apple shows to users as p-core preference and affiliation in their own products. I don’t know - I just know they’re doing something, and it appears that high-end SoCs that have a bare minimum e-cores are causing issues in one or more products that leverage the JUCE platform.

As an end-user - just need Unifi’s developer supported, either direct, peer-developer support, or… a move “up” the priority list for getting this somehow into JUCE.

To be clear… not all of my JUCE-based products are manifesting this unusable show-stopper behavior. I don’t know why. It raises a question for me as well, okay, how are developers then “rolling their own” solutions where this is not handled in JUCE??? How resilient are those solutions? Again, dunno. Don’t need to know… am an end-user.

Just need, and hence this request, in the form of encouragement, to see how to implement anything at all that resembles what Apple seems to be doing in their own products, which is presented to users as p-core affinity and not what might behind the scenes actually be a workgroup of processes that are in reality, either set by priority or some other mechanism to be p-core processes.

Dunno, and as an end-user, don’t need to know either - just need what BillG once described to me as the “promise of technology” to manifest itself here so that we can continue to do what folks not all that long ago, could only dream of.

Again, couldn’t be happier in general, but this issue is a real show-stopper.

In closing, am not complaining. I just can’t use this product if it cracks and pops within seconds. Just can’t. Could never recommend to use that layer for live touring acts. That’s just a reality.

What I can do, and that is the point of this post, is to raise awareness of a problem and encourage all to help enable each other to get over this new asymmetric model of e-core and p-core computing.

When “up there” in the p-cores, it seems to behave as expected.

How can I help more than just encouraging y’all to have a look into this (soon, if not yesterday)?

1 Like

Hi Peter, just curious to know if you used the Audio workgroups thread registration method for your real-time audio threads (for Mac ARM builds)? I did for my plugin and it definitely improved things (but still doesn’t appear to be a way to force threads to p-cores only).

(Awaiting JUCE team to formally bring this to JUCE API - hopefully soon, but there is a least a working solution for AU/Standalone if you check my and other related posts:[MacOS Audio Thread Workgroups - #11 by onar3d]. Maybe the latest VST3 updates now allow to retrieve the audio workgroup - I haven’t had time to check that.)

1 Like

No, I am still using the standard Juce thread.

My calculations have non-critical deadlines, that is, with high priority they are always ready in time, even with 15-25 instances of my plugins.

I am though still following the (non?) progress here on adopting Apple’s new APIs/guidelines/whatever, because I have no time to invest in this myself, as I am focusing on new products, customer functionality and user experience.

Cheers
Peter