FR: Thread-Priority vs Efficiency/Performance Cores

I found out if a juce::Thread should run on an Apple M1 performance-Core, it needs at least priority 8.

Also the current Intel “Alder Lakes” will introduce the efficiency/performance core design, but I’m not sure how this is implemented.

I would be great if the juce.Thread api will somehow reflects this, something like thread.setPriority(thread.getMinPerformanceCorePriorty())

okay, this commit changes the behavior completely. So I guess before this change all threads (with no defined priority) had actually very low priority on macOS?

Yes, I think that was the case.

1 Like

Can we get some clear documentation here? We’ve noticed that some users complain about NEXUS suddenly loading a lot slower, needing a lot longer to scan folders, etc. all stuff we put into background threads running with priority 3 (so lower than normal, still higher than only-on-idle).

We’ve used priority 3 because without setting it explicitly, the “background”-tasks interrupted the message thread so frequently that the mouse stuttered. Once we switched to priority 3, everything became butter-smooth and loading times, folder scans, etc. didn’t really take any longer than before.

Once the M1 Max came out, we suddenly got a few complaints about slow preset-scan times (browser window stays empty until the scan is complete), and even preset-loading is affected, because we decode (uncompress) all the individual samples with a thread pool, again with priority at 3, and something that normally takes 50 - 100 ms now sometimes takes 4,000 - 5,000 ms!

This doesn’t happen on a standard M1 (we bought on release day in 2020), but apparently only on newer M1 Max.

This commit could help, but we don’t understand fully what it does. Can you maybe give us a clear description of what each priority means exactly now on macOS and Windows? Or is it compatible now? Can we set the thread priority for macOS and Windows and it will actually mean the same thing?

Thanks :slight_smile:

2 Likes

Guys - we have exactly the same question - we upgraded JUCE and suddenly our app ran like shite, taking 60x longer to do some sound file analysis!

Really unclear what thread priorities we should be using any more.

1 Like

So let me get this right:

Would be nice if someone could confirm Thread Priority < 8 means it will never run on a performance core.

This suggests every worker thread created with a default priority (5) would have a distinct performance reduction on Apple Silicon.

Whilst there’s no explicit guide of what the default priority meant, I had not counted on these threads having such a low ceiling for performance when the machine is idle.

Thanks

1 Like

Hey guys - would be great to get some input on here. We have an App with zillions of Thread instances and need to know if we need to patch it so it runs on performance cores again, or whether you think this is a JUCE issue…

2 Likes

@reuk

@jimc and myself have looked at this and we’re confident this is a JUCE bug that is capable of causing performance issues on apple silicon machines.

The code means that all threads with a juce priority<8 have their apple priority set to 0:

policy = priority < lowestRealtimePriority ? SCHED_OTHER : SCHED_RR;

param.sched_priority = [&]
{
	if (policy == SCHED_OTHER)
		return 0;
		
		return jmap (priority, lowestRealtimePriority, maxInputPriority, minPriority, maxPriority);
}();

Due to some macos QOS stuff, this then means those threads only ever run on efficiency cores (of which there are only 2 on M1 Pro and Max)

Here’s a simple fix that seems to keep the original intent of the thread code:

Can we please have it patched back into JUCE 6.1? Incidentally, the code is still in JUCE 7 and presumably still causing problems.

FWIW, I suspect that this OpenGL fix here is also no longer needed after the change above. Thread: Update macOS thread priority calculation · juce-framework/JUCE@48c6087 · GitHub

Thanks!
Dave

6 Likes

I know the JUCE team have probably been focusing on the J7 launch but this also has me worried. We have a lot of threads running and this could cause problems with file reading amongst others in Tracktion Engine/Waveform. Thanks in advance.

I spent most of today looking at this. Unfortunately, Apple has changed pthread priority characteristics across their devices, breaking our ability to map our 0-10 range to something useful.

I spent the day mapping the performance characteristics with raw pthread on my M1. 0-4 will restrict the thread to the E cores, and anything above 5 is balanced across all cores.

It appears to be slightly different on the Pro/Max, a priority in the 0-9 range will restrict it to the E cores and potentially run that at a lower clock resulting in extreme performance drops. I see x4 less performance on my M1 with 4 E cores; we’re likely to see x8 less on the Pro/Max with only 2 E cores.

sched_get_priority_min/max appears not to return anything useful on M1 platforms, and setting a priority of zero with the SCHED_OTHER policy (as pthread docs recommend/require) will force that thread into its lowest performance characteristic.

TL; DR. We can no longer rely on posix threads for macOS. Will Fix.

6 Likes

The docs from Apple state, that one should use pthread_set_qos_class_self_np
to set priority. Maybe this is something that can be adopted in JUCE.

QoS is the direction Apple and Windows are going for scheduler prioritising, especially as we see more machines with asymmetrical processor architectures.

We’re going to have to update our Thread models in JUCE.

2 Likes

Thankyou very much for looking into this! :slight_smile:

In case it helps others following this, I found the following doc enlightening: Scheduling of Threads on M1 Series Chips: second draft – The Eclectic Light Company

@Rincewind I did notice that qos.h (where I believe pthread_set_qos_class_self_np lives) does not have a method for setting the priority of another thread aside from temporarily. So there’s a bit more work for things like ThreadPools.

Thanks again,
Dave

I thought I would clarify the current state of threading priorities on M1 platforms while we decide what direction we want to go in.

Since @reuk’s recent change, setting the priority on M1 platforms will have no effect. However, It does fix the issue of threads being placed into the lower tier. So you can expect all your threads to run as fast as possible (outside of real-time) regardless of priority, with no more crippling performance on M1+ chips.

The pthread_setschedparam method we currently use effectively exposes two priority levels:

  • 0-4: Threads will be restricted to E Cores only.
  • 5+: Threads will fill available P Cores and spill over to E cores if necessary.

As sched_get_priority_min/max always returns a range well above 4, threads on M1+ are set to the higher performance threshold regardless of the level requested.

These priority levels have nothing to do with the QoS classes to which a particular blog post refers. These are not mapped to pthread priority levels and can only be accessed via different API calls.

We’re currently thinking about what to do with this new API and how to best make it accessible for everyone.

4 Likes