Workgroup.join() crashes sometimes - I am not using it with Juce threads, am I doing it wrong?

Our plugin uses JUCE’s master tip, v 7.0.8.

On each audioWorkgroupContextChanged (const juce::AudioWorkgroup& workgroup) callback, I wrap the workgroup reference in a mutable lambda.

Our threads - started in an existing library - are posix threads, set to “realtime” with the correct Apple API’s, and they correctly work with our own apple workgroup API implementation in a standalone application.

When used in a Juce plugin we have to use Juce’s API for joining workgroups however since Juce doesn’t expose the native handle.

Hence the lambda: I copy this so that it is available to each of these already running threads, which invoke it once every time the workgroupContextChanged callback has been called, to join the new workgroup.

On my (M1, 13.6.1) mac this could be ran hundreds of times without fail - even joining 20 threads to a workgroup (not a real-world scenario…).

Except yesterday, when it almost always crashed, even with 2 thread - only to never again crash when running the tests again today:

plugin +0x0612a8 juce::FixedSizeFunction<T>::operator() (juce_FixedSizeFunction.h:196)
plugin +0x0612a8 juce::WorkgroupToken::getTokenProvider (juce_AudioWorkgroup.h:80)
plugin +0x0612a8 juce::AudioWorkgroup::WorkgroupProvider::join (juce_AudioWorkgroup.cpp:93)

The same crash has been frequently reported by some testers.

I do test the workgroup object for validity before using it - with ‘explicit operator bool()’, since I’ve noticed some hosts (Reaper) cause the callback to be called also with an invalid workgroup.

I’m quite convinced from the above, that there’s some race condition somewhere.

To my questions:

Does the ‘const juce::AudioWorkgroup& workgroup’ reference go stale within some time, after which the “token provider” inside is invalid?

Am I ‘allowed’ to invoke it from already running threads or am I expected to restart the threads for each invocation?

Thanks!

The AudioWorkgroup system manages reference counts through os_retain and os_release functions. Holding a reference does not properly adjust the count.

I suggest creating a copy of the object to ensure accurate reference counting and see if that helps.

If the issue still persists can you create an MVP for me to test? An example of how you capture and use the workgroup would be ideal.

Thank you, I now copy, and I’ve not managed to reproduce the crash again!

Also two of our testers gave it a try with Logic, which always crashed, and it is now A OK. Pending more extensive testing I’d consider it fixed. I hope.

With that said, the const & callback argument I think really does communicate “don’t copy me”, it’d be a nice useability improvement if that is addressed, or at least documented?

1 Like