Multiple render threads in processBlock() - Best approach?

I need to process some stuff on separate threads, inside and directly from the processBlock callback.

Communication happens through atomic flags and FIFOs, so this is not a problem.

I can think of 3 questions:

1.

The algorithm is designed that in the main processBlock() there should be no idling, but in an exceptional situation it can happen the audio callback, needs to wait until another thread has finished his job, what is the best way to do that.

Would something like this be a good solution
while(waitForOtherTreads) { Thread::yield(); }

2.

How to signal other-threads to wake up in the callback. I successfully implemented that one-time using WaitableEvent::signal(), but I wonder if there are better ways to do this.

3.

What priority should these extra-threads have.

Mac-OS introduced the concept of audio work groups. Is this also useful for plugins, is there experience here?

Wouldn’t it make sense if jure could provide a multi-platform API here?
Also considering that newer plugin formats like CLAP, directly support multi-threading.

With Clap, only if the host provides that service to the plugins, the plugins must always also be prepared to do their processing serially or with their own multithreading implementation.

This causes the OS to stop processing your thread for at least 10ms on Windows.
You might not want your plugin to impose that much latency.
I believe the alternative is a spin-lock, the downside of those is that they burn CPU (and battery) while waiting. So these are not very suitable on mobile devices.

How to signal other-threads to wake up in the callback.

I’m no expert, but I believe the standard method is to use a std::condition_variable. Most examples also require a mutex though, which is not realtime safe, so I’m not 100% sure that’s the way to do it.

This causes the OS to stop processing your thread for at least 10ms on Windows.

Interesting, do you have more information or benchmarks?

Thread::yield() uses Sleep(0) on windows
According to Microsoft documentation:

After the sleep interval has passed, the thread is ready to run. If you specify 0 milliseconds, the thread will relinquish the remainder of its time slice but remain ready.

On macOS sched_yield(void); is used, which seems to be the posix-equivalent

I found out that in the Tracktion Engine these assembler commands are also used in combination.

 _mm_pause();  (Win)
 __asm__ __volatile__ ("yield");  (Mac)

Which basically are special “spin” commands for the cpu, used to wait for very short times, and after a while the above mentioned methods will be used.

I wonder, has this ever been checked, to what extent this makes sense, and what number of spins, for example, represents an optimal value here. Are there any studies on this or best practices.

Most examples also require a mutex though, which is not realtime safe, so I’m not 100% sure that’s the way to do it.

I think if we work with different threads we already give up “classical” realtime safety principles, but would be interesting what other have experienced.

c++ - How to make thread sleep less than a millisecond on Windows - Stack Overflow).&text=This%20is%20due%20to%20the,in%20the%201%2D10ms%20range.

this covers it a little. Basically it quite random how long it Sleeps for.
But what if the DAW has lined up say 10 plugins to process, and your plugin relinquishs the DAWs timeslice? Are you stealing CPU of other plugins in the DAW?