How many audio threads can I create without problems?

It is not purely academic. I have as good reason to use more threads. But it does not matter for this discussion.

1 Like

Not commenting specifically to this thread but in general, context (almost) always matters.
It could be that there’s a much better way of doing things if you provide enough context.

6 Likes

Co-opting threads to do the work but not depending on them to finish is an interesting take. I guess then it comes down to how you decide that a thread has been delayed so that you can complete the work on the main audio thread.

BTW, in my responses here I’ve specified that such a intra plugin multi-threaded approach should be a user toggle option and my context has been synthesisers where isolated voices should be a good candidate for threading. But, yes, in a DAW situation you don’t have full control over threading and then there’s the OS too. For standalone embedded projects it may be a different matter.

Can’t talk for OP but if you’re talking about PolyBLEP or MinBLEP oscillators then yes, you should be able to do a lot of them without stress. But if you’re handling oscillators that using IFFT for band-limiting or more adventurous approaches to additive synthesis techniques then the processing burden per voice starts to stack up depending on your polyphony. Then if you add phasor unisons, harmonic unisons etc. It’s quite possible to get modern commercial plugin synthesisers to choke depending on the patch, polyphony and processing block sizes etc.

If modern computers are coming along with 16/32+ cores then it’d be nice to be able to use that extra power.

reFX: believe it or not, it doesn’t matter how fast your cpu is, there will always be people who find useful ways to harness all the resources. Take my project for example. I’m working on massive microphone arrays. Say you have 84 channels in and 36 channels out and every input and every output is connected with a filter. And let’s say that every filter should be updated dynamically in every processing block based on an analysis of the input signal. Now, when every processing block involves a several thousand FFTs, hundreds of matrix inversions and thousands of large matrix multiplications, and you want to support hardware from a few years ago, after profiling, hand-vectorizing with intrinsics for each platform, pouring over the assembly code and optimizing everything for a few months, you will eventually start to think about the possibility of multithreading.

To answer the original question: You can run one thread per core and the os will distribute them across the cores. However, if your cpu uses hyperthreading, windows will happily schedule two threads on the same core which runs much slower than when they are on different cores. So you should avoid that. There is nowindows api call for this specifically, so you have to do the following in the audio callback function: 1. Pin the main audio thread to the core it is running on. 2. Pin the worker threads to each of the remaining cores. 3. Run your processing. 4. Unpin the main audio thread. It’s a dirty hack, but I haven’t found any better way.

How do you pin a thread to a core?

I won’t go too far into the details, as this is not specific to juce, but rather a general windows programming topic. To give you a starting point for finding the necessary information, here are the Windows api functions that are involved: SetThreadAffinityMask, GetCurrentThread, GetProcessAffinityMask, GetCurrentProcess, GetCurrentProcessorNumberEx, GetLogicalProcessorInformation, GetThreadGroupAffinity. You will also want to use AvSetMmThreadCharacteristics, AvSetMmThreadPriority, AvRevertMmThreadCharacteristics

It’s so tempting to just throw a few extra threads at a problem, but the debugging nightmares that come with unsynchronized data just aren’t worth it. I’ve found that sticking to the ā€˜one main audio thread’ rule and using std::atomic for simple state changes - or a proper lock-free FIFO for anything complex - is really the only way to sleep at night.

I don’t think anyone reading this thread would find it very tempting to go multithread, but whether it’s worth it or not depends entirely on your application. fifos and atomic variables are two of the tools you can use if you choose to go that way, but there are many others, and barring those that have been superseded by newer ones, each one of them has their legitimate use cases and should be studied to know which tool to use in which case.