Timur Doumler Talks on C++ Audio (Sharing data across threads)

That is the case on Apple operating systems afaik, where the OS fabricates that thread.

Afaik, that is not the case on WASAPI, where – if you use that API explicitly instead of JUCE – you actually create the processing thread yourself, and you are responsible yourself for giving it some priority.

In general – please correct me if I’m wrong! – afaik there is nothing magic or special about this thread. OK, there is a bit of magic on macOS and iOS (CoreAudio), but in general there is no magic here. It’s just a thread.

I don’t know, but there are Linux real-time kernels which you might be able to ask for a “real-time” thread that has some special characteristics. For example, I think it’s possible on real-time kernels and posix for mutexes to elevate the priority of threads with contending locks to try and minimise the lock period and avoid priority inversion.

I think Android has something similar.

But ultimately, yes, these are threads and blocking access to them will eventually cause problems if you don’t have exact knowledge of how the scheduler works.

So maybe the conclusion is all the current best practices and try to minimise any data sharing.

Yes. And this is why high frequency traders use their own hacked Linux kernels :wink:

But I don’t think that the common Linux audio APIs like ALSA, JACK, etc. are using any such things?

Well I thought that was one of the reasons for the Ubuntu Studio OS?
That it had improved real-time guarantees. Maybe it uses modified versions of JACK/ALSA etc?

No idea. I’m curious, too.

Is anyone here more knowledgeable about these things?

I think as long as it’s an indication and the implementation is free to ignore it, that might be a reasonable thing to introduce. It would be similar to other attributes that serve as optimisation hints, such as [[likely]] and [[unlikely]] which we are introducing in C++20.

Can you explain this? What is a “real-time language” if C++ isn’t?

How do these get propagated to the scheduler though?
I can understand that [[likely]] and [[unlikely]] might be used to re-order code for cache locality etc. (oh, and the branch predictor…) but how would the OS know that a bunch of instructions should be grouped and not interrupted. Are there CPU instructions for this?

Maybe what I’m after is the Transactional Memory TS…

I have no idea how any of this stuff actually works.

What I mean is that C++ doesn’t offer you a real-time guarantee. You cannot declare a function in C++ that is defined to finish in less than X milliseconds. There exist real-time systems that can do this, that’s what they use for things like life-critical medical equipment, rockets, and airplanes.

Within the constraints of a non-real-time operating system though, C++ is an exceptionally good language for “real-time-ish” applications (for lack of a better term), which is what we talk about when we talk about audio and similar things.

1 Like

And if I had to define “real-time-ish”, I would say: achieving “real time safe” performance in practice, i.e. audio without glitches, but achieving that through the way we write our code, and not through any kind of explicit mechanism that would guarantee such characteristics (which doesn’t exist in C++).

BTW, the best summary I’ve seen for this topic is here: http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing

4 Likes

Afaik, that is not the case on WASAPI, where – if you use that API explicitly instead of JUCE – you actually create the processing thread yourself, and you are responsible yourself for giving it some priority.

In general – please correct me if I’m wrong! – afaik there is nothing magic or special about this thread. OK, there is a bit of magic on macOS and iOS (CoreAudio), but in general there is no magic here. It’s just a thread.

On Windows, MMCSS can improve latency and reliability for both WASAPI and ASIO. The JUCE WASAPI wrapper already takes advantage of MMCSS, so I don’t think there would be any benefit to rolling your own WASAPI code.

For ASIO, Steinberg specifies that ASIO drivers should use MMCSS internally:

The driver should set an appropriate high priority for the thread. Starting with Windows Vista, Microsoft introduced the Multimedia Class Scheduler Service (MMCSS). On Windows Vista or any newer Windows version, ASIO driver threads must be in the “Pro Audio” class of MMCSS and their priority set to “CRITICAL”. This guarantees the highest execution priority for the bufferSwitch(). It lies within the sole responsibility of the ASIO driver to set the priorities of the threads it owns. An ASIO host shall by no means alter these priorities.

So any recent ASIO driver should be doing this already. MMCSS is a good thing.

https://docs.microsoft.com/en-us/windows/desktop/procthread/multimedia-class-scheduler-service

But - the real issue with Windows is the underlying kernel.

I spent a fair amount of time writing Windows audio drivers. Of course everyone wants to make an ASIO driver with consistent and isochronous callbacks to the client application, but the whole Windows DPC architecture makes that difficult. Essentially Windows NT is a preemptive multitasking system layered on top of a cooperative multitasking system.

Decent writeup about Windows DPCs and how they affect audio latency here:
https://support.focusrite.com/hc/en-gb/articles/208360865-Troubleshooting-DPC-latency

Microsoft docs say that an individual DPC shouldn’t execute for more than 100 microseconds, and that a driver should not hold a spinlock for more than 25 microseconds. But there’s no way for the OS to enforce those rules and they are routinely violated.

So, yes, the WASAPI MMCSS threads are better and more likely to be real-time. But any arbitrary kernel mode driver could run a DPC for a few milliseconds and wreak havoc. As long as the underlying OS is not real-time, there’s not much you can do.

Hope that helps-

Matt

I pass a copy of the object via a non-blocking FIFO. Then relax safe in the knowledge that I can modify the original object with absolutely no race conditions to worry about.

Perhaps your question should not be “How can I share ownership of pointers to objects between threads?”, but rather “Should I?”.

5 Likes

We are using try_lock in the audio thread. In 99.999% of the time you will be able to acquire the lock in the audio thread which is good enough for our purposes. Just have to handle the case of the 0.001%. Worst case is a parameter update delay of one block.

That sounds reasonable, but can it not potentially fail the try on the next block also?

Actually, whilst Just::Thread is commercial, the atomic_shared_pointer is open source under a BSD-style license.

There is also an atomic shared pointer in folly. Folly also has various other useful concurrent data structures.

I tested this one a while ago. Calling is_lock_free() returned true on Mac and iOS, but false on Windows. So depending on what you’re targeting and how important lock-free-ness is to your use case, test it.

Hmm… it also seems rather old and uses std::auto_ptr which now doesn’t compile on macOS. I’m guessing this is an abandoned fork of the Just::Thread code.

Folly is definitely the better option if you can live with the dependencies.

1 Like

What about two lock free queues of pointers to object ?

  • one from main Thread to audio thread to bring objects in;
  • one from audio thread to main thread to deallocate memory, once the object is no longer needed by the audio thread.
1 Like