Lock-free queues and visualization of data


#1

Hello guys and @timur !

Something has been bothering me for a while, and I told that to a few guys who have not been able to answer my questions. I think that the lock-free queue isn’t the best structure for doing basic communication between an audio thread and the UI.

In the past, I did code a meter for JUCE, and also a spectrum analyzer. In both cases, they work thanks to a buffer, which is shared by the audio + UI threads. The audio thread put the data inside in the processBlock functions as usual, and the UI thread is reading that data from time to time thanks to a Timer / HighResTimer / TimeSliceClient or whatever.

The thing is, a meter is supposed to work by reading the last 100-200 ms of the data for example for a RMS meter.

Tell me if I’m wrong, the very principle of the AbstractFIFO in JUCE (and of std::queues I guess) is that every sample that is put inside the structure with a write/push, is supposed to be read/popped once. Not twice, not zero time, otherwise it’s no more a queue.

However, every time I need to refresh the data displayed on the UI in my RMS meter, I need to get the last N samples that are inside the buffer. Depending on the Timer callback rate, that means that some given audio samples, at given time, might not be read even once (if the timer is called every 1 second for example), or it might be read twice or more (if the timer is called every 20 ms for example). Which means, that the lock-free queue can’t be used directly, or can’t be used at all.

That’s why I came up with an idea, having something equivalent to the AbstractFiFo class, but which always returns the last N given samples : Concurrency, meters, JUCE classes. In fact, that thinking follows the one I did in this thread…

So, I would like to ask everybody in the forum who have already some skills about the use of lock-free queues to tell me : are they really that useful ? Isn’t it an error to use queues instead of something like my data structure ? Or is it more efficient, even possible, to solve that issue using only lock-free queues, that is getting the last N samples of a buffer without compromising the behaviour of the queue ?

Thanks in advance !


#2

I’m not sure if i understand your problem right, but if you always need the last n samples, why not make a second ring-buffer, which is feed by the lock-free fifo buffer (which should no contain any audio-logic, its just for decoupling the threads, and use the second ring-buffer to get the last n samples.

Or just calculate the RMS in a ring-buffer (no lock-free needed in this case) directly in the processBlock(), and only push the result to the gui.


#3

Hello !

You have understood my problem. I need the N last samples put into a buffer, without any concurrency issues.

You are right about the possibility of using a second ring buffer, I don’t see any way to do that stuff if we keep using a lock-free queues to decouple the reading / writing between the UI and the audio thread. But I think that’s a bad approach if we need to store the same thing several times. And I think that the structure I suggested in the other thread is a solution to that problem, since it allows you to get the N last samples without the need of another audio buffer, with the lock-free constraint. So I wanted to have also some feedback about it, and I wonder what Timur would think about it :wink:

For the RMS value, indeed it is possible to return only the value between the UI and the audio thread, but that means we need to calculate it all the time, or to try to synchronize the audio and the UI refresh time. I think it’s a better approach to do the calculus only when the UI asks for it, so the calculus is made less often…


#4

I had a similar case for realtime pitch detection. I ended up using a ring buffer of N samples length, no lock free queue.

I guess the queue is useful when you cannot drop any samples…


#5

I battled this problem for several months, writing 4 totally different implementations. My requirements were that it be audio buffer/audio sample rate/UI framerate independent, and RMS (or whatever) calculations for the UI were done on the UI thread, with audio data transported between the threads using a lock free queue.

It is A PAIN. I never got it working to the point that I was happy with it, and it was so elaborate the only way I could justify using it was to generalize it to make it usable for any component type, which bogged it down even further. And even then I would run into issues of the UI thread getting clogged and the resulting UI drawing having serious latency behind what was going on in the audio thread (now that I think back to it, this was the biggest issue).

I think it’s a better approach to do the calculus only when the UI asks for it, so the calculus is made less often…

A way around this is to check getActiveEditor() to see if the UI is actually present before doing the meter calculations in the audio thread (though I don’t know if some hosts may report an editor is active when it’s not).

Save yourself the headache and do your UI element calculations (i.e. RMS value, FFTs, whatever) in the audio thread and access them in your editor via a single atomic object/value, interpolating if necessary if your buffer sample rate is lower than your UI frame rate. You may not even have to do this, as the lowest audio buffer rate I usually actively support is 44100Hz/1024samples, which supports a framerate up to 43FPS (I support 10-60FPS for the UI - but 43 looks smooth enough to pass close to 60).

Hope this helps. I’d be curious to hear @timur’s thoughts as well since his talks are what started me on my implementations as well.


#6

[quote=“chkn, post:2, topic:20659, full:true”]
I’m not sure if i understand your problem right, but if you always need the last n samples, why not make a second ring-buffer, which is feed by the lock-free fifo buffer (which should no contain any audio-logic, its just for decoupling the threads, and use the second ring-buffer to get the last n samples.

Or just calculate the RMS in a ring-buffer (no lock-free needed in this case) directly in the processBlock(), and only push the result to the gui.
[/quote]I second this, double buffering works great. Lock-free queues are low-level and only for transporting data. I usually end up with 3 threads: Audio, UI and an auxillary audio thread that pulls samples from the queue into a history buffer and synchronously+chronologically runs it through anything that needs audio data, but doesn’t run on the real-time thread. 4 if you’re running OpenGL as well.

It is a bit hard to get it right, but the rewards are great: No strain on audio thread or stomping the UI thread with timers, plus you get an async audio “callback” you can block / interact with UI+OpenGL in, and it all runs without any polling or latency.


#7

Are there any open source working examples of this? I feel like there would be issues with the GUI noticeably lagging behind the audio since the thread is lower priority and the audio for the GUI may get processed at some later/arbitrary time. I’d be really curious to see this in action.


#8

I also use something like Jonathon’s suggested approach, i.e. do all the processing on the audio thread (with a check to see if there is an active editor if you want to only process when needed). The GUI then simply picks up the most recent value (either by polling or by notification). That way the FIFO can safely be quite short as you simply discard old unread frames. It’s also easy to do the double buffering here.

The only scenario where I’d prefer to stream everything to the GUI is where you need a sample-accurate history.

PS - this is in the context of plugins (where it is often a bit silly to try and multi-thread the processing)


#9

You can check out Signalizer, that does exactly this:

Relevant code is in this file mainly:
https://bitbucket.org/Mayae/cpl/src/master/CAudioStream.h

The trick for eliminating latency is using semaphores, that wakes up the async audio thread for the GUI as soon as any audio data is pushed.


#10

Thanks for all the information !

So everybody is using the double buffer approach when using lock-free queues. That’s what I guessed :wink: I’m still going to use my approach however since I don’t need another buffer and I can have the lock-free behaviour.


#11

Oh… I’m a bit late here… but just in case, I’ve done this sometime ago just for kicks… Works quite nice.
The code isn’t at it’s best, but I guess it might be useful to someone trying something similar.

It is separated in two classes; FFTDisplayProcessor and FFTDisplayComponent
And there’s the AudioSampleFifo class that does the Audio<->UI threads magic :slight_smile:
It was working last time I checked it, let me know if you have problems with it.

FFTDisplayProcessor.h (1.4 KB)
FFTDisplayProcessor.cpp (3.6 KB)
FFTDisplayComponent.cpp (1.9 KB)
FFTDisplayComponent.h (580 Bytes)
AudioSampleFifo.h (3.6 KB)


#12

Nice one, thanks! Can’t wait to check it out.

One thing I noticed: you shouldn’t return from paint without painting, since the paint call could have come from OS or something, so don’t assume that the area shows anything useful.

I would rather just inherit Component and Timer, so you can decide in the timerCallback(), if you want to call repaint() or not…

Glad to hear from you! :slight_smile:


#13

Thanks!!! @daniel with a great feedback as always!!
This was for a project that never get to see the light of day, so it is pretty much in experimental form.
I’ll give it a revamp one of these days, please feel free to give all the feedback you want, I’d really appreciate it! :wink:


#14

Haha - I’m updating my own version of exactly the same thing right now!