Use a lamba function in processBlock?

For a plugin that has some latency and would like to offload intense processing outside of the audio thread, could something like this work?

void MainProcessor::processBlock(AudioBuffer<float>& buffer, MidiBuffer&)
{
    MessageManager::callAsync([buffer]()
    {
        // heavy cpu computation using buffer input samples
    }

    // fill buffer output with previously computed samples
}

*note - the actual call I would use is ioService.post([buffer]…), but I am trying to avoid talking about ioService…so I’m substituting MessageManager::callAsync which works similarly. I wouldn’t normally use the MessageManager thread for computation.

The general need to offload computation outside of the audio thread seems like a common problem. This approach seems most straightforward (to me), but I wonder about the capture of buffer -

Does it properly make a deep copy for the lambda to use later?

Is there a heap allocation in that capture? It seems like there would have to be. The internet seems fuzzy on the subject. I read that it is up to the AudioBuffer copy constructor. Thanks!

You should not use callAsync from the audio thread.
you are capturing the buffer by copy, so it will make an allocation.
you should use proper thread-safe techniques to pass data from the audio thread to a background thread, such as lock-free/wait-free FIFOs.

Using lambdas does involve allocation/deallocation that cannot be fully controlled and thus it should be avoided on the audio thread.
You can get away with lambdas containing audio code in case they get inlined by the compiler. This happens if you use them as a template parameters with their own compiler-deduced type, but never happens if there is any std::function/storing the lambda in a variable involved.

2 Likes

OK, good points. The problem is the capture of the buffer, not the actual lambda, right? Shouldn’t this be safe:

void MainProcessor::processBlock(AudioBuffer<float>& buffer, MidiBuffer&)
{
    // put data in lockfree queue1

    MessageManager::callAsync([this]()
    {
        // pull data from lockfree queue1
        // put data in lockfree queue2
    }

    // pull data from lockfree queue2
}

My understanding is that the lambda is an object created on the stack with data members for each captured variable. If the variables are simple types, there is no heap allocation (?). The object is then copied into MessageManager in this case, but still no heap allocation.

Nope, unfortunately that’s not true. As we use some allocation detection mechanism running on the audio threads in our products under development my experience taught me the following:

  • A lambda without capture translates to something equal to a free function, but only visible in that local scope. So it is as light as a function pointer & does not allocate.
  • Any kind of lambda with capture however will lead to the lambda being translated into some object which does allocate to store each captured value somewhere, regardless if they are POD or more complex types.
  • std::function as a multi-purpose wrapper for callable objects definitively allocates as it does not know which kind of callable object (a function pointer, a lambda…) will be assigned to it, so it cannot determine it’s memory requirements beforehand

Furthermore, if you look at the underlying implementation of MessageManager::callAsync you’ll find this

bool MessageManager::callAsync (std::function<void()> fn)
{
    struct AsyncCallInvoker  : public MessageBase
    {
        AsyncCallInvoker (std::function<void()> f) : callback (std::move (f)) {}
        void messageCallback() override  { callback(); }
        std::function<void()> callback;
    };

    return (new AsyncCallInvoker (std::move (fn)))->post();
}

Each time it is called a new AsyncCallInvoker is allocated on the heap, regardless of the type of lambda wrapped in that std::function object.

That being said, if you really plan to offload some processing to background threads you should create your own thread, put whatever you want to send to it into a lock free queue that is read by the other thread. If you want to get the processed samples back to the processing thread it gets really complicated, as you need a a good strategy on how you want to guarantee synchronization without using locks to avoid priority inversion problems. Doing something like that right is a really challenging thing

Thanks - very helpful. The allocation of the capture variable sounds like a definite show stopper for this method. Secondarily, I’m not actually using MessageManager, and instead I do have my own thread with a boost io_service object. So, my actual call is ioService.post(…). But, the post function probably employs a similar allocation mechanism to MessageManager, because the queue through which io_service gets new work (in the form of std::functions) cannot be infinite. That is unfortunate and does make this more difficult.

Depending of the complexity of your setup, a lock free queue approach will probably not be too difficult either. We are using this one https://github.com/cameron314/readerwriterqueue without problems for some time now. Better don’t look at the implementation :see_no_evil: but it works quite well.

Now with a thread running in a polling mode and polling the queue on the reader side for new data and reacting to that new data is not that complicated in the first place. Furthermore, from that thread you can then call anything you want as it’s not as time critical as the processing thread

Yeah, that is where I was headed. I use boost::lockfree::spsc_queue (https://www.boost.org/doc/libs/1_73_0/doc/html/boost/lockfree/spsc_queue.html) quite a bit. The background thread must poll, and that makes me cringe thinking about some DAW’s like Logic that pause processBlock frequently (and so that background thread is just wasting cpu).

This is untrue.

A lambda is just shorthand for a class declaration. This class will hold each item in the lambda capture list as a plain data member. If you capture something by reference, the data member for that item will be a reference. If you capture by value, the data member will be a normal value type. Constructing a lambda will only allocate if the invoked constructors (copy, move, or otherwise) of those data members themselves allocate. If the lambda captures something like a std::array<int, 1000> by value, it won’t allocate on construction. Check out this example on cppinsights to see what I mean.

3 Likes

Well, that is hopeful and was my understanding. But, now I am concerned about posting that lambda to some background task. I am using an io_service (aka io_context) object, and the post function seems to imply that it uses allocation (io_context::post - 1.73.0).

First of all: What a great tool, can’t believe that I didn’t come across that until now, thanks @reuk :slight_smile:

Then: Thank you for the clarification and for that gerat example. Indeed I had cases in mind where the lambda would be assigned to a std::function object. With a C++ 11 context where a function returning auto wouldn’t be available, something like this would have been the solution. Please correct me if I’m wrong here, but from my observation assigning anything that is no plain function pointer to a std::function will allocate, won’t it?

The standard only specifies that std::function should not allocate when wrapping a function pointer, I think. Some standard library implementations will use small-buffer-optimisation to store small function objects (i.e. not function pointers, but still small) directly in the std::function instance, which shouldn’t allocate. Unfortunately this behaviour cannot be relied-upon.

When rewriting the dsp::Convolution for JUCE 6, I found myself needing something a bit like std::function which was guaranteed not to allocate, so I added dsp::FixedSizeFunction. At the moment this is an ‘internal’ type so it’s not accessible from user code - this is just so that we can test it out a bit more in JUCE and make sure that the API is sufficiently resilient. I expect that we will make this type public at some point in the future.

4 Likes