How do I implement a non-uniform partitioned convolution without extreme CPU Usage?

Hey, i am trying to create a Convolution Reverb, for my Synth Plugin I am currently working on.

I have a couple of questions, which are probably not that smart, because i am relatively new to C++ and dsp, but having a blast learning it. Just feel a little stupid sometimes, but this forum helped me a lot in the past, so i thought I`ll just ask.

I’ve read that it`s a good idea to use a non-uniform partitioned convolution algorithm for a reverb. So i tried an external library (FFTConvolver from HiFi-LoFi) as well as the JUCE one. But it seems like i am using it not right.

In both cases i had an CPU Usage that was extreme High (up to 90%), depending on my BufferSize, but never lower than 16% (On 1024 Samples BufferSize). In the intialisation, i tried multiple headsizes (and tail sizes on FFTConvolver) but it did not really change much. I have the feeling, that i probably use it wrong. As mentioned in multiple posts , the JUCE convolution is not the best for long IR’s, CPU wise, but that high of a CPU Usage seems a bit heavy, especially running in the same problem with JUCE as well as FFTConvolver. So here are a couple questions:

  1. Does the JUCE Convolver (in non-uniform “mode”) have multiple convolution Engines, and splits the Impulse response according to the defined head size or do I have to do it manually? (probably a dumb question, depending on the non-uniform part, but because nothing changed for me from uniform to non-uniform i ask it anyways)
  2. Is it generally a good Idea to set the maximumBlockSize in the specs high or to the host buffersize?
  3. Is there anywhere a good example or tutorial on how to implement it properly, to function with a longer Impulse Response?
  4. Does the MessageQueue help to optimize the performance, if I don’t load new ir’s or specs and if so, how do I use it properly? Probably also a dumb question, but I honestly don’t understand for what it`s needed, because I dont know what engine-updates the convolution engine sends. Pretty new to this field, as I said^^
  5. Is creating an audioBlock or a processContext realtime safe? My current understanding is that it`s some kind of pointer, but that could be completly false^^
    6.Would it be a good idea to split the impulse response into a head part and a tail, and use one convolver without latency, and one with latency? Or is that (like I think) what the non-uniform convolver does?

If you have any recommendations for how to use this or another library efficiently, or where I can learn about this topic i would be very thankfull. I hope to somehow set it up in a way that has no heavy latency.

As a sidenote, i also wanted to try the wdl library by cockos, but it was super confusing to me, and i have not found any documentation. If there is one out there I would appreciate a link to it :slight_smile:

Thanks a lot in advance, and sorry again for the probably noobish questions.
If you have anything to say to this topic, even if not asked or mentioned in my post, i am very happy to hear about that.

Hey,

I think what you’re missing here is that to fully benefit from the fact that the convolution is partitioned, the tail bit can be processed on a different thread than the audio thread (since you don’t need the result of this operation now but some time later). The idea is that during an audio callback:

  • 1/ the head of the IR is convolved with the audio input directly in the audio thread (often directly as a FIR filter) and added to the output
  • 2/ the result of tails convolved with previous audio input (this operation was performed on a different thread) is queried and added to the output
  • 3/ the audio input is somehow passed to another thread which computes the result of convolving the with tail. The result of this operation will be used in a subsequent audio callback (see step 2/)

The HiFi LoFi implentation does not provide you directly with the dispatch of the computation to background thread and so on, but exposes an interface so you can implement it. So you need to inherit from TwoStageFFTConvolver and reimplement startBackgroundProcessing and waitForBackgroundProcessing: FFTConvolver/TwoStageFFTConvolver.h at non-uniform · HiFi-LoFi/FFTConvolver · GitHub .

You can see an implementation example of threaded convolution using HiFi LoFi as a base in the HISE library for example: HISE/hi_dsp_library/dsp_basics/ConvolutionBase.h at develop · christophhart/HISE · GitHub

Hope this helps, good luck with figuring out all the details.

1 Like

This is also a good example of using the HiFi LoFi convolver in juce: GitHub - HiFi-LoFi/KlangFalter: Convolution audio plugin (e.g. for usage as convolution reverb)

1 Like

Thanks a lot, yes that is a super usefull information. I will definitly try that.
Just out of curiosity, how would i implement the background treading tail with the juce convolution engine? Do i set up two convolver, use one in the background and there is where the messageQueue comes in handy? Or is it pre-defined when i implement it as non-uniform? And if not, how do i define a background tail convolver, because i don’t find anything for that in the documentation. And i assume, if i can set a head size, then there is a tail prrocessing also?^^

Again thanks a lot for all this:)

I will definitly check that out, thanks!