Convolution Reverb with Juce Convolution class (High CPU usage,... why)



I would like to develop a convolution reverb and tried it with the juce::dsp::Convolution class. It works fine but the convolution-process function kills my cpu. It seems that the cpu usage increases, when I increase the source buffer. My dsp process function processed 64 samples per call. Let me show you my code…

In the Constructor I generate a testbuffer with Noise like this:

const float TEST_SIZE = 2048 * 8;
testBuffer.setSize(2, TEST_SIZE);

float *tL = testBuffer.getWritePointer(0);
float *tR = testBuffer.getWritePointer(1);

for (size_t i = 0; i < TEST_SIZE; i++)
	const float multiply = 1.0f - (float)i / (float)TEST_SIZE;
	tL[i] = (rand() % 10000) * 0.0001f * multiply;
	tR[i] = (rand() % 10000) * 0.0001f * multiply;

also in the constructor I initialize the convolution class (config::BUFFER_LENGTH = 64, the size of my dsp loop)

juce::dsp::ProcessSpec specs;
specs.maximumBlockSize = config::BUFFER_LENGTH;
specs.numChannels = 2;
specs.sampleRate = 44100;
juceConvolution.copyAndLoadImpulseResponseFromBuffer(testBuffer, 44100, true, TEST_SIZE);

The convolution is now ready to work. In the dsp function I call the process of the Convolution

void Convolution::process(AudioSampleBuffer& buffer)
   // ....
   juce::dsp::AudioBlock<float> block(buffer);
   juce::dsp::ProcessContextReplacing<float> context(block);
   // ....

It works. Sounds like a noise Reverb, but it takes more than 15% CPU in Release mode and over 80% CPU in debug. If I increase the testbuffer, the CPU usage will also increase. That’s strange because when I load a Convolution VST from another companies, I can choose wave-files as long as I want and the CPU usage doesn’t increase. It’s static. What am I doing wrong here?


I haven’t studied the Juce convolution algorithm in detail but it may be the kind where the CPU use increases as the impulse response length increases. (There are various ways to do the convolution including ones where the CPU use does not increase much as the IR length increases.) There isn’t really anything you could do in that case except to find another convolution algorithm that has different CPU use characteristics. What happens if you increase the processing buffer size from 64 samples to some higher value?


Hello !

Well, I’m not sure you can really find any convolution reverb plug-in where the CPU load doesn’t increase with the IR size !

The convolution algorithm in JUCE is state of the art uniform partitioned convolution algorithm. That means that it is not really suited for IR with a size higher than 1 second, for which not uniform partitioned algorithms are mandatory and being used by every single reverb convolution plug-in.

The reason we decided to not include such an algorithm in JUCE is because a lot of more or less troll patents cover more or less all the ways to do a proper not uniform partitioned convolution algorithm, and until we are able to see what is possible to do in an open source library without getting an army of lawyers calls, we won’t be able to include this in JUCE I’m afraid.

If you still want to use JUCE for your convolution reverb, you might want to increase the buffer size of the algorithm processing block, and handle the additional latency by yourself. Or you might want to try another convolution library. Moreover, if you are doing your tests on Windows, the CPU load might decrease if you install Intel MKL and use it with the JUCE FFT class wrapper (on macOS vDSP is automatically being used).

Otherwise, the difference in CPU load between Debug and Release configurations is normal, it’s a collateral damage from using SIMD accelerated instructions in the Convolution processing code.


Thanks for the answers. That’s strange. Try the convolver from FL Studio. That’s a convolution Reverb and it doesn’t matter how long the IR is you load in. Just 4 fun I created a wave which is 30 minutes long and used in in the convolver as IR. The CPU usange doesn’t increase. Only 1% usage and (almost) no latency. Hmm.
I will try to find another algorithm.


Sounds suspicious :slight_smile: How do you know it’s 1% usage ? Are you looking for the CPU load in FL Studio or with a task manager ? Since not uniform partitioned convolution involves a background thread, it is (more than) possible that this 1% is just the audio thread part (the first second) but not the other threads part (the 30 minutes minus 1 second :slight_smile: )


I get similar results with Reaper’s ReaVerb, I can load up a 60 second (sadly, the longest ReaVerb allows to load) IR and the CPU usage is under 1% during audio playback in Process Explorer (a Task Manager replacement that shows more accurate information).

edit : Hah, better yet, if I have 2 instances of the ReaVerb plugin with the 60 second IR, the CPU use still does not even rise over 1%, so I guess most of that 1% CPU use is just Reaper’s general audio processing stuff going on…The plugins show ~0.1% CPU use in Reaper’s own performance meters.


In Reaverb, I think you need to enable the ZL and LL options to do a fair comparison, otherwise the algorithm is working with a high latency which is a very good way to reduce CPU consumption for a convolution algorithm.

Anyway, I’d really like at some point to do some testing with not uniform partitioning, to get this kind of performance, but I have first to clear all the doubts with patents and licensing…