ConvolutionEngine implementation question/improvement?

chkn · August 10, 2022, 10:49am

I’m currently studying the juce-Convolution, currently only the internal ConvolutionEngine-struct.

The algorithm has a special adaption to user bigger chunks of impulse data (relative to input data) when used with smaller buffer sizes "blockSize > 128”. It uses then FFT-blocks which are 4x bigger than the block-size instead of 2x bigger.

In this case more overlap data is created, which is preserved for later operations.

I see theoretical benefit from it, but only if the impulse-response is at least four times bigger than the block-size (in my simplified model I assume that FFT-calculation time is proportional to the FFT-size, and the engine is fed with optimal aligned input-blocks)

fftSize = BlockSize * 2
processingTime = roundup(IRSize / BlockSize) * fftSize

Vs adapted

fftSize = BlockSize * 4
processingTime = roundup(IRSize / (fftSize-blockSize)) * fftSize

Imho the condition for this adaption should not dependent on the size of the block "blockSize > 128”, instead it should be “irsamples > (blocksize * 4)

These are theoretical thoughts, probably only @reuk and @fr810 can say something about it, because they wrote the engine

struct ConvolutionEngine
{
    ConvolutionEngine (const float* samples,
                       size_t numSamples,
                       size_t maxBlockSize)
        : blockSize ((size_t) nextPowerOfTwo ((int) maxBlockSize)),
          fftSize (blockSize > 128 ? 2 * blockSize : 4 * blockSize),
          fftObject (std::make_unique<FFT> (roundToInt (std::log2 (fftSize)))),
          numSegments (numSamples / (fftSize - blockSize) + 1u),
          numInputSegments ((blockSize > 128 ? numSegments : 3 * numSegments)),

PS:
Is it planned to transform the whole engine to use templates to support double-precision in the near future?

reuk · August 11, 2022, 2:34pm

I wasn’t involved with the construction of the engine - I just rewrote some of the setup code in an effort to make it a bit more thread-safe.

@IvanC might be able to provide more insight.

IvanC · August 11, 2022, 3:04pm

Hello!

If I remember well, I came up with this condition because the FFT calculus is not the slowest operation in this context, but the multiplications side. Having a higher FFT size with fftSize = blockSize*4 when blockSize is low and IR size high enough reduces significantly the number of convolutions you have to do per block.

However, I think even better conditions for this could be found with some benchmarking. I think one of my own engines uses something like:
fftSize (blockSize <= 128 && numSamples >= 8192 ? 4 * blockSize : 2 * blockSize)

instead of :
fftSize (blockSize > 128 ? 2 * blockSize : 4 * blockSize)

(and same condition for numInputSegments)

For double precision support, I guess it might not be something complicated to do, but the JUCE wrapper would have to check additionally if the wrapped FFT libraries support properly double precision.

chkn · August 11, 2022, 9:17pm

Thanks for the insight!

Topic		Replies	Views
dsp::Convolution - head size? General JUCE discussion	0	215	August 31, 2021
FFT based convolution arbitrary min size of 64? General JUCE discussion	0	283	October 9, 2021
DSP Convolver performance Audio Plugins	10	1117	November 29, 2023
Convolution Reverb Audio Plugins	2	1546	February 27, 2019
Best practices for audio processing independent of buffer size Audio Plugins	6	1962	December 3, 2020

ConvolutionEngine implementation question/improvement?

Purchase

Discover

Learn

Support

About

Events

ConvolutionEngine implementation question/improvement?

Related Topics

Purchase

Discover

Learn

Support

About

Events