The problem is: it doesn’t sound very diffused. The result still feels echo-y or sparse, rather than smooth or dense. The math (Hadamard, shuffle, inversion) and delay wiring all seem correct.
Any thoughts? Is 4 steps too few? Am I missing something obvious?
I think for a purely serial diffusion algorithm you want to have at least 6 steps. Although the echo density required for a reverb will be equally dependent on your # of steps as well as your delay times. That said, the use of Hadamard and shuffling is new to me, so I could be missing something.
is totalNumInputChannels just the value returned by getTotalNumInputChannels()? If so, and if this is a stereo effect, then it will be equal to 2 and 4 diffusion steps will generate 2^4 == 16 echoes, which is not a lot
The solution is NOT to crank the number of diffusion steps.
Instead what you want to do is decouple the number of internal diffusion channels from the number of input channels.
You do this by allocating N delay lines, where N is greater than M, which is the number of input channels, and then writing each input channel into the first M delay lines, like so:
for(unsigned i = 0; i < number_of_input_channels; ++i) { // pseudoish code
diffusion_channel[i].write(input_channel[i]);
}
Some diffusion channels will start off empty, but this is fine. Depending on how you do things, you might have to normalize the outputs by dividing the intensity of each sample by the number of diffusion channels
As far as the numbers go, 4 or 8 channel diffusers work well. If you stick to a power of two number of diffusion channels and specify this value at compile time, you can use a fast templated hadamard implementation that gets rid of unnecessary runtime operations
Also a performance side note, in DiffusionStep::process, the line
std::vector<float> dsInputs = inputs;
allocates memory. Because this function is called 4 times in the inner process loop, that means you’re allocating and freeing memory 4 times per sample!!
Dynamic memory allocation, in addition to just being really slow, has nondeterministic performance characteristics (you never know exactly how long it will take), so it’s basically forbidden to do it on the audio thread
You should really be preallocating all your audio buffers and passing them around by reference
Also I just realized this thread is 2 weeks old lol