Hi all, I tried to follow the SIMD Tutorial (JUCE: Tutorial: Optimisation using the SIMDRegister class)
I can’t figure out how to use it for more channels than the register size.
I’m on an M1 Mac with supports a 128bit registry, so I can fit 4 floats in there.
For simplicity I’d process only multiples of 4, so let’s say 12 channels.
To test the multichannel stuff, i created an AudioBuffer with 12 channels and copied sample data to it in getNextAudioBlock( const juce::AudioSourceChannelInfo& bufferToFill)
Should i then in the SIMDTutorialFilter create multiple iir filters?
std::vector<std::unique_ptr<dsp::IIR::Filter<dsp::SIMDRegister<float>>>> iir;
Then in process() align and interleave them and call iir->process on all of them?
for (int i = 0; i < numChannels / registerSize; ++i)
{
auto subInputBlock = inputBlock.getSubsetChannelBlock (i * registerSize, registerSize);
auto inChannels = prepareChannelPointers(subInputBlock);
using Format = AudioData::Format<AudioData::Float32, AudioData::NativeEndian>;
AudioData::interleaveSamples(AudioData::NonInterleavedSource<Format> { inChannels.data(), registerSize },
AudioData::InterleavedDest<Format> {toBasePointer(interleaved.getChannelPointer(0)), registerSize},
numSamples);
iir[i]->process(dsp::ProcessContextReplacing<dsp::SIMDRegister<float>>(interleaved));
auto outputBlock = context.getOutputBlock();
auto subOutputBlock = outputBlock.getSubsetChannelBlock (i * registerSize, registerSize);
auto outChannels = prepareChannelPointers (subOutputBlock);
AudioData::deinterleaveSamples(AudioData::InterleavedSource<Format> {toBasePointer(interleaved.getChannelPointer(0)), registerSize},
AudioData::NonInterleavedDest<Format> {outChannels.data(), registerSize}, numSamples);
}
It feels like the wrong thing to do and also introduces crackling on all but the first 4 channels
Any better solutions?
thanks!