Yeah at that point stack allocation is fine. Didn’t know that alignas would only work with stack though. I found the snag and again it was a left hand right hand operator problem with juce SIMD. I do wish these were a little more complete!
Rant with example
using simd = dsp::SIMDRegister<float>;
simd value = simd::fromNative({0.2f,0.3f,0.4f,0.5f});
simd valid = value * 2.0f;
simd invalid = 2.0f * value;
In my case it was even worse as it was a 1 / value;
Q) whats the best way of doing a 1 / simd with juce?
I think you should be able to get around the left hand/right hand issue with explicit casting:
dsp::SIMDRegister<float> values = dsp::SIMDRegister<float>::fromNative({0.2f,0.3f,0.4f,0.5f});
dsp::SIMDRegister<float> ones = static_cast<dsp::SIMDRegister<float>>(1.0);
// So this is fine:
auto test = ones * values;
// But this still fails...
auto inv = ones / values;
I guess the thing that’s really missing is the division operator. I don’t know the underlying instruction sets well enough to know whether or not that implementation is feasible…
This doesn’t allocate asamples as begin aligned on 16bits, it says that asamples should. But if the new statement is not aligned itself, then there are no valid assumption on asamples.
When I was playing around creating a little template framework to learn the intel intrinsics I seem to remember I used one like _mm_div_pd so I’m quite sure the devision operator should be there. Or maybe that’s an AVX only thing and that’s why it’s not part of juce.
Intrisincs are not the way to go, have a look at boost::simd or TR2 SIMD. That’s the proper approach (tr2 should probably replace JUCE specific wrappers once compilers support it officially).
There are too many libraries IMHO (I’m using libsimdqq at the moment, but will probably migrate to tr2 as soon as possible, the blocker will probably be XCode, as usual).