Double outperforming Floats and other happy accidents

nammick · May 27, 2018, 6:29am

When I started playing around with audio programming. I setup a whole collection of DSP classes and made all there points of entry and the way they process the same. Whilst doing benchmarking I found using doubles throughout actually improved speed by 20 - 30% which didn’t seem right to me for obvious reasons. I was aware of SIMD but thought that was purely a compiler optimisation. Doing some deeper reading into how compiler uses particular opportunities to run 2 for the price of one operations on certain floating point operations and it seems it’s as much to do with with way you code as it is the intelligence of the compiler.

So are there any hard and fast rules to increase the chance for SIMD opportunities?

I now know that the idea is to fill a the SIMD 128bit register with 4 x floats or 2 x doubles (which I now know is where my performance boost came from) and also access memory in a uni-stride way (which I know is a good practise anyway on any platform)

jonathonracz · May 27, 2018, 4:36pm

The only good way to increase SIMD opportunities is to manually write SIMD code. You should never rely on the compiler to generate SIMD code, because obviously it can vary wildly across architectures, compilers, and your code’s branching/accumulators, etc.

There are, however, some guidelines you can follow to increase the likelihood of your code being vectorized at compile time. Check out http://www.agner.org/optimize/optimizing_cpp.pdf section 12.3 “Automatic vectorization”.

alatar · May 27, 2018, 5:06pm

I agree with Jonathon here.
In my, limited, experience: Yes, the compiler does SIMD optimizations. But my handcoded SIMD code is still faster than what the compiler produces.

About doubles: Hmm… not sure why they were faster for you. If you do proper manual SIMD optimizations, floats should be faster.

nammick · May 28, 2018, 5:04pm

Looking at my FIR implementation would align 2 x doubles where as it would be 2 x float which only consumes 64 bit and not qualify for SIMD vectorisation so 20-30% performance increase would make sense.

Thank you both for confirming that I really need to focus on the operations instead of just relying on compiler. I do prefer to be implicit were possible.

Topic		Replies	Views
Mixing floats and doubles in DSP code for efficiency Audio Plugins	5	1634	May 13, 2017
[DSP module discussion] New class SIMDRegister General JUCE discussion	10	3119	February 21, 2019
Float vs double? General JUCE discussion	6	2370	December 30, 2015
My SIMD code barely improves performance General JUCE discussion	4	1005	December 3, 2019
[FR] Vectorized summation Feature Requests	8	720	September 2, 2020

Double outperforming Floats and other happy accidents

Purchase

Discover

Learn

Support

About

Events

Double outperforming Floats and other happy accidents

Related Topics

Purchase

Discover

Learn

Support

About

Events