Fast SINE approximation


#1

Does anyone know of an effecient sin function approximation. In my audio loop I have an operation that hits sin(radians) * amplitude 64 times per sample and this consumes about 14% of a core which is acceptable but not necessary especially at higher frequencies where an approximation would just as accurate.

I have experimented with using polynomials and calculating which hill i’m in but not getting much of a saving.


#2

There is a LookupTable in the new DSP module.
You give it any function you like and it will create a table with n values and does linear interpolation between the values…

HTH


#3

A cool cheers. Will do some perf tests with my implementation


#4

https://docs.juce.com/master/structdsp_1_1FastMathApproximations.html


#5

That sounds quite a lot, are you testing with optimized release builds?


#6

CPU drop from 15% to 11% which is great and no noticeable error in sound. I may still use std::sin on the lower bands but that extra 4% cpu goes along way with stability thanks…


#7

Havn’t even looked at release build yet. Still along way off a product. My thought process is though if it works well in debug it’s only going to be better in release…


#8

Debug builds are for debugging, not for measuring performance.


#9

You think :wink:

But in all seriousness performance in debug is indicative of how it will behave. When all the assertions e.t.c. and debugger interop is removed - if the debug build works with acceptable performance, It’s a safe bet release will perform better.


#10

Yes, it’s fairly safe to assume that the release build will be faster than the debug one. But that’s about all you can assume.

There’s absolutely no point in trying out optimisations like the ones in this thread in a debug build to see how well they work. The effectiveness of an optimisation in debug mode tells you nothing about how effective the same thing will be in release mode.


#11

…and just to qualify that a bit, for example: adding a polynomial approximation may improve a debug build, because it’s faster than a call to std::sin. However, the release build may have been able to vectorise your loop and convert that std::sin call into an SIMD operation that goes over 4x faster. Adding the polynomial will prevent it doing that vectorisation, so actually could be slower. You just don’t know until you try it.


#12

Points taken…

Reality of my situation is - release build of the current application has made little to it’s performance. Would love to get juce interfaced with some of the FPGA’s in the office, what I am working on right now is kinda a precursor to that.


#13

If you use Intel IPP, maybe this could be worth a look

https://software.intel.com/en-us/ipp-dev-reference-sin


#14

ooooow - need to look in to that


#15

What about using a rotation matrix with precomputed coefficients, that allows to compute the next sine sample?

alpha = angle between samples

M = [cos(alpha), -sin(alpha); sin(alpha), cos(alpha)] (computed once and for all)

V(0) = [1; 0]

V(i+1) = M * V(i) (4 mult, 2 sums)

sample(i) = V(i)_1 (the first coefficient of the V(i) vector)

The only issue of such an approach is the cumulative effects of the approx errors, it depends if it’s an issue for you or not, and if you can devise a method to avoid these if required.

[edit] there are also approximations available there: https://github.com/francoisbecker/fb-utils/blob/master/include/fbu/math_utils.hpp#L316


#16

Just keep in mind that also profiling could be tricky. especially on macOS as Apple don’t provide much control on CPU throttling.

  • For Windows machines you can usually turn off a lot of things (such as turbo boost / c-states) from the BIOS.
  • For macOS with Instruments you can try to set some CPU affinity and turn HT for more consistent results.

And also if you’re profiling a specific portion it might be nice if your software is modular enough to profile the specific module rather than having entire app / plug-in running multiple threads.