Juce FFT vs FFTW benchmarking?

kkjathal · August 21, 2016, 1:24am

Just curious if anyone has any experience with using both and has any rough idea of how much faster FFTW is than the JUCE FFT ? (assuming this since the JUCE implementation mentions it’s not optimized for speed).

Thanks!

Jan_Schwers · August 21, 2016, 10:14am

FFTW is way faster, but as it’s under the GPL you cannot use it for commercial use, unless you buy the non-free license which is…too expensive. You might wanna look into the Intel IPP (https://software.intel.com/en-us/intel-ipp). They have a free Community License.

jonathonracz · August 21, 2016, 10:51am

I have high hopes for the upcoming JUCE DSP module having a truly fast FFT so we don’t have to pay for FFTW or chase other solutions… in the meantime we have to use other things. Like @Jan_Schwers said IPP is a good choice if you’re targeting only x86, but IMO even the community license is still too restrictive due to the fact that every dev has to have an account and library can’t be put in a shared repo etc. etc.

I’ve been on a similar hunt, my favorite looking thus far is FFTS, which has a permissive license and SSE/AVX/NEON acceleration.

dinaiz · August 21, 2016, 6:42pm

What about KissFFT ? It’s under the BSD licence

PixelPacker · August 21, 2016, 8:58pm

You might have noticed, that Intel IPP which has support for pretty fast FFT and complex vector math is now free (as in beer), if you don’t require their support. So I’d hope, that sooner or later juce FFT stuff will be a wrapper around IPP on Windows and Linux and around vDSP on Mac and iOS. Then the only problem that remains is ARM based Android and Embedded Linux. Not sure, what one would use there.

Best,
Ben

jucemarler · August 27, 2016, 2:52pm

Yeah I’m currently using kiss inside a real-time audio classification module I’m putting together.

It would be great to remove the need for that explicit dependency when the JUCE FFT is given some magic whenever that DSP module is appearing.

dinaiz · August 28, 2016, 3:41am

Just for the sake of removing the dependancy or because you found out that KissFFT is slower than Intel’s FFT ?

jucemarler · August 28, 2016, 8:35am

I was more thinking then the audio classification module/lib would be friendlier to other JUCE users (particularly newbies), by using JUCE’s own FFT implementation.

I’m using kiss at the moment purely due to the BSD license, decent enough speed and my own familiarity with it.

I haven’t taken the time to do any bench marking between the various packages out there.

jules · August 28, 2016, 9:03am

Our implementation is basically the same algorithm as kissFFT. I’d be very surprised if kissFFT was any quicker, so seems kind of pointless to go to the trouble of adding a 3rd party library unless you’ve actually benchmarked and found that there’s a good reason to do so.

And certainly our experience was that modern vectorising compiler optimisations get close to making a pure C++ algorithm as good as an assembly-language one. The intel FFT is probably a bit better because I’m sure they’ll use some sneaky CPU-specific tricks, but in many real-world cases none of this matters, since the bottleneck will be memory/cache access rather than pure CPU number-crunching. TL;DR: Don’t waste your time prematurely optimising unless you can measure a problem in your FFT and then measure an improvement by swapping the library!

jucemarler · August 28, 2016, 9:07am

HI Jules,

Fair enough. Thanks for the info. To be honest I’ve got it in there because at the moment one of the libraries I’m using uses KissFFT internally so I kept it in there as I got comfortable using it. Looking like I’m going to be replacing the library with my own routines anyways so I’ll go with the JUCE FFT from there on!

yairadix · December 12, 2016, 1:23pm

When looking at the implementation of performRealOnlyForwardTransform, it looks like it just prepares a buffer for an equivalent perform call.

However, in kiss_fft’s case the real-only-transform is twice as fast, according to its “TIPS” file:

Also, kiss’s fftr returns half the spectrum (the second half is usually not needed and can be trivially derived from the first half) so it uses less memory and is probably more cache-efficient…

adamwilson · February 21, 2017, 9:53am

Has anyone done any benchmarking to compare, e.g. JUCE FFT / KissFFT / FFTW / FFTS?

IvanC · February 21, 2017, 10:52am

Yes ! Intel FFT > vDSP / Accelerate > FFTS > FFTW > PFFFT > FFTReal > Ooura FFT >>> KissFFT + JUCE FFT in short

chkn · February 21, 2017, 11:10am

Did you check the FFT of Intel IPP or Intel MKL?

ckhf · February 21, 2017, 11:23am

In some tests we did, Ooura FFT was faster than FFTReal. In our tests this also depended on Windows/OS X and whether using floats or doubles.

chkn · February 21, 2017, 11:41am

Its also important which sizes has been checked, i think the differences are not too big with small ffts, because of cpu-memory cache.

ckhf · February 21, 2017, 12:09pm

We tested various sizes (32 - 1048576) as 32- and 64 bit application on various machines. Generally, the difference increased with the size. I’d say for most audio applications the size will be between 256 and 32768.

The tests also included a convolution (full processing: Y1=FFT(X1), Y2=FFT(X2), Y1=Y1*Y2, X1=IFFT(Y1), X1=rescale(X1)) which was identical for FFTReal and Ooura. The data was initialized randomly before the benchmark begun

yairadix · February 21, 2017, 12:25pm

Did you benchmark real-transforms? It would be very surprising if JUCE’s FFT matched KissFFT there.

IvanC · February 21, 2017, 1:03pm

I should probably redo my benchmarking tests at some point and post the results since I don’t remember some of tne details

adamwilson · February 21, 2017, 2:44pm

So, from how things look atm, on iOS/macOS its best to use vDSP, on Intel platforms IPP, and other platforms e.g. Android arm either FFTS or PFFFT (avoiding the bloat and license issues of FFTW).

In reality, it might not make enough of a difference to warrant #ifdefs for each platform, and should be easier with one library that supports all major platforms. PFFFT looks good but [does not seem to compile for Android] (https://bitbucket.org/jpommier/pffft/pull-requests/1/introduce-set_ps1-macro/diff) (without modification), FFTS looks tricky to include and not recently supported by its original author… KissFFT seems the simplest to use and decent enough performance.
It may be that for my needs, the JUCE implementation will be good enough

Topic		Replies	Views
FFTS -- Fastest FFT implementation, and Free/BSD License General JUCE discussion	28	13588	June 11, 2021
Comparing FFT engines General JUCE discussion	22	5351	February 25, 2023
Native (per-platform--optimized) FFT General JUCE discussion	3	943	January 31, 2018
[FR] Add IPP support for FFT General JUCE discussion	22	2428	December 2, 2020
Which Intel FFT to use - IPP or MKL? Useful Tools and Components	24	10565	July 26, 2018

Juce FFT vs FFTW benchmarking?

Purchase

Discover

Learn

Support

About

Events

Juce FFT vs FFTW benchmarking?

Related topics

Purchase

Discover

Learn

Support

About

Events