Demonstrational code for dsp::fft:perform()

Hi. I’d like to know if anyone has a little demo project that contains the normal perform-method. i can only find examples of the realOnly-methods, but if i understood correctly then the perform-method is basically the one that is most flexible in terms of usage, so it’s advisable to learn it, right?

i tried a bit of stuff myself and atm i just have 2 vectors of Complex < float > of size * 2, just like in the fft tutorial with the other method. in the method where i push the fifo to the fft object i say:

 fft->perform(&fifo.data()[0], &data.data()[0], false);

because if i understand correctly then this method asks for a reference to the first entry so it can read the whole array from there. then i made a method for testing the output:

void print(float relSize = 1.f) {
    String p = "";
    for (auto i = 0; i < data.size() * relSize; ++i)
        p += String(data[i].real()) + ", " + String(data[i].imag()) + ": ";
    DBG(p);
}

i gave it a sine wave at 1000hz and it produced this output to the console:

8.7641e+08, 0: -4.77772e+08, -3.98638e+08: 7.91343e+07, 0: -4.77772e+08, 3.98638e+08: 0, 0: 0, 0: 0, 0: 0, 0: 
8.76411e+08, 0: -4.77773e+08, -3.98638e+08: 7.91343e+07, 0: -4.77773e+08, 3.98638e+08: 0, 0: 0, 0: 0, 0: 0, 0: 
8.76412e+08, 0: -4.77773e+08, -3.98639e+08: 7.91342e+07, 0: -4.77773e+08, 3.98639e+08: 0, 0: 0, 0: 0, 0: 0, 0: 

ok so… i know that it makes sense that the 2nd half of the array is empty. someone once explained to me that it’s only used for some calculations within the perform-method. but… what do i do with the actual values now? which are the calculations that get magnitude and phase from these bins? the 2nd bin doesn’t show 0 as imaginary number so i first thought that’s where the 1000hz wave is located but it seems to be the same when i shift the sine wave up to 20k or something which should clearly be the last bin.

EDIT:
i know i asked a similiar question before and in my last fft related post i was told to use std::abs() on the complex number to get its magnitude and std::arg() on it to get its phase. well i tried that but even then all bins show something with e+05 etc at the end, and i expect the normalized sine wave to stand out in the measurement with a high value.

in order to visualize to you my problem a bit better i show you this screenshot from my daw:


as you can see i wrote a little plug’n script script that can produce the first 8 of harmonics of any frequency as normalized value (-1 to 1). the visualizer is set to normalized as well which proves that it works so far. and as you can see the sine wave is set to a frequency of 2khz. i also tested that in a spectrum analyzer. so that’s not the problem. the problem is that when i do this on the data:

for (auto i = 0; i < size * relSize; ++i) {
                    auto magnitude = std::abs(data[i]);
                    p += String(magnitude) + ", ";
                }

i get

2.05411e+09, 1.44914e+09, 2.04448e+08, 1.57981e+08, 1.46176e+08, 1.57981e+08, 2.04448e+08, 1.44914e+09,

(currently on order == 3 as you can see) why is my sine wave not represented in the numbers?
(FFTPlugin is the name of the juce plugin that performs fft)

In general: those high values are quite strange, sure you sent enough samples into your FFT?

From the values, it seems like your FFT size is very small:

I see DC, a frequency, then Nyquist, after that the complex conjugate of the frequency. So it seems that your fftSize was 4? The 0s indicate that your array is actually bigger than needed, you only need 2 * getSize() many float values.

I recommend using something like Python, Matlab, or Octave (free: https://www.gnu.org/software/octave/) to get familiar with the DFT. Play around with different sizes, get to know the leakage problem, learn how to solve it with windowing, see how windowing affects the main-lobes and side-lobes of the analysis. Check out how you can apply convolution by multiplication in the frequency domain (don’t forget to learn about zero-padding and circular convolution!! :wink: ). And after that implement it in C++ and compare the values, so you can verify that you are implementing it the right way. Learning it directly with C++ is very hard!

Btw: &fifo.data()[0] is the same as fifo.data().

on init of my fft object i declare an order and then size is 1 << order. my complex vectors have size * 2. and my windowing function works like this:

windowAmp = std::sin(float(idx) * sizePiMaxInv);
fifo[idx] = sample * windowAmp;
++idx;

for each sample. (sizePiMaxInv == MathConstants < float > ::pi / float(size - 1))

edit: oh and yes! i currently experiment with rather small fft sizes so i can better see what’s going on in the console output.

according to this desmos my windowing function is working pretty good. maybe there are a lot of better ones but this one should be good enough to give clear results on a simple sine wave shouldn’t it? https://www.desmos.com/calculator/qungiuxl8e

Here a small code snippet for performing an FFT

constexpr int order = 8;
constexpr int nFFT = 1 << order;
constexpr float fs = 48000;
constexpr float f = fs / nFFT * 2; // check out the second bin after the FFT!!

std::vector<std::complex<float>> data (nFFT);
for (int i = 0; i < nFFT; ++i)
    data[i] = cos (2 * MathConstants<float>::pi * f * i / fs);

dsp::FFT fft (order);
fft.perform (data.data(), data.data(), false);

// normalize
for (auto& elem : data)
    elem /= nFFT;

for (int i = 0; i < nFFT; ++i)
    DBG (data[i].imag() << " " << data[i].real());

Edit: smaller fix to calculate frequency
Edit 2: complex input data

3 Likes

edit: now my reply doesn’t make sense anymore^^

Edit: mine neither :smiley:

ok i see now that you use complex the vector is not size * 2 but just size. maybe that’s my mistake… let me try something really quickly…

Yep, FFT will produce fftSize many complex values -> 2 * fftSize many float values, and as an input it will use the same number of values. As in audio we are using mostly real signals, we don’t need the imaginary part and reduce the signal size to fftSize many real values. However, we sill need to provide enough space so the FFT can be performed. When feeding only real values, our output will be complex conjugated:
DC f1 f2 ... fm Nyq fm* ... f2* f1*`` <- * is complex conjugated here So we only use the first half **plus** the Nyquist one DC f1 f2 … fm Nyq`

m would be fftSize / 2 - 1 here

it’s interesting btw that you use the same vector as in- and output on the perform method. in the fft tutorial they are seperate objects. i guess that’s just because the fft output in the tutorial is being used on the graphics thread as well?

edit:

i got rid of my 2nd vector and rewrote my push-method like this:

void push(const float& sample) {
            if (idx == size) {
                fft->perform(data.data(), data.data(), false);
                idx = 0;
            }

            // windowing. prevents discontinuity
            auto windowAmp = 0.f;
            switch (windowType) {
            case WindowType::Linear:
                windowAmp = idx < sizeHalf ?
                    float(2 * idx) * sizeMaxInv :
                    2.f - float(2 * idx) * sizeMaxInv;
                break;
            case WindowType::Sine:
                windowAmp = std::sin(float(idx) * sizePiMaxInv);
                break;
            default:
                // No Windowing
                windowAmp = 1.f;
                break;
            }
            
            data[idx] = sample * windowAmp;
            ++idx;
        }

it looks much nicer now without std::fill and std::copy as well. the output also changed for some reason that i don’t understand yet:

0, 2.6444e+08, 4.10047e+08, 4.197e+08, 3.20899e+08, 1.73705e+08, 4.83355e+07, 0.111874, 
0, 2.6444e+08, 4.10046e+08, 4.19699e+08, 3.20898e+08, 1.73705e+08, 4.83351e+07, 0.111948, 

with this print method:

String p = "";
for (auto i = 0; i < size * relSize; ++i) {
                    auto magnitude = std::abs(data[i]);
                    p += String(magnitude) + ", ";
                }
DBG(p);

now the dc offset is gone which is good. but i get a high looking number compared to the other ones at the end and still no clue where my 1khz sine wave shows up

FFT is only performed on the thread from where it’s called, so there’s no internal threading going on. Regardless of that: I am actually not 100% sure if it’s allowed to use the same buffer here, unfortunately there’s nothing in the docs. You would have to inspect the source code of all the wrappers in order to figure out if there’s any problem. I simply did it and it worked, lucky me :wink:

1 Like

ok now my push- and print-methods look like this:

void push(const float& sample) {
            if (idx == size) {
                fft->perform(data.data(), data.data(), false);
                print();
                idx = 0;
            }

            // windowing. deleted to emphasize on important stuff 
            
            data[idx] = sample * windowAmp;
            ++idx;
        }

        void print() {
            String p = "";
            for (auto i = 0; i < size; ++i) {
                    auto magnitude = std::abs(data[i]);
                    p += String(magnitude) + ", ";
                }
            DBG(p);
        }

with this kinda output again:

2.50325e+08, 1.806e+09, 9.86781e+08, 1.18388e+09, 1.79913e+09, 1.18388e+09, 9.86781e+08, 1.806e+09

the dc offset was never gone actually. i just had my print method at the wrong place. but now it is very apparent that directly after the perform there is no sign of anything happening in the data. at this point i wonder if my sine wave even reaches this place.

edit: ok so turned out it didn’t, lol. i forgot to send channelCount to the object that handles everything in my code. i’ll edit again when i made new measurements

ok, now that my sine wave is actually correctly channelized to my fft object i get the following output

0.10484, 0.339503, 0.182495, 0.218485, 0.321871, 0.218485, 0.182495, 0.339503, 
0.0124111, 0.218819, 0.126668, 0.146543, 0.223256, 0.146543, 0.126668, 0.218819, 
0.10484, 0.339503, 0.182495, 0.218485, 0.321871, 0.218485, 0.182495, 0.339503, 
0.0124111, 0.218819, 0.126668, 0.146543, 0.223255, 0.146543, 0.126668, 0.218819, 
0.10484, 0.339503, 0.182495, 0.218485, 0.321871, 0.218485, 0.182495, 0.339503,

from my print method:

String p = "";
for (auto i = 0; i < size; ++i){
      auto magnitude = std::abs(data[i]) / size;
      p += String(magnitude) + ", ";
}
DBG(p);

i like about this that it indeed produces normalized values, just as expected. my only concern left is that i still don’t see in which bin my 1khz sine wave landed. atm it looks as if their was audio in the whole spectrum. my first impulse was to think that there might still be something bad going on in the way i give it the audio input. but i think this method, that i call from processBlock, looks just ok:

void process(AudioBuffer<float>& buffer) {
            for (auto s = 0; s < buffer.getNumSamples(); ++s) {
                auto sample = 0.f;
                for (auto ch = 0; ch < buffer.getNumChannels(); ++ch)
                    sample += *buffer.getReadPointer(ch, s);
                sample *= channelCountInv;
                push(sample);
            }
        }

(channelCountInv == 1.f / channelCount)

Second one or better: all of them. Welcome to spectral leakage :wink: frequency resolution of an DFT is fs/size, so real bad with order of 3 :wink:

1 Like

so what i did now was:

change the window type. up until now i had this sine-type window but now i wrapped pow(x,7) around it because a friend told me that the frequency response of an fft is better if there is pretty much just silence around the window.

turn up the order == 8 so it’s still a bit readable but with less spectral leakage hopefully.

reduce the output to size / 2 because the other half is just a mirror apparently.

changed my code structure in a way that it doesn’t continously outputs values but only once after some secounds, to make it more reasonable with the bigger amount of numbers popping up.

the output:

2.26143e-05, 0.000144084, 0.00603683, 0.0347246, 0.0927037, 0.14102, 0.129823,
0.0715398, 0.0214991, 0.00256371, 4.72361e-05, 5.82473e-06, 1.2083e-06, 3.32574e-07, 
.11751e-07, 3.97494e-08, 2.04652e-08, 8.7575e-09, 6.15625e-09, 1.99344e-09, 2.44061e-09, 
.72149e-09, 3.8707e-09, 2.16605e-09, 3.68502e-09, 3.74331e-09, 4.76764e-09, 2.99685e-09,
1.429e-09, 8.84518e-10, 3.33291e-09, 3.71793e-09, 3.60381e-09, 3.42999e-09, 3.58897e-09,
2.52687e-10, 4.31954e-09, 2.65527e-09, 2.76164e-09, 2.98942e-09, 1.03809e-09,
2.60202e-09, 3.88445e-09, 2.81017e-09, 1.514e-09, 2.43923e-09, 1.13812e-09, 1.98681e-09,
1.37313e-09, 2.99838e-09, 2.58017e-09, 1.81295e-09, 3.03806e-09, 1.40512e-09,
1.35735e-09, 1.60283e-09, 1.16415e-09, 3.83994e-09, 4.165e-09, 2.63418e-09,
1.86265e-09, 2.98169e-09, 4.80839e-09, 2.74246e-09, 2.15997e-09, 3.86652e-09,
3.71436e-09, 2.83251e-09, 5.58794e-09, 5.8902e-09, 6.71586e-09, 4.65661e-09, 4.79993e-09,
1.05459e-09, 3.3025e-09, 1.88689e-09, 2.21181e-09, 3.81688e-09, 6.17311e-09, 6.53592e-09,
3.4798e-09, 2.32468e-09, 3.98459e-09, 2.88209e-09, 2.97581e-09, 3.17632e-09, 2.82221e-09,
4.00822e-09, 2.73671e-09, 6.52354e-10, 2.87833e-09, 1.76848e-09, 3.82646e-10, 1.1431e-09,
1.82018e-09, 4.10992e-09, 5.18303e-09, 5.25531e-09, 4.92599e-09, 1.74679e-09,
6.31415e-09, 9.93993e-09, 4.50986e-09, 3.096e-09, 2.65179e-09, 1.36151e-09, 7.63158e-09,
7.60751e-09, 8.77522e-09, 7.09547e-09, 1.39994e-09, 2.87988e-09, 4.33792e-09,
1.96479e-09, 8.83511e-10, 3.43864e-09, 2.37247e-09, 5.32992e-09, 5.40355e-09,
3.92416e-09, 1.31709e-09, 1.86265e-09, 0, 0, 0, 1.86265e-09, 1.42103e-09, 3.12323e-09

if i interpret this correctly the fft expects there to be “something” between bin1 to bin9. after that there is only really small numbers left. the numbers in 1-9 are also not that high however. considering that the wave i gave it was a normalized wave getting 0.14 as maximum value in bin5 seems like i made a mistake there. so i checked out what my calculation consists of:

void push(float sample) {
            window.process(sample, idx);
            input[idx] = dsp::Complex<float>(sample, 0.f);
            ++idx;
            if (idx == size) {
                idx = 0;
                fft->perform(input.data(), output.data(), false);
            }
        }

oh yeah and i tidied up the fft object a bit and outsourced the window, as you can see…
however, the values that are actually being printed are made this way:

for (auto i = 0; i < size * relSize; ++i) {
     auto magnitude = std::abs(output[i]) / size;
     p += String(magnitude) + ", ";
}

now there are a lot of things that could be going wrong here atm. i could still have the wrong idea about good window types, i could have misunderstood how to get the magnitude from the complex values, or something that i don’t even think of of course… may you give me another hint please? the values look so good already. i think i’m pretty close to actually getting it.

edit: so now i tried to output just std::abs(output[i]) without / size and also on a different frequency. now i chose 10khz. the out showed that it seems to work. now the “big numbers” appear somewhere in the middle of the debug message with a maximum at 36.1053. so i went to desmos and compared that number to 2^8 since that’s the size of the fft atm, and found out that the number / 7 is pretty close to 36. generally having to devide something with 7 is a rather unusual thing to do if you know what i mean, so i think this can’t be generalized and is just coincidence for this exact case, isn’t it? b/4 however equals 64. that would make more sense in my opinion. it would mean there is a headroom of 27.8947 on that harmonic. if this is true then i guess it would have something to do with how all the bins leak into each other.

some further experiments where i introduced even more harmonics of a given frequency revealed that i indeed now have a functioning fft object capable of telling me where its harmonics are located. the values are just still not really normalized

Yes, divided by 7 is coincidence. Your window is taking away quite a lot of energy, so I am not surprised the magnitude is not that big. I almost always use a Hann window for it, especially when overlapping the input. You could apply a factor of 2 to compensate for it, but that of course depends on the audio content in general and the position within the frame.

1 Like

well, actually i just wanted to know if i’m on the right track, which i am apparently. so i’ll have to check out the hann window then. but were i right about having to devide the size by 4 or is it 2 or 1 actually? also will there always be a drop in energy when i use any type of windowing? and is windowing even used if the plan is to invert the fft in a later stage again or is windowing just something that makes better visuals?

edit:
ok, the hann window seems to be really good indeed. i get much higher values on the frequencies that make sense and still low values everywhere else. so that indeed helped a lot. now it goes up to around 27 on a normalized harmonic of my test tone. on my current order of 7 that would mean that i have headroom of 5 on this harmonic:

https://www.desmos.com/calculator/z2jcxrvk3x

which seems much more reasonable considering that there has to be some loss of energy due to the windowing

A sinusoid signal of amplitude 1, with a frequency exactly matching the one of a DFT bin, will have a magnitude of 0.5 for both the positive and the negative frequency (which is in the second half of the spectrum). You’ll get the 0.5 if you divide the FFT result by the number of bins.
As you signal currently doesn’t match a specific bin, it’s energy is divided up to all bins (leakage) with the most energy around the bins with similar frequency. Additionally, you’ve applied a window, which will also broaden the peak and bring it down a little bit in magnitude.

Try a frequency f = k * fs / fftSize with k being the bin number, then you won’t have spectral leakage unless you apply a window.

Regarding inverting: a IFFT can only invert what the FFT made, so you’ll get a windowed signal again. If you can undo the window than you can get the signal back, however with sum smaller numerical errors. Best practice is to overlap the incoming signal, so you can retrieve the signal out of several frames.

1 Like

just as you told me i just deactivated windowing and chose a frequency that exactly on a bin by calculating 8 * sampleRate / size = 3000. i fed that into a new method that i just wrote which can output a freq of some bin with linear interpolation and connected that to this knob and yes… it finally works and gives me just the value that you were talking about: .5f

so this is it i guess :slight_smile: now i know how to get normalized values of fft data. thank you so much!