From FFT to sound bars spectrum

Hi,

I’m a bit stuck after calling the performRealOnlyForwardTransform in FFT class.
I apology in advance but I sucks in math…
I expect someway to convert float (re,im) data into byte “quantization” in
let say in 8 frequency bars… I just want a left / right vu-meter

Thanks for any tips (urls) for dummies

The first thing is that the complex number gives you amplitude and phase for a given frequency in your spectrum.
To display the amplitude, you need to take the absolute value of it. Then, you need to average the frequencies that sum to the same “bin”/octave in your 8 frequency bars.
You may also want to apply a window function on your chunk of data to avoid spills.

Hi,

I’m new to JUCE but have some experience in digital signal processing - so I hope I can help you a bit.

I’m not exactly sure what you’ve understood so far, so I’ll try to explain all steps you need to know so that you can hopefully help yourself a bit more after reading.

Maybe you should take a look at complex numbers first? Complex numbers are a mathematical concept, helping you making some things like periodical sine waveforms a lot easier to calculate. However, this has absolutely nothing to do with the number format & quantization your computer uses to compute them. Just two different topics.

So, lets get started with complex numbers:
Complex numbers are numbers, constructed out of two numbers, the real and the imaginary part. Maybe you might compare this idea to fractional numbers, which are also always made out of two numbers, the numerator and the denominator. Now back to complex numbers, the real part is what can be compared to real numbers, you are used to deal with every day. The imaginary part is another real number, multiplied by the imaginary unit i. Both added together form a complex number: z = x + y*i with z = complex number, x = real part, y = imaginary part.

Now if you take a coordinate system, mark the point [x, y] and draw an arrow from [0, 0] to that point you might measure the length of that arrow. This length is what is called the absolute value of the complex number. It can be calculated by the help of the Pythagorean theorem: |z| = sqrt(x^2 + y^2)

To fully describe a complex number with the help of its absolute value, you have to mention the angle, relative to the real axis of your coordinate system.

Take a look at the wikipedia article and scroll down to “Definition” there is a picture making the relation between real and imaginary value and absolute value and angle a bit clearer. You don’t have to understand the whole text of the article :wink:
https://en.wikipedia.org/wiki/Complex_number#/media/File:Complex_conjugate_picture.svg

So as a conclusion: There are always two ways of describing a complex number: Real and imaginary value and absolute value and angle.

Next topic: The FFT!
Now as you know about complex numbers, we can go on to the FFT. A general FFT transforms a vector of complex values into another vector of complex values. Input & Output vector are of the same size, e.g you put 128 values into the FFT and 128 values will come out of the FFT. This statement won’t say anything about the actual meaning of the values, but you’ll have to keep that in mind.
Now what is the meaning of the input & output data? Usually, we take a row of input samples, taken from a continuous input signal and consider them to be the input vector of the FFT. This sample vector represents the so called “time domain” signal, as it represents the signal form, sampled over the time. The output of the FFT will be the “frequency domain” representation - a vector consisting of complex numbers. Each complex number represents a sine wave with a given frequency, with the absolute value of this complex number equaling the amplitude of that sine wave and the angle representing the phase shift of that sine wave. I have created some plots to make things more clear:

What can be seen here?

  1. Although the first and second time domain signals differ (same frequency but with a phase shift), the absolute value of their frequency domain representation is nearly the same (in theory completely the same)

  2. The frequency vector is mirrored around the center frequency of 0 Hz. In fact it isn’t centered completely in my plots, but this is due to my laziness not making up the plots perfectly. Believe me, the center line is the DC value = 0 Hz. This happens because we only put in a real valued vector (there are no complex numbers in our audio samples). So a common thing in audio processing is to just compute the right half of the frequency vector, so that 128 real valued audio samples translate to 65 complex valued frequency values.

  3. There is more then one peak in the frequency spectrum, although the input signal just “contains” just one frequency. This is based on the fact that there is no exact 600 Hz frequency value in the output signal, so all the values around that frequency come up a bit. Get the corresponding frequency of the elements coming out of the FFT, you have to build a vector with equally spaced frequency values from 0 Hz to 0.5 * Samplerate. In our case (considering just the right half with 65 frequency values) the first four elements of this vector would be 0Hz, 375Hz, 750, 1125Hz and so on.

So how does this solve your problem?
Now you hopefully have a better idea of what you would want to compute.

  1. You’ll just need the absolute values of the FFT results. performRealOnlyForwardTransform() will give you the complex valued result, so you would have to compute the absolute values yourself afterwards. (I’ve never used the JUCE FFT class, instead I used fftw until now, but I found performFrequencyOnlyForwardTransform() in the JUCE FFT documentation - If I get it right this probably will return the absolute values directly. But better check that twice as I never used that class)

  2. If you need just 8 frequency bars, you’ll have to sum up all output frequency values that fall into a bar and divide them by the number of values you summed. You’ll probably want to have log-spaced bars, so the upper bars might contain more frequency values than the lower ones

Now go on and try that yourself first. After you were successful with that, you’ll maybe also take a further look at window functions to smooth out your result even further

15 Likes

Thanks for all this ! It works like a charm…

And obviously the API call I was looking for is indeed performFrequencyOnlyForwardTransform

Regards