FFT to visual band analyzer - Help needed

Hello folks!

Before moving on, please beware that I have very very little knowledge in audio processing.

I’d love to get a stylish 2d band analyzer that I’d connect from a streamer (audio source) to a mac-mini. I’ve looked for weeks to find one and the only stylish one that I found is that one: audioMotion.js

Here’s an example of what it looks like :point_down:

Unfortunately, as it’s developed in JS and running in the browser, we’re experiencing (too much) latency. We’ve tried using a proper audio interface, upgrading to the latest mac mini and so on and so on, there’s always a too noticeable latency that kills it all.

Now, I’ve started since yesterday to reproduce the screenshot above with JUCE, but I’m experiencing a couple of issues and would love some of your help.

1. Tempered scaling

Since AudioMotion is open-source, I’ve been able to scrap his code to find out how he was “spreading” the FFT data to these bars (since, as I found out, naively spreading the frequencies linearly doesn’t work well :sweat_smile:)

According to his code (audioMotion-analyzer/audioMotion-analyzer.js at master · hvianna/audioMotion-analyzer · GitHub), it looks like he’s using an Equal Temperament Scale, which from what I could read, seems like a great choice to spread frequencies in intervals.

Unfortunately, when looking at the result, it does not look close to his. Here’s how I’m doing it:

    // generate a table of frequencies based on the equal tempered scale
   // This is done only once - out of the hot loop
    void buildTemperedScale() {
        float root24 = pow(2.0f, 1.0f / 24.0f);
        float c0 = 440.0f * pow(root24, -114.0f); // ~16.35 Hz
        float freq = 0.0f;
       // Number of notes to group
        int groupNotes = 6;
        
        int i = 0;
        
        // TODO: Replace 22 000 with Max Frequency config var
        // TODO: Replace 20 with Min Frequency config var
        while ((freq = c0 * pow(root24, i)) <= 22000.0f) {
            if (freq >= 20.0f && i % groupNotes == 0) {
                temperedScale.push_back(freq);
            }
            i++;
        }
    }

Then, when reading the fftData, here’s how I’m aggregating it:

    int freqToBarIndex(double freq) {
        for (int i = 0; i < temperedScale.size(); ++i) {
            if ((float) freq <= temperedScale[i]) {
                return i;
            }
        }
        
        return (int) temperedScale.size();
    }
    
    void drawNextFrameOfSpectrum()
    {
        // first apply a windowing function to our data
        window.multiplyWithWindowingTable (fftData, fftSize);       // [1]
        
        // then render our FFT data..
        forwardFFT.performFrequencyOnlyForwardTransform (fftData);  // [2]
        
        auto mindB = -100.0f;
        auto maxdB =    0.0f;
        
        for (int i = 0; i < temperedScale.size(); ++i) {
            barAmount[i] = 0.0f;
            barTotal[i] = 0.0f;
        }
        
        for (int i = 1; i < fftSize; i++) {
            auto freq = (_sampleRate * i) / fftSize;
            auto barIndex = freqToBarIndex(freq);
            auto level = juce::jmap (juce::jlimit (mindB, maxdB, juce::Decibels::gainToDecibels (fftData[i])
                                                   - juce::Decibels::gainToDecibels ((float) fftSize)),
                                     mindB, maxdB, 0.0f, 1.0f);
            
            barAmount[barIndex]++;
            barTotal[barIndex] += level;
        }
        
        for (int i = 0; i < temperedScale.size(); i++) {
            if (barAmount[i] == 0) {
                scopeData[i] = 0;
            } else {
                scopeData[i] = barTotal[i] / barAmount[i];
            }
        }
        
    }

Then, finally for the rendering:

    void drawFrame (juce::Graphics& g)
    {
        auto width  = getLocalBounds().getWidth();
        auto height = getLocalBounds().getHeight();
        
        for (int i = 1; i < temperedScale.size(); ++i)
        {            
            auto rectHeight = juce::jmap (scopeData[i], 0.0f, 1.0f, (float) height, 0.0f);
            float rectWidth = (width / temperedScale.size());
            
            g.drawRect(i * rectWidth, rectHeight, rectWidth, height - rectHeight);
        }
    }

Here’s the result with groupNotes = 2 in buildTemperedScale

Link to a video (Could not embed more than one thing :/)

As you can see, I’m having “holes” in the frequencies, especially in the low ones.

I assume that’s because none of the fft frequencies I get can go in some of the intervals of the scale, but I don’t understand why audioMotion is not having that issue.

Do you have any idea why that’s the case and how I could solve that problem?

2. “Smoothing”

The WebAudioAPI has a prop called smoothingTimeConstant. According to the documentation, it turns out they’re applying a Blackman window to the FFT data. That’s all great since JUCE does already support that window too. However, I don’t seem to understand where I can apply that smoothingConstant which can be between 0 and 1.

class MainComponent   : public juce::AudioAppComponent,
private juce::Timer
{
public:
    MainComponent()
    : forwardFFT (fftOrder),
    window (fftSize, juce::dsp::WindowingFunction<float>::blackman) // here
    {
        setOpaque (true);
        setAudioChannels (2, 0);
        startTimerHz (60);
        setSize (700, 500);
        buildTemperedScale();
    }

Is there even a way to do that? Or do I need to apply that smoothing manually? If so, would you have any (not too math-oriented) references for that?

3. Performance…?

Finally, I’m currently applying an FTT of size 2^11=2048. However, as soon as I increase the size of it, say 4096 or 8192, I’m massively dropping frames and everything looks rubbish.

As I’m only drawing bars and not doing path/line stuff, the computation doesn’t seem to get any worse in the rendering part. Therefore, would you have any idea of what’s going wrong with my code again?

Just so you have it all at least once, I’ll share it entirely below (sorry, I hate to dump big chunks of code like that, hence why I split it in the relevant parts above):

/*
 ==============================================================================
 
 This file is part of the JUCE tutorials.
 Copyright (c) 2020 - Raw Material Software Limited
 
 The code included in this file is provided under the terms of the ISC license
 http://www.isc.org/downloads/software-support-policy/isc-license. Permission
 To use, copy, modify, and/or distribute this software for any purpose with or
 without fee is hereby granted provided that the above copyright notice and
 this permission notice appear in all copies.
 
 THE SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY WARRANTY, AND ALL WARRANTIES,
 WHETHER EXPRESSED OR IMPLIED, INCLUDING MERCHANTABILITY AND FITNESS FOR
 PURPOSE, ARE DISCLAIMED.
 
 ==============================================================================
 */

/*******************************************************************************
 The block below describes the properties of this PIP. A PIP is a short snippet
 of code that can be read by the Projucer and used to generate a JUCE project.
 
 BEGIN_JUCE_PIP_METADATA
 
 name:             SpectrumAnalyserTutorial
 version:          2.0.0
 vendor:           JUCE
 website:          http://juce.com
 description:      Displays an FFT spectrum analyser.
 
 dependencies:     juce_audio_basics, juce_audio_devices, juce_audio_formats,
 juce_audio_processors, juce_audio_utils, juce_core,
 juce_data_structures, juce_dsp, juce_events, juce_graphics,
 juce_gui_basics, juce_gui_extra
 exporters:        xcode_mac, vs2019, linux_make
 
 type:             Component
 mainClass:        AnalyserComponent
 
 useLocalCopy:     1
 
 END_JUCE_PIP_METADATA
 
 *******************************************************************************/


#pragma once

#include <JuceHeader.h>

//==============================================================================
class MainComponent   : public juce::AudioAppComponent,
private juce::Timer
{
public:
    MainComponent()
    : forwardFFT (fftOrder),
    window (fftSize, juce::dsp::WindowingFunction<float>::blackman)
    {
        setOpaque (true);
        setAudioChannels (2, 0);  // we want a couple of input channels but no outputs
        startTimerHz (60);
        setSize (700, 500);
        buildTemperedScale();
    }
    
    ~MainComponent() override
    {
        shutdownAudio();
    }
    
    //==============================================================================
    void prepareToPlay (int, double sampleRate) override {
        _sampleRate = sampleRate;
    }
    void releaseResources() override          {}
    
    void getNextAudioBlock (const juce::AudioSourceChannelInfo& bufferToFill) override
    {
        if (bufferToFill.buffer->getNumChannels() > 0)
        {
            auto* channelData = bufferToFill.buffer->getReadPointer (0, bufferToFill.startSample);
            
            for (auto i = 0; i < bufferToFill.numSamples; ++i)
            pushNextSampleIntoFifo (channelData[i]);
        }
    }
    
    //==============================================================================
    void paint (juce::Graphics& g) override
    {
        g.fillAll (juce::Colours::black);
        
        g.setOpacity (1.0f);
        g.setColour (juce::Colours::white);
        drawFrame (g);
    }
    
    void timerCallback() override
    {
        if (nextFFTBlockReady)
        {
            drawNextFrameOfSpectrum();
            nextFFTBlockReady = false;
            repaint();
        }
    }
    
    void pushNextSampleIntoFifo (float sample) noexcept
    {
        // if the fifo contains enough data, set a flag to say
        // that the next frame should now be rendered..
        if (fifoIndex == fftSize)               // [11]
        {
            if (! nextFFTBlockReady)            // [12]
            {
                juce::zeromem (fftData, sizeof (fftData));
                memcpy (fftData, fifo, sizeof (fifo));
                nextFFTBlockReady = true;
            }
            
            fifoIndex = 0;
        }
        
        fifo[fifoIndex++] = sample;             // [12]
    }
    
    void die(const std::string& msg)
    {
        std::cerr << msg << std::endl;
        exit(1);
    }
    
    // generate a table of frequencies based on the equal tempered scale
    void buildTemperedScale() {
        float root24 = pow(2.0f, 1.0f / 24.0f);
        float c0 = 440.0f * pow(root24, -114.0f); // ~16.35 Hz
        float freq = 0.0f;
        int groupNotes = 2;
        
        int i = 0;
        
        // TODO: Replace 22 000 with Max Frequency config var
        // TODO: Replace 20 with Min Frequency config var
        while ((freq = c0 * pow(root24, i)) <= 22000.0f) {
            if (freq >= 20 && i % groupNotes == 0) {
                temperedScale.push_back(freq);
            }
            i++;
        }
    }
    
    
    int freqToBarIndex(double freq) {
        for (int i = 0; i < temperedScale.size(); ++i) {
            if ((float) freq <= temperedScale[i]) {
                return i;
            }
        }
        
        return (int) temperedScale.size();
    }
    
    void drawNextFrameOfSpectrum()
    {
        // first apply a windowing function to our data
        window.multiplyWithWindowingTable (fftData, fftSize);       // [1]
        
        // then render our FFT data..
        forwardFFT.performFrequencyOnlyForwardTransform (fftData);  // [2]
        
        auto mindB = -100.0f;
        auto maxdB =    0.0f;
        
        for (int i = 0; i < temperedScale.size(); ++i) {
            barAmount[i] = 0.0f;
            barTotal[i] = 0.0f;
        }
        
        for (int i = 1; i < fftSize; i++) {
            //                        auto skewedProportionX = 1.0f - std::exp (std::log (1.0f - (float) i / (float) scopeSize) * 0.2f);
            //                        auto fftDataIndex = juce::jlimit (0, fftSize / 2, (int) (skewedProportionX * (float) fftSize * 0.5f));
            auto freq = (_sampleRate * i) / fftSize;
            auto barIndex = freqToBarIndex(freq);
            auto level = juce::jmap (juce::jlimit (mindB, maxdB, juce::Decibels::gainToDecibels (fftData[i])
                                                   - juce::Decibels::gainToDecibels ((float) fftSize)),
                                     mindB, maxdB, 0.0f, 1.0f);
            
            barAmount[barIndex]++;
            barTotal[barIndex] += level;
        }
        
        for (int i = 0; i < temperedScale.size(); i++) {
            if (barAmount[i] == 0) {
                scopeData[i] = 0;
            } else {
                scopeData[i] = barTotal[i] / barAmount[i];
            }
        }
        
    }
    
    float smooth(float previousVal, float newVal) {
        return 0.8f * previousVal + 0.2f * newVal;
    }
    
    void drawFrame (juce::Graphics& g)
    {
        auto width  = getLocalBounds().getWidth();
        auto height = getLocalBounds().getHeight();
        
        for (int i = 1; i < temperedScale.size(); ++i)
        {
            //auto x = juce::jmap (scopeData[i - 1], 0.0f, 1.0f, (float) height, 0.0f);
            auto avg = (scopeData[i - 1] + scopeData[i]) / 2;
            
            auto rectHeight = juce::jmap (avg, 0.0f, 1.0f, (float) height, 0.0f);
            float rectWidth = (width / temperedScale.size());
            
            g.drawRect(i * rectWidth, rectHeight, rectWidth, height - rectHeight);
        }
    }
    
    enum
    {
        fftOrder  = 11,
        fftSize   = 1 << fftOrder,  // 2^fftOrder | 11 = 2048 / 12 = 4096 / 13 = 8192 / 14 = 16384 / ...
        scopeSize = 256,             // buffer size
        bars = 200
    };
    
    juce::dsp::FFT forwardFFT;                      // [4]
    juce::dsp::WindowingFunction<float> window;     // [5]
    
    float fifo [fftSize];                           // [6]
    float fftData [2 * fftSize];                    // [7]
    int fifoIndex = 0;                              // [8]
    bool nextFFTBlockReady = false;                 // [9]
    float scopeData [bars + 1];                    // [10]
    float barAmount [bars + 1];
    float barTotal [bars + 1];
    std::vector<float> temperedScale;
    float minLog;
    double _sampleRate;
    int minFreq = 20;
    int maxFreq = 22000;
    
    JUCE_DECLARE_NON_COPYABLE_WITH_LEAK_DETECTOR (MainComponent)
};

Finally, if anyone’s interested in helping me, please, try not to go too far in math stuff as I’m not super good at it and I won’t understand a lot of what you’ll say unfortunately :sweat_smile:

Thank you so much for your help, I’m so excited to get this up and running! :star_struck:

EDIT: It actually looks like the smoothing and the blackman window have nothing to do in common.

Here’s how they apply their smoothing processing, which I’m having a hard time deciphering:

https://webaudio.github.io/web-audio-api/#smoothing-over-time

EDIT2: Found the actual chromium implementation chromium/realtime_analyser.cc at 99314be8152e688bafbbf9a615536bdbb289ea87 · chromium/chromium · GitHub

Need to figure out how to get that working now.