Reading a complete Soundfile, understanding FFT output


#1

I have 2 questions, sorry if I’m asking both in one thread. I want to read a sound file and make some analysis of it.I take the FFT example from the tutorial, the problem is that I think I’m not reading the whole file. I do this in the following way:

void MainComponent::getNextAudioBlock (const AudioSourceChannelInfo& bufferToFill) {    
    if(!analizeFile) {
        bufferToFill.clearActiveBufferRegion();
        return;
    }
    else {
        if(fileBuffer.getNumChannels() > 0) {
            
            auto* channelData = fileBuffer.getReadPointer(0);
            int numSamples = fileBuffer.getNumSamples();
            for(auto i = 0; i<numSamples; i++)
                pushNextSampleIntoFifo (channelData[i]);
            analizeFile = false;
        }
    }
}

and after that I’m filling the fifo;

void MainComponent::pushNextSampleIntoFifo(float sample) noexcept {
    if(fifoIndex == BUFSIZE) {
        analysis.emplace_back(std::make_shared<AnalysisData>());
        myClass->analyze(fifo, chunk, analysis);
        chunk++;
        fifoIndex = 0;
    }
    fifo[fifoIndex++] = sample;
}

Where “analizeFile” is a boolean, it is set with the Menu. I’m counting the numbers of chunkData. I do the same with another library, libsndfile. With libsndfile i become 377 chunks of data, with Juce 375 chunks of data, for this reason I think I’m not reading the whole file.The Buffersize is now 1024, but I need to change the size on my own (f.e. 512 or 2048).
The second question is about the output the dsp::FFT. Inside from myClass I’m doing some FFT’s, for example the following:

myClass::myClass(float* input, int bufferSize) {
       dataFFT = (float*)malloc(sizeof(float) * (bufferSize*2)); //float*
      fftOrder = log2(bufferSize);
      fftSize = 1 << fftOrder;
      myFFT = std::make_unique<juce::dsp::FFT>(fftOrder); 
}
doFFT(float* input) {      
      std::copy(input, input+bufferSize, dataFFT);
      myFFT->performRealOnlyForwardTransform(dataFFT);
     ....more Code
}

Where is actually the output, in the second half of the array (dataFFT)? I need to process the FFT’s output, what I need is the same kind of function provided by FFTW’s fftw_plan_r2r_1d
Regards.


#2

It’s hard to answer your first question without seeing all of your code. It’s a bit strange to be pushing the whole file into a fifo in your audio callback - if fileBuffer already contains the whole file, why not just use that as a buffer?

To your second question. dataFFT is used both for the input and output. The output will be in complex interleaved form (i.e. dataFFT[2*i+0] will be the real component of the i-th complex number and dataFFT[2*i+1] will the the imaginary component) so you cannot compare it to fftw_plan_r2r_1d - it’s more like fftw_plan_r2c_1d method (in fact if dontCalculateNegativeFrequencies=true then fftw_plan_r2c_1d and performRealOnlyForwardTransform will have identical output - except maybe for a constant scaling factor).

If you want something similar to fftw_plan_r2r_1d then you should use performFrequencyOnlyForwardTransform.


#3

@fabian Thanks for your answer. I’m reading the file in the following way:

void MainComponent::openFile() {
    shutdownAudio();
    String file;
    FileChooser chooser ("Select a soundFile.", fFileLocation, "*.wav, *.aiff");
    if (chooser.browseForFileToOpen()) {
        fFileLocation = chooser.getResult();
        file = fFileLocation.getFullPathName();
        std::unique_ptr<AudioFormatReader> reader (formatManager.createReaderFor (file));
        
        if (reader.get() != nullptr) {
            duration = reader->lengthInSamples / reader->sampleRate;
            fileBuffer.setSize (reader->numChannels, (int) reader->lengthInSamples);
            reader->read (&fileBuffer, 0, (int) reader->lengthInSamples, 0,  true,  true);
            setAudioChannels (reader->numChannels, 0);
        }
    }
}

Yes, it no makes any sense to use it in the audio callback. I’m reading here a entire sound file in a buffer, could I have problems I someone for example reads a very long sound file (f.e. 10 or 20 Minutes, with SampleRate 96 Khz)? Could the application go out of memory?
About the FFT. Making some experiments, I become the same result as fftw_plan_r2r_1d(bufferSize, dataTime, dataFFT, FFTW_R2HC, FFTW_ESTIMATE); if I do the following:

doFFT(float* input) {
     int middle = totalFFTSize/2;
     int quarter = middle / 2;
     std::copy(input, input+bufferSize, dataTemp);
     dataTimeFFT->performRealOnlyForwardTransform(dataTemp);
     for(unsigned i = 0; i<middle; i++) {
         if(i<quarter)
             dataFFT[i] = dataTemp[i*2];
         else
            dataFFT[i] =dataTemp[(i*2)+1]*(-1);
      }

DataFFT has now the same output as with FFTW r2r .I printed both outputs and compares it, doing this operation i become always the same result. I think you don’t support variable sizes for the FFT, like f.e. 3072(1024+2048), do you? Or there is some tricks available to do that? I need this size too (bufferSize + bufferSize/2), and the same kind of transformation (forward and invers).
Regards


#4

I can’t spot anything wrong with your code at first glance.

If that would happen your app would very likely crash immediately so I don’t think that’s the issue.

No we don’t. But you have two options:

1. You can pad your input with zeros so that the buffer’s size is 2^n. The downside is that the frequency separation of two frequency bins will be that of a 2^n FFT.

2. You can use the chirp-z transform to break-down a normal FT (with arbitrary size) into three FFTs (where one of them can usually be calculated beforehand). With this approach you will still need to pad the data with zeroes to the next 2^n size but the result will be as if you hadn’t padded them.