DSP beginner asking for assistance with PSOLA

seanwayland · October 12, 2018, 4:37pm

Hi all,
I am trying to build a pitch shifting audio plugin.

I took the audio input tutorial and edited like this

    void getNextAudioBlock (const AudioSourceChannelInfo& bufferToFill) override
    {
        auto* device = deviceManager.getCurrentAudioDevice();
        auto activeInputChannels  = device->getActiveInputChannels();
        auto activeOutputChannels = device->getActiveOutputChannels();
        auto maxInputChannels  = activeInputChannels .getHighestBit() + 1;
        auto maxOutputChannels = activeOutputChannels.getHighestBit() + 1;

        auto level = (float) levelSlider.getValue();
        /// values for the pitchShit routine 
        float shift = 2; // up an octave 
        long sr = device->getCurrentSampleRate(); // samplerate 
        long os = 32; // oversampling 
        long numSamps = bufferToFill.numSamples; // size of buffer 
        long fftSize = 2048; // fft window size 

        for (auto channel = 0; channel < maxOutputChannels; ++channel)
        {
            if ((! activeOutputChannels[channel]) || maxInputChannels == 0)
            {
                bufferToFill.buffer->clear (channel, bufferToFill.startSample, bufferToFill.numSamples);
            }
            else
            {
                auto actualInputChannel = channel % maxInputChannels; // [1]

                if (! activeInputChannels[channel]) // [2]
                {
                    bufferToFill.buffer->clear (channel, bufferToFill.startSample, bufferToFill.numSamples);
                }
                else // [3]
                {
                    auto* inBuffer = bufferToFill.buffer->getReadPointer (actualInputChannel,
                                                                          bufferToFill.startSample);
                    auto* outBuffer = bufferToFill.buffer->getWritePointer (channel, bufferToFill.startSample);
                    //for (auto sample = 0; sample < bufferToFill.numSamples; ++sample)
                    //    outBuffer[sample] = inBuffer[sample] * random.nextFloat() * level;
                    float *inbuf = (float *)inBuffer;
                    float *outbuf = (float *)outBuffer;
                    
//void smbPitchShift(float pitchShift, long numSampsToProcess, long fftFrameSize, long osamp, float sampleRate, float *indata, float *outdata)
                    // call routine with values I added 
                    smbPitchShift(shift, numSamps, fftSize, os, sr, inbuf, outbuf);
                }
            }
        }
    }

I am calling this routine :
http://blogs.zynaptiq.com/bernsee/repo/smbPitchShift.cpp
The result seems to be transposing up the octave which it should but it’s pretty noisy

Casting like this seems a bit iffy but it compiles and makes noise

float *inbuf = (float *)inBuffer;
float *outbuf = (float *)outBuffer;

Do I need to interpolate between the bins or something …
Not sure where to go from here.
Perhaps I need to read a DSP book !
Thanks for your replies !
Sean

CrushedPixel · October 14, 2018, 5:25pm

Hey Sean,

I suspect that the smbPitchShift routine is indeed meant for processing an entire buffered audio file, and not only small, continuous buffers. Try changing the sample buffer size (you can do it in the standalone program’s options), and see if the noise changes frequency.

I’ve been working on a pitch-shifting software (auto-tune) for the past 6 months myself, and believe me when I tell you I tried cheating my way around actually having to know what I’m doing, but that didn’t work out. If you are serious about this (or any other DSP project, for that matter), you need to know how the pitch shifting method you are implementing works, and code it yourself.

The smbPitchShift routine implements a frequency-domain pitch shifting method using fast fourier transforms (FFTs). If you choose to implement a similar algorithm, I can only recommend the ffts library (specifically, linkotec’s fork), which it implements high-performance FFTs with a permissive license, allowing you to use it in commercial applications.

Good luck on the journey that lies ahead of you, pitch shifting is indeed a complex (and equally exciting) endeavour!

seanwayland · October 14, 2018, 5:39pm

Thanks CrushedPixel.
I was thinking the same as you on my morning work but was also thinking why would that routine have a buffer in it if it was meant to process an entire file.

Thankfully I started another thread about threading which gave me some hints …

I made this small change

        //for (auto channel = 0; channel < maxOutputChannels; ++channel)
        for (auto channel = 0; channel < 1; ++channel)

The noise has gone and it seems to work with a bit of latency which I would expect.

It doesn’t sound amazing with polyphonic input but at least I made some progress!

I have a general understanding of what the algorithm is doing. It’s passing a kernel through the buffer and using it to detect periodicity in the waveform. Not enough to build the ultimate product but I will keep battling.
The version I built using PYO sounds good enough that I want to continue!
Perhaps when I finish my computer science degree in a year I can consider some further post-grad DSP study of some sort …

Onwards!
Sean

DaveH · October 14, 2018, 10:42pm

I came to the conclusion long ago that smbPitchShift was a misleading listing, almost designed to throw people off the scent. Because the absolute best way to pitch shift is to time shift THEN change the playback rate. I spent years researching my own algorithm. Yes, I said YEARS! Good luck!

seanwayland · October 15, 2018, 2:22am

Well yours does pretty much what I intended to build ( without the ADSR and sustain pedal ) . Sounds great and the latency is good !!
I might use my time more wisely and solve another problem !!!
Sean

DaveH · October 15, 2018, 8:51am

PSOLA was designed for working in the time-domain. It’s mainly for voices and certainly only for single notes. The main aspect of it is the pitch detection part, of which I haven’t looked at in depth. I believe people have used several pitch detection routines at once and discard the result that changes rapidly, for example, one technique may see sudden octave changes which is unnatural for a human. I’m sure there’s a few caveats like that, but it’s all part of the fun
Getting in deep with pitch shifting opens up a world of psycho-acoustics, where mathematics can’t model how the brain works in a linear way - Now THAT’S a rabbit hole.

xenakios · October 15, 2018, 12:26pm

I am bit confused because the thread title says PSOLA but then you mention the smbPitchShift, which is a FFT/phase vocoder based process…

DaveH · April 18, 2021, 10:02pm

PSOLA is based in the time domain. It’s the best way to do pitch shifting in the 1980s

It’s great if you can follow the pitch accurately, which is easier these days! I hear people use more than one analysis technique to reduce octave jumps and missed pitch contours.

If you use an FFT, you also have to use overlap and add, also because of Windowing.
Sorry for the very brief reply, if feel the topic is just too large to post on a Juce forum.
And… after ten years, l’m a bit rusty on the details.

Topic		Replies	Views
Pitch shifter plugin (fix output noise) Audio Plugins	6	1574	April 6, 2021
Octaver Audio Plugins	2	1428	December 30, 2016
VST Pitch Plugin Crash Audio Plugins	6	1308	November 12, 2013
Input Buffering Audio Plugins	4	713	December 20, 2013
Best approach for a simple pitch adjustment plugin Audio Plugins	10	2201	January 4, 2020

DSP beginner asking for assistance with PSOLA

Purchase

Discover

Learn

Support

About

Events

DSP beginner asking for assistance with PSOLA

Related topics

Purchase

Discover

Learn

Support

About

Events