Realtime Rubberband Pitchshifter - Working Example

Hi All,

I’ve been thinking about having a go with Rubberband for a while to use pitch shifting in a plugin. I’ve seen various different posts on the forum say this cant be done because of the multiple hundred millisecond delay etc so I thought I would give it a go anyway and post some functional code here as I haven’t seen any JUCE examples. In return there are a couple of issues that I hope someone with a little more knowledge of rubber band could help me with as I have only spent a day with the thing.

My Aim
I wanted to create a pitch shifter with the possibility of using fast modulation.

My Implementation
Once I had got rubber band up and running and using the process() and retrieve() functions I noticed that the latency was incredibly inconsistent. Just because you feed the required number of samples into the input doesn’t mean you are going to get as many (or as few) as that same number of samples out the other end. This led to rubber bands internal buffers filling up with lower pitches and emptying themselves faster with higher pitches. This extreme latency issue is still a problem if you use ring buffers on their own as overtime I found the latency just kept increasing gradually.

I looked at their example of a linux plugin to see how they deal with this and the solution is to modulate the stretch ratio in a way that is dependant on the number of samples available in the output ring buffer. After a little bit of trial and error I have come up with 2 “modes”, a low latency one which doesn’t mind the buffers being so empty and also another mode that is suited to smooth modulation (the output will empty rapidly when you change pitch so the extra latency is to give a bit more leeway). Here comes the first problem… Although the smooth mode prevents any tearing due to lack of output samples and is fine on many sources, with some sources (synths with loads of harmonics) there is a vinyl like crackle when changing the pitch (more prominent when it is gradual) which I think is coming from inside rubber band. Does anyone have any suggestions of different settings to give rubber band that may eliminate this (possibly) internal crunchiness.

The other issue is the potentially random amount of latency. Just wanted to check I hadn’t missed something that would make this a little more constant? I’d improve it by putting a variable delay line for the dry signal and monitor the latency of the wet so that they are somewhat hitting at the same time when dryCompensationDelay is enabled - its not very good at the moment for a percussive dry/wet mix but works for other sources!

To get it to work I also created a simple RingBuffer class which would give me the buffer pointers I needed etc… This could be improved to make it a little more “safe” but it seems to work.

My implementation seems to have pretty good latency but I haven’t done extensive testing at different buffer sizes.

Anyway, here’s my code!

Pitch shifter class:

#include <JuceHeader.h>
#include "rubberband/RubberBandStretcher.h"
#include "RingBuffer.h"

class PitchShifter
{
 public:
    /** Setup the pitch shifter. By default the shifter will be setup so that the dry signal isn't delayed to be given a somewhat similar latency to the wet signal - this is not accurate when enabled! By enabling minLatency some latency can be reduced with the expense of potential tearing during modulation with a change of the pitch parameter.
     */
    PitchShifter(int numChannels, double sampleRate, int samplesPerBlock, bool dryCompensationDelay=false, bool minLatency=false)
    {
        rubberband = std::make_unique<RubberBand::RubberBandStretcher>(sampleRate, numChannels, RubberBand::RubberBandStretcher::Option::OptionProcessRealTime + RubberBand::RubberBandStretcher::Option::OptionPitchHighConsistency, 1.0, 1.0);
        //rubberband->setMaxProcessSize(samplesPerBlock);
        initLatency = (int) rubberband->getLatency();
        maxSamples = 256;

        input.initialise(numChannels, sampleRate);
        output.initialise(numChannels, sampleRate);
        
        juce::dsp::ProcessSpec spec;
        spec.maximumBlockSize = samplesPerBlock;
        spec.numChannels = numChannels;
        spec.sampleRate = sampleRate;
        if (dryCompensationDelay)
        {
            dryWet = std::make_unique<juce::dsp::DryWetMixer<float>>(samplesPerBlock * 3.0 + initLatency);
            dryWet->prepare(spec);
            dryWet->setWetLatency(samplesPerBlock * ((minLatency) ? 2.0 : 3.0) + initLatency);
        } else
        {
            dryWet = std::make_unique<juce::dsp::DryWetMixer<float>>();
            dryWet->prepare(spec);
        }
        
        timeSmoothing.reset(sampleRate, 0.05);
        mixSmoothing.reset(sampleRate, 0.3);
        pitchSmoothing.reset(sampleRate, 0.1);
        
        if (minLatency)
        {
            smallestAcceptableSize = maxSamples * 1.0;
            largestAcceptableSize = maxSamples * 3.0;
        } else
        {
            smallestAcceptableSize = maxSamples * 2.0;
            largestAcceptableSize = maxSamples * 4.0;
        }
    }
    
    ~PitchShifter()
    {
        
    }
    
    /** Pitch shift a juce::AudioBuffer<float>
     */
    void processBuffer (juce::AudioBuffer<float>& buffer)
    {
        dryWet->pushDrySamples(buffer);
        
        pitchSmoothing.setTargetValue(powf(2.0, pitchParam / 12));          // Convert semitone value into pitch scale value.
        auto newPitch = pitchSmoothing.skip(buffer.getNumSamples());
        if (oldPitch != newPitch)
        {
            rubberband->setPitchScale(newPitch);
            oldPitch = newPitch;
        }

        for (int sample = 0; sample < buffer.getNumSamples(); sample++) {   // Loop to push samples to input buffer.
            for (int channel = 0; channel < buffer.getNumChannels(); channel++) {
                input.pushSample(buffer.getSample(channel, sample), channel);
                buffer.setSample(channel, sample, 0.0);
                
                if (channel == buffer.getNumChannels() - 1) {
                    auto reqSamples = rubberband->getSamplesRequired();
                    
                    if (reqSamples <= input.getAvailableSamples(0)) {       // Check to trigger rubberband to process when full enough.
                        auto readSpace = output.getAvailableSamples(0);
                        
                        if (readSpace < smallestAcceptableSize) {           // Compress or stretch time when output ring buffer is too full or empty.
                            timeSmoothing.setTargetValue(1.1);
                        } else if (readSpace > largestAcceptableSize) {
                            timeSmoothing.setTargetValue(0.9);
                        } else {
                            timeSmoothing.setTargetValue(1.0);
                        }
                        rubberband->setTimeRatio(timeSmoothing.skip((int) reqSamples));
                        rubberband->process(input.readPointerArray((int) reqSamples), reqSamples, false);   // Process stored input samples.
                    }
                }
            }
        }
        
        auto availableSamples = rubberband->available();
        
        if (availableSamples > 0) {                                         // If rubberband samples are available then copy to the output ring buffer.
            rubberband->retrieve(output.writePointerArray(), availableSamples);
            output.copyToBuffer(availableSamples);
        }
        
        auto availableOutputSamples = output.getAvailableSamples(0);        // Copy samples from output ring buffer to output buffer where available.
        for (int channel = 0; channel < buffer.getNumChannels(); channel++) {
            for (int sample = 0; sample < buffer.getNumSamples(); sample++) {
                if (output.getAvailableSamples(channel) > 0) {
                    buffer.setSample(channel, ((availableOutputSamples >= buffer.getNumSamples()) ? sample : sample + buffer.getNumSamples() - availableOutputSamples), output.popSample(channel));
                }
            }
        }
        
        if (pitchParam == 0 && mixParam != 100.0) {                         // Ensure no phasing with mix occurs when pitch is set to +/-0 semitones.
            mixSmoothing.setTargetValue(0.0);
        } else
        {
            mixSmoothing.setTargetValue(mixParam/100.0);
        }
        dryWet->setWetMixProportion(mixSmoothing.skip(buffer.getNumSamples()));
        dryWet->mixWetSamples(buffer);                                      // Mix in the dry signal.
    }
    
    /** Set the wet/dry mix as a % value.
     */
    void setMixPercentage(float newPercentage)
    {
        mixParam = newPercentage;
    }
    
    /** Set the pitch shift in semitones.
     */
    void setSemitoneShift(float newShift)
    {
        pitchParam = newShift;
    }
    
    /** Get the % value of the wet/dry mix.
     */
    float getMixPercentage()
    {
        return mixParam;
    }
    
    /** Get the pitch shift in semitones.
     */
    float getSemitoneShift()
    {
        return pitchParam;
    }
    
    /** Get the estimated latency. This is an average guess of latency with no pitch shifting but can vary by a few buffers. Changing the pitch shift can cause less or more latency.
     */
    int getLatencyEstimationInSamples()
    {
        return maxSamples * 3.0 + initLatency;
    }
 
 private:
    std::unique_ptr<RubberBand::RubberBandStretcher> rubberband;
    RingBuffer input, output;
    juce::AudioBuffer<float> inputBuffer, outputBuffer;
    int maxSamples, initLatency, bufferFail, smallestAcceptableSize, largestAcceptableSize;
    float oldPitch, pitchParam, mixParam;
    std::unique_ptr<juce::dsp::DryWetMixer<float>> dryWet;
    juce::SmoothedValue<float> timeSmoothing, mixSmoothing, pitchSmoothing;
};

Simple ring buffer class:

#include <JuceHeader.h>

class RingBuffer
{
public:
    RingBuffer(){}
    ~RingBuffer(){}
    
    void initialise(int numChannels, int numSamples)
    {
        readPos.resize(numChannels);
        writePos.resize(numChannels);
        
        for (int i = 0; i < readPos.size(); i++)
        {
            readPos[i] = 0.0;
            writePos[i] = 0.0;
        }
        
        buffer.setSize(numChannels, numSamples);
        pointerArrayBuffer.setSize(numChannels, numSamples);
    }
    
    void pushSample(float sample, int channel)
    {
        buffer.setSample(channel, writePos[channel], sample);
        
        if (++writePos[channel] >= buffer.getNumSamples()) {
            writePos[channel] = 0;
        }
    }
    
    float popSample(int channel)
    {
        auto sample = buffer.getSample(channel, readPos[channel]);
        
        if (++readPos[channel] >= buffer.getNumSamples()) {
            readPos[channel] = 0;
        }
        return sample;
    }
    
    int getAvailableSamples(int channel)
    {
        if (readPos[channel] <= writePos[channel]) {
            return writePos[channel] - readPos[channel];
        } else
        {
            return writePos[channel] + buffer.getNumSamples() - readPos[channel];
        }
    }
    
    const float** readPointerArray(int reqSamples)
    {
        for (int sample = 0; sample < reqSamples; sample++) {
            for (int channel = 0; channel < buffer.getNumChannels(); channel++) {
                pointerArrayBuffer.setSample(channel, sample, popSample(channel));
            }
        }
        return pointerArrayBuffer.getArrayOfReadPointers();
    }
    
    float** writePointerArray()
    {
        return pointerArrayBuffer.getArrayOfWritePointers();
    }
    
    void copyToBuffer(int numSamples)
    {
        for (int channel = 0; channel < buffer.getNumChannels(); channel++) {
            for (int sample = 0; sample < numSamples; sample++) {
                pushSample(pointerArrayBuffer.getSample(channel, sample), channel);
            }
        }
    }

private:
    juce::AudioBuffer<float> buffer, pointerArrayBuffer;
    std::vector<int> readPos, writePos;
    JUCE_DECLARE_NON_COPYABLE_WITH_LEAK_DETECTOR (RingBuffer)
};

Hope this can help someone! It should be good if you’re not wanting constant modulation and/or you have non harmonically rich sources. If anyone tries it out and has a solution to the slight crunchiness then please let me know!

Thanks,
David

6 Likes