a fractional delay line with a few options for the interpolation method
Already have my own, but agree that this would be nice to have as part of the library.
Agreed…
Rail
In the meantime, can we see some example code of a typical delay line? Asking for a friend…
Here’s quick and dirty example, just a ring buffer with a fractional read index.
template<class T, int MaxLen>
class DelayLine {
public:
DelayLine() = default;
T process (T input)
{
// read from FIFO and interpolate
auto frac = read - std::floorf(read);
auto x0 = (int) std::floorf (read);
auto x1 = (x0 + 1) % MaxLen;
auto y0 = fifo[x0];
auto y1 = fifo[x1];
auto output = (1.0 - frac) * y0 + frac * y1;
// write to FIFO
fifo[write] = input;
// increment counters
write = (write + 1) % MaxLen;
read = std::fmod (read + 1.0, float(MaxLen));
return frac;
}
void setLen (float delaySamples) {
auto newRead = float(write) + std::max (delaySamples, float(MaxLen));
read = std::fmod (newRead, float(MaxLen));
}
private:
int write = 0;
float read = 0.0;
T fifo[MaxLen] = {static_cast<T> (0) };
};
Obviously not optimized, but should give you the idea.
Edit: Another method is to use a Schroeder Allpass Filter which looks something like this (someone double check this, it’s been a minute since I’ve used one…)
template<class T, int MaxLen>
class AllpassDelayLine {
public:
AllpassDelayLine() = default;
T process (T input) {
int read = (write + len) % MaxLen;
auto u = fifo [read];
auto x = input + g * u;
auto output = u + -g * x;
fifo[write] = x;
write = (write + 1) % MaxLen;
return output;
}
void setLen (float delaySamples) {
// someone may want to correct this...
float whole = std::floorf(delaySamples);
float frac = delaySamples - whole;
len = int (whole);
g = (1. - frac) / (1. + frac);
}
private:
T fifo[MaxLen] = { static_cast<T> (0) };
int write = 0;
int len = 1;
T g = 0.5;
};
Here’s some literature on fractional delay lines: http://users.spa.aalto.fi/vpv/publications/vesan_vaitos/ch3_pt1_fir.pdf
AFAIK the most common method is to use a (e.g. 4-point) Lagrange interpolator: http://users.spa.aalto.fi/vpv/publications/vesan_vaitos/ch3_pt2_lagrange.pdf
Luckily, a 4-point Lagrange interpolator is also in the JUCE code base: https://docs.juce.com/master/classLagrangeInterpolator.html
But it’s used for resampling, not for fractional delay lines
Note: a lot of interpolation methods were developed for use on graphics data. They don’t necessarily sound good when applied to audio data.
One conclusion I reached for myself: when making “effects box” plug-ins, linear interpolation of a delay line is probably good enough. More advanced interpolation might be called for if you were designing high-end sample rate conversion algorithms, to use on a whole mix (when mastering or whatnot).
Here was one helpful debate I found when researching the subject:
There, and elsewhere online, Olli Niemitalo’s “elephant paper” gets mentioned as one exhaustive study of the audible traits of various interpolation algorithms:
More advanced interpolation might be called for if you were designing high-end sample rate conversion algorithms, to use on a whole mix (when mastering or whatnot).
Or modulation.
I beg to differ!
IMHO linear interpolation doesn’t sound good. Its frequency response depends on the fraction itself.
So if you want to create a modulated fractional delay (e.g. doppler shifts), you don’t want to have an additional modulation in timbre, which you’ll get with linear interpolation.
If a fractional delay line will be added to JUCE, it should sound good
Hmm, interesting. In what frequency range do you tend to hear that additional modulation in tone? Across the whole spectrum, or mostly just in the high frequencies?
Could you elaborate on that a bit? By a dependent “frequency response” do you mean a deviation from a flat frequency response (i.e. a pattern of resonances and nulls)?
Agreed. Perhaps Olli Niemitalo’s paper might help steer that debate, or which interpolation method to adopt?
Yes, I’ve meant magnitude response, not frequency response in general, sorry.
The delay in samples can be split up into whole samples and the fractional part. E.g. 2.3
into a sample delay with t_d = 2 samples
and a fractional delay with t_fd = 0.3 samples
.
We agree, if the fractional part is zero, we have only a sample delay. So there’s of course no change in magnitude response. To be exact, there’s a change only in phase response.
Let’s look at the extreme case: a fractional part of 0.5 samples
. With linear interpolation you’ll add to your signal a copy of itself, shifted by one sample, and attenuate both by 6dB
(0.5
linear gain). The magnitude response of this is a lowpass-filter, as it’s basically a moving average filter. Higher frequencies will be attenuated more than lower ones.
Now, if you have a modulated fractional delay, the fractional part will change over time, and this possibly quite fast, depending on your modulation. This will result into a frequency dependent amplitude modulation, as you will engage the above described low-pass filter from time to time, sometimes more, sometimes less. Things you generally don’t want to happen, if you simply want to delay your signal.
A 4-point Lagrange for example will try to keep the magnitude response flat as possible. A 4-point will fail for high delay modulations (intense doppler shifts, or fast moving sources in space). For this, some more points would be even better, but of course with a higher processing cost. However, 4-points are sufficient in general. I use it in a room simulation plug-in, and I simply limited the maximum speed of sources to 30m/s to avoid artifacts.
…and it actually has the wrong name. I intended to create a fractional delay, but it was not possible with the LagrangeInterpolator back then, since it had no way of limiting the amount of incoming samples, which is crucial to know, when to wrap around. I was able to provide an improvement, which is now available in JUCE (thanks Jules!)
You can use the LagrangeInterpolator now providing a buffer to wrap over:
LagrangeInterpolator interpol;
AudioBuffer<float> buffer;
AudioBuffer<float> delay;
int readPos;
double ratio;
auto advance = interpol.process (ratio,
delay.getReadPointer (channel, readPos), // inputSamples
buffer.getWritePointer (channel), // outputSamples
buffer.getNumSamples(), // how many samples to produce
delay.getNumSamples() - readPos, // how many samples, until delay buffer ends
delay.getNumSamples()); // how many samples to move back
readPos += advance;
while (readPos >= delay.getNumSamples())
readPos -= delay.getNumSamples();
The interpolator is stateful, so the usual advice:
- one interpolator per channel
- keep the interpolator persistent
- if the signal is interrupted (sudden reset of readPos or restart) call
interpol.reset();
Thank you for further explaining - I had not thought of the linear interpolator as a lowpass filter before, and I’ve got to think on that a bit now…
I ran some numbers on it, and the results seemed counter-intuitive, so here’s my feedback: it makes sense that the “extreme” case would be when the fractional part is 0.5 samples - that’s the spot farthest from actual data, where you’re making the biggest guess.
However, it’s not making sense that it would be the same as a lowpass/moving average filter, because you’re not feeding the output of the linear interpolation back into itself, the way you would with a one pole lowpass filter. On the next output sample, you are looking at two new data points, and guessing at a value somewhere between them, without being influenced by the previous linear interpolation result.
However, this being DSP, things can be counter-intuitive at times, so I’m willing to be corrected about this.
Low-pass filters don’t have to be IIR filters, they can also be FIR. That filter I was describing is an FIR low-pass filter with the following coefficients: b0 = 0.5
, b1 = 0.5
.
Here’s the frequency response of that moving average filter:
Fractional parts are distributed evenly, it’s not farthest away from actual data. They depend of course on the delay setting, but in general you can see them as distributed evenly.
(I hope I’ve interpreted your statement correctly, english is not my native tongue )
Ah, right! I overlooked that. Being from a background in analog electronics, I tend to think of filters intuitively as just IIR filters. Gotta do some reading about FIR one of these days…
So, in that graph, the magnitude response at Nyquist ƒ is only 2-3dB down?
Hmm, I just meant, when calculating that an output sample that is exactly at a 0.5 fraction between 2 known input samples, that is when you are making the biggest guess. I’m not sure what you mean by the fractional parts being distributed evenly… you mean, across the range of delay settings?
Nyquist is at the rightmost data point in that plot. So Nyquist would be -inf actually. Looking at samples at Nyquist frequency you’ll get alternating signs: 1 -1 1 -1
. Doing linear interpolation here will result in only zeros, so -inf dB.
Oh that’s what you’ve meant, sorry! I wouldn’t call it guessing, as you actually can calculate a perfect fractional delay by treating each sample as a sinc function. With oversampling you can shift it and sample it for every sample point of your signal. Problem here: sinc has values down to -inf and up to +inf, so your new filter would be infinity long. That’s bad What Lagrange does is basically an approximation so we don’t need as many samples to write for each input sample (just 4 instead of infinity).
You can see this on page 70: http://users.spa.aalto.fi/vpv/publications/vesan_vaitos/ch3_pt1_fir.pdf
I thought that “normalized frequency” was equal to frequency divided by sample rate?
So for example, when sampling at 48k the Nyquist frequency would be 24k. And 24k/48k = 0.5.
One of us is off by a factor of 2. Is this a case of confusing if the samples are real numbers or complex numbers? (And, if the latter, I am probably in over my head.)
It is! The frequency scale shows radians / sample, and the plotted scale goes from 0 to pi, as the x-axis label states: (x pi rad/sample).
pi is the Nyquist frequency. So at 48kHz the rightmost data point would be for pi rad/sample or 24kHz.
Otherwise the plotted magnitude response would be symmetrical as the filter coefficients are real-valued.
OK, got it - I just got tripped up by the different units. Didn’t catch that you were doing radians/sample instead of cycles/sample as the unit of measure. Thanks for explaining.