Resampling stream issues - microstutters

We’ve implemented some audio streaming code that sends floats from point A to point B in chunks of X amount of samples, then re-samples the chunks to the destination sampleRate if conversion is needed, and we’re running into a problem where it’s mostly working fine but there are “micro-stutters” introduced when listening to the stream if it was resampled.

Our suspicion is that we’re losing a sample in the final buffer due to a rounding error possibly, but we can’t seem to nail this down and it’s driving me crazy. Here is the code which performs the resample on the incoming chunks (in our case we are only working with mono/stereo so don’t mind the hard-coded channel logic)

I’ve seen various examples where sometimes there is no cast to double and instead a ceil is used on the result, but it doesn’t seem to matter which we choose; the microstutters are still there after recombining the buffers. If the entire source buffer is sent at once there are no audio issues so it has something to do with the resampling and recombining of chunks for sure.

Does anyone have any ideas?

void SampleManager::ResampleBuffer(double p_dblRatio, AudioSampleBuffer& p_roBuffer, int p_iChannels, bool p_CreateNew)
{
	// Adjust the buffers size based on the sample rate ratio.
	jassert(p_dblRatio > 0);
	int AdjustedNumSamples = (double)p_roBuffer.getNumSamples() / p_dblRatio;

	AudioSampleBuffer temp;
	temp.clear();
	temp.setSize(p_iChannels, AdjustedNumSamples, true, false);

	const float** inputs = p_roBuffer.getArrayOfReadPointers();
	float** outputs = temp.getArrayOfWritePointers();

	if (p_CreateNew)
	{
		std::unique_ptr<LagrangeInterpolator> resamptemp1 = std::make_unique<LagrangeInterpolator>();
		resamptemp1->reset();
		resamptemp1->process(p_dblRatio, inputs[0], outputs[0], temp.getNumSamples());
		if (p_iChannels > 1)
		{
			std::unique_ptr<LagrangeInterpolator> resamptemp2 = std::make_unique<LagrangeInterpolator>();
			resamptemp2->reset();
			resamptemp2->process(p_dblRatio, inputs[1], outputs[1], temp.getNumSamples());
		}
	}
	else
	{
		resampler1->processAdding(p_dblRatio, inputs[0], outputs[0], temp.getNumSamples(), 1.0f);
		if (p_iChannels > 1)
		{
			resampler2->processAdding(p_dblRatio, inputs[1], outputs[1], temp.getNumSamples(), 1.0f);
		}
	}

	p_roBuffer = temp;
}

EDIT: Here’s another method I found online OpenShot Library | libopenshot: AudioResampler.cpp Source File

Couple of oddities in this one

  • new_num_of_samples = round(num_of_samples * dest_ratio) - 1; ← What is the difference between this and the method I posted above? Should there not be just one way of calculating this?
  • // Prepare to play the audio sources (and set the # of samples per chunk to a little more than expected)
    resample_source->prepareToPlay(num_of_samples + 10, 0); ← why +10? Where does this magic number come from?

Another observation: the process method of the resampler is supposed to return the amount of input samples it used to resample, but it’s returning different values for some reason:

  • InputBufferSize (which would be correct),
  • or inputBufferSize - 1, which I can’t explain at all, since all of our sample chunks coming in have the exact same size, so why would it decide not to process one of the samples…

That is easily explained. The resampler is stateful because it interpolates over the last 5 (IIRC) samples. Thanks to rounding errors, if the ratio is not exactly 1.0 sometimes the resampler needs a sample less than expected, because the fractional part of the index was low enough to produce a sample more than expected.

Hope that is understandable…

Hmmm ok, that’s interesting actually … we’ve built a test case where we take a sample that was recorded in 48k then resample it down to 44.1k using different chunk sizes and depending on the size we’re able to sometimes get it to work perfectly, other times it has the micro stutters.

This would line up a bit then with what you’ve explained, perhaps we need to feed the resampler in “magic” sizes that, given any arbitrary sanpleRate shift, that it can do it’s thing properly without any fractional rounding errors… maybe?

The solution is not to find a magic size because there is none. Instead, if the call to the resampler does not return you enough samples, you should call it with the next buffer immediately, use the samples you need from this new result, and cache the rest to be used later.

Sorry could you elaborate please? I’m not sure I fully understand… Do you mean something like this? Some example numbers

  1. I have a sound composed of say 100k samples, in 48k, that I want to convert to 44.1k in chunks of 1k because its coming from a stream over a network
  2. I feed the first chunk of 1000 samples into the resampler, and I check the return value of the process method to ensure that it used all of the input samples
  3. In this example the resampler returns 999, so the next time I use the resampler I feed it 1001, so the last sample from the first chunk again plus the samples from the new chunk?

I feel like I’m close but still off…

Actually I was thinking the reverse way (ie all samples are consumed but not enough produced).
But yes I think you understood the point.

You need to cache the samples that were not used and use them in the next chunk. So your point 3 is correct.

IT WORKS!!! Thanks guys very much for your help. This was driving me crazy, and getting this resolved is a helluva way to start off your monday!

Cheers!!

1 Like