Processing Audio: Sample by Sample or Buffer by Buffer?

fefanto · March 12, 2017, 10:42am

Good Sunday Jucers!

I am refactoring my audio engine, and I am in the process of deciding whether to favour sample by sample processing vs. whole buffer processing, and/or if have an hybrid approach between the two.
i.e
inline float output = myEngineBlock->audio(float input)
vs
myEngineBlock->processChannel(float* buffer, long bufferSize)

Below a list of the pros of the two approaches coming off the top of my head: what I would like to do here is to start a discussion to clarify whether some of my points are totally wrong or of little importance, or if I am totally missing other important aspects of the conversation. Please please send me any of your 2 cents (or your million dollar ).

Sample By Sample approach

Parameter modulation come out easier and better, since internal parameters are re-evaluated every sample
With this approach you give more room for optimisation to the compiler and CPU cache (this is totally empiric , but I suspect there’s a reason why for example the guys at Cycling74 came out with gen~ working on a sample by sample basis )

Buffer by Buffer approach

Enables using optimised operations (FloatVectorOperations, IPP etc) for simpler interconnecting stuff (e.g. mixing output of two internal processing blocks)
Makes “Block Switching” easier, since if conditions or virtual function calls can be acceptable on a buffer by buffer basis (not so much on a sample by sample basis)

Thanks for reading!
Michelangelo

pizzafilms · March 12, 2017, 3:43pm

I just refactored my plugin a few weeks ago from sample based to buffer based and wow, what a performance difference! For me, I saw a 285% performance gain. That was all I needed to see.

pmbk · March 13, 2017, 8:36pm

With Max, the reason has more to do with simplicity for the user. It’s much simpler to write sample-by-sample code if you are newer to writing DSP code, which many gen~ users are.

The rest of the audio engine for Max is definitely block-based and that’s unlikely to change any time soon, for the same efficiency reasons as noted above. There was an interesting talk at ADC by Ian Hobbs that is worth a look. He showed a very different way of doing single-sample processing through variadic templating.

jonathonracz · March 13, 2017, 10:48pm

ALWAYS use buffer by buffer for JUCE’s AudioBuffer<> classes (unless you don’t care about performance). If your data is interleaved (which is common in image processing) then ALWAYS use sample by sample.

Why? Spacial locality.

tl;dr The CPU can process elements way faster when they’re being processed in a straight line (in simple terms) vs hopping all over in memory.

fefanto · March 14, 2017, 7:22am

Thanks for all the answers! I see just a tiny tiny preference for buffer by buffer approach
In this case though, how do You manage a smooth parameter modulation, since all internal block parameters get dereferenced only once per buffer?

P.S
One doubt I have (take a look at the code below): If all the tick() methods are “forced inline” then the approach below is apparently “sample by sample”, but there is no difference from “buffer by buffer” , since all the code is inlined in a big “buffer by buffer” block working on the root AudioSampleBuffer.
Do you think this assumption is correct? If so, then the approach below is far cleaner and flexible in my opinion, as long as one does not allocate/deallocate, lock etc. inside the blocks.

processBlock (AudioSampleBuffer& buffer, MidiBuffer& midiMessages)
{
int frames = buffer.getNumSamples();
float* pL = buffer.getWritePointer(0);
float valueIn = 0.0f, valueOut=0.0f;

while (frames—)
{
valueIn = *pL;
valueOut = myBlock1->tick(valueIn);// all tick() are forcedinline
valueOut = myBlock1->tick(valueOut);
// some more blocks
valueOut = myBlockN->tick(valueOut);
*pL++ = valueOut;
}
}

Thanks again!!

chrisboy2000 · March 14, 2017, 8:57am

But if you are jumping around the blocks in the most inner loop you may loose the “spacial locality” that Jonathan described.

chrisboy2000 · March 14, 2017, 8:59am

Also if tick() is a virtual method (which is most likely the case unless you do variadic template stuff), the performance overhead for the vtable is not trivial.

fefanto · March 14, 2017, 9:13am

Ok - I’m starting to get the spacial locality thing - I wonder if there’s a profiler feature or data aggregation in Xcode or visual studio that lets you check how bad is your code in that respect.

As for the virtual table topic, My assumption is that a dsp block that needs to process a single sample has a float tick(float) method (or float tick(void) for synths) that is not virtual and is inlined , following the STK coding style (https://ccrma.stanford.edu/software/stk/)

chrisboy2000 · March 15, 2017, 9:10am

Last time I checked, STK does use a virtual tick() - the tick() in eg. stk::filter is virtual and the subclasses override it (they just don’t specify the override keyword).

I think this is why there are also block based methods in stk and the single tick() is left in there if performance is not an issue.

If its not virtual, you can’t use polymorphism which is key to write a generic engine unless you do the magical stuff Ian was presenting at ADC.

Topic		Replies	Views
Understanding the need for audio buffers General JUCE discussion	13	2424	November 17, 2019
Buffer Based Processing and Sample Accurate Automation Audio Plugins	3	1530	September 1, 2020
What exactly counts as block based processing? Audio Plugins	10	779	September 6, 2023
Per sample based processing Development	11	253	September 13, 2025
Aren't the built-in AudioBuffer operations (such as applyGain(...)) a little verbose? Development	13	1686	May 16, 2019

Processing Audio: Sample by Sample or Buffer by Buffer?

Sample By Sample approach

Buffer by Buffer approach

Purchase

Discover

Learn

Support

About

Events

Processing Audio: Sample by Sample or Buffer by Buffer?

Sample By Sample approach

Buffer by Buffer approach

Related topics

Purchase

Discover

Learn

Support

About

Events