Bad idea to perform multiple DSP processes within the same for loop?

it seems like the general way most people do audio DSP is to run a loop to process the block of each section in the DSP chain.  So, for example, for white noise going through a filter and then an envelope, it seems the common way is to process a block of the noise, then process a block of the filtered sound, and then finally process the block multiplied by the envelope data. 


BUT, is there any practical reason why it's a bad idea just to process the noise, filter and envelope all through a single for loop?  Is it going to be significantly slower for some reason?  Or are there other issues that make it a mistake?  

To use the example above, i have a noise source class, a filter class, and an envelope class..and each one has a prepare method and a perform method.  The prepare is run once at the start of the block, and then the perform is run on every tick in the loop. 


so my 'renderNextBlock' code looks something like this:

filter.prepare (getFilterCutoff());
envelope.prepare (getDecay(), getRelease());

for (int i = 0; i < blockSize; ++i)
    float noiseOut = noise.perform();
    float filterOut = filter.perform (noiseOut);
    float output = filterOut * envelope.perform();

    // then, write output to the audio buffer...

to me, this way seems more logical, because none of my audio DSP classes need to have any knowledge of what sort of buffer they will write to.  
But why does everyone else seem to write the whole loop to the buffer for every section of the DSP chain?  Is there something i'm missing?  


There's no guarantee the compiler will be able to get rid of the function call for each sample when you do it like that even if your DSP objects's perform method are not virtual. If it's virtual, it's virtually guaranteed (no pun intended) it will result in an indirection and a function call for each sample. So the purpose of processing in bigger blocks is usually to reduce the overhead of function calls, direct or indirect ones.

There are also some processes that can be optimised easily using SIMD processing (see the JUCE FloatVectorOperations class). Generating noise and performing an IIR filter (as per your example) are two processes that are more difficult to optimise with SIMD calls.