Performance when using virtual inheritance for processors

Hi,
I’ve been designing code to be generic using inheritance. It’s easier to code, works well for GUI stuff, but I was wondering how heavy the performance hit would when doing things such as I have below, especially for for classes inherited down the line. I’m considering other design approaches such as using templates. Any guidance or thoughts are much appreciated.
(Stripped down)Example:

class Osc {
/*constructor, variables and stuff*/
public:
   virtual float processedSample(float phase) = 0;
   void processBlock(float* output, int numSamples)
   {
        for (int I = 0; I  < numSamples; I++)
        {
             output[I] = processedSample(m_phase);
        }
       }
};

class SinOsc : public Osc {
public:
    float processedSample(float phase) override
    {
        return (sin(2 * pi * phase * m_freq));
    }
}

That’s probably a poor use of inheritance and it’s expensive to have a virtual call per sample. You should operate on a block level. For the class, your design decision is made at compile time so you can just use a template. Something like what’s below.


template <typename OscType>
struct OSC {
    void processBlock(float* buffer, size_t numSamples)
    {
        for (size_t i = 0; i < numSamples; ++i)
        {
             buffer[i] = OscType(m_phase);
        }
    }
};

struct SinOSC {
    float operator() (float phase)
    {
         return (sin(2 * pi * phase * m_freq));
    }
};

OSC<SinOSC> myOsc;
myOsc.processBlock(myBuffer, 32);
1 Like

Thanks, yeah this is what I was thinking.

I read somewhere (maybe stackoverflow) that current compilers unwind virtual inheritance at compile time quite well. So the following could work without performance losses:

    SinOsc sineOsc;
    sineOsc.processBlock(data, numSamples);

As the compiler knows which object is used (SinOsc), it should be able to chose the correct implementation.

Only if the virtual inheritance can’t be determined at compile time, there will be a performance loss:

struct OscillatorBase
{
    virtual float processedSample(float phase) = 0;
};

struct Osc
{
    void processBlock(float* output, int numSamples)
    {
        for (int i=0; i<numSamples; ++i)
            output[i] = osc->processSample(output[i]); // should be checked for nullptr!
    }
    OscillatorBase* osc = nullptr;
}

At compile time it is not clear which class is used int Osc:processBlock().

On a quick search I didn’t find the source any more so it is absolutely possible that I misinterpreted something. Maybe someone with a more thorough C++ background could comment.

In many cases, devirtualization will get rid of the overhead, but there are still some obscure cases where the compiler can’t easily get rid of the v-table lookup. This talk is quite interesting:

3 Likes

Devirtualisation is nice (if it happens) but it’s like auto-vectorisation, you might make a small change in the code and suddenly that section can’t be optimised like you expect. You probably won’t find out it’s not being de-virtualised til much later also. I think it’s better to make it explicit, then at least the behaviour is consistent and not relying on magic. If you need runtime polymorphism, maybe try a variant? I’ve seen conflicting metrics on which is faster though so I guess you just have to measure.

I’d say at any rate because these methods are called so frequently, the performance should be pretty good due to the cache. Probably best to just go with what is easy to maintain and gives you the flexibility you need.

3 Likes

Thanks for all the replies!
@fabian That looks interesting.
@Anima I agree, I’d rather not rely on potential optimisations.
Also, the template does seem to be quicker, doing some benchmarking, and is just as easy to code.

Also, I would be interested if anyone had any cool examples using Template-Meta Programming that they would be willing to share, I haven’t looked at it a whole lot but it seems like it has a lot of potential.
This talk was really cool https://www.youtube.com/watch?v=XK88ji7vpyQ&t=2585s