Optimizing performance for Synthesizers

Hello,

I am trying to carefully go through the code and find where I can improve on performance.

I’m wondering if anyone else have tried similar methods:
1- Replacing function generators with buffers (wave tables etc)
2- Replacing the midi note to Hz with a pre-calculated table of 127 notes (since it’s using Pow).
3- Stop processing anything in RenderNextBlock of the Synthesizer if no notes or ADSR is active, providing that there are no effects such as delays etc.
4- Optimizing the graphics rendering.

Today I am specifically interested in the graphics. What I did was that I fully commented out any signal processing everywhere and just observed the synth running without any interactions with any parameters.

The results are a bit strange (This is Intel’s VTune profiler for a release build on Windows):

Notice how rendering is taking up a big chunk of the CPU time. And the renderNextBlock of the synth isn’t even in that list since I made sure it doesn’t run anything.

I also tested the VST3 on a DAW by opening 10 instances and closed all the open synth windows and the cpu stays on 30% constantly.

I have standard knobs, dropdowns and I have the Juce’s keyboard component in there:
53 knobs
13 combo box
6 toggles
1 Keyboard component

Are there any specific locations I should look to improve in regards to graphics ?

Thanks.

What are you exactly rendering? I mean something like menus, sliders, etc. or an active oscilloscope/spectrum that must be updated regularly? Software rendering or OpenGL? With OpenGL you should see a decreased CPU usage, whereas with normal rendering it’s done in the CPU so it’s normal that it takes a big chunk. Also in this latter case there are many tricks like instead of stroking paths, use horizontal/vertical lines, only paint when something changes (not every X time), etc.

I’m using the juce framework’s components:
Slider
Combobox
Label
Toggle
Group

I could go with the OpenGL route but that’s a more complex step which is a full refactoring rather than just optimizing what I have.

What i don’t quite understand is that whether anything is being rendered in a vst when the editor window isn’t even open…

Well, I haven’t used that profiler so I’m not sure how to interpret the microarchitecture usage parameter in reference to what. But I’d say it’s not fair to profile without having the audio processing active.

What I mean is, usually when you profile your synth it tells you “out of all this time you are measuring, most of this time is spent here”. That doesn’t mean it’s taxing your CPU, you could probably have 1% usage but out of that 1% most of the work is done on the GUI as you are not running the audio thread.

Said so, if I’m misinterpreting the results and your profiler actually references the usage in reference of the total CPU usage given that time, then you must have something wrong in your code because the basic components you are using are far from being demanding.
Maybe with some code people can help you figure it out

Did you notice the part about 30% cpu in DAWs ?

Yes, but that doesn’t say much unless you show us some code to try to recreate the problem or see what are you doing wrong. As I said, those components aren’t demanding at all and unless you repaint them constantly they shouldn’t even be consuming CPU.

I think that profiler is a bit misleading. That is actually the Juce watermark animation hence the high cpu usage.

However, I found out the main culprit for this is reading parameters from value tree state.

I’m reading them in AudioProcessor::processBlock, commenting out that part alone drops that 30% to 2% for 10 instances of the vst. Here’s how mine looks like:

void NewProjectAudioProcessor::processBlock (juce::AudioBuffer<float>& buffer, juce::MidiBuffer& midiMessages)
{
    ....

    for (int i = 0; i < m_synth.getNumVoices(); i++)
    {
        const float attack  = m_treestate.getParameterAsValue(Ids::Envelope::Attack::id).getValue();
        const float decay   = m_treestate.getParameterAsValue(Ids::Envelope::Decay::id).getValue();
        const float sustain = m_treestate.getParameterAsValue(Ids::Envelope::Sustain::id).getValue();
        const float release = m_treestate.getParameterAsValue(Ids::Envelope::Release::id).getValue();

        const float pitchAttack = m_treestate.getParameterAsValue(Ids::Envelope::Attack::id   + Ids::Frequency::Pitch::id).getValue();
        const float pitchDecay = m_treestate.getParameterAsValue(Ids::Envelope::Decay::id     + Ids::Frequency::Pitch::id).getValue();
        const float pitchSustain = m_treestate.getParameterAsValue(Ids::Envelope::Sustain::id + Ids::Frequency::Pitch::id).getValue();
        const float pitchRelease = m_treestate.getParameterAsValue(Ids::Envelope::Release::id + Ids::Frequency::Pitch::id).getValue();

        const float filterAttack = m_treestate.getParameterAsValue(Ids::Envelope::Attack::id + Ids::Filter::id).getValue();
        const float filterDecay = m_treestate.getParameterAsValue(Ids::Envelope::Decay::id + Ids::Filter::id).getValue();
        const float filterSustain = m_treestate.getParameterAsValue(Ids::Envelope::Sustain::id + Ids::Filter::id).getValue();
        const float filterRelease = m_treestate.getParameterAsValue(Ids::Envelope::Release::id + Ids::Filter::id).getValue();

        const float lfoSpeed = m_treestate.getParameterAsValue(Ids::LFO::Speed::id).getValue();
        const float lfoAmount = m_treestate.getParameterAsValue(Ids::LFO::Amount::id).getValue();

        if (CSynthVoice* synthVoice = dynamic_cast<CSynthVoice*>(m_synth.getVoice(i)))
        {
            for (int j = 0; j < NO_OF_OSCILLATORS; j++)
            {
                synthVoice->SetOscillatorActive(j, m_treestate.getParameterAsValue(Ids::Oscs::Active::id + juce::String(j)).getValue());
                synthVoice->SetOscillator(j,
                    (EOscMode)(int)m_treestate.getParameterAsValue(Ids::Oscs::Mode::id + juce::String(j)).getValue(),
                    m_treestate.getParameterAsValue(Ids::Frequency::Strength::id + juce::String(j)).getValue(),
                    m_treestate.getParameterAsValue(Ids::Frequency::FineTune::id + juce::String(j)).getValue(),
                    m_treestate.getParameterAsValue(Ids::Frequency::Semitone::id + juce::String(j)).getValue(),
                    m_treestate.getParameterAsValue(Ids::Frequency::Octave::id + juce::String(j)).getValue());

                ..... //similar code follows
                
            }
        }
    }

    .... // Actual processing

}

UPDATE: I moved the ones that didn’t need to be within the Voice loop outside and it dropped from 30% to 25%. But still, I would like to get this down to 5% if I can.

Any suggestions on optimizing this are appreciated.

There is a lot of stuff there that you shouldn’t do in ProcessBlock. dynamic_cast, and juce::String(j) conversions should be avoided here i think. Do you need to update these parameters every block? Looks like overkill

Is there a way to get parameters using an int or uint rather than passing a string?

Also, how often is a block called?
I understand that we process the buffer inside a block but for sure I would reduce the amount of times I read those values as well. I just need to read frequent enough to not miss any value changes.

You could make each parameter a member of your synthesizer class, implementing an AudioProcessorValueTreeState::Listener in the class and making use of the parameterChanged(const String &paramID, float newValue) method to handle changes in the value. This way you’ll only be changing the variables when they’ve actually changed value, instead of re-assigning the values every single process block

You should be extremely careful when doing parameter updates on parameterChanged() as those can update on any thread!

Instead, you can cache the pointer to the atomic value of the parameter parameter in the constructor of your processor using getRawParameterValue() and check the value directly.

//class member:
std::atomic<float>* gain = nullptr; 

//Processor constructor:
MyProcessor()
{
    gain = getRawParameterValue("Gain");
    //No more string lookups after this point
}

As for the synth voices, you can apply the dynamic casts only once and avoid doing this rather expensive operation in the processblock.

//Synth class member:
std::vector<MySynthVoice*> myVoices;

//Synth constructor:
MySynthesiser()
{
   //after adding all voices the regular way 
   for (int voiceID = 0; voiceID < getNumVoices(); ++voiceID)
   {
        myVoices.push_back(dynamic_cast<MySynthVoice*>(getVoice(voiceID));
   }
}

//To access the voices:
for (auto* voice: myVoices)
    voice->specialVoiceFunc();
1 Like

Interesting, ! I’ve been using atomic values for class members but didn’t think of the limitations of parameterChanged(). Thanks for the tip :slight_smile:

1 Like

Thank you @eyalamir! Making atomic members of the parameters fantastically did the job!

The cpu usage went down form 30% to only 7% for 10 instances of the synth running (without the DSP part)!

I used Static casting for now but I’m planning to make the voices members too.

Thanks again everyone for your advise and tips!

Cheers
Aram