CPU load smoothing for STFT block processing

Hi all!

I’m working on a plugin that uses STFT block processing, so the bulk of the work happens in chunks that I can process like every 256 samples (maybe more, maybe less). Everything else is pretty much just buffer reads/write.

The straightforward option would be to just read/write the input output buffers and do the processing / overlap-add every 256th sample. Obviously I’ll get a very uneven CPU load and give the host a hard time figuring out an efficient scheduling regime.

What are good ways to even out the CPU load? Doing the hard work in a separate worker thread? Trying to break it down into a few steps that I can spread out in the time between hops? Or has this become a non-issue with modern hosts and I shouldn’t give myself the headache?