Handling Samplerate conversions realtime (in ARA)

Right now I analyse the audio sources and play them back with a potentially buffered reader in the playback renderer, without storing any intermediate files (just like in the juce ara demo plugin).

Here are my ideas right now:

  1. Realtime SRC in the playback renderers processblock

    1. convert the sample range of each audioSource into a range for the correct sample rate
    2. change the playback code to play back regions based on new audiosource render range
    3. use a library like r8brain etc. for SRC
    4. report a latency to the host since r8brain needs a lot of samples before outputting the converted audio - the amount of latency will probably be dynamic (?) since the amount of preringing for linear phase filters will differ for different samplerates

    Pros:

    • no file management etc necessary

    Cons:

    • more realtime processing required
      • the buffer even has to be converted to doubles and back again in case of r8brain…
    • introduces latency
    • probably the hardest to implement?
  2. storing my own temporary file and reading that one

    1. create new temporary file with an edited audio source
    2. play that one back

    Pros:

    • there already is MemoryMappedAudioFormatReader in juce
    • I could add my other modifications in there too (my plugin does not depend on the different musical data etc for the different playback regions)

    Cons:

    • file reading and writing will make analysing step slower either way
    • filemanagement… check for enough space on the hard drive etc
    • if these files arent temporary, they will take a lot more space
    • if these files are temporary, loading times will increase when opening a project
  3. storing a converted version of the audio source in memory in my ARAAudioSource child class and playing this one back

    Pros:

    • Probably faster to implement

    Cons:

    • unacceptable memory footprint - if my calculations are correct, the ARA Documentation example of 30 minutes life-recording of 20+ tracks would result in more than 4GB

Maybe Im missing some other option.
Anyways, I was confused that there isnt already a post about this.

What is the best way to add SRC? How do you manage SRC?

For your information, our plug-ins offer real-time conversion (option 1). We have our own implementation, but it is of course possible to use juce::ResamplingAudioSource, or else there are plenty of efficient solutions on the Internet. There’s no latency because the plug-in is ARA, so you have access to the entire audio content (and, by the way, ARA doesn’t allow reporting latency).

2 Likes

TLDR: Do you load the whole audio source content into ram for playback?


Thank you! I chose the first path, but there is still one edge case that I was not able to fix without loading the full audio source content into memory.

I preload the source data into memory, similar to the juce::BufferingAudioReader, but I added methods to preload the next samples even when playback is stopped, using the playhead state from one of the editor renderers.
The only thing that seems impossible with this approach is 100% realtime safety when the playback position jumps during playback.
Its impossible to predict where the playhead could be jumping so the whole file would have to be preloaded into memory.
The example in the ara documentation (many very long audio files) would result in 3GB+ of RAM usage, is this really necessary?

Another thing to note (which does not really matter since other hosts actually need to be realtime safe): even though Reaper tells ARA playback renderers that they are realtime, they arent really.
From my tests, Reaper seems to be prebuffering and when setting the readers read timeout above 0, Im not able to create any audio glitches, no matter how hard I try.

We don’t (yet) handle SRC at all in ARA, but we still have to handle jumps during playback, and that means reading into buffers, and that takes time. So if we ask for a sample that is not present in our pool of buffers, then we return silence until we do get the buffer we need (which will be for a later sample position by that time, obviously). So jumping around to areas that are not loaded already results in short periods of silence while the data is loaded. Also, we always issue a read request for two buffers whenever we ask for one, on the assumption that we will more likely play that than jump or loop.

3 Likes

Thanks! Im at the same point now including SRC and Melodyne and Vocalign seem to do the same thing, so I guess there is no way around this small piece of silence without high ram usage. I might contact the ARA devs about this since sample perfect playback should be a thing imo.

But Im afraid they only did it that way to be as compatible as possible.