Multitrack recording delay

I’m working on building a multitrack player/recorder and I currently have it playing 8 wave files together in sync. I have two issues that I need help with at this point (more will surely come!):

  1. I’d like to confirm that I’m taking the proper approach to playing the multiple tracks because I have a sneaking suspicion that I’m not, even though they do sound OK.

  2. I’m having difficulty recording a new track in sync with the other playing tracks.

For each playing track, I have the wav file processed by:

AudioFormatReader -> AudioFormatReaderSource -> AudioTransportSource

I also have a single MixerAudioSource that I add each track’s AudioTransportSource to as a source.

After building all of the tracks, I assign the MixerAudioSource as a source to an AudioSourcePlayer. That is then added as an audio callback to an AudioDeviceManager. (At this point, I also add an AudioRecorder class like
the JUCE demo as a callback to the AudioDeviceManager.)

To play, I go through all of the valid tracks and call setPosition() and start() on each one’s AudioTransportSource.

Is this the proper approach to playing multiple tracks? It seems like I should be starting a single mixer rather than all of the individual AudioTransportSources separately…

For the recording, I used the same approach as the audio recording demo that creates a ThreadedWriter and the audioDeviceIOCallback gives the samples to the writer. When the Record button is clicked, I start recording to the file and then start all of the AudioTransportSources for the other tracks to play.

Although I play in time with the playing tracks while recording, the result is that the recorded track is always about 2 seconds behind the other tracks when played back. I’m guessing that this delay is the time between when it starts writing the file and when the other tracks start playing.

What approach should I use to save the incoming audio to a file in sync with the playing tracks?



You can go AudioFormatReader -> (opt AudioSubsectionReader) ->AudioFormatReaderSource ->MixerAudioSource->AudioTransportSource and then start the mixer with AudioTransportSource::start().

As for the delay I suppose you’re not using an asio driver, which I suggest you do to minimize the delay to below 100ms at least. But you can’t get rid of the delay all together so you’ll have to start the playing tracks in advance of the recording, that is, start them already during the countdown period of the recording.

Thanks for your reply. I’ll work on inserting the mixer earlier in the chain and hopefully that will save me some startup time and I’ll also try switching the playing/recording order.

I’m just wondering if I’m taking the wrong approach and should be working at a lower level. Unfortunately there are woefully few examples of using JUCE to create even a basic multitrack DAW. If you or anyone else can point me in the direction of some code and/or logic examples, it would truly be appreciated!

Few years ago, I wrote a simple DAW software on macOS and Windows, it was almost start from scartch, using C++ language, JUCE library and no any 3rd library. In audio filed of C++ program, JUCE is the best solution i’ve ever met.

I encounted the same problem as you mentioned above.
Hardly to remember the details how I struggled and finally improved it… The core was: I wrote 3 classes for multi-track playback.

These are the classes’ h file (the comments are in Chinese, please translate them by any trans-app):

  • PositionableResamplingAudioSource class

/** 本类相当于 ResamplingAudioSource 和 PositionAudioSource 的“二合一”版本,可改变所持有的 PositionAudioSource 的采样率。
    @see SequenceAudioSource, PositionableMixerAudioSource
class PositionableResamplingAudioSource : public PositionableAudioSource

    /** 构造1参:本类可改变该AudioSource的采样率。2参:本类是否托管1参 */
    PositionableResamplingAudioSource (PositionableAudioSource* const inputSource,
                                       const bool deleteInputWhenDeleted);

    /** 设置所持有的AudioSource的重采样比率。可随时调用此函数,即使执行期间亦可。
        @param samplesInPerOutputSample     此值必须大于0。值为1.0 相当于“直通”,不做任何改变。
    void setResamplingRatio (const double samplesInPerOutputSample)
	{ resamplingAudioSource->setResamplingRatio(samplesInPerOutputSample); }

    /** 返回当前所设置的重采样比率。返回值由 setResamplingRatio()所设置. */
    double getResamplingRatio() const throw()       { return resamplingAudioSource->getResamplingRatio(); }
	/** 设置是否循环播放 */
	void setLooping(bool shouldLoop)				{ source->setLooping(shouldLoop); }

	/** 如果当前处于循环,则返回true */
	bool isLooping() const							{ return looping; }
	/** 实现基类AudioSource的纯虚函数 */
    void prepareToPlay (int samplesPerBlockExpected, double sampleRate);
    void releaseResources();
    void getNextAudioBlock (const AudioSourceChannelInfo& bufferToFill);

	/** 实现基类PositionableAudioSource的纯虚函数 */
    void setNextReadPosition (int64 newPosition);
    int64 getNextReadPosition() const;
    int64 getTotalLength() const;

	PositionableAudioSource* source;
	ResamplingAudioSource* resamplingAudioSource;
    int64 volatile nextPlayPos;
    const bool looping;


  • SequenceAudioSource class

class PositionableResamplingAudioSource;

/** 可持有多个 AudioFormatReader 的音频数据,并对所持有的音频数据按其起始时间点的先后自动排序的 PositionableAudioSource.
    本类在产生音频数据流期间,可随时添加或移除新的 AudioFormatReader. 前提是需提前调用该音频数据的resampleSource->prepareToPlay()。
    @see PositionableResamplingAudioSource, PositionableMixerAudioSource
class SequenceAudioSource : public PositionableAudioSource
    /** 该嵌套类代表某个音频Clip所持有的音频数据,亦即 SequenceAudioSource中的某个音频数据.
        @see SequenceAudioSource::getEventPointer()
    class AudioEventHolder
		/** 构造1参: 建议设置为AudioFormatPartReader。2参:本数据在时间标尺中的偏移量(采样数)。 
		    3参:1参采样率与项目采样率的比值。 4参:本类是否托管1参 */
		AudioEventHolder (AudioFormatReader* const sourceReader,
                          const int64 startOffset,
		                  const double resampleRatio = 1.0,
		                  const bool deleteWhenRemoved = false);

        ~AudioEventHolder ();

        AudioFormatReader* sourceReader;                    /**< AudioFormat读取器*/
        PositionableResamplingAudioSource* resampleSource;  /**< 本类所持有的可定位、可重采样AudioSource */        
		int64 startOffset;                                  /**< 所持有的音频数据的开始偏移量(采样数)*/
        int64 sampleNums;									/**< 所持有的音频数据的总采样数 */
        double resampleRatio;                               /**< 所持有的音频数据的重采样比率 */

        friend class SequenceAudioSource;        

    /** 添加一个音频数据。2参偏移量的单位为采样,即1参的播放起始位置。4参为本类是否托管1参。
		执行成功返回所添加AudioEventHolder,否则返回nullptr。注意:所添加的AudioEventHolder由调用方负责销毁 */
    AudioEventHolder* addAudioEvent (AudioFormatReader* const sourceReader,
							const int64 startOffset,
							const double resampleRatio = 1.0,
							const bool deleteWhenRemoved = true);

	/** 添加一个音频数据,自动排序。所添加的AudioEventHolder由调用方负责销毁 */
	void addAudioEvent (AudioEventHolder* event);

    /** 基于给出的索引移除本类所持有的某个音频数据 */
    void removeAudioEvent (const int index);

	/** 基于给出的AudioEventHolder移除本类所持有的某个音频数据 */
	void removeAudioEvent(AudioEventHolder* event);

	/** 对本类所持有的所有音频数据进行排序 */
	void sort()														{ list.sort (*this); }
    /** 返回本类所持有的音频数据的数量 */
    const int getNumEvents() const                                  { return list.size(); }

    /** 基于给出的索引返回封装该音频数据的 AudioEventHolder 堆对象 */
    AudioEventHolder* getAudioEventHolder (const int index) const   { return list [index]; }

    /** 返回某个音频数据的索引 */
    const int getIndexOf (AudioEventHolder* const event) const		{ return list.indexOf (event); }

    /** 返回给出的采样数或其后的第一个音频数据的索引。如果给出的时间点超出整个序列的结束时间,则返回值为本类所持有的音频数据的数量 */
    const int getNextIndexAtTime (const double samples) const;

    /** 清除所有的音频数据 */
    void clear();

    /** 返回本类所持有的第一个音频数据的播放起始点(单位:采样) */
    int64 getStartTime() const          { return (list.size() > 0) ? list.getUnchecked(0)->startOffset : 0; }

    /** 返回本类所持有的最后一个音频数据的播放起始点(单位:采样) */
    int64 getEndTime() const            { return (list.size() > 0) ? list.getLast()->startOffset : 0; }

    /** 基于给出的索引获取某个音频数据的起始点和长度(单位:采样)。索引越界,则二者的返回值为0 */
    void getEventTimeAndLength (const int index,
                                int64& sampleOffset,
                                int64& sampleDuration) const;
    /** 本类产生音频数据流之前,持有本类对象的类需调用此函数 */
    void prepareToPlay (int samplesPerBlockExpected, double sampleRate);

    /** 本类停止产生音频数据流后,持有本类对象的类需调用此函数 */
    void releaseResources();

    /** 本类开始产生音频数据流,此函数供持有本类对象的类在需要音频数据流时调用 */
    void getNextAudioBlock (const AudioSourceChannelInfo& bufferToFill);
    void setNextReadPosition (int64 newPosition);   /**< @internal */
    int64 getNextReadPosition() const	{ return currentPlayingPosition; }
    int64 getTotalLength() const;                   /**< @internal */
	bool isLooping() const				{ return isPlayingLoop; }
    /** @internal */
    static int compareElements (const SequenceAudioSource::AudioEventHolder* const first,
                                const SequenceAudioSource::AudioEventHolder* const second) throw()
        return (first->startOffset - second->startOffset == 0) ? 0 
            : ((first->startOffset - second->startOffset > 0) ? 1 : -1);

	/** 在此更新下一个要播放的音频数据的索引,和播放的位置 */
    void updatePlayingEvent (const int64 newPosition);
    Array <AudioEventHolder*> list;	/**< 存储和管理AudioEventHolder的数组 */
	AudioSampleBuffer tempBuffer;	/**< 混合两个AudioEventHolder所需的音频采样缓冲 */
    CriticalSection lock;			/**< 作用域锁 */

    double currentSampleRate;		/**< 当前采样率,添加AudioEventHolder时所需 */
    int64 currentPlayingPosition;	/**< 当前播放位置 */
    int bufferSizeExpected;			/**< 当前的缓冲大小,添加AudioEventHolder时所需 */
    int currentPlayingPart;			/**< 当前正在播放的AudioEventHolder的索引 */

    bool isPlayingLoop;				/**< 当前是否处于循环模式 */


  • PositionableMixerAudioSource class

/** 本类相当于 MixerAudioSource 和 PositionableAudioSource 的“二合一”版本,可将一到多个 PositionableAudioSource
    @see PositionableResamplingAudioSource, SequenceAudioSource
class PositionableMixerAudioSource  : public PositionableAudioSource
    /** 添加一个需混合的 PositionableAudioSource. 2参为本类是否托管之 */
    void addInputSource (PositionableAudioSource* newInput, const bool deleteWhenRemoved);

    /** 移除已经添加的某个 PositionableAudioSource。2参为移除后是否销毁之 */
    void removeInputSource (PositionableAudioSource* input, const bool deleteSource);

    /** 移除所有已经添加的 PositionableAudioSource。是否一并销毁,由添加该 PositionableAudioSource 的2参来确定 */
    void removeAllInputs();
    void prepareToPlay (int samplesPerBlockExpected, double sampleRate);
    void releaseResources();
    void getNextAudioBlock (const AudioSourceChannelInfo& bufferToFill);

    void setNextReadPosition (int64 newPosition);
    int64 getNextReadPosition() const;
    int64 getTotalLength() const;
    bool isLooping() const;

    Array<PositionableAudioSource*> inputs;
    BigInteger inputsToDelete;
    CriticalSection lock;
    AudioSampleBuffer tempBuffer;

    int64 currentPlayingPosition;
    double currentSampleRate;
    int bufferSizeExpected;
    bool isPlayingLoop;



loopfine, thanks for sharing your header files. I’ll dig into them and see what I can implement. Were you able to get your DAW working to record and playback in sync successfully?

BTW, any other code you could dig up to help me get the basic engine up-and-running would be awesome.