Estimating JUCE behavior to develop a generic architecture file for Faust


#1

Hi,

We are currently estimating how the JUCE framework behaves in order to develop a generic architecture file for Faust (http://faust.grame.fr). Starting from a Faust source code (a textural .dsp file), the Faust compiler produces a self-contained C++ file with the actual DSP computation, to be “connected” to the audio architecture and widget system.

Over the years we have developed architecture files (on iOS, Android, OS X, Windows, Linux…) using either multi-platform frameworks (like QT…) or specific one (Java + JNI Audio code on Android, CoreAudio + Objective C on iOS…etc…).

Using JUCE could be a better solution and we are currently doing some tests. Developing a basic architecture file with sliders and buttons, we were able to deploy an OS X, iOS and Android. Behaviors are more or less OK on OS X and iOS, but quite unsatisfactory on Android (on a Motorola XT1068 run-on OS 5.02) : latency between action on the screen (hitting a button, moving sliders…) and the resulting sound is awfully long (almost 1 second probably…), when our “native” version was much better in this area.

Is this a known behavior? What could be the reasons for this? What test should be done to better understand what happens?

Thanks in advance for any advice.


Dropout and weird delay caused by AudioIODeviceCombiner?
#2

On Android you really need to enable the openGL rendering to get decent UI performance. With it, you get great rendering speed, but without it things are ridiculously slow… Making GL the default is on our to-do-list, but for now you just need to create an OpenGLContext and attach it to your top-level component. (see the big juce demo for an example)


#3

Doing some debugging I found out that the main latency is actually at the audio level : by default the application is called with huge buffers of 11520 frames. Deploying the Juce demo, I see that the audio buffer can be lovered to 1920 frames on the Motorola device but not less.

We could run with a 512 frames buffer in this same machine with our own OpenSL based code + JAVA interface.

How mature is the Android audio layer code? Does the audio chain run entirely on C++ side? Is is purely callback based? Or using threads?

Thanks.


#4

We have a very good relationship with the Android audio team at Google and have worked closely with them to make sure that JUCE will give you the very best latency that’s possible on that platform.

But it depends hugely on the device and version of Android that you’re running - older devices have very poor latency. Newer Android M devices that are badged as “pro audio” will give performance that’s pretty much on-par with what iOS can do.


#5

Thanks… but this does not explain my precise questions : why this poor latency on this device where our own code could do better?

I looked at the “juce_android_Audio.cpp” code : it seems an additional thread is used beside the native OpenSL callbacks to drive the user audio callback. Why that ? Why not doing everything in the OpenSL callbacks themselves? (as SuperPowered people (https://www.google.com/search?client=safari&rls=en&q=SuperPowered&ie=UTF-8&oe=UTF-8) are doing in their SDK?

Thanks.


#6

Re: your particular device, I have no idea - Fabian’s our Android expert, and he might be able to give you more info when he’s back next week.

We’ve got an ongoing project with the Google guys investigating callback-vs-threads. It’s a non-trivial question - obviously we’ve tested both techniques, but although it may seem at first glance that callbacks lower the latency, we found other problems with them performing poorly under higher CPU loads and leading to spikes and glitches. We ended up building a test app for the Google team to use to measure these performance characteristics, and although Fabian may be able to give a more up to date answer to this next week, IIRC there were some kernel adjustments needed before it’d be possible for either way to provide the optimal performance. In the meantime, we implemented what we found to be the best method overall.


#7

Thanks for the detailed answer.

This is quite clear that what ever you do, you’ll end up having “spikes and glitches” at high CPU load… And the expedience in this area on all systems I know of (iOS, OS X, Windows, Linux…) is that implementing a more complex system (like another thread with : priority setting issues… synchronization issues…) usually makes thing worse… Or you end up adding buffering and latency somewhere in the chain to “hide” the real thing in some sense.

I"ll wait for Fabian’s insight then.


#8

I wish it was that simple, but on mobile multi-core platforms, especially where asymmetric CPUs are involved there also OS load-balancing behaviours that need to be considered and either dealt with at the kernel level (which is what we’re talking to the Android team about) or by workarounds in our code.

…I think basically what I’m trying to say here is: this is way more complex than a reasonable person would expect it to be, but we’re on the case!


#9

Do you mean things are basically different on a multi-core mobile platform compared to a multi-core regular OS that were are dealing with for several years?

What is asymmetric CPU then?


#10

Yep. Asymmetric CPUs have different cores that run at different clock speeds. So if the OS suddenly shifts your audio thread to a less powerful core to save power because it’s too busy, then things get nasty!


#11

I read in the following page : https://googlesamples.github.io/android-audio-high-performance/guides/opensl_es.html

One consequence of potentially independent audio clocks is the need for asynchronous sample rate conversion. A simple (though not ideal for audio quality) technique for asynchronous sample rate conversion is to duplicate or drop samples as needed near a zero-crossing point. More sophisticated conversions are possible.

It there any “clock drift adaptation” code in place in JUCE audio layer to deal with that?


#12

Are you confusing the audio clock with the CPU clock? I was talking about CPU clocks, in the context of them creating subtle performance problems, but that has nothing at all to do with audio syncing.


#13

We discovered recently that our native OpenSL code suffers from an audio synchronization issues when in duplex mode and after some minutes, we get audio clicks. So I found this blog that explains this probably comes from this audio clocks drift issue.

So I looked at SuperPowered SDK, and I don’t see any “audio clocks drift” compensation code.

My point is that : another issue that have to be no in the Android audio layer is this audio clocks drift issue, and I was wondering if JUCE contains code to do that.


#14

No, there’s not currently any multi-stream syncing code, but it’s something that I’ve wanted to add for a while, probably in the context of a more general-purpose audio i/o syncing utility that could deal with different sample rates too.


#15

I would even say : different sample rates and possibly different buffer sizes. This can be needed when you transfer audio on a network between 2 machines, each using each buffer size/sample rate setting, and with the “clock drift” issue.

A good candidate to have a look at is Zita-njbridge here : http://kokkinizita.linuxaudio.org/linuxaudio/, extracting the essence of Fons Adriaensen algorithm to be used in a wider context.


#16

Yes, buffer sizes too.

There’s actually already some simple code to do this buried inside our CoreAudio implementation, but I’d like to pull it out and make it a bit more flexible.


#17

I see in the juce_mac_CoreAudio.cpp file that there seem to be a mode when an additional thread is used (with this “run” method) : why that?
On OS X even in the presence of separated audio devices, you can perfectly combine them in a single one (using the “aggregate device” concept), the use the AULab Audio Unit to have a single duplex callback to handle inputs and outputs at the same time.
This is much simpler, and there is no audio clock drift issue to handle (since this is done by the underlying CoreAudio layer).


#18

In juce_ios_Audio.cpp process method : you’re still using kLinearPCMFormatFlagIsSignedInteger format? This was needed in the early days of iOS, but floats are now supported natively, and you can be sure Apple CoreAudio highly short/float and float/short conversion routines are much better than what can be simply coded in a C++ loop.

Apple code can be looked at http://opensource.apple.com/release/os-x-10112/ (IOAudioFamily-204.3 where there is some X86 code…) and we can expect highly optimized vectorized code to be used right now :frowning:http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/BCFGHIBH.html)


#19

Have you seen this recent Superpowered announce :

Anything Fabian could add on the current performance of the JUCE Android audio layer?

Thanks.


#20

Very difficult to comment on that Superpowered announcement without losing our diplomatic cool…

But trying to be very polite about it, I would say that that we all consider it at best disingenuous of them to suggest that something that only works on rooted devices is a “problem solved”.

For the record: there’s nothing their library does on non-rooted devices which is in any way faster than JUCE’s implementation. We work closely with Google to make sure that that’s always the case. And I’d say the chances of Superpowered getting a friendly reception from the Android audio team after this publicity stunt is pretty low.