The higher FFT sizes really only improve the low-end precision, the added precision in the high-end is really not needed - especially if you’re displaying the frequency content with a log scale.
With an FFT size of 8192, and a sample-rate of 48kHz, you’ll have a frequency bin every 48000/8192 = 5.86Hz (I think my math is right? please correct me if that’s wrong). In the high-end, 6Hz difference is completely undetectable by human hearing (10kHz - 10.006kHz for example) so you really don’t need that level of precision while at the low-end 6Hz is quite noticeable.
In GT Analyser I allow users to select of block size of anywhere from 1024 to 65536 but I have a fixed number of points of 256 (which is actually a max because repeated points are omitted) which still gives good precision across the spectrum.
I think when I’ve done it in the past I determined what the minimum distance in the x-axis is between two points, then for every point in the spectrum between those two pixels I take the maximum value to determine the y-coordinate. This then works nicely regardless of zoom level etc.
As @ImJimmi says I’ve also seen an OpenGL context increase CPU, however only on Windows. On macOS, it generally seems to improve things. In the past we had an option in the plugin to enable the OpenGL context, we would default it the option on for macOS, and off for Windows.
I just tried Kotlin Compose. And it looks like the results are disappointing for Juce
Apple M1 test:
Kotlin Compose - 32768 lines [512px, 256px] with gradient ~60 fps - 33.2% CPU
Kotlin Compose - 98304 lines [512px, 256px] with gradient ~60 fps - 34.7-37.8% CPU
Juce strokePath - 32768 lines [140px, 96px] with setGradientFill ~60 fps - 64.4-72.3% CPU
As you can see, the number of lines is 3 times larger and the window size is 3 times larger, and the load is still several times less for Compose. I suspect there’s a Skia under the hood. Something is clearly wrong with strokePath. Any solutions to reduce CPU load/fps ? p.s. I don’t want to use openGLContext.attachTo - this works with friezes and memory increases significantly.
[Edit] Ohh, and that I forgot to say, that ~12% cpu in Compose goes to wave[i] = Random.nextFloat(), so in fact the load of drawing will be much lower. While in juce I just load static values from wave.
From my own experiences with profiling path rendering on macOS (using the animated waveforms on the home page of the Demo app, upping the timer rate to 1000Hz to force continuous updates), the biggest bottleneck is copying data from the juce::Path to the CoreGraphics Context.
If instead we could have a way to construct a path that adds to the context directly, we could remove this costly copying.
That being said, there’s still things you can do on your end to make things more performant - e.g. you should preallocate memory using juce::Path::preallocateSpace() after calling path_.clear() to ensure you’re not having to re-allocate as you build up the path.
Am I correct in that you want create and stroke 1024 * 32 paths at 60 frames a second? If so, I think you are going to find this a performance challenge no matter the tech/platform (re: seeing high mem with opengl).
For context, I’ve done a lot of performance optimization, now in JUCE and before that professionally for over a decade — IMO regardless of platform you will end up needing to get creative and “cheat” to get something performing well. Basically, find ways to draw less or less often or reuse what you’ve already drawn. Even more important if others will run your app (you have a decent M1 machine and others may still be on older machines).
The “get creative” part depends 100% on your end goals and what compromises are acceptable for your use case. I wrote these based just on your vid, so might not all make sense for you:
Futz with details: See if you get better performance with 1 path per lineIndex vs. one big path with subpaths (make sure to preallocate enough space). See what the impact of removing the gradient fill is. What is it like if you paint to a single channel image first, then composite up to argb? It’s worth it to know all these pieces of information for your specific use case, as it will inform your options.
Performance goes hand in hand with UX. It seems like you are trying to display 32 realtime channels of waveform data — what should the user be able to pick out and attend to? The waveforms in the video are tiny in height and going fast, so is it mainly “vibes” or is there something technical you want to highlight? If so, focus on improving that and usually performance will follow.
Try draw n’ clear. instead of every sample changing, you could draw each path starting at x=0 and drawing over 1024 frames to to x=endOfScreen so it “builds up” over time. When you get to the end, you wipe the screen and start from x=0 again. This is a different vibe, but gives you opportunities to draw very small slice at a time and might help users pick out trends.
Sliding Cached Image. Instead of recreating the entire image from paths each frame, you might have better luck having a cached image 1.5 or 2x wider than its component viewport, adding 32 new points to the image each frame, then translating the image left in a container. This might require also storing the last y positions so you can create a new path with just the new data. Every X frames you’ll need to copy the right side of the image to the left.
Data preprocessing. [Edit: relevant to vid more than screenshot]. Your LineResolution is a step in the right direction, but ideally the data would be manipulated and reduced. Think of what a DAW does and why it does it (it reduces waveform data depending on scale for performance and UX reasons).
Something is clearly wrong with strokePath
g.strokePath generates a new path per call and then fills that new path. If a stroked path is reused across paint calls, it’s more efficient to manually do juce::PathStrokeType (2.0f).createStrokedPath and cache that path in the component across calls.
Good luck! It’s a legitimate tough problem but I think you have lots of opportunity to tame it if you focus on reusing what’s already been drawn.
@sudara
The problem is not so much finding a clever solution, the problem is that Kontlin Compose draws it with ease and with even more quantity. Here I use only 32 lines, but there with Kotlin I even tried 120 lines and it doesn’t matter it still has low CPU usage and FPS > 60 (Mac M1) and Ipad >= 112-120. Also GPU usage 0.1%. That is strange as for me.
On the same platform with the same code structure?
Regardless, my tips will be useful if you move forward with JUCE. You’ll have to manually optimize for your use case. Sliding image would be my first bet based on your last posted image.
Yes, everything is on same platform. The code is also similar and simple. It just reading values from the float array. I would even say Kotlin is less optimized, since every frame I just make a new array with Random.nextFloat(), because I’m only testing it… But in the example with Juce, I read static data that I don’t even change yet, I call just repaint().
I don’t know what they did there, I just remember trying Flutter and it was terribly slow with a cpu of 90-100%.
The reason why I need to update all 32 lines and not using setBufferedToImage(true) with drawing secondary image on a top - that’s because when FM Modulation will be changed, I need to reflect it. btw. same as in Ableton WaveTable.
{Edit} I also just tried to prepare Path in the constructor and call in the paint only g.strokePath(path_…) - nothing changed. Same CPU load. So the cost of building path no effect here.
Do you how are those values converted to filled lines on screen? My guess is custom GPU stuff? JUCE paths are very 1:1 wrapping an underlying macOS CG implementation.