Graphics Performance: Windows vs. macOS


#1

Hello,

We are having GUI performance issues on macOS 10.13 vs. Windows 7 & 10. I have read through several other forum threads about others having similar issues, but it seems none of the things that were discussed there would apply to what I am seeing.

I created a very small sample application to show the issue. After using Projucer to create the framework of the app, I customized maybe 25 lines of code to show the problem. On Windows, if I set my timer to 16ms (60Hz), the app uses something like 3% of the CPU. On macOS, it uses 75% of the CPU and cannot run at 60FPS. (There are notes in the source code about the testing I have done.) The app draws 256 ‘meters’ (so 512 rectangles total) and it is completely hosed on macOS and Windows isn’t even breathing hard.

I have done performance analysis on both versions, and the macOS version spends all of its time in the paint function dealing with the 512 rectangles. At this point I cannot think of a way around this to speed up the redraw code other than to scrape JUCE and use something like Metal on macOS. I talked with Jules about this at an ADC a while back, and he suggested using RectangleList which did not help the problem.

I have put the source to the TestUIApp on dropbox: https://www.dropbox.com/s/3fgyyob87xjlkbq/TestUIApp.zip?dl=0

There is a secondary issue that I also discovered. If I try and be smart about only invalidating the part of the rectangle that is actually changing, this actually slows the app down to an unusable point when the mouse cursor hovers over any of the rectangles. There is a checkbox in the app to turn on/off the ‘feature’ so you can see the issue for yourself.

Thank you in advance for any help or suggestions you might have.

David A. Hoatson


#2

The CoreGraphics API should be able to draw a few rectangles, it sounds to me like you’ve got something else going on there.

You’re using subcomponents for all these objects - have you tried playing with Component::setPaintingIsUnclipped() ? Could be that it’s the clipping, not the drawing, which is slow?


#3

Jules,

Thanks for the quick reply. Unfortunately, adding setPaintingIsUnclipped(true); didn’t change anything. 78% CPU on the MacPro 2013 (6-Core Intel Xeon E5) with AMD FirePro D500 graphics.

I have updated the source in dropbox with the setPaintingIsUnclipped changes.

Any other thoughts as to what might be going on here?

Thank you,

David A. Hoatson


#4

I’ve found repainting so many subcomponents that fast isn’t going to perform very well, even if they’re not repainting their whole bounds each time. I had about the same results (75%+ CPU, didn’t consistently keep 60fps) when profiling this example on Release on my machine

I was able to get a little performance increase by calling repaint() on the MainComponent at the end of the timer callback, and commenting out the repaint() calls that were inside MeterComponent's update method. This way you only have one trigger to repaint that just goes over the whole window. I was able to keep it at 60 FPS on my machine but the CPU performance increase wasn’t that great while profiling a Release build.

The best case I got though was removing the meters as visible components, and instead just drawing all of their rectangles in MainComponent::paint() and using MainComponent::repaint(160, 10, 1024, 848) to just repaint the area where the metering is drawn. I also removed the call to Graphics::fillAll(), since the main window is already filling the background colour. When profiling this got it down to about 50-60% CPU and I kept a consistent 60 FPS.


#5

Tony,

Thanks for your help on this. I can’t remember if it was you or Ben Loftis that I spoke to at a previous NAMM show.

I also changed the app to remove the individual MeterComponent, removed the fillAll() and used RectangleList to paint the 512 rectangles in the MainComponent::paint function and was able to get the CPU down to 27% on macOS (full 60FPS), which is a huge improvement.

Unfortunately this is still not good enough.

I’m starting a new project for a new product and I cannot use JUCE on macOS if we are going to burn 27% of the CPU just to paint meters on the screen. Maybe I’m being unrealistic in assuming that we should be getting the same (or similar) GUI performance on macOS that we see on Windows.

Thank you,

David A. Hoatson


#6

Just a general comment………in doing graphics you should take the approach in all cases of drawing as little as possible. Your design should reflect this philosophy from the beginning.

So, in other words, look through your code base and find all possible places where you can avoid painting. Implement your code in such a way that paints (repaints) happen only when absolutely necessary.

For example, something I do routinely is to actually use a flag to block repaints while processing is happening elsewhere. So the processing sets the flag when it starts, and clears the flag when it finishes. The technique is very simple, and typically invisible. And, usually you can easily achieve at least 30 fps, and often much higher, even if you block quite a few repaints. And those repaints are usually not necessary for good user experience. So, this saves CPU for more important tasks.


#7

have you tried switching to opengl for rendering the rectangles? they can be batched in a single draw call even. if you want 60fps and your full windows is going to repaint that often, then i don’t think you’ll find a better approach other than switching to it.


#8

It’s not terrible performance, given what you’re asking it to do. And bear in mind that these are percentages of only one core, so for the entire machine it’s really a fraction of that overall - and if you’re comparing with Windows performance, make sure you’re actually comparing the same per-core metric, as different CPU meters report things differently.

JUCE_ENABLE_REPAINT_DEBUGGING is quite useful for this kind of thing. Turning it on here, I can see that you’re only repainting the meters, and not the strips of background in between them. In general that’s good practice, as it means you’re drawing fewer pixels, however on a GUI like this with so many separate rectangles, that means that each repaint involves a hugely fragmented clip region, and that’s clearly eating into the CPU. Just a quick hack to remove the gaps between the meters speeds things up by a noticeable amount.

I’m surprised that CoreGraphics is being a bit slow here, but good luck in finding anything other than JUCE which makes it faster - we all just call the same API functions under the hood!
However, doing a quick test of turning on openGL rendering cut it down to 20% of one core on my machine, so maybe that’s the way to go. Ultimately, then only really super-efficient way to render something as busy as this is with a custom GL shader.


#9

oh, and one more quick observation - adding a repaint() to your timerCallback() takes it down to 10% so probably there’s some overhead going on by having all those separate components repainting sub-areas at different times. You’d probably see better performance by just having a big rectanglelist of all of them, and not using separate subcomponents.

I should mention that this is a pretty unusual case where it could be faster to keep redrawing larger areas. Normally you’d want to minimise the number of pixels, but in this case where the content is simple large rectangles, it’s the repaint region validation that seems to be more significant.


#10

Thanks for all the pointers. Doing the repaint in the timerCallback and using a RectangleList is how I was able to get it down to 27% CPU on macOS.

Obviously this is just a test application to show the issue I was seeing and to try and figure out how to solve the problem. Having 256 meters on screen painting at 60FPS is the worst-case I could think of so having fewer meters and even dropping to 30FPS would help. My real application has much fancier looking meters instead of plain rectangles, so it is actually much more CPU intensive.

In my previous tests with openGL, it did not improve performance at all with this type of test. I haven’t tried it since switching to the single RectangleList way of doing things, so I’ll give that a go and see how it improves things.

Thanks again, and if you think of anything else that might help, please let me know!

David A. Hoatson


#11

I have added the openGL code to my TestUIApp and that further reduced the CPU load on both Windows 7 and macOS 10.13. Windows 7 went down to 1% and macOS 10.13 went down to 17%. This is with RectangleList and removing all of the repaints for the 256 Components. I believe this is the reason for the increase in performance when using openGL that I wasn’t seeing before. All of the rectangle painting is pipelined for openGL to speed up. Although 17% still isn’t as good as I was hoping (I was hoping for under 10%), this is an amazing amount of performance increase from the 78% CPU load for the same functionality just a couple of days ago.

I think that when I put a realistic number of actual meters in the real application the CPU load will be well under my 10% goal. I am happy to upload the latest source in case anyone else cares to see what changed.

Thanks again for everyone’s help.

David A. Hoatson


#12

if you pack your rectangles in a single vertex buffer and glDrawArrays them you can draw potentially tenth of thousands of rectangles (as triangle strips) per seconds. and in your fragment shader you can do whatever fancy color effect in one go.


#13

you can prototype your shaders here, right in the browser:

https://www.shadertoy.com/