Optimizing the drawing


#1

I don’t know if this still applies to the latest version, but I remember that whenever WM_PAINT was received, Component::paint() would be called for all components covered by the region passed by WM_PAINT. It would make more sense to have a backbuffer that gets retained, and only draw the components that really need it (flag them with “needsRepainting”). In the optimal case, there’s no need for repainting and you can blit the backbuffer immediately. This could significantly improve the drawing speed (yes still got problems with it, as updating only the level meters in my Mixer will cause everything between all the level meters also to be repainted - the blitting itself is fast now since your update to StretchBlt).


#2

That shouldn’t be the case… If you have components that don’t intersect the complex region, then they shouldn’t be repainted at all. Are you 100% sure they’re not accidentally getting invalidated?


#3

Let’s say I have a Mixer on screen with 30 Level-Meters which are all 10 pixels in width with and 90 pixels in height, spaced at a horizontal interdistance of 30 pixels.
The top of the level meters is y=10, the bottom is at y=100.
The first level meter would be at x=10, so the next one is at x=50, and the last one of the 30 is at 10+30*40=1210 (and its Component::getRight() would return 1220).

If I call repaint() on all my MixerChannel’s level meters in a same timerCallback, this will result in successive repaint() = InvalidateRect() calls for the 30 small regions, which Windows will coagulate for the next WM_PAINT and so the WM_PAINT’s horizontal range will go from (x1=10, y1=10, x2=1220, y2=100). So everything in that big region will be repainted, not only the LevelMeters. Or am I completely mistaken? I don’t think there’ll be 30 WM_PAINT calls after 30 InvalidateRect() calls?!


#4

No… There will be one WM_PAINT, but it’ll provide a complex region containing your 30 rectangles. Components that don’t intersect any of those rectangles won’t get painted.

(Caveat: I can’t guarantee that Windows hasn’t got some kind of internal logic that decides that 30 regions is too complex, and just simplifies the region down to its bounding rectangle…)

You know about JUCE_ENABLE_REPAINT_DEBUGGING, right?


#5

Ok, my bad. I didn’t know about WM_PAINT passing complex regions. And I didn’t enable JUCE_ENABLE_REPAINT_DEBUGGING either. I checked using JUCE_ENABLE_REPAINT_DEBUGGING, and indeed only the LevelMeters get repainted.
The amount of CPU used for drawing 10 level meters is about 15%. That’s far too much. Other apps do this with about 2% on the same machine. I’ll have to investigate what’s wrong.


#6

I’ve modified my LevelMeters code to only use 2 calls: fillRect() and drawImageRect(). Still I get a completely insane CPU usage of 40% for 20 level meters. That’s just not normal.


#7

When doing any kind of continual repainting I’ve found by far the best methods are to reduce the drawing code in the paint method as much as possible. That means calculating the pixel height of your current level and only repainting if it has changed.

I also draw and cache my meter images in resized(), one for completely on, one for completely off then you only need to draw the off image and a proportion of the on image over the top. Doing simple, non-rescaling image blits are pretty fast, much faster than any geometric drawing graphics calls. If you are doing any sort of gradient fills or alpha blending on a timer you will seriously drain the CPU.

The other thing I do is almost all my GUI processing on a background thread. This may or may not suit you but basically I only fill an AbstractFifo from my audio callback and then process it in a TimeSliceClient callback. For non-essential tasks I then just drop a few samples if it doesn’t get processed in time but you could use some sort of adaptive sleep time if you need all samples processed. The advantage of this is that your main GUI remains responsive as all you are doing on the main thread is basically blitting images.

Of course you may be doing these things already but implementing these changes drastically improved the performance of my code.


#8

Its hard to imagine why you’re having such high CPU usage. Have a look at the level meter in SimpleDJ and see how it communicates with the audio I/O callback and make sure you’re doing something similar:

https://github.com/vinniefalco/AppletJUCE

(Be sure to look at the Simple DJ app).


#9

Are your meters opaque, and did you set setOpaque(true)? If not also the background will be redrawn.
did you limit your refreshrate (20/s)


#10

The refresh rate is pretty high (about 60Hz), I’m ofcourse using setOpaque().


#11

Thats pretty high: - 30 fps should be enough for smooth animations and you save 50÷ Cpu : -


#12

I’ve made some test app. It shows 30 vertical and 30 horizontal level meters. When I only turn on the vertical or the horizontal ones, then the CPU usage is low. When I turn them on both, then the CPU suddenly goes up a lot. In Debug build this is 8% (for 30 level meters) vs. 45% for 60 level meters. I think something is fishy with the RectangleList code which seems to allocate/reallocate all the time, but I’m not quite sure if it is that - although it showed to be a hotspot using VerySleepy.

[code]// MainComponent.h

#include “JuceHeader.h”

class LevelMeter;
class MainComponent:public Component, public Timer, public Button::Listener
{
public:
MainComponent();
~MainComponent();
private:
Array<LevelMeter*> meters;
void timerCallback();
void buttonClicked(Button *btn);
Button *btnVertical, *btnHorizontal;
};[/code]

[code]// MainComponent.cpp

#include “MainComponent.h”

static float mainPhase=0.0f;

// *********************************************************************************
class LevelMeter:public Component
{
public:
LevelMeter()
{
phase=mainPhase;
mainPhase+=0.04f;
setOpaque(true);
}

void paint(Graphics &g)
{
	int y=int(phase*float(getHeight()));
	g.setColour(Colours::white);
	g.fillRect(0,0,getWidth(),y);

	g.setColour(Colours::red);
	g.fillRect(0,y,getWidth(),getHeight()-y);

}

void tick()
{
	phase+=0.01f;
	phase-=int(phase);
	repaint();
}

private:
float phase;
};
// *********************************************************************************
MainComponent::MainComponent()
{
// add 30 horizontal level meters
for (int i=0; i<30; i++)
{
LevelMeter* meter=new LevelMeter();
addAndMakeVisible(meter);
meter->setBounds(50+i*40, 350, 10, 300);
meters.add(meter);
}

// add 30 vertical level meters
for (int i=0; i<30; i++)
{
	LevelMeter* meter=new LevelMeter();
	addAndMakeVisible(meter);
	meter->setBounds(10, 10+i*14, 30, 10);
	meters.add(meter);
}

// add button to turn on/off vertical level meters
addAndMakeVisible(btnVertical=new ToggleButton("vertical level meters"));
btnVertical->addListener(this);
btnVertical->setBounds(200,10,300,30);
btnVertical->setToggleState(true, true);

// add button to turn on/off horizontal level meters
addAndMakeVisible(btnHorizontal=new ToggleButton("horizontal level meters"));
btnHorizontal->addListener(this);
btnHorizontal->setBounds(200,40,300,30);
btnHorizontal->setToggleState(true, true);

// start common timer
startTimer(50);

}
// ---------------------------------------------------------------------------------
MainComponent::~MainComponent()
{
deleteAllChildren();
}
// ---------------------------------------------------------------------------------
void MainComponent::timerCallback()
{
for (int i=0; i<meters.size(); i++)
meters[i]->tick();
}
// ---------------------------------------------------------------------------------
void MainComponent::buttonClicked(Button *btn)
{
bool b=btn->getToggleState();

// change visibility of the vertical level meters
if (btn==btnVertical)
	for (int i=30; i<60; i++)
		meters[i]->setVisible(b);
// change visibility of the horizontal level meters
else
	for (int i=0; i<30; i++)
		meters[i]->setVisible(b);

}
// *********************************************************************************[/code]


#13

I think that Windows will sometimes consolidate many rectangles under some circumstances.


#14

There may well be hotspots in the low level graphics code but as I said before there are plenty of simple optimisations you can perform first. The less graphics calls and repaints you can get away with, the less you will suffer from these slowdowns.

By just adding in a meter height state and only repainting if necessary I managed to cut down the CPU usage displayed by activity monitor from 60% to 20% on my MacBook 3,1. This was about 5 lines of code:

[code]class LevelMeter:public Component
{
public:
LevelMeter()
: lastHeight (0), currentHeight (0)
{
phase=mainPhase;
mainPhase+=0.04f;
setOpaque(true);
}

void paint(Graphics &g)
{
    //      int y=int(phase*float(getHeight()));
    g.setColour(Colours::white);
    g.fillRect(0, 0, getWidth(), currentHeight);
    
    g.setColour(Colours::red);
    g.fillRect(0, currentHeight, getWidth(), getHeight() - currentHeight);
    
}

void tick()
{
    phase+=0.01f;
    phase-=int(phase);
    
    currentHeight = int (phase * float (getHeight()));
    
    if (currentHeight != lastHeight)
    {
        lastHeight = currentHeight;
        repaint();
    }
}

private:
float phase;
int lastHeight, currentHeight;
};
[/code]

There have been many discussions in the past about how unreliable these sorts of CPU measures are however as if your system is relatively idle your app may well take a lot of CPU but other tasks will take priority over the message thread.


#15

Here is an example of the other sort of optimisation I was talking about, caching to images. In this simple example it doesn’t make much of an improvement (for me at least) but for complex graphics drawing can make a huge difference. I seem to remember Jules saying something like Images can be cached to graphics card memory but paths etc. need to be uploaded after each graphics call so incur an extra penalty. I also seem to remember someone (was it you?) saying they had a faster drawImageAt implementation which could reap further rewards by using images.

Anyway, here’s the simple code, it would be interesting to see how it translates to Windows anyway.

[code]class LevelMeter:public Component
{
public:
LevelMeter()
: lastHeight (0), currentHeight (0)
{
phase=mainPhase;
mainPhase+=0.04f;
setOpaque(true);
}

void resized()
{
    onImage = Image (Image::RGB, getWidth(), getHeight(), false);
    offImage = Image (Image::RGB, getWidth(), getHeight(), false);
    
    {
        Graphics g (offImage);
        g.fillAll (Colours::white);
    }

    {
        Graphics g (onImage);
        g.fillAll (Colours::red);
    }
}

void paint(Graphics &g)
{
    //      int y=int(phase*float(getHeight()));

// g.setColour(Colours::white);
// g.fillRect(0, 0, getWidth(), currentHeight);
//
// g.setColour(Colours::red);
// g.fillRect(0, currentHeight, getWidth(), getHeight() - currentHeight);

    const int meterHeight = getHeight() - currentHeight;
    
    g.drawImageAt (offImage, 0, 0);
    g.drawImage (onImage,
                 0, currentHeight, getWidth(), meterHeight,
                 0, currentHeight, getWidth(), meterHeight);
}

void tick()
{
    phase+=0.01f;
    phase-=int(phase);
    
    currentHeight = int (phase * float (getHeight()));
    
    if (currentHeight != lastHeight)
    {
        lastHeight = currentHeight;
        repaint();
    }
}

private:
float phase;
int lastHeight, currentHeight;
Image onImage, offImage;
};[/code]

Let us know if you find any further improvements or potential hotspots in the graphics code. I’m sure we would all benefit from low-level optimisations if they can be made.

EDIT: I found the post by Jules I was referring to above, it seems he was talking about the OpenGLImageType but the point is with hardware accelerated renderers this is possible (not sure if CoreGraphics on the Mac does this at all).


#16

No it doesn’t, I checked via repaint debugging. Same happens on OSX.
Interestingly, if I call getPeer()->performAnyPendingRepaintsNow() after calling repaint() on the first 30 level meters, then do the same after calling repaint() on the next 30 level meters, then the CPU goes down by nearly 50%!


#17

[quote]Here is an example of the other sort of optimisation I was talking about, caching to images[/quote].
Seems to only make everything slower. I was also using drawImageAt() instead of fillRect() in my other code, but it’s definitely slower.


#18

Here are some results, make with Shark on OSX:

With call to ComponentPeer::performAnyPendingRepaints() every 30 level meters, after calling repaint on each of these 30 level meters in the timerCallback().
[attachment=1]with performanypendingrepaints.png[/attachment]

And without (normal repaint() to all 60 level meters in the timer callback):
[attachment=0]without performanypendingrepaints.png[/attachment]

We can see that the distribution of CPU time is completely different in both cases.


#19

SolidColourEdgeTableRenderer shouldn’t even be part of the profile, nor should aa_render_shape. The fastest method of drawing level meters is to fill solid rectangles with no blending or anti-aliasing. Juce correctly detects the optimized case of a solid colour rectangle aligned on integer coordinate boundaries and switches instead to a straight fill, this happens here:

juce_RenderingHelpers.h

    void fillRect (const Rectangle<int>& r, const bool replaceContents)
    {
        if (clip != nullptr)
        {
            if (transform.isOnlyTranslated)
            {
                if (fillType.isColour())
                {
                    Image::BitmapData destData (image, Image::BitmapData::readWrite);
                    clip->fillRectWithColour (destData, transform.translated (r), fillType.colour.getPixelARGB(), replaceContents);
                }
                else
                {
                    const Rectangle<int> totalClip (clip->getClipBounds());
                    const Rectangle<int> clipped (totalClip.getIntersection (transform.translated (r)));

                    if (! clipped.isEmpty())
                        fillShape (new ClipRegions::RectangleListRegion (clipped), false);
                }
            }
            else
            {
                Path p;
                p.addRectangle (r);
                fillPath (p, AffineTransform::identity);
            }
        }
    }

fillRectWithColour instead of fillShape is going to be orders of magnitude faster…draw the level meter using only calls to Graphics::fillRect() using fully opaque colours and no transform.


#20

Thanks Vinn, I did exactly that. Only using fillRect() for the profiling and I also tested with setPaintingIsUnclipped(true) on all LevelMeters, no difference. If you don’t trust the results above, please feel free to use your own profiler and check.