Optimizing animated components

I’m optimizing my plugin, which has many animated components. The most time I spend in any one function, about 5%, is in CGContextGetClipBoundingBox. This is always called by either CoreGraphicsContext::isClipEmpty() or CoreGraphicsContext::getClipBounds(). I wouldn’t have guessed that this API call would be so slow.

CoreGraphicsContext.mm is lean and to the point—I hate to think about mucking it up. But in my case, a little bit of extra crud here to prevent CGContextGetClipBoundingBox being called can be a big win.

One idea would be to calculate the clip bounding box locally when clipToRectangle, clipToRectangleList, and excludeClipRectangle are called. When the context is created, the clip bounds is in a known simple state covering the entire context. When one of these three calls is made, it’s easy to calculate the resulting bounds. So we can return our local version of the clip bounds and avoid the expensive call.

When clipToPath or clipToImageAlpha are called, we would set a flag called something like clipRegionIsComplex, indicating that CGContextGetClipBoundingBox needs to be called. Calculating the boundary of the added clip area here would be possible but starts to look like more added code for less win.

When saveState() is called the clipRegionIsComplex flag would get saved with that state. This makes it possible to get back to a simple state without adding any new methods.

I’m going to work on other stuff for a bit and let Jules and the community ponder this idea. If the above seems a reasonable patch concept after that I’ll go ahead and try it.

Not a bad idea! In profiling, I’ve never noticed it spending much time in that function, but I guess that if you have a lot of components, it could start to become a hotspot. I might have a quick go at optimising that myself…

OK, I’m glad you’re interested! Because there are always an endless number of things one could optimize. I just happen to need this particular one. I’ll work on some other stuff this coming week.

The real hurt comes if many components are opaque, because then you have O(n^2) calls to Graphics::excludeClipRegion() from Component::renderComponent(). I would have liked to use opaque components for blitting speed (RGB vs. ARGB), but had to go with not-opaque ones because of this.

FYI I have about 20 components I’m animating at 30Hz. In area they all add up to approximately one 320x240 rectangle, so this should be entirely doable. And is, really, just not in the CPU overhead I want. If I had a hundred animated objects, I would feel fine about having to write my own renderer for them.

I think I’m doing all the wrapper stuff appropriately. I have a single timer thread for the refresh that gives each animated component the opportunity to update itself; if a component needs an update it calls repaint(int, int, int, int) with the area that needs updating.

I’ve just checked in a version that caches that clip rect, if you want to try it.

Always interested in whatever optimisation suggestions you’ve got! I do optimisation when I notice something that’ll help with whatever I’m writing, but obviously everyone will be using it in different ways, so may hit different hot-spots to me.

Great! I can’t wait to try it. Unfortunately the version I am using for my plugin is months behind the tip and so I have some slogging through Image* and Point changes to do. Now I have a really good incentive to catch up though. Will keep you posted.

On reading your code I like your caching the result of CGContextGetClipBoundingBox better than what I was proposing.

Well done! CGPathGetBoundingBox is now down from 4.6% to 0.6%. And everything still works, which is always nice.

Excellent, thanks!