Fluent animation frame timing

OK, I added some logging within my step function and I can definitely tell the issue is with the message timer interval. It’s just too inaccurate.

If I log the differences between the times, I get a pattern similar to this (in milliseconds):
25
25
0
25
25
0
25
25
0

//that 25 is not really 25, but rather something between 24 - 26 ms and the zero is not really zero but something between 0 and 1 ms.

Which is of course what makes the stuttering so visible on the large screen with low resolution.

If I replace it with a high precision timer I get nice frame intervals between 15 and 17 which would be good enough I believe.

BUT, of course there’s a catch. I cannot do any rendering there because the high precision timer runs in it’s own thread. To render there I would have to acquire MessageManagerLock but that makes no sense as I would have to wait for the inaccurate timer again…

Is there some simple trick or do I really need to use another thread and then make some smart synchronization?

I render everything in a separate image anyway and send it over ArtNet so this part on it’s own have no problem I guess. But I would still like to render it to the screen, for which I need some synchronization mechanism…

Well no, it’s incorrect to say that the Timer class is inaccurate.

If nothing at all is happening on the message thread, i.e. nothing is repainting, no other timers or callbacks are doing any work, then a Timer will be pretty accurate, probably down to about 1ms.

But anything that runs on the message thread, like Timer does, is at the mercy of being delayed by other events happening on there. Probably repainting will be the thing that causes the biggest delays. Obviously if you have a paint callback blocking for e.g. 25ms then the best a Timer can do is just to be called in-between when it gets a chance to run.

Likewise there’s no point in having another thread trigger something else to happen on the message thread because that too won’t be able to interrupt a long repaint event, and would also just have to wait until it finishes.

So all you can really do is to optimise or restructure your paining to be quicker. That’s going to be true of any framework on any OS, as message threads are the same everywhere!

(An idea I’ve had for a long time is to try a scheme where the Graphics class doesn’t do any drawing, but just records all the instructions into a list which is then actually rendered on another thread, freeing up the message thread to do other work, but don’t know when I’ll get the chance to try that one out!)

1 Like

(An idea I’ve had for a long time is to try a scheme where the Graphics class doesn’t do any drawing, but just records all the instructions into a list which is then actually rendered on another thread, freeing up the message thread to do other work, but don’t know when I’ll get the chance to try that one out!)

Doesn’t has to be so complicated, basically you can do this by drawing from a background thread on an image. The whole list logic, i would say, isn’t neccessary (Of course it would be easier for existing code which relies on message thread)

But it has also one real disadvantage, the paint routine will still block the message-thread!
(especially if you have a lot of small things)

An option would be, If we have some kind of - setBackgroundAllowed(true) - flag, to tell juce that thats okay to call this function from a background thread, than juce could perform any kind of multi-thread optimization, to allow this component to be quicker repainted.

No, that’s really not the same thing - the ideal situation is for the paint routine to happen on the message thread (because you always need it to interact safely with components and other data that would be a total pain to make thread-safe), but for it to run optimally quickly, deferring the actual rendering work to another thread, and also skipping an intermediate image if possible, as images can also slow things down.
This would work really well with openGL in particular, where the rendering is already on a separate thread.

If you paint a lot small particles, the paint routine itself can be very slow. The current openGL render implementation is also very ineffective when drawing lots of vertical lines (also not stable enough to run as plugin on hundreds of costumer PCs, driver issues etc…)

The image could also exists as buffer in graphics memory, and could be painted by newer graphics APIs (Metal…)

I’m not saying that make some kind of graphic-protocoll improves a lot of things, but it also doesn’t solve a lot of problems which are the reason when you are having real performance problems.

The real problem is the interaction with the message-thread, which still happens if you create the protocol on it.

So when there is a lot of stuff ongoing on the message-thread, the graphic will still stutter, because it relies on creation on the message thread.

I think the future is more some kind of pipe structure

data-model --one-way–> worker-thread creates graphic --one-way–> display

Thanks Jules for your comments,

I am not sure however that what you describe is what happens here. This would imply having a very high or at least significant CPU load by the application, wouldn’t it? Bu I have something like 3%…

Or, there are some priorities involved. When does the paint method get called? Is it another message based timer from the main thread? When those two gets really close, I can imagine the rendering delaying my step timer (the case when the paint timer comes juuuuuuust before my timer). Does it work like this? Well, the pattern 25 25 1 25 25 1 would actually support that kind of theory… Two times it comes first and then one time second… What’s the frequency of the repaint timer (if it really works like this…)?

As for the separate thread… I believe it may actually help. I have already tried that. I added the high precision timer and in that thread I do updating of box2D world, rendering to an image and sending the content over Art-Net. At one place, I create a copy of the rendered image (while being locked in a critical section) and assign it to the main thread. The main thread only renders it’s assigned image.

I get consistent 16 ms between each frame. Though at this point it stutters every once in a while (but not all the time as is the case with message timer) which may be the case when the synchronization happens to get stuck at waiting for the critical section - I will still have to dig into that.

But you are right that if I wanted to get a fluent animation on the screen this wouldn’t help as I would still need to wait for the rendered image in the message thread. In my case, it is important to get fluent Art-Net output. That’s why I decided to try out the thread based timer…

Well that’s not what I was talking about, I’m talking about “normal” 2D vector/UI/etc graphics. To do particle stuff you need to do your own GL shaders and data pipeline.

No, the only thing it implies is that a some event (or cluster of events) - maybe paint, maybe not - is hogging the message thread for about 25ms at some point so that your timer is being held off.

The CPU level is almost useless here, because it’s an average, and you care about the granularity here, not the average load.

TBH you may get very slightly better performance than a Timer if you trigger your own callback event from a thread or high-precision timer, but only if there are many other timers running. That’s only because the Timer class uses a single callback event for all its timers, so if the event queue is congested then adding your own event will give your one more probability of getting a chance to run. But if the queue isn’t busy then a Timer with 1ms period is pretty much equivalent to just posting a message to the event queue to be run as soon as possible.

Yep, that’s what I was talking about in the paragraph right after that…

Could you please answer that part about when the paint method get’s called? Is it also timer based or does repaint() method register some callback? Looking at the code, it seems to me that it invalidates the region which I believe should trigger the WM_PAINT message. Right? And if this message happens to come before the WM_TIMER message and take long enough to finish, the result would be exactly what I am seeing, right? It’s a lot of guessing from my side because I do not really know WinAPI, nor how JUCE handles these kind of things internally…

Hi Aros, sorry for jumping into your discussion at this point. I just joined the community and am working on something someway related to your goal…

I think that the timer approach is just not the right one. Correct me if I’m wrong: you have a sort of engine which calculates new frames and needs to be called pretty regularly, and then you need to show these frames to the user on the computer display and at same time on a sort of led wall… well, here is what I would do…

Let your “engine” work with its own time base, say 80 frames per second. And each time you get a new frame you mark your on screen component to be refreshed.

This will “disconnect” the engine frame beat from that of the on screen component which will be painted in the traditional way by accumulating refresh requests (well, technically speaking, overlapping requests get aggregated into one single request working on a potentially bigger area).

When the on screen component paint() function will get called, it would just take a snapshot of the current offscreen image (the one managed by the engine at 80 FPS) and copy it over its own Graphics.

This is a simple description, but of course you’d better add double buffering techniques to reduce the time of the copy (when Component::paint() gets called) to a swap of a pair of pointers.

This will give the engine the “illusion” to be called very precisely because you will implement its own offline loop in a separate thread which won’t follow the system message priority rules… and will keep the paint() function from the component side very light. And you won’t take care of the position of your bouncing ball at this “rendering” time, because it already was taken care by the engine in its own thread loop.

Another improvement you could add is not to have a 1:1 correspondence from the offline image and the one of the Graphics object you will copy the image to.

If you manage a 2xWidth by 2xHeight offline image and scale it to the final image when needed, then you’ll have twice the pixels where to draw your bouncing ball (like in a sort of “in house” retina display). Final result is that your intermediate virtual display will support half-pixels and your ball will move way smoother.

And of course you could experiment 4x factors as well. It just will raise the CPU load a bit, but if I’m not wrong, scaling should happen with hardware acceleration (not sure about this on Windows… I’m on a Mac).

Just my contribute…

Yes, as you can see, I have already done that… Still needs a little bit fine tuning though. As also mentioned in that post. Sometimes it stutters a bit (most likely it gets locked somewhere in between).

However, you won’t get a fluent animation on screen this way, as already explained by jules and also widdershins. The snapshot of the current offscreen image, as you call it, will be rendered from the message thread and thus this will happen in irregular intervals, which is still going to make the animation a bit choppy.

Also, I believe that the offscreen rendering is somewhat slower than the direct onscreen rendering. BTW is that so, @jules? I guess it also depends on the renderer…

Another thing is that when the target size is not the same as that of the original image, resampling will take place which is terribly slow with the SW renderer.

I don’t get this part though… But if you mean rendering in a higher resolution and then downscaling to the target size (i.e. supersampling), that will trigger the resampling process which again is terribly slow. And I am not quite sure JUCE supports rendering to offscreen target with GL renderer… Setting the resampling algorithm to “nearest neighbor” helps a lot, but it’s still slow. I would of course appreciate any insight from someone who knows how it is with JUCE and offscreen rendering. Ehm, jules?

There isn’t an answer except for “it depends on many things”.

For most OS and rendering engines, it’ll use the native OS repaint region/callback system, and they’ll all handle the injection of paint events into the message queue in different ways. On others it does use a Timer or something similar internally, but you can’t assume anything other than “at some point fairly soon after calling repaint(), you’ll get a paint callback”.

Looking at how our code works so you can find ways to tweak your code to improve performance is probably a bad idea, because if it’s so fragile that you need to do this, then whenever something (the OS or JUCE) makes a tiny change to the way things behave (and that will happen) then it’ll break your code. Better to do something like alph suggests, probably using openGL.

But which of the suggestions? Offscreen render using OpenGL? Or any of that supersampling stuff? Currently I do offscreen render but using the SW renderer. Also I do almost everything as he describes already. Apart from double buffering, which I may consider to add, but I doubt that will help significantly. Copying of the 120x60 image shouldn’t take anything like 25 ms…

And by the way, JUCE supports float coordinates, which I do use. That should have a fairly similar effect…

And thanks to all of you… Helps me a lot.

Yes, I use to make intense graphic stuff on Mac and Windows, from time to time, and discovered that on Windows I only can count on the software renderer, which really makes things harder when coming to high performance 2D transforms.

On Mac I’m pretty sure that drawings map directly on CoreGraphics which makes things happen in a hardware layer so it really does the job faster. This is the reason for me to make your kind of stuff on Mac exclusively, where possible.

Since as said I intensely handle pixels in an offscreen image, I really improved the performance by calling setBufferedToImage(true) on the target component visible on the screen. I suppose this is because all times you ask for a Graphics for a visible component, this will take a longer time if compared to an offscreen one. So, the buffered image implementation will take care by itself to only access the visible Graphics once for all, after all that micro-drawings happened off screen.

This, at least, works as described on a Mac…

Hm, I’m on Windows here…

However, this is getting strange…

To summarize… What I already have:

HighPresisionTimerThread (60Hz):

  1. Box2D update

  2. Render to offscreen target using SW renderer

  3. Send the image over artnet.

  4. Create a copy of the rendered image with getCopy() member function

  5. Inform the component it should repaint and set it’s image to this copy via C++11 lambda and MessageManager::callAsync, which looks like this:

    img = img.createCopy();
    MessageManager::callAsync(= {
    this->imgComponent.setImageWithBroadcast(img);
    this->imgComponent.repaint();
    });

MainThread:
imgComponent which is in fact a child of ImageComponent gets repainted because it’s image was set to a new one.

The point is that if I log the time in milliseconds (getMillisecondCounter()) at the end of the timer thread, it’s almost always exactly 16ms which is OK.

If I log the time at the end of the image paint function I get something between 15 - 24 ms which I believe is just fine. I can also tell no frames are missing because I also log the index of the image I render and no index is missing at all.

So far so good, right? That’s what I wanted to achieve…

Well, yes. But that’s not what I am seeing. It still lags like twice a second… I can clearly see that on the screen (don’t know about the art-net output at this point, I will try that later). So, if the end of the paint function happens to be exactly where it should be and no images are left out… Than the lag has to happen somewhere after that. Is it possibly that the frame-buffer switch (or however the JUCE rendering on Windows work) is slightly delayed every once in a while? What could be a reason of that? No I know that the artifacts I was seeing all that time are not even related to how I render it.

I think I can also see the same effect with AnimationAppExample from the JUCE examples directory. It’s just not so obvious since the movement is not that simple and the resolution is not that low. But even then I can see it stutter every once in a while.

I know that JUCE is not a game engine and that this is not a standard use-case, but I would still like to know what happens and why… I guess I wouldn’t encounter this on Mac right? Too bad I don’t have one to try that…

Also note that in my Window I have also another component that renders the game as well, but this one uses Box2DRenderer which is somewhere in JUCE and is called normally from the main thread. On this component the stuttering happens as well and I believe that it happens at the same time as on the ImageComponent(it’s impossible to tell that because you know… There are 60 frames per second). It really looks like the lag is caused by something else after everything is painted. To me, it looks like the frame buffer switch is delayed for some reason…

A trick we’ve used in Tracktion to catch occasionally-glitching bits of code is to create a class that uses RAII to measure the time a function or block takes to complete, and which will assert/log if it takes more than a maximum duration. It’s easy to create one of those, just a few lines of code, and if you scatter them around your paint routines you can quickly track down anomalies

1 Like

sounds like a great tool to share with the community!!

I believe it is already, as ScopedTimeMeasurement

1 Like

Well, yes, though ScopedTimeMeasurement just logs each measurement. What you need for this kind of thing is a version that asserts when something takes too long, so you can catch it and see what’s going on.

2 Likes

Hm, thanks. I get the idea. But I doubt this is going to help in my case. As I said, I have already measured and logged the time in all the paint functions I had written. And the differences are alright after implementing that trick with HighResolutionTimer. That means the lag has to happen somewhere outside of my code.

Well, if I was trying to track it down, then adding checks inside the timer class itself could find out whether any other timer was taking too long. And hacking some checks in the internals of e.g. MouseInputSource could tell you whether it’s a mouse event, etc. If the message thread is getting blocked, there must be a place where you can detect what’s doing it.