DisplayLink for OSX OpenGL


#1

Hey Jules, have you seen this?
http://developer.apple.com/library/mac/#qa/qa1385/_index.html

It seems like the “Apple-preferred” way to do OpenGL.

I found that link because using the current Juce code, my frame rate sometimes seems to drop below 60 when a different window (not the one with the OpenGLComponent) is focused. I thought maybe this had to do with thread priorities or something. So I was Googling around a bit. Just wanted to point out that page to you.


#2

FYI - frame rate will also get whacked in various edge situations with DisplayLinks. It’s definitely not a panacea.

I would save it more for a fullscreen use. It would be reasonable for a juce window with an OpenGLRenderer attached to the main component to use a DisplayLink when switched to fullscreen and/or kiosk mode. Apart from that, I doubt there would be much benefit.

Bruce


#3

Thanks for the info Bruce.

Isn’t there also an advantage with DisplayLink that the synchronization is slightly better? I’ve heard that the callback is called with just enough time (using measurements from previous calls) to let your draw your graphics before the blanking interval. This seems better than a traditional thread scenario (what’s in Juce now), where you call your GL functions and then call SwapBuffers, because the thread will block until the next interval (i.e., up to 16ms on a 60Hz display). 16ms might not seem like a lot, but it matters for the kind of real-time app I’m working on.

Obviously I suppose I could write some code to measure the duration of my drawing myself, and use that to uSleep before drawing the next frame, but it’s just one less thing to worry about if I do it the Apple-recommended way.


#4

[quote=“MusicMan3001”]Isn’t there also an advantage with DisplayLink that the synchronization is slightly better? I’ve heard that the callback is called with just enough time (using measurements from previous calls) to let your draw your graphics before the blanking interval. This seems better than a traditional thread scenario (what’s in Juce now), where you call your GL functions and then call SwapBuffers, because the thread will block until the next interval (i.e., up to 16ms on a 60Hz display). 16ms might not seem like a lot, but it matters for the kind of real-time app I’m working on.

Obviously I suppose I could write some code to measure the duration of my drawing myself, and use that to uSleep before drawing the next frame, but it’s just one less thing to worry about if I do it the Apple-recommended way.[/quote]

Oh, for a real OpenGL view, it’s probably the way to go - I use it. It’s just not any better at avoiding occasional glitches when other windows overlap your view or similar (maybe slightly worse?).

It will give you a ‘callback’ once per frame, although there are other, OpenGL specific ways to do that. But either way, the drawing thread will block: it’s just a matter of whether juce or Apple does the blocking. If you were thinking that your juce thread blocks after you call SwapBuffers until the next interval btw, I don’t think it does - it just sleeps a defined amount, with a loose attempt at incorporating used time. You can still use that time before you sleep (finish your function), if you want - although, I’m not sure juce makes that easy, if that’s what you’re saying?

Bruce


#5

No, I’m just saying that, forgetting specifically about Juce for a second, there are two ways to do your OS X OpenGL rendering: one, use a regular posix-style thread with a timer or a loop, which is what Juce does, or two, get a callback from the DisplayLink, which is in a thread managed by the Apple code.

If you go the regular posix thread route, you’re doing something like this:
while (appIsRunning) {
// draw OpenGL
glWhatever(whatever);

// swap buffers
glSwapBuffers(); // can’t remember the exact function call, but you get the picture
}
In which case, glSwapBuffers will block (basically sleep) until the blanking interval. So if your glWhatever code only takes 1ms to process, you’re sleeping for ~15ms before you see the result on the screen.

Conversely, using the DisplayLink, or some other timing system, you’re doing something like this:
while (appIsRunning) {
// sleep until the last possible moment
Sleep(16-lastDrawDurationInMs);

startTime = Time::getTimeInMs();
glWhatever();
glSwapBuffers();
lastDrawDurationInMs = Time::getTimeInMs() - startTime;
}

See what I’m saying? If you can get the measurement right, the second way gives you a more “instantaneous” representation of the data your drawing is based on, if that data comes from keyboard input, joystick input, audio input, etc. (since those are processed much more often than 60Hz).

Hope that made sense.


#6

You’re saying that since in theory, the Apple system hands off to your code closer to the actual swap than a casual loop, you may be getting lower latency? I suppose, in theory. But: things are actually triple buffered, so the actual swap and screen display is still disconnected (although in theory in sync) and you can’t tell how close the Apple DisplayLink call is. Let’s be kind and say they give you only 5 milliseconds - which assumes all your drawing is relatively trivial, and has limited texture refresh etc., then you may be getting data an extra 10 milliseconds ‘fresher’? Assuming there’s zero potential for blocking in your data update system.

Of course, 10 milliseconds is way below what we can perceive, but it does all add up. If an extra 10 millisecs is the make or break for an app, I suppose it’s worth finding it.

Another point is that the juce loop isn’t actually related to the swaps, afaict. So they may be anyway between 16 and 1 milliseconds from the next swap point, and that will swing.

If Jules has a look and wants to add it, I can help: it’s fairly trivial, the main work would be wrapping the Apple provided thread enough that the juce threading system knows about it.

Bruce


#7

So, in investigating this further, I’ve noticed something interesting. Using the threaded loop, i.e., my first example, which is what Juce uses:

while (appIsRunning) {
// draw OpenGL
glWhatever(whatever);
// swap buffers
glSwapBuffers(); // can’t remember the exact function call, but you get the picture
}

Previously I thought that the SwapBuffers call was sleeping until the blanking interval, but it turns out that it’s actually busy-waiting. Which means that it’s using 100% of the CPU (or I should say, 100% of one of the cores, if it’s a dual-core or quad-core system)! This manifests itself in interesting ways; for example, if I try to open a File dialog while my OpenGL window is running, the File dialog goes REALLY slow. Even other apps go really slow while my OpenGL app is running.

So this is clearly a major performance issue. And I think it’s one of the things that the DisplayLink method tries to resolve.

Side note: if I add a Sleep() call in my render loop, even if it’s just 1ms:
while (appIsRunning) {
Sleep(1);
// draw OpenGL
glWhatever(whatever);
// swap buffers
glSwapBuffers(); // can’t remember the exact function call, but you get the picture
}
Everything works much better.

I would be surprised if this isn’t affecting a lot of people who are using OpenGLComponents.


#8

But the waitForNextFrame() function already does a sleep, for at least 1ms… (?)


#9

Where is that function? I’m not seeing it.


#10

it’s in OpenGLContext.cpp


#11

You have something wrong. Jule’s loop is fine. The fact that one thread is sleeping is also fine.

I didn’t get into it before, but your assumption about swapBuffers blocking was wrong, yes. It also doesn’t hog the CPU, nor does sleep. Ifyou have swap buffers hogging your machine, there may be something wrong with your OpenGL setup, or it may be you’re seeing that one core ‘idle’.

Is it possible that you’re just seeing the OpenGL system working? Do you know about the OpenGL command queue - that probably all your work is being done when that swap buffers call is issued? It’s usual to see the heavy CPU load then, it’s when the command queue is actually flushed and the heavy work happens.

Bruce


#12

Ah, ok. You’re talking about the repo head. We’re still on 1.53, since we’re working on a commercial app and we generally wait for a “stable” Juce release.

So, you’re aware that using the logic in that waitForNextFrame function, you’re almost never going to be waiting more than 1ms, right? The renderFrame function isn’t going to exit until just after the blanking interval (assuming sync is enabled). Which means that the elapsed time of the render will always be 1/60 = 16ms. So the jmax expression will always return 1.

Also, minor note on that point, wouldn’t you want the defaultFPS to be based on the monitor’s refresh rate? I have a 120Hz LCD monitor (for gaming) and another much older LCD one that runs at 75Hz.


#13

Oh, I didn’t see your reply there Bruce.

Can you clarify what you’re saying? I’m not following you 100%.

I am aware of how the OpenGL pipeline works, and that most of the work is done at the SwapBuffers call.

But are you saying that a thread loop which renders and swaps (with NO sleeping) shouldn’t hog the CPU? On my system, which is a newer Mac Mini, it certainly does appear to be hogging the CPU. The Activity Monitor doesn’t show any activity, but as I said, File dialogs go very slow, etc.

When you say “my OpenGL setup”, are you talking about my code, or about my graphics hardware config?


#14

Jules will explain the hogging the CPU thing. Essentially, if that core has nothing else to do, it can appear to be using the whole CPU.

As for the other thing, glSwapBuffers should not block. The only blocking version of that could be putting a glFinish after it. I don’t have the older juce codebase to compare, but unless it has glFinish, or a similar blocking call, then if you really, really are seeing blocking, you may have something wrong.

To be specific - turning on v-sync means that the actual visual swap doesn’t take place until the next vertical interval. It has no effect on on your CPU or what GL commands you enqueue.

Maybe you issued an OpenGL call that would block the queue until after drawing has completed? That might have the effect you describe, like a swap buffers then a glGet then a glFinish, or something?

Wait - I found a couple of caveats. Apparently, on ATI cards, swap buffers can block. And more to the point for you, probably, if you call swap buffers when another swap is pending, then blocking is the expected behavior. Can you post the background thread looping code? For a while there (long after 1.53 though), I think the loop code didn’t have a minimum wait. It does now. Thanks for making me find that info!

But you may be coming to the bit where you have to try what you’re seeing on A) current juce, and B) a different graphics card.

Bruce


#15

Aah, that’s quite interesting, because guess what, I do have an ATI graphics card. I have a Mac Mini, 2.5Ghz Core i5, with a Radeon HD 6630M. So I wonder if that’s it. Because I’m pretty sure I’m not doing the other thing (swapping when another swap is pending).

That sucks then! :slight_smile: Can’t easily expect all the users of my software to have non-ATI cards. Oh well.

Out of curiosity, where did you find the info about the blocking behavior for ATI?

And another question. If a user has a non-ATI card, does that mean there’s really no need to sleep (or wait) at all in the rendering loop?


#16

No, other way around. Swapbuffers returns immediately, and you need to wait or sleep until you want to start drawing the next frame. I have been told - haven’t tried yet - that it’s normal to do anything else you need, then call glFinish, which will block. So that CPU thread just sleeps until the frame has been swapped.

Bruce