What's the fastest way to blit an Image?

An application needs to render a bitmap to a Graphics from a paint function with no blending, transforms, or scaling.

Are there any tips or tricks to ensure that it doesn't go through any of that processing and gets optimally written to the destination?

 

Have you tried something like this:

void VideoCanvas::paint(Graphics& g)
{
    ...
    // this will draw the whole image, keeping its size the same
    g.drawImageAt(image, x, y);
    ...
}

The docs say that drawImageAt() is the simplest drawing method; this seems to imply that minimal processing is happening. Try it out and step debug to find out what's involved. Try different renderers (software & OpenGL).

Honestly, though, what is "fastest" or "optimal" is going to depend on you application and platform. What have you tried so far, and what are you trying to do? What platforms are you targeting?

Back when JUCE 2.0 was hot on the streets, I wrote a video player using the ffmpeg library. I was able to play DVD quality video in a custom JUCE Component at 29.97 frames/sec. It was flicker free and very smooth! I was able to play Big Buck Bunny (the ubiquitous test video) at a resolution of 854x480 (framerate = 1/24) with no problems. I opened up a full-feature 90min movie in QuickTime and compared it to my video player; there was virtually no frame drift between the two players, even after an hour. Not only that, I used a ComponentDragger so I could use the mouse to drag the video around within my JUCE application, and I could drag the video--while it was playing--around my app. That, too, was smooth as ice (no flicker).

I called repaint() to trigger screen redraws, but the thing about this is that there is no guarantee how long it will take for the screen to actually be redrawn. In practice, however, I found the screen updates to be nearly instantaneous. I actually had to put a delay in my code because I could decode and render frames faster than the video framerate (of course, the delay is needed to sync the movie to the video frame rate, but my point is that I had clock cycles to spare).

Trying to render full screen video would be a better test of JUCE's graphics performance, but I've not yet had such a use case. Although, if I did want it to be full-screen, I would probably send each frame to the video card and use hardware scaling to fill the screen. On Windows, you could use Component::getWindowHandle() to return a HWND...once you've got a Window handle you can do anything. You could pass that to CreateHwndRenderTarget() and create a Direct2D render target, or you could go full bore and use DirectX. Back in the day (before JUCE supported OpenGL), people would do a similar thing; they would use JUCE components for buttons etc. and have an area where they did OpenGL stuff. But guess what? Now, you can just use a OpenGLContext, so pretty much everything I said in this paragraph would be a freaking waste of time to actually try. Not only that, it looks like JUCE already has a Direct2DLowLevelGraphicsContext that is used under the hood on Windows.

I know from my Direct2D days that blending, transforms, and scaling are essentially free if they're handled by the graphics card (OpenGL provides this same benefit.) Generally you have to specify these things anyway, even if it's D2D1::IdentityMatrix() etc.. Is there a reason you're worried about this "extra" processing? You'll have to grok the source to figure out what native methods are being called on your system.

For the record, if you're doing video, just use QuickTimeMovieComponent. (I needed to be able to play the video frame by frame in reverse, which is why I tortured myself with ffmpeg.) If you're thinking about breaking the laws of JUCE physics, I hope you have a good reason ;)

Anyway, if you can cache texture data in the graphics card, that's the "fastest" way to render to the screen. If you actually need to render a JUCE Image (especially if it changes each frame), I'm guessing that the bottleneck will be how much bitmap data you're pumping to the graphics card. In my video player above, I was only pushing ~50MB/sec, but it ran smooth as ice using JUCE's software renderer even in DEBUG mode (this is a solid lower bound--you can probably expect much more). At 1920x1200x32@60fps, you'd need to push about 550MB/sec. I've done this with D2D/DirectX; I don't have good numbers on whether JUCE could do this.

Do you have a bound on how much pixel data you need to push through the pipes?

Well I don't need that level of performance. We handle portability at our company by having our own pixel render level UI controls and then requiring only the ability of the target device to get bitmaps on the display. The feel is better if it's faster, but the requirements are not on a par with rendering video frames.

I looked in the source and the drawImageAt function is just a helper that sets up the arguments for the more robust call. 

I guess I could use openGL just to blit this image. Our actual openGL "engine" for UI is going to be a nightmare to port to multi-threaded.

 

So you have your own software renderer, and you're using JUCE just to handle the magic of blitting pixel data to the target platform?

... is going to be a nightmare to port to multi-threaded.

On the JUCE end, don't forget to use MessageManagerLock to lock the message loop if you have other threads updating the UI directly.

I guess I could use openGL just to blit this image.

You could use JUCE's OpenGL software renderer, but you might not need to use any actual OpenGL code if you're just blitting pixels. What I did was just create an Image, create an Image::BitmapData and call getPixelPointer() and just use that uint8 pointer to access the pixels directly. Then I let JUCE handle the actual rendering by calling drawImageAt(). It was plenty fast.

 the drawImageAt function is just a helper that sets up the arguments for the more robust call.

Again, no cause for alarm here.