OpenGL Performance on Windows

Has anyone else noticed that the performance using drawImage() is much worse on windows than on a mac (I tried both with Intel integrated graphics and with some external NVidia cards)?  Using an older mac mini with only Intel HD graphics I can get higher framerates then on a windows 7 machine (quad core xenon) with a NVidia Quadro FX580.  I also get much better framerates running in windows inside Parallels 10 on my macbook pro.  Are there special optimizations that need to be turned on inside of Visual Studio?   

So you're attaching an OpenGLContext to a component to use the GL renderer?

Probably you have poor OpenGL driver support (Intel integrated) or not (fully) installed OpenGL drivers for your Quadro FX580.

---

To overcome this on Windows Google did the ANGLE project, which aims to translate OpenGL ES 3.0 calls to DirectX.

It is used in Chrome, Firefox and is available in Qt as default (OpenGL still available) for Windows applications.

Also some parts of the ANGLE shader compiler are used cross platform. https://code.google.com/p/angleproject/

---

Hi Jules, do you have any plans with ANGLE?

 

 

Yes,

Although looking through the forums it appears the way to do this is with *this and not get....  

I just tried using openGLContext.attachTo (*this); instead of *getTopLevelComponent() and it made no difference.

Here is a code snippet inside my paint method with openGLContext.attachTo (*this) being used to enable OpenGL rendering :

paint (Graphics& g)

{
  if(!doneOnce ){
    ​img = Image(Image::ARGB,dat.width,dat.height,false);
    doneOnce = true;
  }

  Image::BitmapData bitMap (img, Image::BitmapData::writeOnly);

  //Copy data into bitMap here 

  g.drawImage(img, 0, 0,getWidth(),getHeight(),0,0,img.getWidth(),img.getHeight());

}

Is there a better way to do this?

Well.. if you write into the image every time you do a paint() then it'll need to upload that new data to the GPU each time too. GL is only fast for images that don't change because they can be cached on the GPU.

I read in a previous post of yours that you were able to play multiple hi-def videos at the same time using this technique.  I am only trying to play one 1920x1080 stream (the bottleneck is definitely the g.drawImage()).  There are many players out there that do this easily.  I would have expected openGL to be fast enough for this. 

Just checked playing the same video using VLC and openGL as the video output and it plays great (even on lesser hardware).  I have verified that it is all in the drawImage call.

Yes, the GL system can certainly handle that, but it's a lot of data and you'll probably need to be a bit more careful about it. You might want to use an OpenGLTexture instead of an Image, or use OpenGLImageType, which I think is what I used for video.

Using an openGLImageType I get similar performance (maybe a touch faster) except that my image is now upside down (should I still be using g.drawImage?).  Is there any example code on how to do this properly?

I tried a bunch more things -- thinking it was possibly because I had a waitForMillisecondCounter in my paint() method.  No matter what I try, I still get poor performance on windows machines that work fine with other pieces of software.  Also, still not sure why the image is upside down (just changed the invertedCopy inside of the Writer in juce_OpenGLImage.cpp to do a memcopy to fix this issue).  Note that the image is in the correct orientation when OpenGL is not selected as the image type. 

 thinking it was possibly because I had a waitForMillisecondCounter in my paint() method.

Ouch! Putting a delay inside a paint() method is terrible idea under any circumstances, and especially bad if the aim is to make it go faster!!

What are you actually trying to build? It sounds to me like you're a bit out of your depth..

I am trying to build a video player using ffmpeg libraries.  I have the audio playing well and the video decoded (and in sync - no delays in the paint routine anymore).  The only issue I have now is drawing the graphics fast enough.  On my MacBook pro, I have no issues and do not drop any frames.  The same code, however, compiled in Visual Studio 2013 drops probably 5-10 frames per second.  I have worked a lot with ffmpeg in the past (writing codecs, muxers, etc), but this is my first attempt to use your framework (and OpenGL).  I did play around with VLC using OpenGL and it works great on the windows machines I have been using.  I really like your framework and would like to use it for this project (and probably future ones - I have also written AAX plugins for Mac only and think this would be great to be able to support multiple OSs).  This is my last sticking point and your help is greatly appreciated.

Thanks,

Darren

 

I've actually been working on exactly the same thing for a side project (building a wrapper DLL around ffmpeg that provides a GL playback engine), so I know it's possible. We didn't have any particular problems with uploading the data, so perhaps it's your threading model that's wrong - you're not trying to create the GL texture/image in the paint routine, are you?

Actually, I am (sort of).  I have the image data already decoded and rescaled (YUV -> ARGB) before the paint routine is called.  In the paint(), I am just copying the image data to the Juce image (I only have one juce image that had been created already, I just replace the Bit Map data in it each time) and call draw. I did profile this part of the code and it is the draw image call that is taking all of the time.  The copying of the bitmap data to the Juce image is almost instantaneous.  Also, the repaint() is called from a separate thread at the proper intervals.  

Copying data to the Image's bitmap data (i.e. CPU->CPU copy) may seem quick, but

a) it's completely unnecessary!

b) when you later try to actually draw it, the renderer needs to move all that data CPU->GPU so it can actually draw it.

None of that should happen in your paint loop. We found that the CPU->GPU is best done asynchronously on another thread so that the GL texture is already ready to draw, and the paint method takes almost no time.

I just assumed that was not allowed since if I tried to create an OpenGL type Juce image on another thread besides the OpenGL one, it would fail.  

Which method would I call to copy this data directly to the GPU?

...and thank you so much for all of your help. 

You can only use GL calls on the GL thread, but you can do some work in renderOpenGL rather than paint, which effectively runs on a separate thread.

Which method would I call to copy this data directly to the GPU?

Well, I think we may have been using a GL framebuffer. Actually, maybe the OpenGLImageType does store all its data in a framebuffer and not CPU memory.. can't remember exactly, but obviously you'd need to get it into either a framebuffer or texture.

If I use a frame buffer and load it up in the renderOpenGl() function (write pixels), what do I put in the paint method to display this frame buffer ( and perhaps scale it as well, an equivalent to drawImage())?

Should I still just us the opengl image type... then use getFrameBufferFrom to copy my data to the image.. then finally just use the drawImage() function in my paint()... or is there a better way?

 

Probably best to just use an OpenGLImageType and draw it in your paint method.

It is better and the image is right side up again, but still a bit slow on windows (again, just comparing this to other openGL implementations).  Here are my two functions.  Is there something else obvious that I am doing incorrectly?  I only have one line of code in my paint() now.  Note that I create the img in newOpenGLContextCreated() and just keep reusing it.  

renderOpenGL(){

     get data (almost instant)....then


     OpenGLFrameBuffer* buffer =  OpenGLImageType::getFrameBufferFrom(img);

     buffer->writePixels((PixelARGB *)dat.frame->data[0], rect);

}

paint (Graphics& g){

     g.drawImage(img, 0, 0,getWidth(),getHeight(),0,0,img.getWidth(),img.getHeight());

}

Note:  This is now usable on better hardware.