You can use Graphics::getInternalContext().getPhysicalPixelScaleFactor()
I missed that one! Thanks, it's way better now!
Now about the rendering performances, two things:
1. First, about the "should we use accelerated rendering or not" debate, for now, I would say that the CoreGraphics and the Direct2D implementations are either not performant, either not strong enough. That leaves us with the OpenGl option. But as a plugin developer, our philisophy really is the simpler, the lighter, the better. We use OpenGl in one plugin for which we definitely needed high graphics performances, because of the realtime analysis and because of the advanced aspect of the graphics we needed to display.
But for all our other more 'regular' products, which only display images and draw basic stuff on small portions of the screen, a decent software renderer should be enough for us. Besides, one OpenGl Context per plugin view means one thread per opened GUI. It also means that each one of the GL Context threads will be synced to the main thread in order to perform the rendering while the MessageManager is locked. Basically, we get performance loss starting with 4 opened GUI, depending on the host, of course. So I would say that Juce's model for handling OpenGL rendering is not very suited to plugin development, because of that strong dependency between the message thread and the paint() callbacks that can be handled asychronously.
So for now, we're kind of happy with the Software Renderer, it's lightweight, it's completely cross-platform, and it offers decent performances (until last summer, that's my next point...).
About OpenCL and promising future stuff, we would be glad to see Juce using those, of course, especially for hi-res display handling, as you mention.
2. So then, I spent a few hours on analyzing and tracking the Software Renderer perfs today, and I found what the main bottleneck is. First, I have found a couple of small improvements to add to the RenderingHelpers stuff, I'll clean it up and send you a patch. Basically, it's just a few refactors of some core functions (loops on lines/pixels etc.) that allow the compiler to generate better and shorter assembly code.
But then I also found something else in the PixelFormats code. Last summer, after updating Juce at some point, we had to deal with a big perf loss. In Debug, some of our products that were working fine just before the update completely failed to run properly with the new Juce version, the message thread was just stuck handling endless paints. To address that, we just reworked and optimized all our drawing code so it is as efficient as possible.
And today, in addition, we are facing some issues with Retina displays. So I ran some tests, and I found the cause of that perf loss, here it is:
Basically, you made a fix to add clamping inside each Pixel::blend() call. The thing is, in my basic tests, with and without that changeset (just calling blend() 100000000 times and see how long it takes), the unclamped version of the PixelARGB::blend(PixelARGB&) is approximately 2.5x faster than the newest version.
As this single operation is basically the most performed operation during any piece of drawing, we can reasonably say that after that commit, the Software Renderer was roughly 2.5x slower than before...
I completely understand why you needed to add that additional clamping, but I'm sure we can find a faster way of dealing with it.
I will test a few things and see how it goes, I'll let you know about my findings, but at first, if something needs to be done, that's here...
The unclamped version is about 10-15 instruction, the clamped version is about 40-50 instructions. I'm sure there's something in between...
I'll keep you posted.
EDIT: I've attached some basic test code, you can run it here: http://www.compileonline.com/compile_cpp11_online.php, it's a very dumb test, but it shows the point...