I'm starting to work on a project in Juce that will involve lots of 2D graphics (special-purpose game engine). I need to be able to render many (e.g. 10,000) small (e.g. 64x64) ARGB images (originally PNG, if that matters) to screen per frame at 60 FPS. The images may be flipped vertically or horizontally but they will not be scaled or clipped (by anything other than the window). If it turns out to be faster I can have it make pre-flipped copies of appropriate images--memory is not much of a concern at this point.
I'm running Ubuntu 14.4 64-bit with nVidia GTX 660 and the proprietary drivers, in case this matters, but I want the result to be pretty portable to anyone with a modern GPU on Windows and Mac too.
I made a little stress-test program with some code adapted from the 2D background behind the 3D teapot in the demo: a component that's also an OpenGLRenderer, which gets a LowLevelGraphicsContext and puts it in a Graphics object, but then the actual rendering is g.drawImageAt(etc). This runs fine with 1000 images per frame but lags to 6 FPS at 10,000 images per frame. It looks like the work is being done by the CPU, though.
I'm assuming that drawing images directly to the Graphics in Component::paint() won't be any faster than this. But is there any way to use the full power of my GPU to draw these images? Maybe putting each image on two triangles and rendering them as 3D? Custom vertex shaders? I don't have any previous experience with programming for OpenGL anywhere near directly, so I'm not sure what's available.