OpenGLComponent takes up to 90% CPU


#1

Hi,
I have a WIN32 VST Plugin which has an OpenGLComponent. For testing purposes the renderOpenGL() does exactly what the juce OpenGL demo does (called by a timer every 20ms). If I open one Instance of the Plugin things are fine and the painting doesn’t take more than about 1% of the CPU. If I open a second instance of the Plugin, painting of the openGL stuff takes about 90% CPU untill the openGL area of the editor of one of the two instances is completely covered by another window (or the other instance).
If I open two instances of the juce demo app and let them do the same openGL stuff, it works great without any CPU consuming issues.

Another strange thing is, my editor shows some vu meters, too, which use the juce paint routines and a timer (no openGL). Now, when I have only one instance of the editor and let it draw the meters and have the openGL repaint timer stopped, it takes less than 1% CPU. If I have the openGL repaint timer started and the vu meter timer stopped, it also takes about 1%. If both timers are enabled and call the repaint of the components it takes about 8% with regularly happening CPU peaks at about 20%.

The described problems appear in debug and release mode.

After almoust three days of debugging I still can’t get any explanation for this behaviour. Perhaps someone of you had a similar problem and knows a solution. Any hints where to start checking are welcome!
Thanks


#2

Sounds like you’ve got a locking issue there, though I can’t think exactly which lock might be the cause…


#3

Maybe you’re right, but I have no clue how the two independant dlls should lock each other.
From the VS2008 Profiler I can see that the nvoglnt.dll takes 88%.

Is it possible that may way of linking and including “windows.h”, “gl.h” and “glu.h” isn’t ok?
I have a set of custom components that are subclassed from juce::Component or in this case juce::OpenGLComponent. All of these components are compiled to a static library “myLibrary.lib” which is linked to the juce lib. The Editor itself includes “juce.h” and the something like “mySubclassesHeaders.h”. In the .cpp file of myOpenGLComponent (which is a part of “myLibrary.lib”) the “windows.h”, “gl.h” and “glu.h” are included.

Perhaps meanwhile I’m really confused, but am I messing up something by including and linking the stuff in this way???


#4

When you have two instances of a plugin, the DLL doesn’t get loaded twice, it just creates two copies of the plugin object. Maybe you’ve got some dodgy static variables in there?


#5

Hey Jules,
after checking my code again and again for days I came to the conclusion that this strange CPU behaviour must be somehow related to the juce framework (which is always my last thought).
So I took the juce demo vst plugin (win32) and added a simple OpenGlComponent. The result when loading the several instances of the release version within the same host was:
1 instance: between 0% and 1% CPU
2 insatnces: jumping between 1% and 3% CPU
3 instances: jumping between 1% and 6% CPU
4 instances: jumping between 45% and 50% CPU
5 instances: jumping between 45% and 50% CPU
6 instances: jumping between 45% and 50% CPU

This was reproducable with several hosts and three different windows machines.
I don’t no much about the internal opengl and OS handling but I think this behaviour isn’t normal.
Here is my code of the opengl test plugin and I hope this is reproducable on your machine. Otherwise I’m loosing the hope to get my real opengl components work properly ever. You can simply copy it to the DemoEditorComponent.cpp

#ifdef _WIN32
 #include <windows.h>
#endif

#include "includes.h"

#if JUCE_OPENGL

#ifdef _WIN32 
 #include <gl/gl.h>
 #include <gl/glu.h>
#elif defined (LINUX)
 #include <GL/gl.h>
 #include <GL/glut.h>
 #undef KeyPress
#else
 #include <GLUT/glut.h>
#endif

#ifndef GL_BGRA_EXT
 #define GL_BGRA_EXT 0x80e1
#endif

#include "DemoEditorComponent.h"
//========================================================================================
class OpenGLTest :	public OpenGLComponent,
					public Timer
{
public:
	OpenGLTest (const String& name)
		:numPoints (60)
	{
		for(int i=0; i<numPoints; ++i)
		{
			dataX[i] = float(i);
			dataY[i] = float(i);
		}
		startTimer(30);
	}

	~OpenGLTest()
	{
		stopTimer();
	}

private:
	void newOpenGLContextCreated()
	{
		glClearColor (1.0f, 1.0f, 1.0f, 1.0f);
		glDepthRange(0.0, 1.0);

		glEnable (GL_BLEND);
		glShadeModel (GL_SMOOTH);

		glEnable(GL_LINE_SMOOTH);
		glHint (GL_LINE_SMOOTH_HINT, GL_NICEST);
		glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
	}

	void renderOpenGL()
	{
		glClear (GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
		glViewport(0, 0, getWidth(), getHeight()); 
			
		glMatrixMode(GL_PROJECTION);
		glLoadIdentity();

		int i = 0;
		glOrtho (dataX[0], dataX[numPoints-1], dataY[0], dataY[numPoints-1], 0, 1);

		GLfloat ctrlpoints3d[2][4][3];
		glEnable(GL_MAP2_VERTEX_3);
		glEnable(GL_MAP1_VERTEX_3);
		glEnable(GL_AUTO_NORMAL);
		glMapGrid2f(20, 0.0, 1.0, 1, 0.0, 1.0);
		glMapGrid1f(20, 0.0, 1.0);

		glColor4f (1.f, 1.f, 0.f, 0.5f);

		for(int j = 0; j < numPoints-1; ++j)
		{
			ctrlpoints3d[0][0][0] = dataX[j];
			ctrlpoints3d[0][0][1] = dataY[0];
			ctrlpoints3d[0][0][2] = 0.f;

			ctrlpoints3d[0][1][0] = dataX[j] + (dataX[j+1] - dataX[j]) / 2.f;
			ctrlpoints3d[0][1][1] = dataY[0];
			ctrlpoints3d[0][1][2] = 0.f;

			ctrlpoints3d[0][2][0] = dataX[j] + (dataX[j+1] - dataX[j]) / 2.f;
			ctrlpoints3d[0][2][1] = dataY[0];
			ctrlpoints3d[0][2][2] = 0.f;

			ctrlpoints3d[0][3][0] = dataX[j+1];
			ctrlpoints3d[0][3][1] = dataY[0];
			ctrlpoints3d[0][3][2] = 0.f;
			
			ctrlpoints3d[1][0][0] = dataX[j];
			ctrlpoints3d[1][0][1] = dataY[j];
			ctrlpoints3d[1][0][2] = 0.f;

			ctrlpoints3d[1][1][0] = dataX[j] + (dataX[j+1] - dataX[j]) / 2.f;
			ctrlpoints3d[1][1][1] = dataY[j];
			ctrlpoints3d[1][1][2] = 0.f;

			ctrlpoints3d[1][2][0] = dataX[j] + (dataX[j+1] - dataX[j]) / 2.f;
			ctrlpoints3d[1][2][1] = dataY[j+1];
			ctrlpoints3d[1][2][2] = 0.f;

			ctrlpoints3d[1][3][0] = dataX[j+1];
			ctrlpoints3d[1][3][1] = dataY[j+1];
			ctrlpoints3d[1][3][2] = 0.f;		

			glMap2f(GL_MAP2_VERTEX_3, 0, 1, 3, 4, 0, 1, 12, 2, &ctrlpoints3d[0][0][0]);
			glEvalMesh2(GL_FILL, 0, 20, 0, 1);

			glColor4f (1.f, 0.f, 1.f, 1.f);

			glMap1f(GL_MAP1_VERTEX_3, 0, 1, 3, 4, &ctrlpoints3d[1][0][0]);
			glEvalMesh1(GL_LINE, 0, 20);

			glColor4f (1.f, 1.f, 0.f, 0.5f);
		}																
		glDisable(GL_MAP2_VERTEX_3);
		glDisable(GL_MAP1_VERTEX_3);
		glDisable(GL_AUTO_NORMAL);
	}

	void timerCallback()
	{
		repaint();
	}

	float dataX[60];
	float dataY[60];
	const int numPoints;
};


//==============================================================================
DemoEditorComponent::DemoEditorComponent (DemoJuceFilter* const ownerFilter)
    : AudioProcessorEditor (ownerFilter)
{
	//add a OpenGLComponent 
	addAndMakeVisible (openGLTestObject = new OpenGLTest(T("OpenGl Test Window")));

    // add the triangular resizer component for the bottom-right of the UI
    addAndMakeVisible (resizer = new ResizableCornerComponent (this, &resizeLimits));
    resizeLimits.setSizeLimits (100, 100, 1200, 800);

    // set our component's initial size to be the last one that was stored in the filter's settings
    setSize (ownerFilter->lastUIWidth,
             ownerFilter->lastUIHeight);

    // register ourselves with the filter - it will use its ChangeBroadcaster base
    // class to tell us when something has changed, and this will call our changeListenerCallback()
    // method.
    ownerFilter->addChangeListener (this);
}

DemoEditorComponent::~DemoEditorComponent()
{
    getFilter()->removeChangeListener (this);

    deleteAllChildren();
}

//==============================================================================
void DemoEditorComponent::paint (Graphics& g)
{
    // just clear the window
    g.fillAll (Colour::greyLevel (0.9f));
}

void DemoEditorComponent::resized()
{
	openGLTestObject->setBounds (10, 10, getWidth() - 20, getHeight() - 20);

    resizer->setBounds (getWidth() - 16, getHeight() - 16, 16, 16);

    // if we've been resized, tell the filter so that it can store the new size
    // in its settings
    getFilter()->lastUIWidth = getWidth();
    getFilter()->lastUIHeight = getHeight();
}

//==============================================================================
void DemoEditorComponent::changeListenerCallback (void* source)
{
    updateParametersFromFilter();
}

//==============================================================================
void DemoEditorComponent::updateParametersFromFilter()
{

    setSize (getFilter()->lastUIWidth,
             getFilter()->lastUIHeight);
}

#endif

And use this cleaned DemoEditorComponent.h

#ifndef DEMOJUCEPLUGINEDITOR_H
#define DEMOJUCEPLUGINEDITOR_H

class OpenGLTest;

#include "DemoJuceFilter.h"
//==============================================================================

class DemoEditorComponent   : public AudioProcessorEditor,
                              public ChangeListener
{
public:
    DemoEditorComponent (DemoJuceFilter* const ownerFilter);

    /** Destructor. */
    ~DemoEditorComponent();

    void changeListenerCallback (void* source);

    //==============================================================================
    /** Standard Juce paint callback. */
    void paint (Graphics& g);

    /** Standard Juce resize callback. */
    void resized();


private:
    //==============================================================================
    ResizableCornerComponent* resizer;
    ComponentBoundsConstrainer resizeLimits;
    OpenGLTest* openGLTestObject;

    void updateParametersFromFilter();

    DemoJuceFilter* getFilter() const throw()       { return (DemoJuceFilter*) getAudioProcessor(); }
};
#endif

I really hope there is a simple solution for this problem.
Thanks a lot


#6

I doubt if you’re going to be able to find a simple fix - my gut feeling is that the GL drivers are hitting some kind of internal threading issue. It’s very unusual to have more than a couple of opengl windows running at the same time, and even more rare to see them all animating at the same time, so it’s not impossible that they’re just not designed to handle this kind of thing. And all of that graphics data needs to share the same hardware, which probably involves all kinds of horrible switching between contexts many times a second.

If your profiler shoes that the cpu is being spent in the driver, then you’re probably screwed, unless there are some obscure GL flags that you can set that tell it to optimise for multi-windowed use, but I’m afraid I’m not enough of a gl expert to know!


#7

Hi again,
now I did some other tests to make sure that my problem is really opengl related. I took the juce demo app and added a timer controlled repaint to the PathsAndTransformsDemo.cpp.
I just subclassed from Timer, added

to the constructor and the

That’s it.
When I run the demo I get some strange performance problems:
According to the number and the size of the app it takes between 0% and 100% CPU. Of course, I usually wouldn’t repaint() the whole thing, but only what is needed - it’s just a test. Anyways, if only one demo app is opend sometimes a few more pixels to draw on the screen make a difference of 1% to 30% on my test machines. Could you please add this timer and play arround with rezising the window so you can see what I mean. Sometimes I only have to wait a few minutes and CPU raises by let’s say 15% without having anything changed or resized.
This becomes even worse and more random if I open the opengl demo instead of PathAndTransforms (using the code from above in the renderOpenGL()). For example having three demo apps (opengl) opened and open a new one, then all open apps need 10% to 20% more, if I now close this new app, their performance should go to the level it was before, but it doesn’t (well sometimes it does but sometimes not).
This all seems a bit random to me.
To verify that the opengl problems are related to juce and not the opengl or graphics cards on my test machines, I made a windows test app using glut32 (without the juce framework) and let this app perform exactly the same drawing with the same timer interval as my modified juce OpenGLDemo in the juce demo app. The result was straight performance no matter if I opened one instance of the win32 app or twenty. All instances had quite the same performance non regarding the number of opened apps.

So I must come to the conclusion that this serious performance problem is neither related to opengl nor to the use of several .dlls within one app. I have no other explanation as that there must be something strange in the way juce handles its painting at least on windows (perhaps in combination with juce Timer).
I know that it’s strange that appearently nobody had this problems before (or nobody else has posted it), but I checked and checked and double checked before posting this and I’m pretty sure that I didn’t miss the point.

Since this problem is very serious to me (in fact, if there is no solution for this I’m forced to switch the framework, which would cost me a lot of time and tears too because I really love juce), I beg you, Jules to have a look at this.

waiting for some good news…


#8

Ok, so this actually sounds more like it’s the win32 paint scheduler that you’re measuring. But really, is this genuinely stopping your app from working? Nowhere here did you actually say that your app fails, just that it’s using up to 45% cpu, which sounds fine to me.

And think of this: when the event thread gets busy, there’ll be a point where win32 suddenly stops sleeping between messages, and the performance measurements will suddenly jump. I also bet that the windows scheduler has different policies depending on a process’s activity level, and it might stop task-switching away from it as much when it decides it’s busy. If you can point to some juce code and say “that’s the hot-spot where it’s blocking/spinning/whatever” then I’ll take a look, but many people have gone mad unnecessarily from staring too long at cpu meters…


#9

The desperate one again,
first of all, it doesn’t stop my app from working. But if I do some heavy signal processing with several ffts, linear phase filtering, feature extraction… and tried to optimize this to the max so that in the end it needs about 2% on an average machine and then one instance of the plugin with some not too fency GUI takes 50% it makes it a bit useless. Not even myself would like to use such a plugin and I think neither would anybody else.

I really would like to point you to the hot spot, if only I knew where it is.
This time I tried to make it as simple as possible to describe what drives me crazy, so I’ve been gone away from opengl, threads and any special stuff:

I just added a really basic app to the juce demo:

[code]
class TimerAndDrawDemo : public Component,
public Timer
{
public:
TimerAndDrawDemo()
{
startTimer(15);
}

~TimerAndDrawDemo()
{
	stopTimer();
}

void paint (Graphics& g)
{
    g.fillAll (Colours::white);
	  g.fillRect(getWidth()/4, getHeight()/4, getWidth()/2, getHeight()/2);
}

private:
void timerCallback()
{
repaint();
}
};

Component* createTimerAndDrawDemo()
{
return new TimerAndDrawDemo();
}[/code]

If I open this within the juce demo with a size of 1000 x 1000 pixels it takes not even 1% CPU.
If I open this within the juce demo with a size of 1200 x 1000 pixels it takes almost constantly 50% CPU.
This brings me to the assumption that there must be a lock situation because only some more pixels to draw leads to this CPU usage explotion.

I know that it’s almoust impossible for one person to take care about all the requests and keep a complex framework like juce running. So much more I’m sorry that I can’t point to a specific line in the juce code and say that’s the one causing trouble. But I think to include the simple code example from above in your demo won’t take more than two minutes and should make it easy to comprehend what I mean.

Thanks in advance…


#10

It may be time for you to split your OpenGL away from Juce repaints and events. That’s the main reason a lot of us are using OpenGL, at least in a few places - lower overhead for constant re-drawing. The framework is fairly setup to allow you to spin things off, but (search the forums for OpenGL) you’ll see there’s snags.

You could do a simple test - follow the standard overrides to decouple your OpenGL from the repaint loop, then try a simple timer to regularly hit your OpenGL component’s paint method (on the main thread).

If that helps a bit, then maybe doing a full threaded approach will help you. It’s non-trivial, but easier than fighting with another framework!

Bruce


#11

Codswallop! The point I was trying to make earlier is that you can’t talk about cpu figures for a gui in the same way you talk about cpu figures for an audio processor.

When you see the 1% to 50% jump in your example, you can bet that the scheduler is saying “ok, this thread’s getting quite busy, so let’s stick it on its own core and leave it running uninterrupted, so we can keep all the other more intermittent processes on the other core”. That’s a good policy, because if no other process needs that 50%, it’s more efficient just to let your process use it. It does NOT mean that half of your cpu is now unavailable!

Have you used the profiler on this? Show me a hotspot somewhere in the code and I’ll debug it, but I’m not going to waste time staring at the task manager wondering what the scheduler is doing, when this “problem” might just be imaginary!


#12

[quote=“Bruce Wheaton”]It may be time for you to split your OpenGL away from Juce repaints and events. That’s the main reason a lot of us are using OpenGL, at least in a few places - lower overhead for constant re-drawing. The framework is fairly setup to allow you to spin things off, but (search the forums for OpenGL) you’ll see there’s snags.

You could do a simple test - follow the standard overrides to decouple your OpenGL from the repaint loop, then try a simple timer to regularly hit your OpenGL component’s paint method (on the main thread).

If that helps a bit, then maybe doing a full threaded approach will help you. It’s non-trivial, but easier than fighting with another framework!

Bruce[/quote]

Could you explain how to decouple OGL from the paint loop? Without involving glut, how can I repaint an opengl component without a timer?

I’m having similar issues with two opengl components in a single app - once I disable one, my cpu (one of the cores) goes from 100% to basically 0.


#13

Let me see…

I’m doing this in the .h, part of this as described in the juce docs

[code] void newOpenGLContextCreated();
void showOnDestop (const Rectangle &screenSpace);

void	visibilityChanged();

bool	renderAndSwapBuffers()			{ return true; }	// Override to prevent juce drawing
void	renderOpenGL()					{ return; }			// Overide juce

void	render(SyncSourceTimeRecord inTime);				// Override Mesh Screen

#if JUCE_LINUX
	GLXContext          renderContext;
    Display*            threadDisplayConnection;
    Window              embeddedWindow;
#endif[/code]

The Linux stuff is because if rendering on a thread in Linux, you need to make your own display connection.

Then as an example, (my stuff is left in):

[code]void ScreenComponent::newOpenGLContextCreated()
{
needsReshape = true;
}

void ScreenComponent::visibilityChanged()
{
if (isVisible())
{
needsReshape = true;
startThread(4);
}
else
signalThreadShouldExit(); //stopThread(2000);
}

void ScreenComponent::mouseMove(const MouseEvent& e)
{
if (e.x < 8 || e.x > getWidth() - 8)
setMouseCursor (MouseCursor::NoCursor);
else
setMouseCursor (MouseCursor::NormalCursor);
}

void ScreenComponent::resized()
{
stopThread(2000);

setVisible(false); // will stop thread, clear OpenGL

// Unlike a normal screen, we can dynamically resize
width		= getWidth();
height		= getHeight();

// now mark this is the total space
screenSize.setBounds(0, 0, width, height);

// now note that we can draw into the whole space
activeArea.setWidth(width);
activeArea.setHeight(height);

needsReshape    = true;

setVisible(true);

}

void ScreenComponent::render(SyncSourceTimeRecord inTime)
{
ScopedLock l(getContextLock());

#if JUCE_LINUX
// on linux, we deal with the context ourselves, so threads are OK
if (threadDisplayConnection == 0)
{
    String displayName  = T(":0.0"); // force 0.0 for GUI!

    threadDisplayConnection   = XOpenDisplay (displayName);
}

if (renderContext == 0)
{
    needsReshape    = true;
    // get the top level X Window - we're going to tag onto it
    ComponentPeer* const peer = getTopLevelComponent()->getPeer();

    if (threadDisplayConnection && isShowing() && peer != 0)
    {
        XSync (threadDisplayConnection, False);

       GLint attribs[]		        = {	GLX_RGBA,
                                        GLX_DOUBLEBUFFER,
                                        GLX_RED_SIZE,			8,
                                        GLX_GREEN_SIZE,			8,
                                        GLX_BLUE_SIZE,			8,
                                        GLX_ALPHA_SIZE,			8,
                                        GLX_DEPTH_SIZE,         16,
                                        None };

        int numConfigs				= 0;
        int screen					= DefaultScreen (threadDisplayConnection);

        XVisualInfo* const bestVisual = glXChooseVisual (threadDisplayConnection, screen, attribs);

        if (bestVisual == 0)
            return;

        renderContext       = glXCreateContext (threadDisplayConnection, bestVisual,
                                                (getShareContext() != 0) ? (GLXContext)(getShareContext()->getRawContext()) : 0,
                                                GL_TRUE);

        Window windowH      = (Window) peer->getNativeHandle();

        Colormap colourMap  = XCreateColormap (threadDisplayConnection, windowH, bestVisual->visual, AllocNone);
        XSetWindowAttributes swa;
        swa.colormap        = colourMap;
        swa.border_pixel    = 0;
        swa.event_mask      = ExposureMask | StructureNotifyMask;

        embeddedWindow      = XCreateWindow (threadDisplayConnection, windowH,
                                        0, 0, 1, 1, 0,
                                        bestVisual->depth,
                                        InputOutput,
                                        bestVisual->visual,
                                        CWBorderPixel | CWColormap | CWEventMask,
                                        &swa);

        XSaveContext (threadDisplayConnection, (XID) embeddedWindow, improbableNumber, (XPointer) peer);

        XMapWindow (threadDisplayConnection, embeddedWindow);
        XFreeColormap (threadDisplayConnection, colourMap);

        XFree (bestVisual);
        XSync (threadDisplayConnection, False);

        if (renderContext != 0)
        {
            //updateContextPosition();
        }
    }
}

if (needsReshape && threadDisplayConnection && embeddedWindow)
{
    int x       = getX();
    int y       = getY();
    ComponentPeer* const peer = getTopLevelComponent()->getPeer();
    Component::relativePositionToOtherComponent (getTopLevelComponent(), x, y);
    XMoveResizeWindow (threadDisplayConnection, embeddedWindow, x, y, jmax (1, width), jmax (1, height));
}

if (renderContext != 0 && glXMakeCurrent (threadDisplayConnection, embeddedWindow, renderContext) && XSync (threadDisplayConnection, False))
#else
if (makeCurrentContextActive()) // will create one if needed
#endif
{
    if (needsReshape)
    {
        initOpenGL();
        needsReshape = false;
    }

    MeshScreen::render(inTime);

    #if JUCE_LINUX
        glXSwapBuffers (threadDisplayConnection, embeddedWindow);
    #else
        swapBuffers();
    #endif
}

}[/code]

You have to bear in mind you’re not on the main thread, as with most other juce drawing, but it does update as often as you get your thread to run.

Bruce


#14

Thanks bruce, I’ll investigate threading then.


#15

Hey,
if some of you have similar problems with frequently updated drawings (fft analyzer or waveform display or something like that), here is the combination that worked best for me.

After testing many different approaches, the one with the best performance for this purpose was a non threaded combination of juce and the Agg2D library.

I use juce for the image rendering part and Agg2d for the path rendering part, drawing the Agg2D paths onto juce Images. The image rendering of Agg2D is quite slow compared to the juce one, but the path rendering is pretty fast (in comparison to juce) and it looks rellay nice. My tests with threaded versions yielded no better performance (in fact, every threaded approach perfomed i little worse than a non threaded).

I hope some of you may find this helpful.


#16

Good that you found a solution. Bear in mind no-one advised you to use a threaded solution with basic drawing, just with OpenGL, which is what your topic starts with.

So your advice would actually be that, if you can’t get OpenGL working the way you want, to stick with Juce drawing, maybe with supplementation from another method if there’s a specific case where it’s quicker.

Funnily enough, if you had started with ‘Juce Paths seem slow (compared with other libs)’ then you might have got here quicker.

Bruce


#17

You’re absolutely right, but this whole thing is a process of learning to me. If I always knew in the beginning which way to go, it would save me a lot of trouble. But everytime I’m trying a lot of different things (even if some of them might drive me crazy) I’m learning things that may be good to know some day.

At least now I have a better overview when to use OpenGL, when to use juce and when to use something else.

Anyways, thanks to everybody who gave their advice and tried to help.