Runtime disaster!


#1

Hi all!

Recently, I’ve installed the new openSUSE 10.3 Linux on my computer. I’ve installed the latest Code::Blocks IDE and tried to recompile my new JUCE application to test it runs under Linux. You could imagine how I was surprised when my application just froze in just 5 seconds or so. I’ve tried a Debug version - no chance. Well, I’ve tried to run my old JUCE applications (that were compiled under openSUSE 10.2 a time ago) and the history repeated. Any JUCE application (old or new) just freezes in 5 seconds or so. No CPU overloading, no memory consumption - just stops responding to mouse clicks, keyboard keys and stops repainting itself. The GNOME process explorer shows me that any JUCE application uses 0% of CPU time and waits for input events. No chance to debug - it freezes. No other non-JUCE application behaves like that to this moment. I’m lost.

Can anyone explain what’s happening?


#2

Utterly bizarre… I can’t really think of anything to try except getting it in the debugger and having a look at what it’s up to.

It’s probably something small, but there’s nothing I can think of that’d cause anything as weird as this…


#3

[quote=“jules”]Utterly bizarre… I can’t really think of anything to try except getting it in the debugger and having a look at what it’s up to.

It’s probably something small, but there’s nothing I can think of that’d cause anything as weird as this…[/quote]

No chance to debug. Do you remember my YPuzzle game? So, the cells just freeze in the middle of animation while they’re being shuffled. It all looks like snapshot.


#4

Surely you can launch it in the debugger and see where it is when it stops?


#5

Launch it in gdb and when it halts, use the “bt” command to print a back trace. If the debugger doesn’t halt (like if you have an infinite loop) the command to force a break is “interrupt” (or ctrl-c ? …or z? …don’t recall atm.)


#6

Here is the screen shot of calling stack from KDbg

I’ve noticed a strange thing too. When I’ve tried to debug for the first time, the debugger exits immediately with status ‘0’. I’ve set a break point at the beginning and found out that if I request only a single instance of the application then JUCE checks for that and tries to lock a mutex and it can’t do that. So, JUCE thinks that another instance is running and exists but it’s wrong. I’ve had to allow more than one instance to perform debugging.

I’m not sure that’s the whole stack trace info but it’s all I could get.

P.S. I suspect the problem is in sync objects and pthreads. Maybe something new is added to the kernel and libraries what we do not know about and thus cannot handle with that.


#7

Well yes, it looks like X windows is deadlocking with something… Next thing to do is to look at the stack traces of the other threads when it gets into this state, and figure out what’s locking with it.


#8

xcb is well known for breaking stuff.
In order to avoid various problems that are arisen by xcb you should build xcb and libxcb with -DNDEBUG flag. or at least rebuild libX11 WITHOUT xcb support.

try to do this and see if it is the root of all evils.
i’ve previously had problems with it and some of my precompiled binaries.


#9

[quote=“kraken”]xcb is well known for breaking stuff.
In order to avoid various problems that are arisen by xcb you should build xcb and libxcb with -DNDEBUG flag. or at least rebuild libX11 WITHOUT xcb support.

try to do this and see if it is the root of all evils.
i’ve previously had problems with it and some of my precompiled binaries.[/quote]

kraken, It’s not the point. suppose you’re right but I cannot ask people to recompile their X11system shared library in order to have the ability to run my JUCE application.


#10

mmmmh i uninstalled xcb before mainly because of this problem (i think is unresolved even with latest xcb version), which happen a lot of time during a juce run. but the fact is that is not JUCE the problem… is xcb ! cause it is affecting other X11 applications as well…

http://lists.freedesktop.org/archives/xcb/2007-August/002961.html

package maintainers that handle official repositories or official distro binary builds should check things before distributing… or you should tell in a ear to your customer to install a stable system instead of using an newer beta uber-unstable one …


#11

ah just a little side note: xcb is now a hard dependency of compiz-fusion >= 0.6. using libX11 compiled with xcb support will make compiz compile and run, but will break xine, java and a lot of other applications (one above all: juce).

so basically… you can’t ask for a stable system if you want a newer 3D desktop build

be patient and wait (or a little regression here will help your hair remain black)

:smiley:


#12

[quote=“kraken”]ah just a little side note: xcb is now a hard dependency of compiz-fusion >= 0.6. using libX11 compiled with xcb support will make compiz compile and run, but will break xine, java and a lot of other applications (one above all: juce).

so basically… you can’t ask for a stable system if you want a newer 3D desktop build

be patient and wait (or a little regression here will help your hair remain black)

:D[/quote]

well, thanks for the link. Seems to me the very same problem.
Since the bug was discovered in August this year I’m sure there will be an update for this soon. And thank you, kraken, again for your participation.


#13

I have Fedora 8 and this problem is affecting me… Any JUCE application I run, even a simple window, goes into this thread deadlocking thing within 5 seconds.

I saw this post over in the Jucetice forums

http://www.anticore.org/jucetice/forums/viewtopic.php?id=30

which seems to detail a similar problem, and explains how to use the environmental variable:

export LIBXCB_ALLOW_SLOPPY_LOCK=1

However, when I tried it, it didn’t seem to have an effect, or maybe I did something wrong. Does anyone know if there is yet a workaround? Perhaps a package I need to update? I’m quite unfamiliar with Linux, and I’m just trying to get my application to run on it for completeness sake.


#14

be sure you have libxcb >= 1.1, if you have a version below the env variable just don’t work.

This is very annoying anyway, i had a lot of users reporting deadlocks in jost and the only thing i can say to them is to upgrade to xcb 1.1 or recompile xorg server without it.

I have to consider if it’s worth writing a new linux windowing code based on xcb. Sure one thing to do !


#15

Ah, Fedora 8 has not provided the package update libxcb 1.1, hence my predicament. Hopefully, they will do so soon. I tried downloading it from freedesktop.org, but I ran into some dependency problems when I tried to do configure. Thanks for the information though.


#16

the root of all evil is the call of XInitThreads, and the lack of usage of XLockDisplay / XUnlockDisplay in the Windowing and Messaging implementation. I think that the xcb-x11 transport layer is hanging on some mutex, maybe cause we don’t handle correctly the display lock.
Qt for example, doesn’t need that call (well it doesn’t like) cause they use an internal mutex to protect multiple threads accessing simultaneously the same display.

Jules, which threads are doing X calls ? i’ve tried keep XInitThreads out of the initialization code but then X will complain about async messages (obviously).
It is possible to write a single threaded version of the windowing code ? or what you think are the X calls we should protect with lock/unlock cause they are possibly issued on other threads (and message thread is “another” thread, not the one creating the display) ?


#17

Jules any word about this ? A lot of people are complaining that they cannot run properly a JUCE application on a lot of distros. We should get this fixed.


#18

The onlyway this could cause a problem would be when an app does some UI work on its own thread and uses MessageManagerLock - otherwise, all the calls will happen on the message thread. I guess adding some X locking to the MessageManagerLock might help, though TBH my knowledge of thread-safety in X is pretty minimal!


#19

Hi guys,

Sorry for reviving this old thread but i’m running into this issue.
I’ve just ported my juce app (octane render).

The linux port went ok, it runs fine, but after a few seconds it freezes the app.
I’ve tried to use gdb to see what it is but can’t find anything (i’m not very good with linux dev tools and gdb),
the app does display this on stderr after you start it:

libxcb: WARNING! Program tries to unlock a connection without having acquired
a lock first, which indicates a programming error.
There will be no further warnings about this issue.

I thought it could be an issue with my distro (OpenSUSE 11.0 64bit),
so i posted the binary on our beta tester forums and all linux users (whatever the distro they use) all report the same issue.

I’m completely stuck and losing a lot of time.

Did anyone find a solution to this issue ?
I need to have my linux port working by then end of the day otherwise i’ll be in deep sh*t :frowning:

Radiance


#20

the latest tip have some changes in the messaging thread that should fix up things on this problem. some bad xcb errors may appear from time to time but you will be able to start your application.

eventually, try to do

before starting your application and see if anything goes on. btw which version of libxcb are you using on your system ?