Android crashes: I think I understand the problem


#1

I (sometimes, not always) get crashes on Android.  I think I have finally understood the root cause.

In juce_android_Windowing.cpp there is this code:


        view = GlobalRef (android.activity.callObjectMethod (JuceAppActivity.createNewView,
                                                             component.isOpaque(), (jlong) this));

The 'this' is a pointer, and it gets cast to a (64-bit) long.  This is a Bad Thing.  What happens when I see a crash is for example, that the 'this' is actually 1f2d10 but when the Java side gets it, the 'host' value in createNewView() is 1f2d104b62bb38

The real value is clearly there, but some unfortunate thing has happened to it.

What I would suggest is using an integer handle value to uniquely identify the items on each side, and look them up in the glue code.  That way there's no problem with casting and byte ordering etc.


#2

I don't think there's anything wrong with the general idea... There was a bug with message-passing of pointer->int64 stuff which I fixed last week - are you up to date with the latest version?


#3

Yes, I have the latest code.  I actually updated the JUCE code yesterday, and then tried to get my Android stuff working.

It was working more often before; now it usually doesn't work, so I think the solution you came up with may need reworking.  Sorry ...


#4

I think the solution I was talking about is probably unrelated to this, though the symptoms look similar.

Just one thing to try.. Does it make a difference if you add an explicit cast to jboolean?


        view = GlobalRef (android.activity.callObjectMethod (JuceAppActivity.createNewView,
                                                             (jboolean) component.isOpaque(),
                                                             (jlong) this));

..it could be that the code that performs the call is getting the alignment wrong because the sizeof(jboolean) != sizeof (bool)
 


#5

Nice try, but unfortunately it still doesn't help.   

One thought: I'm using the NDK's compiler gcc-4.6.  What compiler are you using?  It shouldn't make any difference, but I've recently seen that gcc (4.8, on Linux anyway) has an egregious bug...


#6

Seeing your other thread about GCC misaligning the stack, that does sound like a likely explanation for this. Clearly the stack area containing the function arguments is mangled so that the JNI receiving the values is using a different memory layout to the way they were pushed. I've no idea how or why that could happen, but assuming that the correct parameters are being passes, it does seem like a compiler-level problem.

I don't have time (or a handy android device) to try this myself today, but something to try would be to replace that boolean parameter (both in the call, the declaration of the JNI method and in the java code) with something else, e.g. another jlong, or perhaps replace both parameters with two ints or something, just to see if that makes a difference.


#7

Hi again;

Yes, I just upgraded my NDK and switched to use clang-3.5, which has its own set of issues (like no "__thread" variables).

However, I encountered the same issue, so I'll try the int,int and see if that makes a difference.  Well let you know...


#8

It does seem to help with that issue.  There is a most definite problem passing values back and forth, now I get: "deliver message: 7534052083830358016" for example.  

 


#9

Besides that, I had to make the postMessage take jint instead of jlong as well; now things seem to work again


#10

Not sure that'd be a usable fix, because if it was a 64-bit system, jint won't be big enough to hold a pointer..


#11

Perhaps so, but the current code doesn't work properly across 32bit Androids, which are by far the lion's share of devices.

I reiterate my suggestion to use a mapping from object * to int and back again, to get the items across the JNI boundary without having problems.


#12

Understood, but the thing that worries me is why it's failing to pass a basic jlong parameter via JNI without mangling it! I really don't like "fixing" bugs by making a compromise without understanding why it was failing in the first place..