Using JUCE 4.1 (just got the full commercial license) and test product was sent to few users.
Immediately the plugin scanning process that is performed by the app crashes it.
This is an intermittent issue, randomly reproducible. Sometimes it goes without a hitch, more often - it crashes.
The scanning is being done on the main thread basically using the same code that the “PluginListComponent” class is using. “allowPluginsWhichRequireAsynchronousInstantiation” is set to its default false which should mean run on the main thread and use that one thread only.
Any help would be appreciated. Thanks!
Here is an example stack trace
Process: MyTestApp 
Version: 0.7.4b (0.7.4b)
Code Type: X86-64 (Native)
Parent Process: ??? 
Responsible: MyTestApp 
User ID: 501
Date/Time: 2016-09-08 13:09:47.307 -0400
OS Version: Mac OS X 10.10.5 (14F1909)
Report Version: 11
Anonymous UUID: 1196E8DC-5C84-5445-BDFF-E71037C7A3DD
Time Awake Since Boot: 1200000 seconds
Crashed Thread: 0 Juce Message Thread Dispatch queue: com.apple.main-thread
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Application Specific Information:
*** error for object 0x610001827720: pointer being freed was not allocated
Thread 0 Crashed:: Juce Message Thread Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff94c6c286 __pthread_kill + 10
1 libsystem_c.dylib 0x00007fff8ddbb9a3 abort + 129
2 libsystem_malloc.dylib 0x00007fff8da711cb free + 428
3 com.Arturia.Solina-V2.vst3 0x000000018288c21b sqlite3_db_readonly + 1760987
4 com.Arturia.Solina-V2.vst3 0x000000018288b3c4 sqlite3_db_readonly + 1757316
5 com.Arturia.Solina-V2.vst3 0x000000018288c25e sqlite3_db_readonly + 1761054
6 com.Arturia.Solina-V2.vst3 0x00000001828476f7 sqlite3_db_readonly + 1479607
7 com.Arturia.Solina-V2.vst3 0x0000000182607f3e VSTPluginMain + 734494
8 com.Arturia.Solina-V2.vst3 0x0000000182617a0e VSTPluginMain + 798702
9 com.arturia.component.Solina-V2 0x000000017c515f68 Steinberg::Vst::AUWrapper::~AUWrapper() + 536
10 com.arturia.component.Solina-V2 0x000000017c504344 ComponentBase::AP_Close(void*) + 36
11 com.apple.audio.toolbox.AudioToolbox 0x00007fff90c82bd5 APComponentInstance::DisposeInstance() + 21
12 com.deskew.MyTestApp 0x000000010cd7e066 0x10ccc3000 + 766054
13 com.deskew.MyTestApp 0x000000010cd7253e 0x10ccc3000 + 718142
14 com.deskew.MyTestApp 0x000000010cd6c067 0x10ccc3000 + 692327
15 com.deskew.MyTestApp 0x000000010cd6ceb8 0x10ccc3000 + 695992
16 com.deskew.MyTestApp 0x000000010cd6e098 0x10ccc3000 + 700568
17 com.deskew.MyTestApp 0x000000010cd3c921 0x10ccc3000 + 497953
18 com.deskew.MyTestApp 0x000000010cd3c6dd 0x10ccc3000 + 497373
19 com.deskew.MyTestApp 0x000000010cd3b366 0x10ccc3000 + 492390
20 com.deskew.MyTestApp 0x000000010cd4d462 0x10ccc3000 + 566370
21 com.deskew.MyTestApp 0x000000010cddc52a 0x10ccc3000 + 1152298
22 com.apple.CoreFoundation 0x00007fff91530a01 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
23 com.apple.CoreFoundation 0x00007fff91522b8d __CFRunLoopDoSources0 + 269
24 com.apple.CoreFoundation 0x00007fff915221bf __CFRunLoopRun + 927
25 com.apple.CoreFoundation 0x00007fff91521bd8 CFRunLoopRunSpecific + 296
26 com.apple.HIToolbox 0x00007fff9636a56f RunCurrentEventLoopInMode + 235
27 com.apple.HIToolbox 0x00007fff9636a2ea ReceiveNextEventCommon + 431
28 com.apple.HIToolbox 0x00007fff9636a12b _BlockUntilNextEventMatchingListInModeWithFilter + 71
29 com.apple.AppKit 0x00007fff8e08e8ab _DPSNextEvent + 978
30 com.apple.AppKit 0x00007fff8e08de58 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 346
31 com.apple.AppKit 0x00007fff8e083af3 -[NSApplication run] + 594
32 com.deskew.MyTestApp 0x000000010cdd8a3c 0x10ccc3000 + 1137212
33 com.deskew.MyTestApp 0x000000010cdd8963 0x10ccc3000 + 1136995
34 libdyld.dylib 0x00007fff8cd075c9 start + 1
One more clue … if I clear all the plugins and then scan for them - the thread count for the application goes above 100. Under normal operation when not many things are happening - it is hovering around 20.
So if I scan for plugins - the thread count goes above 100, then drops to around 95 and STAYS there.
If I then quit and don’t scan for plugins - the threads stay at around 20.
Seems like something is not being released properly. I assume these threads are coming from file scanning and plugin instantiation, but I also see that the plugins are being destroyed as they are locally scoped.
I’m at a loss…
this is a great issue to discuss, we always have hangs and crashes like the ones you describe, and they can also be reproduced on Tracktion, so its good to see where this can get to
The stack trace above is clearly a problem inside the plugin’s own code - not really much we can do if they’ve got sqlite calls that crashes. Would need feedback from the plugin devs to figure that one out.
In Tracktion we use a child process to scan plugins, because so many of them do silly things like crash, or launch dozens of threads which stay running, etc. The plugin scanning classes do have support for hooking in your own child-process scanner, but it requires some work in your own app code to implement it.
Plugin developers … can’t be trusted
in my plugin we also have a separate scanner child process… but if it crashes while loading a ‘bad’ plugin the user has to rescan again which will then skip the ‘bad’ plugin… How do you handle restarting the child process if it runs into a ‘bad’ plugin if there are still some left to scan?
We just put the ‘bad’ one on a blacklist, then kill and re-launch the child process
Thanks - I guess I should ask how does the parent know that the blacklist has changed and needs to kill and re-launch and does it fail silently so it’s invisible to the end user?
I’ll have to go back to my implementation and clean it up.
The worst thing that happens when plugin scanning takes off is where some plugin decides to launch a license manager in a separate process and hang until you mess around with it …
Someone should have a word with those Fxpansion boys about that
Oh, I think we have the child process reporting its progress to the parent, and when it goes quiet for too long… zap.
As far as user is concerned - it’s the host that crashes. The stack trace I provided is just an example. It could crash in a completely different plugin at other point or - it may not.
There is something inherently wrong with the scanning process itself. These plugins never crash when loaded normally into a our host or when we quit the host. It’s only during the scanning process.
Is there any way to catch these with try…catch ?
The idea about child process is something to think about - is there an example somewhere?
That’s exactly why we do it with a child process.
And no, it’s impossible to magically protect against a crash - a plugin can stamp all over your program’s memory and cause havoc, it’s impossible to detect or recover from that.
Hi Jules, I get that plugins can do anything they want and could bring the application down and the notion of a child process is a great one.
What I don’t understand is why this seems to only happen when scanning (and for the same plugin, it only happens sometimes) and not during regular use. What is the scanning process doing that doesn’t happen during regular usage?
You’d need to get the plugin devs to debug it and ask them why it’s exploding!
Usually it’s stupid mistakes like something crashing when the plugin is destroyed without having opened its UI or played any audio through it.
Thank Jules - makes sense actually - one can never be sure what the plugin will do especially when there is no gui opened or no audio was played through it.
We went down the child process route, here are some tips:
We made a really small,separate app for this.
you can check if the process has died using a timer.
if you want to check for “freeze” you can set up inter-process comunication using pipes, sort of like a “ping”. But you need a long timeout, as license popups can bock the app for a pretty long time.
Also make sure you start the childprocess with flags 0.
Some plugins, (Arturia) flood the output with debugmessages and first chance exceptions, making the scan super slow.
I’m trying to use the child process solution you described above for plugin scanning on Windows. When the plugin which is being scanned crashes, the master process sends the kill message to the slave, but the slave process is not terminated, it is still in the task manager. If the plugin does not crash, the child process is terminated properly.
The destructor of the juce::ChildProcessMaster calls the killSlaveProcess(). Should it be enough to terminate the child process or is anything else needed?
Debugging a bit further, I found the following:
I simulate a crashing plugin by triggering an access violation in the source code of the plugin.
If this instruction which triggers the access violation is in the plugin dll which is being scanned, Windows cannot kill the process automatically.
However, if I put the same access violation instruction in the scanner child process, right before calling JUCE_VST_WRAPPER_INVOKE_MAIN in juce_VSTPluginFormat.cpp, the scanner child process will crash the same way, of course, but Windows can kill the process properly.
I’m trying to understand what the difference is between these two cases.