ChildProcess broken on MacOS?


#1

issue #1

  for (;;)
        {
            ChildProcess cp;
            cp.start("cat nonexistingfile");
            String output=cp.readAllProcessOutput();
            DBG("OUTPUT:"+output);
            DBG("EXITCODE:"+String(cp.getExitCode()));
            Thread::sleep(100);
        };

–> sometimes exitcode 0 (should be 1, because the file does not exist)

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:0
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

If i add a cp.waitForProcessToFinish(100); after start

–> exitcode is always 0

On a different project the SAME code, produce different outputs, for example i always didn’t get any output and exitCode is always 0

issue #2
if I use the single string constructor, and uses quotes on filenames, commands may think the quotes belong to the filename


ChildProcess ignoring arguments containing quotes on OS X
#2

Its seems the whole ChildProcess is not very stable for processes which are running very short, wether readAllProcessOutput() / getExitCode seems to be reliable.


#3

Odd I just ran your code and I’m not seeing the same exit codes as you, I always see an exit code of 1 but I am seeing that the output is occasionally empty…

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:1
OUTPUT:
EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

EXITCODE:1
OUTPUT:cat: nonexistingfile: No such file or directory

I also tried adding cp.waitForProcessToFinish (1000); before reading the output. This made the output and exit code consistent but unfortunately it consistently returned the wrong exit code (0), I suspect this has something to do with waitForProcessToFinish calling isRunning which in turn calls waitpid which is also how getExitCode determines the exit code?


#4

Thanks, the random behaviour confirms my observations.


#5

@jules


#6

I can’t reproduce this - I added an assertion to your code and left it running for a while but it never triggered.

I guess what’s happening is that the process is exiting before the read handle is even attached, so its output has already been thrown away at that point.

All I can think of to avoid that would be to change the place where it creates the read handle, e.g. by adding this code at around line 1180:

        else
        {
            // we're the parent process..
            childPID = result;
            pipeHandle = pipeHandles[0];

            if (childPID != 0)
                readHandle = fdopen (pipeHandle, "r");

            close (pipeHandles[1]); // close the write handle
        }

I have no way of testing that, because like I said, my machine is working fine, but maybe you could give it a shot and see if it makes any difference?


#7

no doesn’t seem to change anything.
Did you see also the issue if you put cp.waitForProcessToFinish (100); before reading the output, the exitcode is always 0?

Because the behaviour may influenced by cpu-speed, here is some better testing code, which you can run for a while. It will stop when there is unexpected output.
I’m running it on a dual-core MacBook Pro from 2013, and get the issue quit often in debug build.

for (;;)
        {
            ChildProcess cp;
            cp.start("cat nonexistingfile");
    
            //optional
            //cp.waitForProcessToFinish(100);
            
            String output=cp.readAllProcessOutput();
            auto exitCode=cp.getExitCode();
            DBG("OUTPUT:"+output);
            DBG("EXITCODE:"+String(exitCode));
            jassert(!output.isEmpty());
            jassert(exitCode==1);
            Thread::sleep(10);
        };

PS: its weird, if you put a sleep behind cp.start(), this situation is better, but then, sometimes the output-string is empty


#8

I guess there are currently two problems!

the first, exitCode 0 even if the process should return 1

—> this happens when waitpid returns 0, which does NOT mean the process finished, only (if i read the documentation correctly) that waitpid can’t evaluate the current state.
I fixed this by calling waitpid until i get a non zero return value (of cause there must some kind of iteration limit in productive code)

in getExitCode():

                // just for evaluation purposes
                int pid;
                do
                {
                    pid = waitpid (childPID, &childState, WNOHANG);
                } while (pid==0);

The second, sometimes the process output is empty.

I guess this will happen if fread() returns 0, but the processing is still running, speculation
-> maybe check if the process is currently running via the new method, before exiting readAllProcessOutput


#9

fix for #1, in rare cases it uses 1-3 iterations to get the current child state, problem #2 (no process output) remains, also there is #3 after checking isRunning() exit codes is always 0

  int getChildState() const
    {
        int iterations=0;
        int childState = 0;
        int pid=0;
        
        pid = waitpid (childPID, &childState, WNOHANG);
        while (pid==0 && iterations<1000)
        {
            Thread::yield();
            iterations++;
            pid = waitpid (childPID, &childState, WNOHANG);
        };
        
        DBG("maxI:"+String(iterations));
        
        // Couldn't get child state
        jassert(pid!=0);
       
        return childState;
    }

    uint32 getExitCode() const noexcept
    {
        if (childPID != 0)
        {
            int childState = getChildState();
          
            if (WIFEXITED (childState))
                return WEXITSTATUS (childState);
        }

        return 0;
    }

#10

it seems waitpid can only collect the exit-status once.

I’m stop investigating right now, fact is, two people have independently reported different malfunctions.

This class needs some modifications, its just not reliable at the moment.


#11

I can also confirm I’ve seen oddities with process that exit very quickly.

When I was using ChildProcess to read the macOS pkgutil I was often getting empty outputs.
I reverted to using std::system and reading the results from a file which isn’t obviously optimal but worked for my uses.


#12

OK I think this fixes all the issues for me. However IMO getExitCode() should also block until the process is finished, but I didn’t want to change the general behaviour of the existing code for now.

diff --git a/modules/juce_core/native/juce_posix_SharedCode.h b/modules/juce_core/native/juce_posix_SharedCode.h
index 7d4b91a..30284a0 100644
--- a/modules/juce_core/native/juce_posix_SharedCode.h
+++ b/modules/juce_core/native/juce_posix_SharedCode.h
@@ -1123,11 +1123,12 @@ static inline String readPosixConfigFileValue (const char* file, const char* con
 
 
 //==============================================================================
-class ChildProcess::ActiveProcess
+    class ChildProcess::ActiveProcess : private Thread
 {
 public:
     ActiveProcess (const StringArray& arguments, int streamFlags)
-        : childPID (0), pipeHandle (0), readHandle (0)
+        : Thread (arguments[0].unquoted())
+        , childPID (0), pipeHandle (0), readHandle (0)
     {
         String exe (arguments[0].unquoted());
 
@@ -1181,26 +1182,37 @@ public:
                 pipeHandle = pipeHandles[0];
                 close (pipeHandles[1]); // close the write handle
             }
-        }
+        }
+        
+        startThread();
     }
 
     ~ActiveProcess()
-    {
+    {
+        stopThread (1000);
+        
         if (readHandle != 0)
             fclose (readHandle);
 
         if (pipeHandle != 0)
             close (pipeHandle);
+    }
+    
+    void run() override
+    {
+        while (pid == 0 && ! threadShouldExit())
+        {
+            pid = waitpid (childPID, &childState, WNOHANG);
+            
+            if (pid == 0)
+                Thread::yield();
+        }
     }
 
     bool isRunning() const noexcept
     {
         if (childPID != 0)
-        {
-            int childState;
-            const int pid = waitpid (childPID, &childState, WNOHANG);
             return pid == 0 || ! (WIFEXITED (childState) || WIFSIGNALED (childState));
-        }
 
         return false;
     }
@@ -1213,8 +1225,8 @@ public:
          #error // the zlib headers define this function as NULL!
         #endif
 
-        if (readHandle == 0 && childPID != 0)
-            readHandle = fdopen (pipeHandle, "r");
+        if (readHandle == 0 && childPID != 0)
+            do { readHandle = fdopen (pipeHandle, "r"); } while (readHandle == 0);
 
         if (readHandle != 0)
             return (int) fread (dest, 1, (size_t) numBytes, readHandle);
@@ -1227,23 +1239,27 @@ public:
         return ::kill (childPID, SIGKILL) == 0;
     }
 
-    uint32 getExitCode() const noexcept
-    {
-        if (childPID != 0)
-        {
-            int childState = 0;
-            const int pid = waitpid (childPID, &childState, WNOHANG);
-
-            if (pid >= 0 && WIFEXITED (childState))
-                return WEXITSTATUS (childState);
-        }
-
-        return 0;
+    uint32 getExitCode() const noexcept
+    {
+        if (childPID != 0)
+        {
+            if (WIFEXITED (childState))
+                return WEXITSTATUS (childState);
+        }
+        
+        return 0;
     }
 
     int childPID;
 
-private:
+private:
+    int pid = 0;
+    int childState = 0;
     int pipeHandle;
     FILE* readHandle;

Basically what’s happening above is ActiveProcess is inheriting from Thread, in it’s run() function it is running a loop to determine the returned pid and childState values and store them in member variables. Then everywhere that previously had a local pid or childState value was replaced to use the member variables.

The other fix applied was in read(), I changed…

if (readHandle == 0 && childPID != 0)
    readHandle = fdopen (pipeHandle, "r");

to this…

if (readHandle == 0 && childPID != 0)
    do { readHandle = fdopen (pipeHandle, "r"); } while (readHandle == 0);

This is what stops the output being empty.


#13

Scrap that I still managed to get some empty output! :disappointed_relieved:


#14

I’m feeling with you, i’ve done a similar approach with the same problem.
Now I’m using just std::system.


#15

Exit codes can be retrieved by registering signal handler for SIGCHLD with sigaction(). Please see man page for sigaction for details. JUCE code doesn’t use signal handlers. Note that signal handlers have C linkage.

Here is minimal example:

#include "../JuceLibraryCode/JuceHeader.h"

extern "C" {
    static void signalHandler(int sig, siginfo_t *si, void *context)
    {
        if (si->si_code == CLD_EXITED || si->si_code == CLD_KILLED) {
            DBG("SIGNALCODE:"+String(si->si_status));
        }
    }
}

//==============================================================================
int main (int argc, char* argv[])
{
    struct sigaction sigact;

    sigact.sa_flags = SA_SIGINFO;
    sigact.sa_sigaction = signalHandler;

    sigaction(SIGCHLD, &sigact, NULL);

    for (int i = 0; i < 100; i++) {
        ChildProcess cp;
        cp.start("cat nonexistingfile");
        String output=cp.readAllProcessOutput();
        DBG("OUTPUT:"+output);
        DBG("EXITCODE:"+String(cp.getExitCode()));
        Thread::sleep(100);
    };
    return 0;
}

Sample output:

$ SigChld &> run.log
$ grep EXITCODE:0 run.log | wc -l
37
$ grep EXITCODE:1 run.log | wc -l
63
$ grep SIGNALCODE:1 run.log | wc -l
100