StreamingSocket createListener() fails silently (linux)

My application involves a “master” juce app that connects to multiple “slave” juce apps via StreamingSocket. Each slave creates a listener at startup:

StreamingSocket* listenerSocket = new StreamingSocket();
bool createdOK = listenerSocket->createListener(port);

while(!createdOK)
{
    // try again
}

StreamingSocket* receiverSocket = listenerSocket->waitForNextConnection();

Now, most of the time this is fine, but sometimes the listenerSocket will not actually open a socket (corresponding port is closed when I check with nmap) and in this case the receiverSocket will never get a connection; however, createdOK will still always be true. As I see it, createdOK should be false if the listenerSocket cannot be opened. Am I missing something? Is there a different way I can check for the status of the socket?

EDIT: Using ubuntu 10.04, and Juce 1.51 (I know it’s not the latest, an upgrade is not an option at this point)

Sorry, but there’s no point in me spending time investigating bugs in old code, when they could have been fixed years ago.

You could diff your code against the latest version and see if any changes look relevant. Or reproduce it in a test-case with the latest code and I’ll look at that.

[quote=“jules”]
You could diff your code against the latest version and see if any changes look relevant.[/quote]

I did that, there are basically no changes to the createListener method, or the constructor (except for getting rid of “zerostruct” in the latest version).

Ok… Just had a quick look, but can’t see anything in there that looks wrong in there.

It’ll only return true if bind() and listen() complete successfully - if it’s failing for you, can you point me at the place where the failure gets ignored?

FWIW, off the top of my head I know that the helper connect method doesn’t fail correctly. I think it just looks for timeout, but doesn’t handle any other cases. I can’t remember if I touched anything with listeners.

Edit: I actually do have a fix for this, though It does seem that the file was touched several times in the not too distant past. I don’t have tags in my cloned repo for some reason, so I don’t know exactly what commit to check out to confirm if the fixes are applicable for 1.51. I’ll try to take a closer look after work tonight.

Well, I finally was able to pull V1.51. First question, is, are you sure it is on the bind side and not the connect side?

Notice this in the old connectSocket:

    const int result = ::connect (handle, (struct sockaddr*) &servTmpAddr, sizeof (struct sockaddr_in));

    if (result < 0)
    {
#if JUCE_WINDOWS
        if (result == SOCKET_ERROR && WSAGetLastError() == WSAEWOULDBLOCK)
#else
        if (errno == EINPROGRESS)
#endif
        {
            if (waitForReadiness (handle, false, timeOutMillisecs) != 1)
            {
                setSocketBlockingState (handle, true);
                return false;
            }
        }
    }

    setSocketBlockingState (handle, true);
    resetSocketOptions (handle, false, false);

    return true;

If there is a connection error other than timeout, it falls through to returning true. This same bug is actually in the new SocketHelpers::connectSocket call. Which is why I recalled it.

The createListener code in 1.51 still looks like a pretty straight shot to me. If you get a failure, the internal member variables might be in an odd state (ex. isListener will be set to true, even though you failed). But if you get a successful return, it sure looks like you should have a socket handle and it should be bound to a port.

It’s probably worth noting that nmap won’t nec. catch a bind to INADDR_ANY until it is use. The underlying socket isn’t necessarily bound at this point.

Sorry I couldn’t be of more help.

Hmm, checking again with lsof and netstat it seemed that in the problem cases apparently the socket gets bound, but to a (random?) port above 50000 - would explain why it returns true…

Had a quick look at the (latest) code, and can’t see any obvious blunders in handling the port number… (?)

Just realized that if I ask the socket immeadiately after createListener() which port it has bound to, like this:

then boundTo is 0 in the case where it didn’t succeed but returned true. Again, in netstat it will actually display some 5-digit high portnumber which I’m pretty sure is that socket because it appears and disappears at the same time as the working sockets, and nothing like it is present on machines where all listeners succeeded.

Anyway, it seems to be more likely to happen on a machine after a socket wasn’t shut down properly in the previous instance (because the process was killed before the connection was closed).

You could add a debug output to see what you are passing in for a port. Like Jules, I don’t see anything wrong with the handling in the one function, it just stashes the value and calls bind.