How to get URL to follow redirects?

I'm trying to understand how I can make the URL follow redirects.  I tried to use code like:


InputStream *is = theUrl.createInputStream(false,nullptr, nullptr,String(),0,&sa,&sc);
if (!is)
{
  // for some reason, sc is zero...
}

The returned code from createInputStream is 0 even though I see from wireshark that the site returned '301'.

So: how can I get the URL to bounce along the redirects?

I ran into this issue before, see here: http://www.juce.com/forum/topic/webinputstream-patch-redirects-and-ssl-certificates .

Sadly, writing all the code for every platform to support redirects is not an easy task...

Thanks, I'll look at your code.

Jules, 

Can you say why no error code was returned (no http status code, that is)?

without debugging it myself and seeing what happened, I really have no idea.

Allright, but I am correct in assuming that the 301 should have been returned?

So on OSX the URL does do the redirection properly.

On Windows, it reports OK but fails to read anything (presumably it doesn't actually get the data)

On Linux it report failure, but the error code returned is 0.

 

The web address I'm using to test is "http://www.musimotion.com/" which redirects to the same address with "fr" at the end.

I also bumped into this problem, then searched for this thread and realized I should patch it until Jules fixes it.

All the changes are in juce_win32_Network.cpp::WebInputStream::WebInputStream(...):

I wrapped the code in a while statement, and as long as the returned status code was one of 301, 302, 303, 307, I would keep looping with the new address (Location parameter), if present. Else I would just end the loop and return.

Here's the new function body:


bool done = false;
        ScopedPointer<StringPairArray> currentResponseHeaders = new StringPairArray(false);
        while (!done)
        {
            done = true;
            createConnection(progressCallback, progressCallbackContext);
            if (!isError())
            {
                DWORD bufferSizeBytes = 4096;
                for (;;)
                {
                    HeapBlock<char> buffer((size_t) bufferSizeBytes);
                    if (HttpQueryInfo(request, HTTP_QUERY_RAW_HEADERS_CRLF, buffer.getData(), &bufferSizeBytes, 0))
                    {
                        StringArray headersArray;
                        headersArray.addLines(String(reinterpret_cast<const WCHAR*> (buffer.getData())));
                        for (int i = 0; i < headersArray.size(); ++i)
                        {
                            const String& header = headersArray[i];
                            const String key(header.upToFirstOccurrenceOf(": ", false, false));
                            const String value(header.fromFirstOccurrenceOf(": ", false, false));
                            const String previousValue((*currentResponseHeaders)[key]);
                            currentResponseHeaders->set(key, previousValue.isEmpty() ? value : (previousValue + "," + value));
                        }
                        break;
                    }
                    if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
                        break;
                }
                DWORD status = 0;
                DWORD statusSize = sizeof(status);
                if (HttpQueryInfo(request, HTTP_QUERY_STATUS_CODE | HTTP_QUERY_FLAG_NUMBER, &status, &statusSize, 0))
                {
                    statusCode = (int) status;
                    if (statusCode == 301 ||
                        statusCode == 302 ||
                        statusCode == 303 ||
                        statusCode == 307)
                    {
                        if (currentResponseHeaders->containsKey("Location"))
                        {
                            address = (*currentResponseHeaders)["Location"];
                            currentResponseHeaders->clear();
                            done = false;
                        }
                    }
                    else
                    {
                        if (responseHeaders)
                        {
                            responseHeaders->addArray(*currentResponseHeaders);
                        }
                    }
                }
            }
        }

Keep in mind:

  • I didn't test this thoroughly, I just made sure the redirection works.
  • I expect some funny callback behaviour, if supplied.
  • There's another redirection status code, 300, but I didn't find an example for it and I gather it's rather rare and complex, so I decided to neglect it.

Seems like a sensible plan... Here's a cleaned-up version with a few extra safety-checks - does this seem like a good idea to everyone?

        for (int maxRedirects = 10; --maxRedirects >= 0;)
        {
            createConnection (progressCallback, progressCallbackContext);
            if (! isError())
            {
                DWORD bufferSizeBytes = 4096;
                StringPairArray headers (false);
                for (;;)
                {
                    HeapBlock<char> buffer ((size_t) bufferSizeBytes);
                    if (HttpQueryInfo (request, HTTP_QUERY_RAW_HEADERS_CRLF, buffer.getData(), &bufferSizeBytes, 0))
                    {
                        StringArray headersArray;
                        headersArray.addLines (String (reinterpret_cast<const WCHAR*> (buffer.getData())));
                        for (int i = 0; i < headersArray.size(); ++i)
                        {
                            const String& header = headersArray[i];
                            const String key   (header.upToFirstOccurrenceOf (": ", false, false));
                            const String value (header.fromFirstOccurrenceOf (": ", false, false));
                            const String previousValue (headers[key]);
                            headers.set (key, previousValue.isEmpty() ? value : (previousValue + "," + value));
                        }
                        break;
                    }
                    if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
                        break;
                    bufferSizeBytes += 4096;
                }
                DWORD status = 0;
                DWORD statusSize = sizeof (status);
                if (HttpQueryInfo (request, HTTP_QUERY_STATUS_CODE | HTTP_QUERY_FLAG_NUMBER, &status, &statusSize, 0))
                {
                    statusCode = (int) status;
                    if (status == 301 || status == 302 || status == 303 || status == 307)
                    {
                        const String newLocation (headers["Location"]);
                        if (newLocation.isNotEmpty() && newLocation != address)
                        {
                            address = newLocation;
                            continue;
                        }
                    }
                }
                if (responseHeaders != nullptr)
                    responseHeaders->addArray (headers);
            }
            break;
        }

Looks fine to me. The 10 attempts safety check is good, will probably prevent redirection loops, which I forgot about :)

I thought this was fine too, until I realised it broke something for me!

I've got a bit of code that attempts a login and looks for a Set-Cookie command in the response headers. The problem is that this appears only in the first response and this response is a redirect. So the new code automatically follows the redirect and gets a new set of response headers which no longer include the Set-Cookie command.

Suggested fix (which works for me):

  • add "const bool ignoreRedirects = false" to method signature for URL::createInputStream
  • add same parameter to the WebInputStream constructor and pass said parameter to WebInputStream wi object when it is created
  • change initialisation of maxRedirects in for loop to "int maxRedirects = ignoreRedirects ? 1 : 10"
  • change condition in status code check to "if (!ignoreRedirects && (status == 301 || status == 302 || status == 303 || status == 307))"

I haven't considered what this might mean for non Win32 implementations.

Another approach might be to accumulate header responses from all redirects but I shudder to think what might go wrong if you did that.

Well, assuming that the current implementation is now consitent across OSX and Windows, I could add a flag if you can let me know how that should be implemented on OSX - am too busy on other stuff to dig into it and research it myself this week..

Um well, I'm not sure if there's any consistency to start with.

  • Win32 code has just been changed from no redirects to max 10 redirects
  • Apple code doesn't retry the NSMutableURLRequest; but it looks like your delegate implementation allows following of 1 redirect (then again maybe it recurses infinitely, I'm not used to Objective-C and can't tell)
  • Linux code appears to handle max 3 redirects
  • Android code appears to use HttpURLConnection which handles up to 5 redirects

Jules - I'm not sure if you processed my previous post but I've been thinking there's a few ways to improve the consistency here:

  • Normalise to a max of 5 redirects across all platforms (presuming we can sort out Apple) and allow user to set ignoreRedirects (in which case Android will be an exception)
  • Allow user to specify the number of redirects they want to follow (again Android would be an exception)
  • Don't allow user to set ignoreRedirects, but record all response headers so they can be checked after the fact

If you can let me know which you'd prefer, I can see if I can get it working (might take me a while to sort the Apple one though).

 

PS - I notice you've got CloudFlare going on the website today - now I've got another cookie to intercept :)

(Yes, sorry, your reply fell under my radar..) 

Maybe if we can find a way on all the platforms to control the number of redirects, then instead of a bool to disable them, the user could supply the maximum number that should be allowed, and we could set it to 5 by default?

 

OK - that was my 2nd option. Android will be the exception as it will do 5 redirects automatically, or you can disable it so it does none.