How to retrieve the headers read after a URL read operation


#1

Hello,

I’m performing a readEntireTextStream on a URL I’ve created, and it correctly results in the content of the remote file.

However, the site is replying with a cookie in the HTTP headers of the response, and I’d like to keep track of that cookie for subsequent requests to the same site.

Is it possible to access the received headers for the HTTP response somehow?

Maybe a new parameter could be added to the read* methods that, if found not null, will point to a StringArray/StringPairArray/whatever where to store the received headers, or simply adding a getResponseHeaders () method for that.

What do you think?


#2

It’s a good suggestion, though I’ve not got time at the moment to write and test all the platform-specific stuff that would be involved.

I guess that the most sensible thing way to do it would be to add an optional String* parameter to createInputStream. Don’t think it’d justify adding a parameter to all the read* methods, as they can all be easily reproduced by creating a stream manually and calling a method on it.


#3

fair enough… hope this will be added soon! Unfortunately I’m not an expert of Objective-C programming, otherwise I’d start immediately implementing this…


#4

Ok, I’m working on it already, so I’ll hopefully have a patch to be submitted soon for adding this feature. One quick question: how do I store the content of a NSDictionary into a String or (probably better) a StringArray? Is there a builtin method in JUCE for doing that (it’s a patch internal to the native mac source files, so even using methods that are not available from the outside is fine)


#5

Cool. I’ve never written anything generic to convert from an NSDictionary - I normally get the specific info I need out of them using obj-C, and then do whatever’s necessary with that.


#6

Uhm, I see, but since I’m going for a complete retrieval of the HTTP headers of the request, converting the NSDictionary to a StringPairArray and then accessing the desided data from there seems a more general purpose approach to me, even because the same interface can then be used on Windows (once I’ve completed the patch for that operating system as well). Unfortunately, I’ve no plans to make a similar patch for the linux platform (now).


#7

Yes, in this case it sounds like it’d make sense to return it as a StringPairArray.


#8

Hello Jules, I’ve added this new feature. I wanted to attach the resulting .patch file to this message but unfortunately it does not seem to allow the attachment of files with extensions patch, diff or even txt, so I’m pasting it here. Hope you will be able to use it anyway. If not, contact me and I’ll send it over again, maybe by email.

[code]diff --git a/src/io/network/juce_URL.cpp b/src/io/network/juce_URL.cpp
index 406b26b…700ffa6 100644
— a/src/io/network/juce_URL.cpp
+++ b/src/io/network/juce_URL.cpp
@@ -244,6 +244,8 @@ void juce_closeInternetFile (void* handle);
int juce_readFromInternetFile (void* handle, void* dest, int bytesToRead);
int juce_seekInInternetFile (void* handle, int newPosition);
int64 juce_getInternetFileContentLength (void* handle);
+bool juce_getInternetFileHeaders (void* handle, StringPairArray* headers);
+

//==============================================================================
@@ -256,7 +258,8 @@ public:
URL::OpenStreamProgressCallback* const progressCallback_,
void* const progressCallbackContext_,
const String& extraHeaders,

  •                int timeOutMs_)
    
  •                int timeOutMs_,
    
  •                StringPairArray* responseHeaders)
     : position (0),
       finished (false),
       isPost (isPost_),
    

@@ -277,6 +280,9 @@ public:
handle = juce_openInternetFile (server, headers, postData, isPost,
progressCallback_, progressCallbackContext_,
timeOutMs);
+

  •    if (responseHeaders != 0)
    
  •   	juce_getInternetFileHeaders (handle, responseHeaders);
    

    }

    ~WebInputStream()
    @@ -419,12 +425,14 @@ InputStream* URL::createInputStream (const bool usePostCommand,
    OpenStreamProgressCallback* const progressCallback,
    void* const progressCallbackContext,
    const String& extraHeaders,

  •                                 const int timeOutMs) const
    
  •                                 const int timeOutMs,
    
  •                                 StringPairArray* responseHeaders) const
    

{
ScopedPointer wi (new WebInputStream (*this, usePostCommand,
progressCallback, progressCallbackContext,
extraHeaders,

  •                                                       timeOutMs));
    
  •                                                       timeOutMs,
    
  •                                                       responseHeaders));
    

    return wi->isError() ? 0 : wi.release();
    }
    diff --git a/src/io/network/juce_URL.h b/src/io/network/juce_URL.h
    index f54c22f…3d3ed9b 100644
    — a/src/io/network/juce_URL.h
    +++ b/src/io/network/juce_URL.h
    @@ -203,12 +203,15 @@ public:
    @param connectionTimeOutMs if 0, this will use whatever default setting the OS chooses. If
    a negative number, it will be infinite. Otherwise it specifies a
    time in milliseconds.

  • */
  •    @param responseHeaders  if this is non-zero, all the (key, value) pairs received as headers
    
  •                            in the response will be stored in this array
    
  • */
    
    InputStream* createInputStream (bool usePostCommand,
    OpenStreamProgressCallback* progressCallback = 0,
    void* progressCallbackContext = 0,
    const String& extraHeaders = String::empty,
  •                                int connectionTimeOutMs = 0) const;
    
  •                                int connectionTimeOutMs = 0,
    
  •                                StringPairArray* responseHeaders = 0) const;
    

    //==============================================================================
    diff --git a/src/native/mac/juce_mac_Network.mm b/src/native/mac/juce_mac_Network.mm
    index a7d0170…d115188 100644
    — a/src/native/mac/juce_mac_Network.mm
    +++ b/src/native/mac/juce_mac_Network.mm
    @@ -137,6 +137,7 @@ using namespace JUCE_NAMESPACE;
    bool initialised, hasFailed, hasFinished;
    int position;
    int64 contentLength;

  • NSDictionary* allHeaders;
    NSLock* dataLock;
    }

@@ -200,6 +201,7 @@ public:
hasFailed = false;
hasFinished = false;
contentLength = -1;

  • allHeaders = 0;

    runLoopThread = new JuceURLConnectionMessageThread (self);
    runLoopThread->startThread();
    @@ -224,6 +226,8 @@ public:
    [data release];
    [dataLock release];
    [request release];

  • if (allHeaders != 0)

  •    [allHeaders release];
    

    [super dealloc];
    }

@@ -244,6 +248,17 @@ public:
[dataLock unlock];
initialised = true;
contentLength = [response expectedContentLength];
+

  • if ([response class] == [NSHTTPURLResponse class])
  • {
  •    allHeaders = [((NSHTTPURLResponse *)response) allHeaderFields];
    
  •    [allHeaders retain];
    
  • }
  • else
  • {
  •    allHeaders = 0;
    
  • }

}

  • (void) connection: (NSURLConnection*) conn didFailWithError: (NSError*) error
    @@ -416,6 +431,30 @@ int64 juce_getInternetFileContentLength (void* handle)
    return -1;
    }

+bool juce_getInternetFileHeaders (void* handle, StringPairArray* headers)
+{

  • JuceURLConnection* const s = (JuceURLConnection*) handle;
  • NSDictionary* dictionary = s->allHeaders;
  • if (s != 0 && dictionary != 0)
  • {
  •    NSEnumerator* enumerator = [dictionary keyEnumerator];
    
  •    id key;
    
  •    while ((key = [enumerator nextObject]))
    
  •    {
    
  •        String keyString (nsStringToJuce ((NSString*)key));
    
  •        String valueString (nsStringToJuce ((NSString*)[dictionary objectForKey:key]));
    
  •        headers->set (keyString, valueString);
    
  •    }
    
  •    return true;
    
  • }
  • return false;
    +}

int juce_seekInInternetFile (void* handle, int /newPosition/)
{
JuceURLConnection* const s = (JuceURLConnection*) handle;
diff --git a/src/native/windows/juce_win32_Network.cpp b/src/native/windows/juce_win32_Network.cpp
index 6073327…c4e9fc9 100644
— a/src/native/windows/juce_win32_Network.cpp
+++ b/src/native/windows/juce_win32_Network.cpp
@@ -279,6 +279,68 @@ int64 juce_getInternetFileContentLength (void* handle)
return -1;
}

+bool juce_getInternetFileHeaders (void* handle, StringPairArray* headers)
+{

  • const ConnectionAndRequestStruct* const crs = static_cast <ConnectionAndRequestStruct*> (handle);
  • if (crs != 0)
  • {
  •   LPVOID outputBuffer = 0;
    
  •   DWORD size = 0;
    
  •   // Implementation adapted from: http://msdn.microsoft.com/en-us/library/aa385373%28v=VS.85%29.aspx
    
  •   for (;;)	// Dummy loop for reiterating the call, because the first one will fail for insufficient buffer size.
    
  •   {
    
  •   	if (! HttpQueryInfo (crs->request, HTTP_QUERY_RAW_HEADERS_CRLF, outputBuffer, &size, 0))
    
  •   	{
    
  •   		if (GetLastError () == ERROR_HTTP_HEADER_NOT_FOUND)
    
  •   		{
    
  •   			return true;	// No header was present, this call returns successfully
    
  •   		}
    
  •   		else
    
  •   		{
    
  •   			if (GetLastError () == ERROR_INSUFFICIENT_BUFFER)	// This is the expected result of the first call to HttpQueryInfo
    
  •   			{
    
  •   				jassert (outputBuffer == 0);	// If this assert is triggered, we've looped more than once, and that's bad
    
  •   				outputBuffer = new char [size];	// Allocate the necessary buffer
    
  •   				continue;						// Repeat the query
    
  •   			}
    
  •   			else	// Unexpected error
    
  •   			{
    
  •   				if (outputBuffer)
    
  •   					delete [] outputBuffer;
    
  •   				return false;
    
  •   			}
    
  •   		}
    
  •   	}
    
  •   	// On success, here we should parse the received headers
    
  •   	String headersString ((wchar_t *) outputBuffer);
    
  •   	delete [] outputBuffer;
    
  •   	StringArray headersArray;
    
  •   	headersArray.addLines (headersString);
    
  •   	for (int i = 0; i < headersArray.size (); ++i)
    
  •   	{
    
  •   		const String& headersEntry = headersArray [i];
    
  •   		String key (headersEntry.upToFirstOccurrenceOf (": ", false, false));
    
  •   		String value (headersEntry.fromFirstOccurrenceOf (": ", false, false));
    
  •   		String previousValue ((*headers) [key]);
    
  •   		headers->set (key, previousValue.isEmpty () ? value : previousValue + ";" + value);
    
  •   	}
    
  •   	return true;
    
  •   }
    
  • }
  • return false;
    +}

void juce_closeInternetFile (void* handle)
{
if (handle != 0)
[/code]


#9

Sorry, I can’t get that to work - maybe just email me the file?


#10

File emailed!

In addition, I’d point out in the documentation for URL::createInputStream that returning a null pointer means that some error happened, and in URL::withParameter, that the parameters aren’t actually used only for GET http method but also included for POST requests.


#11

I’ve worked on this and related subject for a while now, there are some things I’d like to point out:

  1. juce_URL.cpp, line 410

[code] {
data << getMangledParameters (url.getParameters())
<< url.getPostData();

        data.flush ();    // I ADDED THIS LINE
        // just a short text attachment, so use simple url encoding..
        headers = "Content-Type: application/x-www-form-urlencoded\r\nContent-length: "
                    + String ((unsigned int) postData.getSize())
                    + "\r\n";
    }

[/code]

data is a MemoryOutputBuffer and postData is its underlying MemoryBlock.
While the MemoryBlock is managed as the internal storage for the MemoryOutputBuffer, its size does not reflect the actual size of written data (on the contrary, it is resized to an initial size by the MemoryOutputBuffer constructor and increased by fixed amounts when needed). It is resized to the actual size of written data only by a flush () call (which is normally called by the destructor of the MemoryOutputBuffer). That’s why I added one to the code above: otherwise, the call to postData.getSize () would result in a larger number than the actual size of written data, and this screws things up very much on Windows (it causes no problems on Mac).
Using data.getDataSize () instead of postData.getSize () would have worked equally well without the need for the flush () call, so that’s another viable solution if you prefer.

  1. juce_mac_Network.mm, line 234

[code]- (void) createConnection
{
connection = [[NSURLConnection alloc] initWithRequest: request
delegate: self]; // it was [self retain] before

if (connection == nil)
    runLoopThread->signalThreadShouldExit();

}
[/code]

I removed the “retain” call because it has no matching “release” upon destruction, and it was causing memory leaks. I tested it and using just “self” worked well so far, without dangling pointer issues.

  1. juce_win32_Network.cpp, line 172

DWORD flags = INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_COOKIES;
I have added the INTERNET_FLAG_NO_COOKIES to be able to pass my own cookies in the extra headers. Otherwise, any additional "Cookie: " directive in the extra headers gets ignored in favour of cookies managed directly by Windows. I understand that letting the OS manage the cookies is a desired behaviour in some situations, so maybe adding a boolean parameter for this is the best way to go, so the library user can choose what to do. Now that it is possible to retrieve the cookies by reading the headers, a developer can manage to respond with received cookies equally well.

same file, line 217:

HttpEndRequest can fail as well, especially if you use the whole thing to send a large file. I suggest replacing the simple one line call with

if (! HttpEndRequest (request, 0, 0, 0)) { break; }

because this one too can timeout and, currently, this does not result in a error reported by returning 0 (while it should, in my opinion).


#12

In addition, I have read somewhere that the separator for coalesced HTTP headers is the comma, not the semicolon that I used in the patch I have submitted to you… It should be corrected in juce_win32_Network.cpp, line 303 (in juce_getInternetFileHeaders ())


#13

Excellent stuff!

  1. Thanks - nice catch!
  2. I think this might be a change in the OSX SDK behaviour - when I wrote it, the connection object was outliving my delegate and crashing when it made the callback, but now, the SDK docs say that it retains the delegate - that looks like they’ve fixed a bug there… Not really sure how best to change it so that it will still run ok on older machines… How about this:

[code] NSInteger oldRetainCount = [self retainCount];
connection = [[NSURLConnection alloc] initWithRequest: request
delegate: self];

if (oldRetainCount == [self retainCount])
    [self retain];

[/code]

  1. Yes, that sounds good - I don’t think the other platforms let the OS provide cookies, so I’ve no objection to turning this off and making them all behave the same.

[quote]
In addition, I have read somewhere that the separator for coalesced HTTP headers is the comma, not the semicolon that I used in the patch I have submitted to you… It should be corrected in juce_win32_Network.cpp, line 303 (in juce_getInternetFileHeaders ())[/quote]

That’s interesting - do you think it should break at either a comma or semicolon? Or just commas?


#14

[quote=“jules”]Excellent stuff!

  1. Perfect!

  2. MacOS X actually replies to cookies, but if you provide a "Cookie: " directive in your extraHeaders, those provided by you take the precedence over the OS provided ones… I have no time now to dig further on how to disable this handling of cookies on mac but I think I’ll try to make up a patch for having a boolean parameter that enables or disables OS managed cookies. In the mean time, I think we are done this way.

According to the cookies separator, it’s not a problem of where you should break: on the contrary, we are speaking on how to deal with headers that are repeated in the response… for example, if the received headers have two "Set-Cookie: " directive, they can’t make it into two separate entries of a StringPairArray because keys can’t be duplicated, but you can add the value of the second "Set-Cookie: " to the value of the previous one, separating it with a separator. That separator should be a comma in my opinion, because of two reasons:

a) the mac code gets a NSDictionary for the headers from the OS, where this same separator is used with the same meaning, so using this on Windows will result in the same content for the resulting StringPairArray

b) most of the HTTP header options allow collapsing multiple instances of the same option with the same key, and the list of values separated by commas.

In HTTP headers, semicolons tend to be “intra-values” separators: if the value of a single header option is composed of more than a single element, those elements are separated by semicolons, and then a comma separates this complex value from the next one… I hope I have been clear.


#15

Ok, I think I understand…! So the code should be like this:

[code] const String key (header.upToFirstOccurrenceOf (", “, false, false));
const String value (header.fromFirstOccurrenceOf (”, ", false, false));
const String previousValue (headers [key]);

                headers.set (key, previousValue.isEmpty() ? value : (previousValue + ";" + value));

[/code]

…keeping the semi-colon for separating the previous and new values, right?


#16

no, this is the right way:

[code] const String key (header.upToFirstOccurrenceOf (": “, false, false));
const String value (header.fromFirstOccurrenceOf (”: ", false, false));
const String previousValue (headers [key]);

                headers.set (key, previousValue.isEmpty() ? value : (previousValue + "," + value));

[/code]

What I was trying to say is that a header row can have this form:

“key: value1, value2, value3”

where occasionally the values can have semicolons as separators for internal elements, for example the whole of the following string can account as value1.

“NAME=VALUE; expires=DATE; path=PATH; domain=DOMAIN_NAME;”

if we use the semicolon to separate value1 from value2, that semicolon will not be distinguishable from those used inside value1


#17

Sorry, I’m having a slow-witted day today… Ok, gotcha, will get that changed, thanks again!


#18

I’m sorry to bother you again on this subject, but the last commit available on the tip now (cc45ec88f5b9c56d18081707ee191b476b44ff68) still has wrong code:
key and value are separated by colon (not semicolon), so this one below is the correct code (to be fixed both on Windows and Linux builds).

                    const String key (header.upToFirstOccurrenceOf (": ", false, false));
                    const String value (header.fromFirstOccurrenceOf (": ", false, false));

The current one looks like this, which is wrong:

                    const String key (header.upToFirstOccurrenceOf ("; ", false, false));
                    const String value (header.fromFirstOccurrenceOf ("; ", false, false));

the rest of the code of this area is ok!


#19

aghh… sorry, I managed to misread all your posts! Will get a fixed version up today!


#20

[quote=“jules”]

[code] NSInteger oldRetainCount = [self retainCount];
connection = [[NSURLConnection alloc] initWithRequest: request
delegate: self];

if (oldRetainCount == [self retainCount])
    [self retain];

[/code]

Again on this: I suggest using a NSUinteger instead of a NSInteger for oldRetainCount, otherwise the compiler complains about signed/unsigned comparison in the if