URL and POST

I’m using the same URL code to POST data to my web application, it’s a simple JSON data post. On windows and MAC i’m getting the right answers from my server, but when on Linux the same post produces an error from the server, i wanted to see how the request looks like and i got:

POST /ddb/ HTTP/1.0
Host: ctrlr.org:80
User-Agent: JUCE/2.0
Connection: Close
Content-Length: 69
Content-Type: application/x-www-form-urlencoded
Content-length: 69

requestData=%5b%0d%0a++%22__list%22%2c%0d%0a++%22allItems%22%0d%0a%5d

with that, the response from the server is “HTTP Error 413 Request entity too large”

I know very little about http, but it kind of looks ok to me… Not sure what could be wrong with it, it certainly doesn’t look “too large”…

well yeah that response is very weird, i don’t know if that has to do with the double Content-length header, i was stumped cause it all works on other platforms.

Looking in the linux code, I can’t see how it could generate the content-length value twice… ??

I added some logging to the URL class inside JUCE and this what happens:

!!JUCE URL::createInputStream headers
Content-Type: application/x-www-form-urlencoded
Content-length: 69

!!JUCE WebInputStream::createRequestHeader headers:POST /ddb/ HTTP/1.0
Host: ctrlr.org:80
User-Agent: JUCE/2.0
Connection: Close
Content-Length: 69
Content-Type: application/x-www-form-urlencoded
Content-length: 69

requestData=%5b%0d%0a++%22__list%22%2c%0d%0a++%22allItems%22%0d%0a%5d

Ah, I see… Ok, how about something like this in juce_linux_Network.cpp, to prevent it duplicating header fields:

[code] static void writeValueIfNotPresent (String& dest, const String& headers, const String& key, const String& value)
{
if (! headers.containsIgnoreCase (key))
dest << “\r\n” << key << ’ ’ << value;
}

static MemoryBlock createRequestHeader (const String& hostName, const int hostPort,
                                        const String& proxyName, const int proxyPort,
                                        const String& hostPath, const String& originalURL,
                                        const String& headers, const MemoryBlock& postData,
                                        const bool isPost)
{
    String header (isPost ? "POST " : "GET ");

    if (proxyName.isEmpty())
    {
        header << hostPath << " HTTP/1.0\r\nHost: "
               << hostName << ':' << hostPort;
    }
    else
    {
        header << originalURL << " HTTP/1.0\r\nHost: "
               << proxyName << ':' << proxyPort;
    }

    writeValueIfNotPresent (header, headers, "User-Agent:", "JUCE/" + String (JUCE_MAJOR_VERSION) + "." + String (JUCE_MINOR_VERSION));
    writeValueIfNotPresent (header, headers, "Connection:", "Close");
    writeValueIfNotPresent (header, headers, "Content-Length:", (int) postData.getSize());
    header << "\r\n" << headers << "\r\n";

    MemoryBlock mb;
    mb.append (header.toUTF8(), (int) strlen (header.toUTF8()));
    mb.append (postData.getData(), postData.getSize());

    return mb;
}

[/code]

Well that removed the duplicate but did not help. I decided to go deep with Wireshark and compared Windows and Linux request headers from my application (same build)
on Linux i get

POST /ddb/ HTTP/1.0
Host: ctrlr.org:80
User-Agent: JUCE/2.0
Connection: Close
Content-Type: application/x-www-form-urlencoded
Content-length: 69

requestData=%5b%0d%0a++%22__list%22%2c%0d%0a++%22allItems%22%0d%0a%5dHTTP/1.1 200 OK
Server: Apache/2.2.3 (CentOS)
Last-Modified: Thu, 15 Jul 2010 18:08:10 GMT
ETag: "13f-48b70fc293680"
Accept-Ranges: bytes
Cache-Control: max-age=172800
Expires: Sat, 10 Sep 2011 19:12:15 GMT
Content-Type: text/html
Content-Length: 319
Date: Thu, 08 Sep 2011 19:12:15 GMT
X-Varnish: 654761406
Age: 0
Via: 1.1 varnish
Connection: close

<html>
<head>
  <title>Unknown Site</title>
</head>
<body>
<p>This space is managed by SourceForge.net. You have attempted to access a URL that either never existed or is no longer active. Please check the source of your link and/or contact the maintainer of the link to have them update their records.
</body>
</html>

So the request looks OK but the response is bad (since the site exists and is fine)

But to compare below is Windows

POST /ddb/ HTTP/1.1
Accept: */*
Content-Type: application/x-www-form-urlencoded
Content-Length: 69
User-Agent: juce
Host: ctrlr.org
Cache-Control: no-cache

requestData=%5b%0d%0a++%22__list%22%2c%0d%0a++%22allItems%22%0d%0a%5dHTTP/1.1 200 OK
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.3.2
Cache-Control: max-age=172800
Expires: Sat, 10 Sep 2011 19:09:29 GMT
Content-Type: text/html
Content-Length: 51770
Date: Thu, 08 Sep 2011 19:09:29 GMT
X-Varnish: 91239431
Age: 0
Via: 1.1 varnish
Connection: keep-alive

---- loads of data i expected

And comparing the two i removed the port from the Host: header on Linux and it worked. I don’t know the “exact” specs for HTTP but i think you specify the port if you are requesting a virtualHost on a perticular port, otherwise any HTTP server (apache, or varnish in this case witch is a reverse proxy) will lookup a invalid virtual name for that host.

Anyway on line 353 in linux_Network.cpp i changed from

<< hostName << ':' << hostPort;

to

<< hostName;

Hmm.

Well, according to the spec, it is correct to add the port number:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html - see section 14.23

It says you can leave off the number if you just want a default, but it’s certainly ok to add one.

I guess what I’ll do is to make it leave it blank unless the URL actually contains an explicit port number…

I think that it’s the fact you are saying HTTP 1.0 not HTTP 1.1, have a look at http://www8.org/w8-papers/5c-protocols/key/key.html the section “Internet address conservation”.

Ah, that could explain it, I suppose there’s no harm in me changing it to HTTP1.1 by default.

Now there are new problems with the port.

First, the decomposeUrl method return the default port as 0
then 0 is used as a service port with a call to getaddrinfo()
result from getaddrinfo() is passed to connect() and connect() fails, since the port is 0 (and not NULL, from what i’ve seen getaddrinfo() can deal without the port, also i think the returned data is a linked list maybe it would be a good idea, to iterate through it and try to connect to any of the results ? http://en.wikipedia.org/wiki/Getaddrinfo)

If i tried setting the port to 80 as the deaful in the decomposeURL then the request fails because the method createRequesHeader adds the port number to the Host header (and the source url does not have the port in it, so it’s an invalid request)

I thought it might be best to include the port in the Host header only if the passed URL contains the port specification otherwise, lookup the port and pass it to the socket but never use it in the Host header.

Isn’t that what I changed it to do?

well like i wrote i’m getting errors on the connect() call, cause the port past to getaddrinfo() is “0” not a null.

beware that with http 1.1 some servers will return chunk-encoded responses , that was the reason you swicthed to http1.0 in the first place
( http://www.rawmaterialsoftware.com/viewtopic.php?f=2&t=2759 )

Aha! I knew there’d have been a reason! Back to 1.0 then…