I am having a problem with the URL class. Whenever I request a URL using readEntireTextStream(bool) I always get a couple of garbage lines, at the start of the stream and at the end. The beginning garbage line seems to be always composed of three characters, and the end is usually “0”. This means that if I request an XML file via URL, it ALWAYS fails to parse.
In addition, I get the occasional 3 character garbage line in some webpages, between the beginning and the end.
Is this normal? Can anyone reproduce this?
Also, to reiterate a previous problem I was having, URL doesn’t seem to encode non-ascii characters properly. Things like curly quotes and em-dashes produce the wrong output.
Testing in Windows shows that the garbage lines at the beginning and the end are not present. Also, readEntireXmlStream works in the same uri that JUCE - OS X fails with. So this seems to be an OS X - only issue.
As far as “special characters” (like: — “ ”) that are not properly escaped, they get mangled in Windows as well. It would be nice if we would get the actual character… I thought we should, since pages are usually encoded in UTF-8 or better.
http://matadata.com/bucket/test.html for escaping and non-escaping curly quotes. Non-escaped quotes produce “â” (plus apparently some other invisible stuff) in my system. This page, however, does not display the initial “garbage line”… I don’t know why.
my site: http://matadata.com is hopelessly broken by URL. Newlines basically produce garbage, and there are random characters interspersed throughout. I changed the character encoding of the source files to no effect. It might be how PHP is serving the HTML, but I have no control over that.
http://nytimes.com displays the initial garbage line, and the closing “0” line.
I was seeing the same behavior on Mac OS X doing CDDB lookups over HTTP with juce 1.45: garbage bytes at the beginning and end of the message (as it turns out, chunked-encoding headers)–the same code worked fine on Windows for months.
Changing the header to HTTP/1.0 in juce_mac_HTTPStream.h seems to have fixed it–big thanks to jpo for posting that suggestion here.