JUCE::URL doesn’t have a way to parse out the standard fields. It provides only scheme, host (called “domain”), path (called “subPath”), and query (called “parameters”); for anything else, the only way to get it is to call toString and then use some other parsing library.
Since the only thing I need is the port (actually, I need either host:port or domain:port, so I can generate cross-domain scripting rules for a given URL, but with the port and what’s already there, I can do the rest easily), I’ve just locally added a getPort method, implemented as:
int start = findStartOfDomain (url);
while (url[start] == '/')
++start;
const int end1 = url.indexOfChar (start, '/');
const int end2 = url.indexOfChar (start, ':');
if (end2 == -1 || (end2 >= end1)) return String::empty;
return url.substring (end2+1, end1);
However, ideally, a URL class should provide the complete set of URL components. Also, it would be better to name them correctly (using “parameters” to mean “query” is especially confusing, because there’s a different field with that name).
It might be simpler to just look at widely-used existing URL parsers for a design. They all cut across the components at different depths (e.g., return the whole net_loc as a string, or break it down into userinfo and hostport, or all the way down to user, password, host, and port), and provide different sets of other extras (e.g., provide path-style APIs, or return the path as an object that provides them, or give params and query as a list of name-value pairs), and if you just clone a popular API, you don’t need to put too much thought into it. Some examples worth looking at: Python urlparse, Cocoa NSURL, Javascript parseUri, perl URL::Split, C++ cpp-netlib (see basic_uri).
If you want to design it from scratch, URLs are defined by RFC 1738 and RFC 1808 (and URIs by RFC 2396), and HTTP URLs in particular by RFC 1630. For a typical absolute URL (in the “common Internet scheme” or “generic URI” format) like this:
… you can define fields:
[list]
[]scheme = “http” // aka protocol[/]
[]net_loc = “user:pass@host.domain.com:8000” // aka authority
[list]
[]userinfo = “user:pass”
[list]
[]user = “user”[/]
[]password = “pass”[/][/list][/]
[]hostport = “host.domain.com:8000”
[list]
[]host = “host.domain.com”
[list]
[]domain = “domain.com”[/][/list][/]
[]port = “8000”[/][/list][/][/list][/]
[]path = “/path/to/resource”[/]
[]params = “param=pval”[/]
[]query = “q1=v1&q2=v2”[/]
[]fragment = “frag” // aka anchor[/][/list]
