Oh no, have all the trendy kids moved on from JSON to YAML already!?
NO, JSON is a proper subset of YAML and as such will always be good to go! Plus, there aren’t official implementations for e.g. Javascript yet. However, there will be, and since it’s completely compatible…
You’re hitting JSON at exactly the right time (well perhaps a bit late), at the sweet spot for this technology. YAML, you’d be an early adopter.
For me, the big advantage of YAML over HSOB is when you’re cutting and pasting records into emails and such, or sending a stream of records through a socket - the fact that it is even more minimal and readable than JSON is another plus.
Is this faster than XML?
At the top level, both XML and JSON parsers are the same O() but in practice JSON parsers always seem to beat comparable XML parsers by a healthy margin and in some cases by over an order of magnitude, if you are validating the XML against a grammar, schema, DTD or whatever the name is (I’ve tried to repress my XML experiences).
There are many reasons why.
The first is simply that JSON documents are always substantially shorter than XML documents, so there are fewer characters to process.
The second is that XML is simply more complex! Clearly, validating a document against a grammar is a hugely time-consuming process, and one can imagine pathological cases with complex grammars… so let’s simply talk about non-validating XML parsers… but even then there are just more “moving parts”, more states in our state machine.
Finally, the data structures that represent XML consume more memory than the data structures that represent JSON. In most newer programming languages (Python, Javascript, etc), JSON objects directly correspond to the native “array” and “associative dictionary” types, so there is simply no need to create a new data structure at all.
In older languages like C++ and Java, there is no specific canonical/“native” associative dictionary(*), so you have to choose one of the many available classes to do this, but it’ll still be something small and fast.
Either way, you end up with arrays and associative dictionaries, things that your language has numerous tools to deal with already.
An XML object is not so simple. It has properties and it contains other tags - or it might be CDATA! So you need some special purpose XML data structure to represent this - a more complex data structure that will occupy more space and resist optimization.
But what about validation with a grammar, schema or DTD? Well, you just throw that in the bin completely - it was a worthless idea. You were always validating the data in your program anyway, making sure ages were positive and IDs existed in your database.
So now you parse any old crap that’s valid JSON, do the work if you have the data fields you need or complain if you don’t, and simply ignore any fields you don’t recognize (which is great for forward and backward compatibility).
OK, but what if you need a data language? Well, XML didn’t provide you with one, did it? I use Google’s free, open source Protocol Buffers for that purpose. It’s a pretty big hammer in some ways, but it generates tight code and a lot of very fancy features.
Protocol buffers aren’t directly JSON, but they have just the same structure, so writing a wrapper to and from JSON (in fact, YAML) turned out to be a pleasant couple of hours’ work with no issues (given a vast amount of work and a lot of hair-pulling previously on all these topics :-D).
–
(* - actually, I’m thinking this is no longer true in recent versions of Java…)