1. 48
  1.  

  2. 7

    Great article. Similar things pop up whenever youngsters think they can replace old proven tech with $FLAVOR_OF_MONTH. NoSQL is to SQL as..

    • JSON is to XML
    • Matrix is to XMPP
    • DAB is to FM
    • Hyperloop is to rail
    1. 11

      You mean NoSQL is significantly better for its intended use? Or are you just picking really bad examples.

      JSON is to XML

      JSON is an easy-to-parse serialisation format with a well-defined object model. It has a few weaknesses (no way to serialise 64-bit integers is the big one). Most of the ‘parsing minefield’ problems are related to handling invalid JSON which is far more of a problem with XML because there are so many ways to get things wrong in the standard.

      In a language that has unicode string support, you can write a JSON parser in about 200 LoC (I have). The XML standard is very complicated and so your choices are either something that supports a subset of XML or using libxml2 because no one else can manage to write a compliant parser (and libxml2 doesn’t have a great security record). Even just correctly handling XML entities is a really hard problem and so a lot of things punt and use a subset that doesn’t permit them. Only now they’re not using XML, they’re using a new poorly specified language that happens to look like XML.

      XML does a lot more than JSON. It can be both a markup language and an object serialisation language and it allows you to build arbitrary shapes on top, but that’s also its failing. It tries to solve so many problems that it ends up being a terrible solution for any of them.

      Matrix is to XMPP

      I was involved in the XMPP process around 2002ish and for a few years. It was a complete mess. The core protocol was more or less fine (though it had some issues, including some core design choices that made it difficult to use a lot of existing XML interfaces to parse) but everything else, including account registration, avatars, encryption, audio / video calling, and file transfer were defined by multiple non-standards-track proposals, each implemented by one or two clients, many depending on features that weren’t implemented on all servers. There wasn’t a reference server implementation (well, there was. It was jabberd. No, the jabberd2 rewrite. No, ejabberd… The JSF changed their mind repeatedly) and no reference client library, so everything was interoperable at the core protocol level and nothing was interoperable at the level users cared about.

      In contrast, Matrix has a much more fully specified set of core functionality and a permissively licensed reference implementation of the client interfaces.

      DAB is to FM

      DAB uses less bandwidth and requires less transmitter power for the same audio quality than FM. DAB+ (which is now over 15 years old) moved to AAC audio. Most of the early deployment problems were caused by either turning the compression up far too high or by turning the power down to a fraction of what the FM transmitter was using. For the same power budget and aiming for the same audio quality, you can have more stations and greater range with DAB than FM.

      Hyperloop is to rail

      Okay, you can have that one.

      1. 2

        JSON is an easy-to-parse serialisation format with a well-defined object model. It has a few weaknesses (no way to serialise 64-bit integers is the big one). Most of the ‘parsing minefield’ problems are related to handling invalid JSON which is far more of a problem with XML because there are so many ways to get things wrong in the standard.

        Funny, because json.org tells me 64-bit integers are perfectly valid. In fact any size integer is valid.

        In a language that has unicode string support, you can write a JSON parser in about 200 LoC (I have).

        Don’t write your own parsers. You will get it wrong and make things an even worse mess.

        In contrast, Matrix has a much more fully specified set of core functionality

        I see there is an actual Matrix spec now. Not bad. No RFC though. But you are right that the core spec of XMPP is very barebones. You need to add lots of XEPs on top to make it useful. Modern servers and clients do this. What the Matrix people have done is take developer effort away from XMPP, fracturing the federated chat ecosystem. Yes I’m upset about this.

        One problem with Matrix is the godawful URI syntax. Instead of being able to say user@example.com like every other protocol, the Matrix devs in their junior wisdom decided instead to go with @user:example.com instead. How do I link to my Matrix account from my website? If things were sensible it would just be matrix:user@example.com. Perhaps matrix:@user:example.com? Or should my OS just know that the protocol “@user” means Matrix? Who knows.

        All this is without getting into perhaps Matrix’ biggest problem: resource use.

        DAB uses less bandwidth and requires less transmitter power for the same audio quality than FM

        Yes, this is all true. But you’re also throwing out one of the main points of broadcast radio: to be able to reach the masses, especially in times of crisis. There are ways of retrofitting FM with digital subcarriers such that existing receivers don’t become paperweights. Because it is FM, you can use GMSK which has quite nice Eb/N0 behavior. Not as nice as OFDM used by DAB but eh.. Good enough.

        edit: I realized I’m wrong about the modulation. It’s always going to be X-over-FM where X is any modulation. It must always run above the stereo pilot wave. Said pilot wave may be omitted, giving mono FM and more bandwidth for subcarriers.

        Anyway, there’s been debate around this in Sweden and the only people who want DAB are the people selling DAB receivers. The broadcasting people don’t want it, the people running the transmitters don’t want it and there is zero pressure from the public.

        1. 1

          True, the radio itself is dying slowly, investing to for more channels doesn’t really make sense for the consumer, or producer.

      2. 9

        XML is certainly proven harmful by its list of exploits for a serialization format, of all things. JSON doesn’t have that issue.

        1. 6

          You mean things like the billion laughs attack? That’s not enabled by default in any modern XML parser. JSON has its own set of parsing nightmares, and lacks a standardized way of writing schemas or handling extensions. On top of that you have things like SOAP, XSLT, XPATH and so on, all standardized.

          1. 4

            Do people write new SOAP APIs anymore? Not sure who is also using XPath or XSLT.

            IMHO, XML is a good document format, but has a lot of ambiguity for serialization (i.e. attributes or elements?).

            1. 2

              Do people write new SOAP APIs anymore?

              The EU does, as does many parts of the Swedish government.

              IMHO, XML is a good document format, but has a lot of ambiguity for serialization (i.e. attributes or elements?).

              This is a bit of a strange one with XML I agree. Attributes have two useful properties however: there cannot be more than one of each and they don’t nest. This could be enforced on elements with a schema, but that came later..

            2. 4

              I’m not aware of any JSON parsing nightmares, could you elaborate?

              1. 9

                This article posted on this very site a day or two ago: Parsing JSON is a Minefield

                1. 12

                  If parsing JSON is a minefield, parsing XML is a smoking crater.

                  Look XML is fine as a document description language, but it’s crazy to pretend like it is somehow a superior ancestor to JSON. JSON and XML just do different things. JSON is a minimal, all purpose serialization format. XML is a document description language. You can of course cram anything into anything else, but those are different jobs and are best treated separately.

                  1. 5

                    And now we have things like JWT, where instead of DoS via (effectively this is what entity-expansion is) zip bombing, we can just jump straight to “you don’t need to check my credentials, I’m allowed to do admin things” attacks.

                    Like it or not, JSON the format is being transformed into JSON the protocol stack, with all the trouble that implies. Just as XML the format was turned into XML the protocol stack in the last age.

                    1. 5

                      JWT is just poorly designed, over and above its serialization format. But as bad as it is, it is significantly more sane than whatever the SAML people were thinking. To be fair though, both JSON and XML are better than ASN.1. In all cases, the secure protocol implementers chose an off the shelf serialization format which was a significant mistake for something that needs totally different security properties than ordinary serialization. One would hope that the next scheme to come along won’t do this, but I’m guessing it will just be signed protobuffs or some such, and the same problems will occur.

                    2. 3

                      billion laughs

                      I already addressed this.

                      XML is mature and does everything JSON does and more. Its maturity is evident in the way JSON people try to reinvent everything XML can already do. From a langsec perspective the only thing JSON has going for it is that it is context-free. There are XML dialects that have this property as well, if I remember correctly.

                      1. 2

                        does everything JSON does and more.

                        My suggestion is that “and more” is bad.

                        1. 1

                          Tooling is good actually. And as I said to the other person, JSON people are busy reinventing most tools that already exist for XML.

                          1. 1

                            JSON people are busy reinventing most tools that already exist for XML

                            Are they? Things I never use: JSON Schema (just adds noise to an internal project; can’t force it on an external one); JPath (your data should not be nested enough to need this); code generators beyond https://mholt.github.io/json-to-go/ (if your code can be autogenerated, it is a pointless middle layer and should be dropped); anywhere you’d use SAX with XML, you can probably use ND-JSON instead; XSLT is a weird functional templating language (don’t need another templating language, thanks)… Is there something I’m missing? I mean, the internet is big, and people reinvent everything, but I can’t say that there are XML tools that I’m jealous of.

                            Maybe we’re in different domains though. I just can’t really imagine having a job where I’m confused about whether to use XML or JSON. The closest is today I saw https://github.com/portabletext/portabletext which is a flavor of JSON for rich text. But I think that project is mistaken and it should just define a sane subset of HTML it supports instead of creating a weird mapping from HTML to JSON.

                            1. 1

                              Things I never use

                              Yes,you never use them. But there are people who try to write protocols using JSON and they just end up reinventing XML, poorly. This means yet another dependency for everyone to pull in. Someone using JSON in their proprietary web app matters little. Someone baking it into an RFC matters a lot.

          2. 1

            When considering data storage, I like my bitemporal graph database of choice. The main advantage is being built on top of traditional data storage, instantly making it seem robust.