1. 21

  2. 10

    Note as prior art: Hadoop often uses an odd one-json-object-per-line format for the same reason, so you can parse it a lump at a time. i don’t see any real benefit in CSJ over that, since you could always skip the whole ‘comma separated’ bit and just make each line an array if that’s what you really want.

    However, i think this is missing the point: your parser should be able to deal with huge JSON documents incrementally. XML SAX parsers could. See also this blog rant: http://nick.zoic.org/etc/deserialize-alter-serialize-an-antipattern/

    1. 2

      CSJ has one glaringly obvious benefit: you don’t need to repeat the same set of keys single line. (And each line is implicitly an object, never any other kind of JSON value.) Compression can neutralise much of that difference, but not all of it.

      Of course that only constitutes a benefit if your data is actually tabular, rather than a sequence of objects with wildly variable shapes.

      1. 1

        I was coming to say exactly the same as your first point, Line separated json as a lot of bonus and zero malus comparing of CSJ

        1. 1

          See also Go, it has built-in support in the stdlib to parse such streams of JSON values: https://golang.org/pkg/encoding/json/#Decoder (e.g. https://play.golang.org/p/Y8MVKZVglf), and presumably other languages already have support for this too, compared to the proposed CSV-JSON. JSON values are already well-defined, so you don’t need a separator to know when one begins/ends (though whitespace is useful for humans, truenull9.12false is not super user-friendly).

        2. 6

          The problem with JSON is that [..] to parse it you have to build everything in one go back into memory.

          Uhm… No, you don’t: yajl in C, ijson in Python. Event-based parsing as you’d expect.

          (Disclosure: I’m the author of ijson)

          1. 1

            Ijson is fantastic for GeoJSON, thank you.

            1. 1

              SBJson 4 has a (I think) simple block-based interface for doing incremental JSON parsing in Objective-C. There’s a video of a talk I did about it, with admittedly dreadful audio. The slides for that talk are also available.

              (Disclosure: I am the author of SBJson.)

            2. 4

              Having a header line seems to limit the usefulness of this as a streaming format.

              1. 3

                This is… actually a great idea.

                The simplicity of being able to stick a “[” and “]” around the line, then feed it through a standard JSON parser, it’s great.

                1. [Comment removed by author]

                  1. 3

                    I suspect the extra [] characters would push it back over the line from “barely structured” into “structured”, at least visually. People seem to like the fact that most CSV text doesn’t look like data, but like something they could type by hand.