1. 73
    1. 7

      Sometimes I complete an exercise in jq over on Exercism and feel pretty good about myself. Then I remember that things like jqjq exist, and I feel humbled. BTW, jqjq is by one of our very own: wader.

      1. 11

        👋 i’m a bit surprised that it worked out so well :) i chugged along at various small parts for quite a while. hardest was probably parsing of infix operators and most mind bending probably jq’s function “thunk”/lambda arg. but most of the other stuff felt quite smooth. probably the most nobel and thing i’m most satisfied with is that jqjq can eval itself (given enough memory) and that it has an eval/1 function that is a valid path expressions so it can be used in assignments and updates!

        $ ./jqjq -n '{a: 1} | eval(".a, .b") += 1'
        {
          "a": 2,
          "b": 1
        }
        
        1. 2

          jq is an impressive bit of code and it’s awesome how quickly it’s become A Standard Tool. I’m curious — what would you have done differently, knowing what you know now?

          1. 5

            Hey! jq was mostly designed by Stephen Dolan (works on Ocaml now a days!) and is not actively working on jq anymore. I’ve send some messages to him asking about similar things but no reply yet, i’ll let you know if i receive something

            But i know there are some unfortunate design choices that can’t really be changed now as it would break existing queries. Some that comes to mind:

            The //-operator can be confusing https://github.com/jqlang/jq/issues/2042 Some naming and conventions in the standard library Output order of “index”-suffixes .a[1,2][“b”,“c”] can be a bit confusing and it also re-evaluates the first expression if i remember correctly. can be very confusing when using input/1 etc that as side-effects.

            Have to go, but I might come back later to add more :)

            1. 1

              Thank you! That’s exactly the kind of hard experience I was hoping to hear about!

      2. 6

        The jq as a PEG Engine page linked is really interesting … I did not know jq was “like that” !

        1. 5

          Yeap! the generator/backtracking nature of jq makes it very nice for this… ignore how to use it to produce sane error messages on parse errors which i haven’t figured out yet :)

          1. 1

            Hm yeah this is fascinating, I searched the jq manual for “backtrack” and got only 1 hit:

            The empty builtin is the generator that produces zero outputs. The empty builtin backtracks to the preceding generator expression.

            There is a lot more talk about “generators”, but to me that doesn’t imply backtracking. So it is weird that this is sorta hidden! (I’ve only used jq a little bit)


            It looks like this is built mainly on the | operator

            If the one on the left produces multiple results, the one on the right will be run for each of those results. So, the expression .[] | .foo retrieves the “foo” field of each element of the input array. This is a cartesian product, which can be surprising.

            And the // operator:

            The // operator produces all the values of its left-hand side that are neither false nor null. If the left-hand side produces no values other than false or null, then // produces all the values of its right-hand side.

            OK I will have to play around with this …


            I kinda feel like the manual does not explain the “why?” of jq very well

            But it is interesting, because I guess this power is why jq is more popular than other JSON tools?

            I have looked at the implementation before, and I noticed it is like a real language with a GC ! (unlike awk and shell!)

            1. 4

              Hm yeah this is fascinating, I searched the jq manual for “backtrack” and got only 1 hit:

              There is a lot more talk about “generators”, but to me that doesn’t imply backtracking. So it is weird that this is sorta hidden! (I’ve only used jq a little bit)

              There is a much more technical description of the language in the wiki that has a whole section about it https://github.com/jqlang/jq/wiki/jq-Language-Description#generators-and-backtracking

              But yes it could probably be described more in the documentation as it is so fundamental and also a big reason jq is so great for some things.

              It looks like this is built mainly on the | operator

              Yeap! and it’s similar to how the pipe operator works in other functional languages plus the multiple outputs behaviour.

              OK I will have to play around with this …

              I can recommend! :) and if you, like me, spend quite a lot of time massaging and querying data in all kinds of formats then jq really shines. For example the string and regex functions are quite nice to use to “parse” non-JSON into jq values and if you want to query even binary formats (and some text formats) you might want to have a look my jq variant https://github.com/wader/fq

              I kinda feel like the manual does not explain the “why?” of jq very well

              It’s a bit a mystery to all the current maintainers what original idea and thoughts was, sadly stedolan who created and designed most of jq is not active in the projects anymore (work on Ocaml now a days). I’ve sent him some questions but no replies yet.

              But it is interesting, because I guess this power is why jq is more popular than other JSON tools?

              Mm i’m not sure actually, it seems like most of the really neat things jq can do are quite unknown and hidden. I’ve been working hard to bring them to light. Maybe most of it is just that is was a very early tool that allowed access and formatting of JSON?

              I have looked at the implementation before, and I noticed it is like a real language with a GC ! (unlike awk and shell!)

              The jq implementation in C i think is mostly reference counted, or what do you mean by GC here? yes i would say it’s real language that “happens” to use a JSON-superset as syntax and has the same types.

              I’ll try summon Nico to this thread, he is a jq old-timer and might have some more answers.

              1. 1

                Hm I guess the main thing about jq is that it’s sort of “point-free / higher order” …

                the expressions create filters, not values ??

                https://jqlang.github.io/jq/manual/#identity

                I was going to ask if . was influenced by Go templates, which use dot (.) as “the cursor”

                which was in turn influenced by my JSON Template project from 2009, which used @ as “the cursor”

                https://www.oilshell.org/blog/2023/06/ysh-design.html#the-first-json-language-i-designed-2009

                However that appears to be mostly superficial. The . is thought of as a filter and not a value. It’s the identity filter

                Go templates / JSON Template are based on walking the tree, not a sequence of filters on a tree (or really a stream of trees! )

                1. 3

                  Hm I guess the main thing about jq is that it’s sort of “point-free / higher order” …

                  the expressions create filters, not values ??

                  Yes I think so too, that is one of my favourite parts of it. Generators (no need for loop syntax) and “liner-friendly” syntax for makes it very good to ad-hoc and exploratory querying and programming.

                  The documentation uses the term filter a bit vaguely. Maybe better to think of that all expressions are generators?

                  https://jqlang.github.io/jq/manual/#identity

                  I was going to ask if . was influenced by Go templates, which use dot (.) as “the cursor”

                  which was in turn influenced by my JSON Template project from 2009, which used @ as “the cursor”

                  https://www.oilshell.org/blog/2023/06/ysh-design.html#the-first-json-language-i-designed-2009

                  However that appears to be mostly superficial. The . is thought of as a filter and not a value. It’s the identity filter Go templates / JSON Template are based on walking the tree, not a sequence of filters on a tree (or really a stream of trees! )

                  Sorry no idea about the origin, maybe it came naturally from how the .a.b.c index syntax looks?

                  But talking about cursor! if you want to digg deeper into how jq works i can recommend learning about “path expressionss” https://github.com/jqlang/jq/wiki/jq-Language-Description#path-expressions when a jq program runs it kind of keeps track of where it is in the input (if possible) and this is how assignment and update are implemented.

                  1. 1

                    I’ve felt like . in jq was like the shorthand for self::node() in XSL.

                    1. 2

                      Hm but is that a “noun”? I think that is what I was getting at – in jq they think of it more as a “verb” / function / generator

                      In JSON Template / Go Templates the . or @ is a noun for sure! But they have a much different execution model, despite being both based on JSON

                      I never really “got” jq – I am playing with it a bit now – but I think thinking of it as a verb is better


                      This kind of relates to my blog post from way back - Pipelines Support Vectorized, Point-Free, and Imperative Style

                      hist() {
                        "$@" | sort | uniq -c | sort -n -r
                      }
                      

                      There hist is sort of a verb that is composed from 3 other verbs – sort, uniq, sort

                      There are no nouns! You’re composing “functions” and not data – it’s higher order

                      I think there needs to be “jq - The missing manual” because I was certainly missing this!

                      1. 2

                        Hm actually I played with jq a ton as a result of this post

                        And I don’t see any reason that . can’t be thought of as a “noun” – the “cursor” or “this” or “it” in Perl, which is $_

                        However they define it as a “filter”, a verb:

                        https://jqlang.github.io/jq/manual/#basic-filters

                        The absolute simplest filter is . . This filter takes its input and produces the same value as output. That is, this is the identity operator.

                        Since jq by default pretty-prints all output, a trivial program consisting of nothing but . can be used to format JSON output from, say, curl.

                        The wiki also defines . as a “generator” or “thunk”


                        I thought I would see a difference once defining functions:

                        $ seq 3 | jq 'def inc(n): . + n; inc(3) | inc(-1)'
                        3
                        4
                        5
                        

                        But to me that reads fine if . is the “current thing”


                        So yeah I find the docs kind of confusing, as even the current maintainers appear to!

                        1. 1

                          However they define it as a “filter”, a verb:

                          I only recently joined as jq maintainer so I can’t speak for what Stephen Dolan had in mind. But i’ve haven’t read or seen anything indicating that the idea was to chain together verbs etc. I guess things has grown more ad-hoc.

                          The wiki also defines . as a “generator” or “thunk”

                          I thought I would see a difference once defining functions:

                          How I think about it is that in the filter pipeline each part of the filter can be seen as a generator or thunk with one implicit argument, the current input, and . is a shorthand for a function that just returns/outputs it.

                          So yeah I find the docs kind of confusing, as even the current maintainers appear to!

                          Probably but as I mention above I’m not sure there ever was any grand idea to start with :) maybe Nico one the jq old-timer knows more.

                          1. 1

                            $ seq 3 | jq ‘def inc(n): . + n; inc(3) | inc(-1)’

                            This is probably obvious to you, but to be clear what about what I mean by “verbs” – | is used like the function composition operator, and inc(3) and inc(-1) are verbs

                            https://github.com/oilshell/blog-code/blob/master/jq/verbs.js

                            function inc(n) {
                              return function (dot) {
                                return dot + n;
                              }
                            }
                            
                            // big difference is "cartesian product" behavior, but I think
                            // this is a reasonable description of common usage?
                            function pipe(f, g) {
                              return function(dot) {
                                return f(g(dot))
                              }
                            }
                            

                            I guess it would be more interesting to implement the generator behavior, but I guess the point is that f and g are verbs. (Somewhat contracting what I wrote yesterday)

                            But maybe what I was thinking yesterday is that it doesn’t really help people understand it to make the noun/verb distinction ??

                            Most people can think of . as “this item” … or I suppose you can think of it both ways, as both a noun and a verb


                            But I think that’s what’s meant here …

                            https://github.com/jqlang/jq/wiki/jq-Language-Description

                            second-class higher-order functions of dynamic extent

                            I will probably have some other questions later, since I spent a lot of yesterday learning jq :)

                            1. 2

                              I guess it would be more interesting to implement the generator behavior, but I guess the point is that f and g are verbs. (Somewhat contracting what I wrote yesterday)

                              Yeap something like that. Your working on a blog post?

                              But I think that’s what’s meant here … https://github.com/jqlang/jq/wiki/jq-Language-Description

                              second-class higher-order functions of dynamic extent

                              Not sure i know what second-class higher-order means here, the few references i find is from the jq wiki :)

                              Most people can think of . as “this item” … or I suppose you can think of it both ways, as both a noun and a verb

                              Ah think i see what going getting at, so . would in that case be referred to as as “pass” or “passthru” etc?

                              I will probably have some other questions later, since I spent a lot of yesterday learning jq :)

                              👍 Great to hear! happy to answer any questions. There is also a IRC channel and discord server with a bunch of jq enthusiasts

                2. 1

                  That’s interesting: I also hadn’t thought of it as backtracking, but it definitely is stream-oriented. But streams and backtracking are closely related, because they are two different ways to implement logic programming (specifically the “nondeterminism” part).

                  I guess when you resume a generator, that is like backtracking to the most recent ‘yield’, and choosing to continue rather than return the yielded value.

                  1. 2

                    Yes, maybe an example makes it more clear. Here (1,2) first output 1, rest of pipeline runs (with $a bound to the value) at some point the execution backtracks so that 2 is outputted and so on…

                    $ jq -cn '(1,2) as $a | (3,4) as $b | [$a,$b]'
                    [1,3]
                    [1,4]
                    [2,3]
                    [2,4]
                    
                3. 2

                  I had no idea jq was a fully fledged language

                  1. 1

                    Ahah that’s awesome