1. 8

  2. 2

    This is awesome

    One solution is to guard the value (second m) with a dynamic check. Another is to apply type tailoring to regexp-match!

    If I understand this correctly, the type tailored code does exactly that, but is syntactically elided at each use of the macro. Right? Could this be extended to statically enforce constraints without run-time checks?

    1. 3

      I don’t think that’s right. As I understand it, the type errors are statically checked. The dynamic check is necessary in Typed Racket today due to designed-in limitations of the type checker, similar to how you must check for null in Java (although Racket does better than Java, IMHO, in that it conservatively rejects the unsafe code). The type tailored solution statically shows that the code rejected by Typed Racket is safe at compile time.

      1. 3

        The type errors are checked statically, but only because the macro call expands to code that rebuilds the match function’s result. That checking has runtime overhead, including the list constructor and the if/error checks. What I’m asking is whether that cost can be eliminated.

        Surely some of it can: The occurrence typing could learn more about composites, so that you could traverse the list, asserting things along the way, without recreating the list.

        But some of it is less clear: Could you avoid the if/error checks completely with more knowledge of the relationship between the pattern and the return value? Could you do that only with a call-site macro, or do you need co-operation from the called code?

        1. 2

          I am admittedly hitting the bounds of my Racket knowledge here (I’ve written one or two macros in my life, so the bounds aren’t very big!), but as I understand the article and the macro it does statically use the value of the regex in the call, e.g.:

          smart-match rejects some programs that Typed Racket would accept, yet evaluate to runtime errors — for instance, (second (regexp-match "r" "racket")).

          So I think the answer is yes, you do have enough knowledge to do what you ask with a less trivial macro, but I’ll repeat that I’m out of my depth here!

          The return of their function is always a list, so I don’t see how you avoid that constructor as written. It would be neat if you could say, “Well, the only thing done with that list is to call second, so….” but that seems to be outside of the scope of the macro, if I understand correctly? I guess if all this concerned you then you could change the macro definition to say you wanted a specific group as an argument and then it might work?

          1. 3

            I don’t see how you avoid that constructor as written.

            There are two calls to the list constructor. One inside the match function that produces the result. And the other is in the macro expansion that takes that result and recreates it (exactly as is!) but throws a runtime-error if it violates some static assumption.

            I’m proposing replacing this:

              (car m)
              #,@(for/list ([i (in-range num-groups)])
                   #`(or (list-ref m (+ 1 #,i))
                         (error 'smart-match))))

            with this:

            #,@(for ([i (in-range num-groups)])
                 #`(or (list-ref m (+ 1 #,i))
                       (error 'smart-match))))

            Note the outer list constructor is gone, but more importantly the functional ‘for/list’ is replaced by an imperative ‘for’ that returns void.

            For an even simpler example of the sort of thing I’m interested in, consider this:

            (if (pair? x) (car x))

            vs this:

            (if (pair? x) (unsafe-car x))

            The first will check pair? twice. The second won’t.

            Presumably you could macro expand to unsafe calls, or the compiler could use constant propagation or global value numbering to eliminate the duplicate check. So my question (to the authors I guess): How can this “type tailoring” technique eliminate any check at all? Not just N -> 1, but 1 -> 0 checks.

            1. 3

              We want to convince Typed Racket’s type checker that the result of regexp-match has a second element.

              smart-match does this by unrolling the result into a list so that the number of results is syntactically apparent. For the given example, here is the actual program that runs after the smart-match macro expands:

              (let ([m
                     (let ([m (regexp-match "(a*)b" "aaabbb")])
                       (if m
                           (list ; list constructor called with two arguments
                            (car m)
                            (or (list-ref m (+ 1 0))
                                (error 'smart-match)))
                (when m
                  (string-length (second m))))

              Typed Racket knows that it is for sure safe to call second because it sees the list constructor called with two arguments.

              Your suggestion won’t quite work because for returns void, so the macro would try to splice void into the outer syntax object (#,@ is shorthand for unquote splicing), resulting in an error. (You can try it out, the post contains all the code needed to run the example).

              The actual implementation probably does use unsafe operators as you suggested though. (Actually, Typed Racket might automatically apply that optimization itself.)

              1. 1

                Thanks for posting this! I searched both your blog post and your paper for the expanded macro and I failed to find it. It would be a nice thing to put in the appendix of the post. Thanks for writing it; I enjoyed the paper!