1. 15
  1.  

  2. 7

    Hello and interesting post!

    I noticed that you didn’t include Dylan, which was one of the first languages to tackle a more complex macro system in infix-syntax code.

    Perhaps that was due to reasons that you indicated in your post (but you didn’t name languages, so it is difficult to tell):

    I had to cut a number of languages from early drafts of this post because their docs were so poor or I needed extensive knowledge of the grammar or even compiler itself!

    Anyway, our macro system is documented in a few places independently:

    • Dylan Reference Manual, but this is somewhat dry.
    • Dylan Programmer’s Guide, an introductory book. This actually contains a definition of swap in that chapter.
    • D-Expressions: Lisp Power, Dylan Style, a paper that describes both the existing standard macro system and a significantly more powerful version of it that is used extensively within the Open Dylan compiler implementation.
    • The Dylan Macro System is an article written by one of our users to explain the macro system in his own words (and more informally than the Dylan Reference Manual).

    As for your examples …

    In Dylan, a swap macro is pretty easy:

    define macro swap!
      { swap! (?place1:expression, ?place2:expression) }
      =>
      { let value = ?place1;
        ?place1 := ?place2;
        ?place2 := value; }
    end;
    

    Dylan macros are hygienic, so there’s no problems with that. This is also pretty similar to syntax-rules from Scheme.

    As for each-it, that isn’t hard either, as unlike syntax-rules, Dylan makes it easy to violate hygiene when needed without a lot of ceremony:

    define macro each-it
      { each-it (?collection:expression)
          ?:body
        end }
      =>
      { for (?=it in ?collection)
         ?body
        end };
    end;
    

    This wasn’t hard either … the ?=it just means that instead of being a substitution from the pattern like ?collection and ?body, it is a hygiene violation.

    Using it is simple:

      each-it (arguments)
        format-out("%=\n", it);
      end;
    

    That approaches the simplicity of the Common Lisp version of this definition and is (to me) much nicer than the syntax-case definition from Scheme.

    1. 1

      Thanks for your kind words!

      I’m reluctant to name the languages I struggled with, as it’s hard to draw a line between their complexity and my inexperience. However, I completely overlooked Dylan (to my shame!), but it would have definitely been an interesting addition.

      Your macro implementations are pretty readable. How does Dylan fare when you need to use a code to manually build a parse tree?

      1. 2

        Well, our normal macro system isn’t like Common Lisp style macros. It is strictly like syntax-rules from Scheme. That said, some pretty complicated things have been built with the standard Dylan macro system, like the binary-data library.

        Inside the compiler however, we have an extended version of this macro system that does allow for quotation and such. I like it a lot and it is really powerful. It is how large parts of the compiler are actually written, including our C-FFI interface. If you check out the D-Expressions paper that I reference above, it deals with this pretty extensively, although some of what is discussed in the paper is not implemented (the parts about the module system).

        It is an open project for someone to make it so that a project can have a compiler plugin that uses this macro system extension. I don’t know that it would be incredibly difficult. It would probably be challenging, and would almost certainly be enjoyable.

        Inside the compiler, these special compiler macros are defined by define &macro rather than define macro. From there, they even start out looking like a regular macro, however, instead of the pattern matching and being used for template substitution (effectively), it is instead executing code which returns AST fragments. The compiler builds upon this and has &converter and other forms which just expand to &macro definitions and are key to how things go from the parsed tokens to the actual compiler IR, eventually.

        That would all be quite tedious if one still had to construct AST stuff by hand, but thankfully, it works with quoting as well, using #{ ... } and allowing for substitution. Inside the compiler, one can use that to construct AST fragments. Unfortunately, these tend to be used in fairly complex areas, so I don’t have a really nice simple example.

        define &macro c-address-definer
          { define c-address ?var-name:name :: ?pointer-designator:expression
              ?options:*;
            end }
            =>
          begin
            let initargs = parse-options($c-address-options, options, var-name);
            let options = apply(make, <c-address-options-descriptor>, initargs);
            let c-name = c-address-c-name(options);
            if (c-name)
              let import = c-address-import(options) | #{ #f };
              #{ define constant ?var-name
                   = make(check-c-address-designator(?var-name, ?pointer-designator),
                          address: primitive-wrap-machine-word
                            (primitive-cast-pointer-as-raw
                               (%c-variable-pointer(?c-name, ?import)))) };
            else
              note(<missing-c-name>,
                   source-location: fragment-source-location(form),
                   definition-name: var-name);
              #{ };
            end if;
          end;
        end &macro;
        

        That one isn’t too difficult though and comes from our C-FFI. You can see that it defines a pattern to be matched, define c-address .... and then it executes some code to parse the options, some basic validation, and then it checks to be sure that a c-name has been provided in the options. If it has, it returns the quoted AST (define constant ?var-name ....). If it hasn’t gotten a c-name option, then it issues a compiler warning (note(<missing-c-name>, ...) and then returns an empty AST fragment (#{ }). That wasn’t too terrible! :)

    2. 1

      In addition to the syntax macros we all know and love, Julia is about to get a metaprogramming feature that’s so far been referred to as “staged functions” for lack of a better name. The idea is that you sometimes want to be able to specialize the body of a function on the runtime types of it’s arguments (something like a macro that has access to type information). When you call a staged function with some arguments (e.g. f(a,b)) with some arguments (e.g. (3,"foo")), the body of f will actually be passed the types of a and b (so here (Int64, ASCIIString)) and is expected to return an expression, which will be evaluated with the values of a and b bound as though the returned expression were the body of f. Another way to think about this is that the programmer gets a chance to type-specialize code before the compiler does.

      Open pull request is here for the curious. As far as I’ve heard, this feature is novel outside of research languages, although I would be really interested to hear that this is not the case.

      1. 1

        From what you say here and from what I’ve seen other places, I wonder sometimes if the compilation model of Julia isn’t what I expect.

        Anyway, in Dylan, we can do something similar, in 2 different ways.

        First, and I’m just going to use stupid examples rather than finding real examples:

        define inline-only function internal-do-some-math (a , b)
          (a + b) * (a + b) - a
        end;
        
        define method do-some-math (a :: <integer>, b :: <integer>)
          internal-do-some-math(a, b)
        end;
        
        define method do-some-math (a :: <single-float>, b :: <single-float>)
          internal-do-some-math(a, b)
        end;
        

        There, since I defined the internal function to be inline-only, the compiler will inline it into each method do-some-math and specialize the types and end up compiling away all of the method dispatch.

        Now, and perhaps more interestingly, Dylan provides a more automated way of doing such a thing that we call copy-down methods. (Technically, these are an implementation extension to Open Dylan, but Gwydion Dylan 2.5 supported them as well.)

        Using copy-down methods, the above would look like:

        define method do-some-math (a, b) => (c)
          (a + b) * (a + b) - a
        end;
        
        define copy-down-method do-some-math (a :: <integer>, b :: <integer>)
         => (c :: <integer>);
        
        define copy-down-method do-some-math (a :: <single-float>, b :: <single-float>)
         => (c :: <single-float>);
        

        Here, we just tell the compiler to find the next most specific method and copy its body and inline / type-specialize it using the new method signature. (The same rules as would apply for next-method.) The usual adjectives that work with method also work with copy-down-method, so it can be inline, sealed or whatever.

        I suspect that given the difference between the Dylan and Julia execution models, the overall impact here ends up being the same.

        1. 1

          Ah, yes! Sorry, I was somewhat brief in my above post, more just trying to pique people’s interest than give a detailed account :)

          Julia method dispatch works on the types of all arguments, as in your examples.

          function do_some_math(a::Int, b::Int)
              a + b
          end
          
          function do_some_math(a::Float64, b::Float64)
              a - b
          end
          

          will do what you would expect (e.g. do_some_math(2,1) == 3 and do_some_math(2.0,1.0) == 1.0)). After reading you code above I’m not surprised that people often compare Julia to Dylan :)

          Where this isn’t sufficient is the case that you have a value-parameterized type, and you want to have an in-principle-infinite variety of methods, each with code specialized on the value. Probably the most common such type is an N-dimensional array of type T (an Array{T, N} in Julia parlance). Specializing on N can be very useful (e.g. unrolling N loops, etc.). In principle, you could manually write out unrolled methods for Array{T, 1}, Array{T, 2}, etc., but this gets tedious and will never be completely general (e.g. if you stop at N=20 and someone wants to use a 21-dimensional array they’re out of luck). this is where “staged functions” will be useful (they’re hopefully going to be the foundation of a very performant system of array views in Julia 0.4, for example).

          1. 1

            Sounds like in that case, it is similar to template instantiation in C++.

            As for people comparing Julia and Dylan … no surprise there. But the compilation models and such are very different and Julia has some type system improvements over the old specification of Dylan. We’re working on reversing that situation. :)