Zig’s comptime code generation is an example of this approach. From what I’ve seen it works pretty well in practice. It has one big property that I don’t like though: whether or not your code typechecks is tested when the macro/template is expanded, not when it is defined. This means that you can write a comptime function that generates invalid code in some cases, and you never find out about it as long as those cases never happen.
I think it’s important to note that when you do find a case that generates invalid code, it is a comptime error. So, you can guarantee all usages of your function are valid if the project compiles.
This can also be applied to platform-specific blocks:
if (on linux) {
// linux specific stuff
} else if (on windows) {
// windows specific stuff
} else {
@compileError("unsupportedPlatform");
}
If you were to call this function on an unsupported platform, it would be a compile error. If you don’t, you can compile the project fine. I think this property of the language is really cool, and I think the Zig standard library uses it for a few different things like skipping tests on certain platforms (others compilers embed this as metadata in the file).
For small macros/templates this is perfectly fine because it’s easy to make sure all the code is tested. But when you start using it for large and complicated things that may have unused code paths lurking in unexpected places, it starts to worry me. Maybe it works fine in practice most of the time, but the idea leaves a bad taste in my mouth.
It did for me at first - why on earth wouldn’t you want all of your code checked?? - but I think I’m being converted. Partial evaluation like this just feels so powerful. I’ve even ended up emulating these ideas in Java for work.
C++ templates have the same property that they are checked at compile-time but on use-sites, and I think this is widely understood to have turned into a usability hell, which is basically the reason for the work on concepts. For example, it is easy for developers of comptime logic in libraries to miss errors that happen on their clients, or to end up with code that is effectively not checked/tested inside the library.
I’m not saying that Zig comptime has the exact same downsides as C++ templates (it has the benefit of being a properly designed compilation-level language, rather than something that grew organically in different directions than the base language), but I also suspect that this class of issues is a problem at large and I share the blog post’s reservations about running wild with the idea.
On the other hand, it is true that the benefits in terms of reduced complexity of the language (compared to a much richer time system) are significant. Language design is a balancing act.
Uhu, with C++/Zig style generics, it’s very hard for the user (or the author) to state precisely what the API of an utility is, which should make SemVer significantly harder in practice.
The second problem is tooling: when you have T: type, there isn’t much an IDE can correctly do with completions, goto definition, etc. Some heuristics are possible (eg, looking at the example instantiations), but that’s hard to implement (precise solutions are simpler).
Regarding tooling: Smalltalk/Squeak/Pharo people would say that you don’t need static analysis to have good tooling, as long as you rephrase the tooling to be about exploring the state of the program as it runs, rather than without running it. I could see how this might possibly work with Zig – collect global traces of successful generic instantiations (for the testsuite in the project, or for your whole set of installed packages, or over the whole history of your past usage of the code…) and show their shared structure to the user.
collect global traces of successful generic instantiations (for the testsuite in the project, or for your whole set of installed packages, or over the whole history of your past usage of the code…) and show their shared structure to the user.
This is something that we already did with the old implementation of autodoc and that plan to eventually also support in the new version as something that you should expect to be able to rely on when browsing documentation for a package.
One advantage Zig has over plain C++ templates is that you can do arbitrary comptime checks and emit clean, precise error messages. For example, instead of just blowing up when you instantiate with a type that doesn’t have a given method, you can check for it and fail compilation with type foo must have bar(x: i32) method. You can even implement something akin to concepts in userspace.
Yeah but I have been burned enough times by pulling out code that compiled 6 months ago to discover, oops, it no longer builds. Even in Rust. There are few things more frustrating. “I need to fix this bug that I just found in an old prod system, it’s a 15 minute fix and and oops now I get to spend three hours trying to figure out which dependency three levels down in my dep tree didn’t follow semver and broke its API.” Or which file doesn’t build with a newer compiler’s interpretation of C++17, or which Magical Build Flag didn’t get saved in the build docs, or why the new binary needs to load libfuckyou.so.0.7.3 but the old one doesn’t, or…
So if possible I would prefer to not add yet more ways of scattering around hidden build landmines in large and complex programs. But as I said, I need to play with it more and get a better feel for how it actually works out in practice.
Just passing in the table with methods feels like it should be simpler, but it’s interesting that it can be more general. Eg, you can do things like storing indexes in a hashmap, and using the Hash from the thing the indexes refer to.
In my own sketch of the language, I was thinking about List.Eq(Point.Eq) style of constructing the table, and doing something akin to C++ ADL syntax for implicit syntax at the call site (ie, if T.Eq is a thing, use that) (so, you only pass tables explicitly if they are defined in a different module) (which I think should give you “implicit syntax is always coherent” property, by virtue only considering impls in the type itself) (and I am not sure strict coherence is useful, unless you have associated types: even with coherence, a Hash impl can change behavior depending on some thread local, so generic code can’t really rely on user-provided callbacks to behave “the same”).
We need something like Rust’s derive macros. I love Rust’s derive macros more than life itself.
Just in case our position here is as close as for generics, my thinking here is:
interface for derives is String -> String. We don’t need hygiene for derives (we don’t have it even for Rust), and token trees are not that great of an abstraction.
derive are strictly additive, the compilation model is as if the compiler just appends a bunch of stuff to the end of file. Compiler might actually want to materialize results of derives to disk, so that the IDE doesn’t have to ponder inconvenient questions when serving goto.
derives don’t go through name resolution rule. When compiling a CU, the set of derive names is known up-front. In the source code, derives are called with dedicated syntax like @Serialize
this allows to expand derives in an embarrassingly parallel way during parsing on a per-file basis.
this also make it much simpler to answer “when to re-derive?” question, simplifying incremental compilation significantly.
For anyone who likes that comment by F# designer Don Syme (I re-read it again, never fails to clarify), I recommend watching this conversation between him and a Rust core team member for more color:
Part of it is a “slippery slope” argument, and it’s indeed debatable how far down the slope you want to go.
Another part is Interoperability and binary compatibility (which is an area where I think Rust has fundamental issues, which can’t be solved with more wrappers).
F# is supposed to interoperate within a .NET VM e.g. with C# ; a shell is supposed to interoperate with even more languages, in bare-bones Unix style (which now includes JSON, CSV, HTML !)
Generic programs are often useful, generic programmers are a bane of my existence
Nope, never run into problems with people over-designing things and leaving me to decipher the type errors that result. Not one bit.
Just passing in the table with methods feels like it should be simpler, but it’s interesting that it can be more general.
Not sure that it can become more general, just makes different tradeoffs. The problem in the end comes down to “what can the compiler actually figure out”, and it is startlingly easy to just end up making your type system Turing-complete by accident.
interface for derives is String -> String.
Already scares the shit out of me, just from C’s prior art, but I’m willing to accept it could be done well if you try. Why are token trees problematic?
derive are strictly additive, the compilation model is as if the compiler just appends a bunch of stuff to the end of file.
Seems pretty reasonable, since this is probably what I’d make my compiler do!
derives don’t go through name resolution rules
How does this interact with letting users create their own derives? Or multiple implementations of them, such as two different creates deriving Debug for the same type, which is now possible ‘cause we don’t have canonicity? Or create different derives with similar names, such a serde::Serialize vs miniserde::Serialize?
Not sure that it can become more general, just makes different tradeoffs.
It can: with traits, you have to make the table compile time. If you pass it as a normal argument though, you can make the table use runtime stuff (eg, close over local variables). Though, every time I need this, it boils down to “hashtable storing indexes”.
Why are token trees problematic?
They are an abstraction which doesn’t fit the problem. Like, for example in foo.0.0 we have foo token and 0.0 token, and its the parser who splits the 0.0 token into three. And, eg, a lot of people want to look at whitespace in tokens for one reason or another (eg, I wanted too write cmd!(git branch --i-made-this-up -v), but I have to wrap it in quotes in Rust bc -v and - v are indistinguishable otherwise).
Though, if we restrict ourselves only to derives, than the input is always a Garnet AST (and not a token soup), so we can go for AST-only API as well.
How does this interact with letting users create their own derives? Or multiple implementations of them, such as two different creates deriving Debug for the same type, which is now possible ‘cause we don’t have canonicity? Or create different derives with similar names, such a serde::Serialize vs miniserde::Serialize?
That’s just details for language designed to figure out :P The high-order bit is that you need to understand that #[dervie(Serialize)] refers to serde without looking beyond the current file. In Rust, that’s not possible because the file may contain something which expands to something which defines the serde name. So you can’t “just” expand everything in one go, nicely and in parallel, it’s an intricate dance of “ok, let’s expand this macro. Ok, it expanded to foo, so we can now resolve foo::bar. Aha, now we can expand bar::baz!”.
To spitball a couple of disambiguating strategies: allow only @ident syntax and disambigate by renaming macros in an analouge of Cargo.toml; allow only @crate_name.macro_name syntax, so that users always spell @serde.Serialize and build off crate renaming.
Basically, our end goal is that, presented with a 128-file compilation unit, we can, in parallel, pares and recusively expand each file, and build a file’s skeleton, which describes which named are defined in each file. Name resolution than works as a completely separate phase, which can look at all skeletons at the same time and as such is a “global” operation, but which is pretty fast as it operates with a small amount of data (b/c the actual parsing of the multitude of precious bytes of text happens before that on all the cores).
…if we restrict ourselves only to derives, than the input is always a Garnet AST (and not a token soup), so we can go for AST-only API as well.
Aaaah, that makes a lot more sense; I kept thinking of macros as AST transformations, while you were distinguishing between that and literally getting a token stream. I’ve only written my own derive macros once or twice but whenever I do I automatically reach for the quote crate which does half of this for you. So, making our derive macros explicitly work on AST rather than token stream makes a lot of sense.
That’s just details for language designed to figure out :P
You monster. :-P Sounds like your thoughts on the matter mostly parallel mine, though. I dislike the idea of macro renaming ’cause it introduces metadata your build depends on outside of the code itself, but I suppose it would work. Requiring derives to be fully qualified with @serde.Serialize or such is probably my preferred route, keeping up my theme of “do the stupid and simple thing, and then add syntax sugar if necessary”.
There have been some questions from other people about how much modules/structs and namespaces overlap with each other though, and that’s a good question I don’t entirely have an answer to yet. I was originally intending for them to be identical and, like Zig and Lua, having one file touch stuff from another literally just defines a file-local const containing a struct with the other file’s definitions in it. But after going through this thought exercise with modular implicits that might be packing too much use into a single construct, and if we don’t need structs to contain associated types then that approach is incomplete anyway ‘cause it leaves out the question of how to touch another file’s types. The idea of preventing derives/macros from expanding into references to other files is a pretty compelling design goal I hadn’t considered up to now. I wonder how Zig does it…
This is a great writeup! I thought about going down the same path with my language, but it appears you got much further along 😀 I haven’t thought it through too much, but for the Ord case could you add an implicit Eq parameter to signify the dependency?
On another note, I also wanted to have something like derive for these modules but I’d like to avoid a whole other syntax for macros. I was thinking of something like comptime specifically for that but still have first class generics.
…for the Ord case could you add an implicit Eq parameter to signify the dependency?
I thiiiiink so, but depending on how it’s done the compiler might have a tough time proving that two different implementations of Ord(T, Eq(T)) actually lead to the same Eq value? I’ll have to think about it more, probably with more coffee in my system.
On another note, I also wanted to have something like derive for these modules but I’d like to avoid a whole other syntax for macros.
I definitely intend to have some sort of macros or comptime code generation in Garnet, but yeah the syntax is a bit of a bear. I actually find Rust’s template-y macros pretty sensible most of the time, once you climb the learning cliff of figuring out how args and repetition works, but there’s reasons I haven’t tried to tackle macros for Garnet yet. :-)
Language design is fun. I think the essential debate about typeclasses vs. modules is the same as implicit vs. explicit. With modules, you need to explicitly say one extra thing, which is how to instantiate the module, i.e.
I’ve written a lot of OCaml, which has the familiar ML-based module system, so have written a bunch of code with this pattern. It’s mostly fine, I can’t ever point to a scenario where it made my life too bad. I like the ease of typeclasses which is due to their implicitness, which means that you get certain code “for free.” I can’t point to a time where they complicated my life too much, but I also don’t use fancy features like HKT. Modules do feel simpler overall.
One last point about this remark / joke:
Math people, don’t bother correcting me.
You just wrote a detailed analysis about programming language theory. You are a math person, in my opinion. This frames math as some kind of in-group that does something other than what we’re doing here, which is analyze and reason about programs. Programmers are some of the most analytical and detail-oriented people in the universe, so I don’t understand the need to distance us from CS. I understand criticizing academia maybe, but not math itself.
…I can’t ever point to a scenario where it made my life too bad.
The generic_sum() function I conjured up is, for me, now the canonical example of the problems modules have with being explicit everywhere. Contrast that with Rust where you can just say some_random_thing.iter().sum()? Blew my mind. Do you have a better way of writing generic_sum() with modules? It’s not a programming paradigm I actually have much practical experience with, so I 100% believe there’s patterns I’m missing.
You just wrote a detailed analysis about programming language theory. You are a math person, in my opinion.
My approach to math is that of a carpenter’s approach to nailguns. Necessary, and kinda satisfying to use correctly when it works out well, sure. But I don’t get much intrinsic joy thinking about the nitty-gritty details of a sharp piece of steel interacting with wood grain. :-P
This frames math as some kind of in-group that does something other than what we’re doing here, which is analyze and reason about programs.
This is a communication problem with any experts talking about different-but-overlapping domains. The in-group behavior happens because people in different domains optimize their thinking differently for different sorts of problems. In my misspent grad school time studying geology and geochemistry, I had the same thing happen trying to talk to material scientists. In my current incarnation writing software for drones, there’s a whole zoo of jargon that pilots use, much of which is silly, obsolete or irrelevant as far as the software cares. But you sure as hell gotta understand what they’re saying when they say something, and be able to communicate back to them, ‘cause most of that jargon expresses stuff that matters a lot to a human in the air, even when a computer flying the same aircraft in the same place with the same physics doesn’t really worry about it. You have to be able to think outside your specific box, and part of that is knowing how you think as well as how other people think.
Digression: Lua does something spiritually similar to this with its metatables. It’s pretty great! Basically each value has a “metatable” attached that has some magical methods on it that are called by operators, comparisons, etc. It also includes methods that are called on function application and on method lookup failure, so with these small primitives you can build just about anything.
It is almost a rite-of-passage for a beginning Lua programmer to read up on tables and metatables, and then create their own object-oriented system. You can find many implementations of varying scope and complexity.
The section on Lua metatables describes something very similar to Python’s magic methods. The difference between those and the modules/typeclasses described here is basically “how do you find which methods to call”. In Lua and Python, every value has a pointer to its vtable/metatable/type/whatever at runtime, which is a structure that contains these methods. So when you call foo.__add__(x) it ends up being, essentially, type(foo)["__add__"](x) and that is executed at runtime. So if foo ends up being a different type at runtime, say going from int to float, the dynamic method call automatically goes from calling int.__add__() to float.__add__(). Or maybe to some_random_type.__add__() which doesn’t exist and causes a runtime error. The question that typeclasses/modules grapple with is basically “how do you sort all this out at compile time and prove it will work without doing a runtime lookup”.
It’s hard to compare, but in a way you could say that Python generifies everything and specializes nothing, by not have a static type system (so no way exists to differentiate in the sources at compile time).
Two, it turns out that coherence isn’t just for show. If you call make_list_eq_thingy(IntEq) twice, you get two different Eq(List(I32)) values which, by ML’s type rules at least, are not the same.
Could this be solved with memorization? So long as a functor is pure, you memoize the result and return a cached implementation on a second invocation.
(Apologies if this is covered later. I am half way through the article, but I need to get to work. 😬)
Even though it’s originally inspired by Scala, the concept behind modular implicits is very simple.
To some extent maybe, the section on inheritance deals with one way of reasoning about this. To me it seems like it’s a problem with a fair amount of odd little nooks and crannies to explore in terms of practical design space, most of which are yet unplumbed. ML people, at least the ones who write about type systems, seem to often be purists about “if this type is abstract, you never know anything about it.”
The absolute shade.
All in good fun. I’ve read a fair bit about Scala but never used it for anything nontrivial. :-)
Great read. @icefox you might be interested to know that this is almost exactly how Coq implements typeclasses. For example, the definition of Eq from your example would look something like:
Class Eq (A : Type) := {
equal : A -> A -> bool;
}.
Instance Int_Eq : Eq int := {
equal := fun x y => (* ... definition of eq for ints *);
}.
The Class and Instance keywords in Coq are essentially the same as defining and instantiating a struct (you can pass around an Int_Eq just like any other ol’ struct). The only magic is that Instance updates a database of terms to be implicitly resolved, in the same way that you use the implicit keyword in your generic_sum example. Using the above definitions, one could then define:
Definition explicit_func (A : Type) (H : Eq A) : unit.
Definition implicit_func (A : Type) {H : Eq A} : unit.
(* ^ curly braces *)
At which point calling (explicit_func int Int_Eq) and (implicit_func int) are equivalent.
I personally find Coq typeclasses a little clunky (typeclasses were added later, and sort of compete with the module system inherited from OCaml). Coq also primarily deals with mathematics, which has some steep hierarchies of inheritance (e.g. a total order :> partial order :> reflexive etc) which can be pretty cumbersome.
One takeaway from Coq that you might find useful is the notion of sections, which can be used to automatically append typeclasses to function arguments when used in a function body. For example, in some pseudo-garnet, this might look like:
section WithLen
-- Any function in this section that uses `len_impl`
-- will have `len_impl` appended to its arguments
context len_impl: Len(@C)
fn foo(container: @C): int =
let len = len_impl.len(container)
...
end
fn bar(container: @C): int=
return foo(container)
end
end
-- foo is now foo(container: @C, len_impl: Len(@C))
-- bar is now bar(container: @C, len_impl: Len(@C))
-- ^ because it called foo, which requires len_impl
This is a rough example (you’d probably need some extra logic / syntax to unify all those @C’s), but hopefully this is some more food for thought. It might prove useful in say, a list library, where Cmp, Eq, and friends are pervasive.
I wonder whether algebraic laws would solve two seemingly-unrelated complaints that you’ve put forward. First, the question of whether a template is valid can be answered with a proof that some algebraic law holds on the template, before the template is ever applied. Second, the question of which derived interfaces are valid (e.g. when Eq and Ord are correctly defined relative to each other) can be answered with an algebraic law which relates interfaces to each other. In your example, we might prove that all instances of Eq which are derived from Ord are correct.
I think it’s important to note that when you do find a case that generates invalid code, it is a comptime error. So, you can guarantee all usages of your function are valid if the project compiles.
This can also be applied to platform-specific blocks:
If you were to call this function on an unsupported platform, it would be a compile error. If you don’t, you can compile the project fine. I think this property of the language is really cool, and I think the Zig standard library uses it for a few different things like skipping tests on certain platforms (others compilers embed this as metadata in the file).
It did for me at first - why on earth wouldn’t you want all of your code checked?? - but I think I’m being converted. Partial evaluation like this just feels so powerful. I’ve even ended up emulating these ideas in Java for work.
C++ templates have the same property that they are checked at compile-time but on use-sites, and I think this is widely understood to have turned into a usability hell, which is basically the reason for the work on concepts. For example, it is easy for developers of comptime logic in libraries to miss errors that happen on their clients, or to end up with code that is effectively not checked/tested inside the library. I’m not saying that Zig comptime has the exact same downsides as C++ templates (it has the benefit of being a properly designed compilation-level language, rather than something that grew organically in different directions than the base language), but I also suspect that this class of issues is a problem at large and I share the blog post’s reservations about running wild with the idea. On the other hand, it is true that the benefits in terms of reduced complexity of the language (compared to a much richer time system) are significant. Language design is a balancing act.
Uhu, with C++/Zig style generics, it’s very hard for the user (or the author) to state precisely what the API of an utility is, which should make SemVer significantly harder in practice.
The second problem is tooling: when you have
T: type
, there isn’t much an IDE can correctly do with completions, goto definition, etc. Some heuristics are possible (eg, looking at the example instantiations), but that’s hard to implement (precise solutions are simpler).I agree, there was a discussion along these lines in a previous post about Zig.
Regarding tooling: Smalltalk/Squeak/Pharo people would say that you don’t need static analysis to have good tooling, as long as you rephrase the tooling to be about exploring the state of the program as it runs, rather than without running it. I could see how this might possibly work with Zig – collect global traces of successful generic instantiations (for the testsuite in the project, or for your whole set of installed packages, or over the whole history of your past usage of the code…) and show their shared structure to the user.
This is something that we already did with the old implementation of autodoc and that plan to eventually also support in the new version as something that you should expect to be able to rely on when browsing documentation for a package.
One advantage Zig has over plain C++ templates is that you can do arbitrary comptime checks and emit clean, precise error messages. For example, instead of just blowing up when you instantiate with a type that doesn’t have a given method, you can check for it and fail compilation with
type foo must have bar(x: i32) method
. You can even implement something akin to concepts in userspace.Yeah but I have been burned enough times by pulling out code that compiled 6 months ago to discover, oops, it no longer builds. Even in Rust. There are few things more frustrating. “I need to fix this bug that I just found in an old prod system, it’s a 15 minute fix and and oops now I get to spend three hours trying to figure out which dependency three levels down in my dep tree didn’t follow semver and broke its API.” Or which file doesn’t build with a newer compiler’s interpretation of C++17, or which Magical Build Flag didn’t get saved in the build docs, or why the new binary needs to load
libfuckyou.so.0.7.3
but the old one doesn’t, or…So if possible I would prefer to not add yet more ways of scattering around hidden build landmines in large and complex programs. But as I said, I need to play with it more and get a better feel for how it actually works out in practice.
Lovely lovely lovely! I pretty much think the same (eg, https://matklad.github.io/2021/02/24/another-generic-dilemma.html). Generic programs are often useful, generic programmers are a bane of my existence (https://github.com/fsharp/fslang-suggestions/issues/243#issuecomment-916079347) :)
Just passing in the table with methods feels like it should be simpler, but it’s interesting that it can be more general. Eg, you can do things like storing indexes in a hashmap, and using the Hash from the thing the indexes refer to.
In my own sketch of the language, I was thinking about
List.Eq(Point.Eq)
style of constructing the table, and doing something akin to C++ ADL syntax for implicit syntax at the call site (ie, ifT.Eq
is a thing, use that) (so, you only pass tables explicitly if they are defined in a different module) (which I think should give you “implicit syntax is always coherent” property, by virtue only considering impls in the type itself) (and I am not sure strict coherence is useful, unless you have associated types: even with coherence, a Hash impl can change behavior depending on some thread local, so generic code can’t really rely on user-provided callbacks to behave “the same”).Just in case our position here is as close as for generics, my thinking here is:
@Serialize
For anyone who likes that comment by F# designer Don Syme (I re-read it again, never fails to clarify), I recommend watching this conversation between him and a Rust core team member for more color:
Syme & Matsakis: F# in the Static v. Dynamic divide (only 1.2K views, but I found it to be great)
I referenced it on my blog (in notes form):
https://www.oilshell.org/blog/2022/03/backlog-arch.html#tradeoffs-between-dynamic-and-static-types-faq
Part of it is a “slippery slope” argument, and it’s indeed debatable how far down the slope you want to go.
Another part is Interoperability and binary compatibility (which is an area where I think Rust has fundamental issues, which can’t be solved with more wrappers).
F# is supposed to interoperate within a .NET VM e.g. with C# ; a shell is supposed to interoperate with even more languages, in bare-bones Unix style (which now includes JSON, CSV, HTML !)
Nope, never run into problems with people over-designing things and leaving me to decipher the type errors that result. Not one bit.
Not sure that it can become more general, just makes different tradeoffs. The problem in the end comes down to “what can the compiler actually figure out”, and it is startlingly easy to just end up making your type system Turing-complete by accident.
Already scares the shit out of me, just from C’s prior art, but I’m willing to accept it could be done well if you try. Why are token trees problematic?
Seems pretty reasonable, since this is probably what I’d make my compiler do!
How does this interact with letting users create their own derives? Or multiple implementations of them, such as two different creates deriving
Debug
for the same type, which is now possible ‘cause we don’t have canonicity? Or create different derives with similar names, such aserde::Serialize
vsminiserde::Serialize
?It can: with traits, you have to make the table compile time. If you pass it as a normal argument though, you can make the table use runtime stuff (eg, close over local variables). Though, every time I need this, it boils down to “hashtable storing indexes”.
They are an abstraction which doesn’t fit the problem. Like, for example in
foo.0.0
we havefoo
token and0.0
token, and its the parser who splits the0.0
token into three. And, eg, a lot of people want to look at whitespace in tokens for one reason or another (eg, I wanted too writecmd!(git branch --i-made-this-up -v)
, but I have to wrap it in quotes in Rust bc-v
and- v
are indistinguishable otherwise).Though, if we restrict ourselves only to derives, than the input is always a Garnet AST (and not a token soup), so we can go for AST-only API as well.
That’s just details for language designed to figure out :P The high-order bit is that you need to understand that
#[dervie(Serialize)]
refers to serde without looking beyond the current file. In Rust, that’s not possible because the file may contain something which expands to something which defines theserde
name. So you can’t “just” expand everything in one go, nicely and in parallel, it’s an intricate dance of “ok, let’s expand this macro. Ok, it expanded tofoo
, so we can now resolvefoo::bar
. Aha, now we can expandbar::baz!
”.To spitball a couple of disambiguating strategies: allow only
@ident
syntax and disambigate by renaming macros in an analouge of Cargo.toml; allow only@crate_name.macro_name
syntax, so that users always spell@serde.Serialize
and build off crate renaming.Basically, our end goal is that, presented with a 128-file compilation unit, we can, in parallel, pares and recusively expand each file, and build a file’s skeleton, which describes which named are defined in each file. Name resolution than works as a completely separate phase, which can look at all skeletons at the same time and as such is a “global” operation, but which is pretty fast as it operates with a small amount of data (b/c the actual parsing of the multitude of precious bytes of text happens before that on all the cores).
Aaaah, that makes a lot more sense; I kept thinking of macros as AST transformations, while you were distinguishing between that and literally getting a token stream. I’ve only written my own derive macros once or twice but whenever I do I automatically reach for the
quote
crate which does half of this for you. So, making our derive macros explicitly work on AST rather than token stream makes a lot of sense.You monster. :-P Sounds like your thoughts on the matter mostly parallel mine, though. I dislike the idea of macro renaming ’cause it introduces metadata your build depends on outside of the code itself, but I suppose it would work. Requiring derives to be fully qualified with
@serde.Serialize
or such is probably my preferred route, keeping up my theme of “do the stupid and simple thing, and then add syntax sugar if necessary”.There have been some questions from other people about how much modules/structs and namespaces overlap with each other though, and that’s a good question I don’t entirely have an answer to yet. I was originally intending for them to be identical and, like Zig and Lua, having one file touch stuff from another literally just defines a file-local const containing a struct with the other file’s definitions in it. But after going through this thought exercise with modular implicits that might be packing too much use into a single construct, and if we don’t need structs to contain associated types then that approach is incomplete anyway ‘cause it leaves out the question of how to touch another file’s types. The idea of preventing derives/macros from expanding into references to other files is a pretty compelling design goal I hadn’t considered up to now. I wonder how Zig does it…
This is a great writeup! I thought about going down the same path with my language, but it appears you got much further along 😀 I haven’t thought it through too much, but for the Ord case could you add an implicit Eq parameter to signify the dependency?
On another note, I also wanted to have something like derive for these modules but I’d like to avoid a whole other syntax for macros. I was thinking of something like comptime specifically for that but still have first class generics.
I thiiiiink so, but depending on how it’s done the compiler might have a tough time proving that two different implementations of
Ord(T, Eq(T))
actually lead to the sameEq
value? I’ll have to think about it more, probably with more coffee in my system.I definitely intend to have some sort of macros or comptime code generation in Garnet, but yeah the syntax is a bit of a bear. I actually find Rust’s template-y macros pretty sensible most of the time, once you climb the learning cliff of figuring out how args and repetition works, but there’s reasons I haven’t tried to tackle macros for Garnet yet. :-)
Language design is fun. I think the essential debate about typeclasses vs. modules is the same as implicit vs. explicit. With modules, you need to explicitly say one extra thing, which is how to instantiate the module, i.e.
I’ve written a lot of OCaml, which has the familiar ML-based module system, so have written a bunch of code with this pattern. It’s mostly fine, I can’t ever point to a scenario where it made my life too bad. I like the ease of typeclasses which is due to their implicitness, which means that you get certain code “for free.” I can’t point to a time where they complicated my life too much, but I also don’t use fancy features like HKT. Modules do feel simpler overall.
One last point about this remark / joke:
You just wrote a detailed analysis about programming language theory. You are a math person, in my opinion. This frames math as some kind of in-group that does something other than what we’re doing here, which is analyze and reason about programs. Programmers are some of the most analytical and detail-oriented people in the universe, so I don’t understand the need to distance us from CS. I understand criticizing academia maybe, but not math itself.
The
generic_sum()
function I conjured up is, for me, now the canonical example of the problems modules have with being explicit everywhere. Contrast that with Rust where you can just saysome_random_thing.iter().sum()
? Blew my mind. Do you have a better way of writinggeneric_sum()
with modules? It’s not a programming paradigm I actually have much practical experience with, so I 100% believe there’s patterns I’m missing.My approach to math is that of a carpenter’s approach to nailguns. Necessary, and kinda satisfying to use correctly when it works out well, sure. But I don’t get much intrinsic joy thinking about the nitty-gritty details of a sharp piece of steel interacting with wood grain. :-P
This is a communication problem with any experts talking about different-but-overlapping domains. The in-group behavior happens because people in different domains optimize their thinking differently for different sorts of problems. In my misspent grad school time studying geology and geochemistry, I had the same thing happen trying to talk to material scientists. In my current incarnation writing software for drones, there’s a whole zoo of jargon that pilots use, much of which is silly, obsolete or irrelevant as far as the software cares. But you sure as hell gotta understand what they’re saying when they say something, and be able to communicate back to them, ‘cause most of that jargon expresses stuff that matters a lot to a human in the air, even when a computer flying the same aircraft in the same place with the same physics doesn’t really worry about it. You have to be able to think outside your specific box, and part of that is knowing how you think as well as how other people think.
Application matters.
It is almost a rite-of-passage for a beginning Lua programmer to read up on tables and metatables, and then create their own object-oriented system. You can find many implementations of varying scope and complexity.
Yep! I’ve done it. It’s fun and magical and mind-bending!
How does this compare for the example to python magic methods (https://micropyramid.com/blog/python-special-class-methods-or-magic-methods/):
The section on Lua metatables describes something very similar to Python’s magic methods. The difference between those and the modules/typeclasses described here is basically “how do you find which methods to call”. In Lua and Python, every value has a pointer to its vtable/metatable/type/whatever at runtime, which is a structure that contains these methods. So when you call
foo.__add__(x)
it ends up being, essentially,type(foo)["__add__"](x)
and that is executed at runtime. So iffoo
ends up being a different type at runtime, say going fromint
tofloat
, the dynamic method call automatically goes from callingint.__add__()
tofloat.__add__()
. Or maybe tosome_random_type.__add__()
which doesn’t exist and causes a runtime error. The question that typeclasses/modules grapple with is basically “how do you sort all this out at compile time and prove it will work without doing a runtime lookup”.It’s hard to compare, but in a way you could say that Python generifies everything and specializes nothing, by not have a static type system (so no way exists to differentiate in the sources at compile time).
Could this be solved with memorization? So long as a functor is pure, you memoize the result and return a cached implementation on a second invocation.
(Apologies if this is covered later. I am half way through the article, but I need to get to work. 😬)
The absolute shade.
To some extent maybe, the section on inheritance deals with one way of reasoning about this. To me it seems like it’s a problem with a fair amount of odd little nooks and crannies to explore in terms of practical design space, most of which are yet unplumbed. ML people, at least the ones who write about type systems, seem to often be purists about “if this type is abstract, you never know anything about it.”
All in good fun. I’ve read a fair bit about Scala but never used it for anything nontrivial. :-)
Great read. @icefox you might be interested to know that this is almost exactly how Coq implements typeclasses. For example, the definition of
Eq
from your example would look something like:The
Class
andInstance
keywords in Coq are essentially the same as defining and instantiating a struct (you can pass around anInt_Eq
just like any other ol’ struct). The only magic is thatInstance
updates a database of terms to be implicitly resolved, in the same way that you use theimplicit
keyword in yourgeneric_sum
example. Using the above definitions, one could then define:At which point calling
(explicit_func int Int_Eq)
and(implicit_func int)
are equivalent.I personally find Coq typeclasses a little clunky (typeclasses were added later, and sort of compete with the module system inherited from OCaml). Coq also primarily deals with mathematics, which has some steep hierarchies of inheritance (e.g. a total order :> partial order :> reflexive etc) which can be pretty cumbersome.
One takeaway from Coq that you might find useful is the notion of sections, which can be used to automatically append typeclasses to function arguments when used in a function body. For example, in some pseudo-garnet, this might look like:
This is a rough example (you’d probably need some extra logic / syntax to unify all those
@C
’s), but hopefully this is some more food for thought. It might prove useful in say, a list library, whereCmp
,Eq
, and friends are pervasive.I wonder whether algebraic laws would solve two seemingly-unrelated complaints that you’ve put forward. First, the question of whether a template is valid can be answered with a proof that some algebraic law holds on the template, before the template is ever applied. Second, the question of which derived interfaces are valid (e.g. when
Eq
andOrd
are correctly defined relative to each other) can be answered with an algebraic law which relates interfaces to each other. In your example, we might prove that all instances ofEq
which are derived fromOrd
are correct.