While it’s true that “all you need is data and functions”, you could also say that “all you need is mov”, since “mov” is Turing complete. Type classes / traits, like many other PLT advancements, allow speaking more precisely and expressively about the desired semantics. Bounded quantification is strictly more expressive than universal quantification.
Traits don’t actually add any expressive power to a language, they can at best make APIs more concise. Comparing not having traits to only using mov seems a bit much.
I’m extra confused, because traits + constraints allow encoding logic that’s not at all possible otherwise at the type level (only permit a call if …), so how is that not expressive power?
Language A can express language B if there is a simple way to rewrite B-programs as A-programs that preserves the overall pattern of semantics of programs — not only the semantic values of valid programs, but which programs are valid, i.e., halt and which are not valid/don’t halt. … That is, A can express B when there exists a decidable function φ mapping each B-program p to an A-program φ(p) such that φ(p) halts iff p halts. A can weakly express B when φ(p) halts if p halts (but not necessarily only if p halts).
Usually, we can just replace a trait/typeclass dictionary with a struct or wrapper. In e.g. Haskell:
f : Trait t => x -> y
f : Trait t -> x -> y
It’s clear that no expressive power is gained. It’s a way to delegate a bit of work to the compiler, to document abstractions, and to expose interfaces for implementation by other modules. See Oleg’s notes on dictionary-passing for a worked example which reduces Haskell typeclasses to OCaml structs.
Shutt is talking about Felleisen’s paper “on the expressive power of programming languages”, and the key idea of that paper is more interesting than the part you quoted. That quote just describes any compiler, the Turing tar pit. Shutt’s next sentence is the interesting bit:
How readily A can express B depends on how disruptively φ is allowed to rearrange the internal structure of p. Felleisen’s paper particularly focuses on the class of “macro”, a.k.a. “polynomial”, transformations φ, which correspond to Landin’s notion of syntactic sugar.
Except the global coherence of type classes. Of course you can talk about incoherent instances and the like and claim that there’s no hard semantic difference, but you’d be ignoring the fact that writing incoherent instances is the exception and not the norm and if you write an API like this relying on instance coherence and if it misbehaves due to an incoherent instance, it’s the fault of the caller and not the implementer. So type classes make the semantics of coherence explicit, and the language makes sure that you have to fight it to get incoherent instances. Then you’re rewarded with the compiler doing the plumbing for you. So I don’t think you can ignore all this and pretend that they’re equivalent. And all of this not even mentioning the styles of code that would be too ugly to consider without the ergonomics of type classes, so even though you could express them, nobody would actually write code like this because even though the following seem like they’re equivalent:
f : Trait t => x -> y
f : Trait t -> x -> y
The following certainly isn’t in practice:
f : (A billion instances, some of them automatically constructed based on a chain of instances defined in terms of other instances) => x -> [y] -> SomeHKT z x (Maybe y)
f : (A billion instances, some of them manually constructed based on a chain of instances defined in terms of other instances) -> x -> [y] -> SomeHKT z x (Maybe y)
I agree with you that they’re not equivalent. They happen to have the same expressive power: there aren’t algorithms which I can express with one but not the other. However, I’m only claiming that the rewrite goes in one direction; I’m not sure whether it’s possible to automatically turn structs into traits.
I also agree with the rest of your point. When a language has traits or typeclasses, the compiler does a bit of extra work to resolve them.
Could you expand upon what would otherwise not be possible? Type classes de-sugar to additional function arguments, so anything you can express with them can be expressed without them.
Yes, how can you express trait bounds that are statically checkable with regular functions?
I guess you can technically express it with either a union type or if that’s not available, with multiple separately named functions (similar to ML modules).
But even the article you linked distinguishes between theoretical and practical expressivity. I try to avoid the Turing tarpit, so I’m talking about practical expressivity, I.e. in this case expressing type bounds on a function without changing its name.
Type class constraints are just syntactic sugar for an extra function argument per bound, so you can write it that way.
I too am talking about practical expressivity. Type classes are practically very weak and can always be rewritten, they at most make APIs more concise. If we are to add such a large amount of complexity to a language I would like much more in exchange, and to not suffer a large compile time and/or runtime performance hit.
Using the distinction between “theoretical expressivity” and “practical expressivity”, traits increase the theoretical expressivity of the type level language, and the practical expressivity of the value level language. Without them, there’s no way to ensure only a single implementation of a trait/vtable exists (or is used) for a given type.
I was talking practically also- is that a useful property? Many type class/trait implementations don’t have that property (including the original paper), and in Haskell it’s common to introduce newtypes to work around that limitation.
Traits don’t actually add any expressive power to a language, they can at best make APIs more concise.
The same is true of functions. They don’t allow you to express anything that you can’t express with labels and goto. Like most language features, they are not there to make the language more expressive, they are there to bias the set of programs that can be concisely expressed. There is an infinite set of possible programs and any Turing-complete language can express them all. There is a finite set of correct, maintainable, and useful programs. The goal of a programming language is to provide an approximation of Huffman encoding over the set of all programs so that members of the set of desirable ones are shorter and easier to write than others.
That’s true of the value language (that is, whether it is computable), but not of the type language (whether it can be statically verified or otherwise reasoned about).
Functions do add to the expressive power of a language, type classes do not.
Sure. The main difference is that functions are privileged primitive types in type theory, and so the native type theory of any programming language is going to have functions, regardless of the language under study.
Like, take any fragment of code in any language, and use its native type theory to establish a precondition and postcondition. Then, we can convert that fragment into a function which maps from the precondition to the postcondition; even if the language doesn’t have syntactic functions, we can still apply the fragment in-place as if it were a function.
They certainly add considerable expressivity to the type level language, and they also enforce consistency in the value level language. To emulate traits without traits would require explicit vtable (dictionary) passing. With vtable passing, there’s nothing preventing an intermediate function from swapping out trait implementations. Said another way, without traits, it is not possible to express that the vtable must be the same throughout the program.
Traits also allow you to express blanket implementations, and ensures they are consistent and do not clash with other implementations.
That’s a property of the Haskell implementation, but not of type classes generally. Many implementations (and the original paper) scope implementations so there can be multiple, and they can be swapped. Even in Haskell it’s common to work-around that limitation by introducing otherwise meaningless newtypes.
I’ve seen this principle in a few different guises. In Haskell there is an approach called Scrap Your Type Classes (https://www.haskellforall.com/2012/05/scrap-your-type-classes.html), where you replace a type class (which is equivalent to a trait in Rust) with a type that represents the behaviour, eg
class Eq a where
eq :: a -> a -> Bool
-- becomes:
data Eq a = Eq { eq :: a -> a -> Bool }
Instances of the class then become normal values.
instance Eq Bool where
eq True True = True
eq False False = True
eq _ _ = False
-- becomes
eqBool = Eq { eq = f }
where f True True = True
f False False = True
f _ _ = False
A type class constraint on a function becomes a normal function parameter, and type class instances are passed explicitly
notEq :: Eq a => a -> a -> Bool
notEq x y = not (eq x y)
-- becomes
notEq :: Eq a -> a -> a -> Bool
notEq eqa x y = not (eq eqa x y)
The obvious downside is that you have to explicitly pass around these instance values, but the upside is you drastically simplify the language, getting rid of an entire separate namespace, syntax and many typechecking rules.
The problem of having to explicitly apply instances can be alleviated by introducing implicit arguments to the language - this is what Agda does, and is something I’m exploring with my own language, which is aiming to be a simplified Haskell. I’ve written a little bit about the design here: https://github.com/hmac/kite/blob/main/docs/implicits.md
Elm has received some flak for asking for this approach [1][2] (this one is a bit inflammatory) [3]
I’m going to refrain from sharing my thoughts on whether it’s right or wrong to scrap your typeclasses since I’m both biased by the languages I know and not very knowledgeable on what is Actually Good for users of a language in this case. But I found the discourse rather interesting.
Thank you! It’s still an extremely rough prototype so I don’t feel ready to share it yet, but I have been planning to write some blog posts on the design of the language. I may share those here, if they seem interesting.
I imagine the tricky bit is going to be when you need to satisfy two traits simultaneously. What comes to mind is a directed workflow, which is simultaneously an executable and a graph.
An earlier version of the post addressed this more directly, and I probably need to work it back in… the “obvious” answer to this is to just pass your original value, along with 2+ conversion functions to turn that data-type into the trait-types that you need. Definitely less elegant than just saying A + B + C, but it’s serviceable.
Clearly showing that yes, it’s equivalent in the sense that they can compute similar things, but interface types provide real tangible benefits to language readability and meaning.
Really the only difference is that we manually convert from Friend to Display in Gleam, where Rust already knows how to use Friend as a Display.
You can improve the representation of the “trait record” to get rid of this manual conversion. The result is known as the “dictionary passing” translation of type classes.
Here’s how you you’d translate the Display example (I don’t know Gleam, I’m just making it up based on what you wrote in your post):
A function that takes a Display-able type gets this signature: fn my_function<A>(display_impl: Display<A>, value: A, ...) -> ....
This also solves hwayne’s comment about using types at multiple traits. A function that takes a Display + Ord-able type gets this signature: fn my_function<A>(display_impl: Display<A>, ord_impl: Ord<A>, value: A, ...) -> .... You’ve gestured in a similar direction in this comment, and I think improving the representation of the “trait record” helps.
This also solves hwayne’s comment about using types at multiple traits. A function that takes a Display + Ord-able type gets this signature: fn my_function(display_impl: Display, ord_impl: Ord, value: A, …) -> …. You’ve gestured in a similar direction in this comment, and I think improving the representation of the “trait record” helps.
It seems like that has slightly different properties, because there’s no guarantee that disply_impl, ord_impl and value are all referencing the same value.
there’s no guarantee that disply_impl, ord_impl and value are all referencing the same value.
display_impl: Display<A> and ord_impl: Ord<A> don’t “reference” a value: they’re records of functions that operate on As.
In fn my_function<A>(display_impl: Display<A>, ord_impl: Ord<A>, value: A, ...) -> ..., the body of the function would call display_impl.format(value) to format the value : A, and ord_impl.compare(value, ...) to compare the value : A with something else.
Gleam is currently pretty limited in what is valid as a constant, so it wouldn’t quite work the way you describe, which is why I suggested passing conversion functions instead. Still interesting to think about tho! (and maybe Gleam will get more robust constants some day)
This is basically the path that lead me to low-key reinvent ML modules for Garnet, yes. However, “just” implementing a generic type that contains generic functions is not always trivial.
What erlang/gleam do for resource management? How do I not “leak the socket”? Is the following picture correct:
a resource is more-or-less just a number (eg, nothing prevents you from using a socket after it is closed)
resources are registered with processes, so, if the whole process dies, resources are automatically cleaned up. Is there a way to send resources between processes? Can I have user-defined resource, or do I need a NIF or something?
I assume there’s some with high order function which makes not leaking the socket past the close call syntactically obvious? And, in Erlang, it’s wrapped in try/catch/rethrow, while Gleam just doesn’t have exceptions?
Hm… But does that scale? I would argue that the complexity foisted on the user in that case is greater than the complexity of the problem they’re trying to solve.
I do not think Rust would have been that successful without
a packaging system, LLVM backend, documentation, concurrent programming primitives, etc.
R without matrices as first-class citizens, would also probably not be that successful.
I understand that the author is looking for the most ‘compact’ easily worded explanation of ‘what is a good’ language.
I think defining a perimeter of use cases where the most compact definition of ‘expressivity’ applies – would help.
It’s not really about good or bad! I love Rust. The post was supposed to be more about “how can we reuse things we’ve learned from traits in languages that don’t have them”
While it’s true that “all you need is data and functions”, you could also say that “all you need is mov”, since “mov” is Turing complete. Type classes / traits, like many other PLT advancements, allow speaking more precisely and expressively about the desired semantics. Bounded quantification is strictly more expressive than universal quantification.
Traits don’t actually add any expressive power to a language, they can at best make APIs more concise. Comparing not having traits to only using
mov
seems a bit much.That’s the definition of expressivity, no?
I’m extra confused, because traits + constraints allow encoding logic that’s not at all possible otherwise at the type level (only permit a call if …), so how is that not expressive power?
See Shutt’s notes on expressivity and abstractive power. I’ll quote the important bit:
Usually, we can just replace a trait/typeclass dictionary with a struct or wrapper. In e.g. Haskell:
It’s clear that no expressive power is gained. It’s a way to delegate a bit of work to the compiler, to document abstractions, and to expose interfaces for implementation by other modules. See Oleg’s notes on dictionary-passing for a worked example which reduces Haskell typeclasses to OCaml structs.
Shutt is talking about Felleisen’s paper “on the expressive power of programming languages”, and the key idea of that paper is more interesting than the part you quoted. That quote just describes any compiler, the Turing tar pit. Shutt’s next sentence is the interesting bit:
Krishnamurthi did a nice presentation of it at “papers we love” https://youtu.be/43XaZEn2aLc
Except the global coherence of type classes. Of course you can talk about incoherent instances and the like and claim that there’s no hard semantic difference, but you’d be ignoring the fact that writing incoherent instances is the exception and not the norm and if you write an API like this relying on instance coherence and if it misbehaves due to an incoherent instance, it’s the fault of the caller and not the implementer. So type classes make the semantics of coherence explicit, and the language makes sure that you have to fight it to get incoherent instances. Then you’re rewarded with the compiler doing the plumbing for you. So I don’t think you can ignore all this and pretend that they’re equivalent. And all of this not even mentioning the styles of code that would be too ugly to consider without the ergonomics of type classes, so even though you could express them, nobody would actually write code like this because even though the following seem like they’re equivalent:
The following certainly isn’t in practice:
I agree with you that they’re not equivalent. They happen to have the same expressive power: there aren’t algorithms which I can express with one but not the other. However, I’m only claiming that the rewrite goes in one direction; I’m not sure whether it’s possible to automatically turn structs into traits.
I also agree with the rest of your point. When a language has traits or typeclasses, the compiler does a bit of extra work to resolve them.
Traits are part of the type system, as are constraints. They are at the type level.
And types are a part of a language.
Sorry, to be clear I was using the computer science definition of expressive power, not the common use of the term expressive to mean “nice APIs”.
Could you expand upon what would otherwise not be possible? Type classes de-sugar to additional function arguments, so anything you can express with them can be expressed without them.
Yes, how can you express trait bounds that are statically checkable with regular functions?
I guess you can technically express it with either a union type or if that’s not available, with multiple separately named functions (similar to ML modules).
But even the article you linked distinguishes between theoretical and practical expressivity. I try to avoid the Turing tarpit, so I’m talking about practical expressivity, I.e. in this case expressing type bounds on a function without changing its name.
Type class constraints are just syntactic sugar for an extra function argument per bound, so you can write it that way.
I too am talking about practical expressivity. Type classes are practically very weak and can always be rewritten, they at most make APIs more concise. If we are to add such a large amount of complexity to a language I would like much more in exchange, and to not suffer a large compile time and/or runtime performance hit.
Using the distinction between “theoretical expressivity” and “practical expressivity”, traits increase the theoretical expressivity of the type level language, and the practical expressivity of the value level language. Without them, there’s no way to ensure only a single implementation of a trait/vtable exists (or is used) for a given type.
I was talking practically also- is that a useful property? Many type class/trait implementations don’t have that property (including the original paper), and in Haskell it’s common to introduce newtypes to work around that limitation.
The same is true of functions. They don’t allow you to express anything that you can’t express with labels and goto. Like most language features, they are not there to make the language more expressive, they are there to bias the set of programs that can be concisely expressed. There is an infinite set of possible programs and any Turing-complete language can express them all. There is a finite set of correct, maintainable, and useful programs. The goal of a programming language is to provide an approximation of Huffman encoding over the set of all programs so that members of the set of desirable ones are shorter and easier to write than others.
That’s true of the value language (that is, whether it is computable), but not of the type language (whether it can be statically verified or otherwise reasoned about). Functions do add to the expressive power of a language, type classes do not.
Sure. The main difference is that functions are privileged primitive types in type theory, and so the native type theory of any programming language is going to have functions, regardless of the language under study.
Like, take any fragment of code in any language, and use its native type theory to establish a precondition and postcondition. Then, we can convert that fragment into a function which maps from the precondition to the postcondition; even if the language doesn’t have syntactic functions, we can still apply the fragment in-place as if it were a function.
They certainly add considerable expressivity to the type level language, and they also enforce consistency in the value level language. To emulate traits without traits would require explicit vtable (dictionary) passing. With vtable passing, there’s nothing preventing an intermediate function from swapping out trait implementations. Said another way, without traits, it is not possible to express that the vtable must be the same throughout the program.
Traits also allow you to express blanket implementations, and ensures they are consistent and do not clash with other implementations.
That’s a property of the Haskell implementation, but not of type classes generally. Many implementations (and the original paper) scope implementations so there can be multiple, and they can be swapped. Even in Haskell it’s common to work-around that limitation by introducing otherwise meaningless newtypes.
I’ve seen this principle in a few different guises. In Haskell there is an approach called Scrap Your Type Classes (https://www.haskellforall.com/2012/05/scrap-your-type-classes.html), where you replace a type class (which is equivalent to a trait in Rust) with a type that represents the behaviour, eg
Instances of the class then become normal values.
A type class constraint on a function becomes a normal function parameter, and type class instances are passed explicitly
The obvious downside is that you have to explicitly pass around these instance values, but the upside is you drastically simplify the language, getting rid of an entire separate namespace, syntax and many typechecking rules.
The problem of having to explicitly apply instances can be alleviated by introducing implicit arguments to the language - this is what Agda does, and is something I’m exploring with my own language, which is aiming to be a simplified Haskell. I’ve written a little bit about the design here: https://github.com/hmac/kite/blob/main/docs/implicits.md
Elm has received some flak for asking for this approach [1] [2] (this one is a bit inflammatory) [3]
I’m going to refrain from sharing my thoughts on whether it’s right or wrong to scrap your typeclasses since I’m both biased by the languages I know and not very knowledgeable on what is Actually Good for users of a language in this case. But I found the discourse rather interesting.
Also, notably, they kinda compile down to that under the hood.
Hey, Kite looks really interesting! You should post it as a separate submission so we can discuss and upvote :)
Thank you! It’s still an extremely rough prototype so I don’t feel ready to share it yet, but I have been planning to write some blog posts on the design of the language. I may share those here, if they seem interesting.
I imagine the tricky bit is going to be when you need to satisfy two traits simultaneously. What comes to mind is a directed workflow, which is simultaneously an executable and a graph.
An earlier version of the post addressed this more directly, and I probably need to work it back in… the “obvious” answer to this is to just pass your original value, along with 2+ conversion functions to turn that data-type into the trait-types that you need. Definitely less elegant than just saying
A + B + C
, but it’s serviceable.Clearly showing that yes, it’s equivalent in the sense that they can compute similar things, but interface types provide real tangible benefits to language readability and meaning.
to_display
has a typo:You can improve the representation of the “trait record” to get rid of this manual conversion. The result is known as the “dictionary passing” translation of type classes.
Here’s how you you’d translate the
Display
example (I don’t know Gleam, I’m just making it up based on what you wrote in your post):A function that takes a
Display
-able type gets this signature:fn my_function<A>(display_impl: Display<A>, value: A, ...) -> ...
.This also solves hwayne’s comment about using types at multiple traits. A function that takes a
Display + Ord
-able type gets this signature:fn my_function<A>(display_impl: Display<A>, ord_impl: Ord<A>, value: A, ...) -> ...
. You’ve gestured in a similar direction in this comment, and I think improving the representation of the “trait record” helps.It seems like that has slightly different properties, because there’s no guarantee that
disply_impl
,ord_impl
andvalue
are all referencing the same value.display_impl: Display<A>
andord_impl: Ord<A>
don’t “reference” a value: they’re records of functions that operate onA
s.In
fn my_function<A>(display_impl: Display<A>, ord_impl: Ord<A>, value: A, ...) -> ...
, the body of the function would calldisplay_impl.format(value)
to format thevalue : A
, andord_impl.compare(value, ...)
to compare thevalue : A
with something else.Thanks for catching that incorrect type!
Gleam is currently pretty limited in what is valid as a constant, so it wouldn’t quite work the way you describe, which is why I suggested passing conversion functions instead. Still interesting to think about tho! (and maybe Gleam will get more robust constants some day)
Gleam supports both records and functions in constants!
They don’t work with inlined functions like this (unless that got fixed). They have to be defined elsewhere and then referenced.
That’s annoying. In the meantime you could change
const display_friend : Display<Friend>
tofn display_friend() : Display<Friend>
for the same effect.This is basically the path that lead me to low-key reinvent ML modules for Garnet, yes. However, “just” implementing a generic type that contains generic functions is not always trivial.
What erlang/gleam do for resource management? How do I not “leak the socket”? Is the following picture correct:
with
high order function which makes not leaking the socket past the close call syntactically obvious? And, in Erlang, it’s wrapped in try/catch/rethrow, while Gleam just doesn’t have exceptions?What if a function needs its arguments to have more than one trait? :^)
I mentioned this is another reply, but you could just pass the original object + multiple conversion functions. :)
Hm… But does that scale? I would argue that the complexity foisted on the user in that case is greater than the complexity of the problem they’re trying to solve.
And if you’re dynamically typed, you don’t even need the types 😌
Most dynamically typed languages still have types.
lol yes, I’m trying for a joke about static types. Maybe saying “you don’t even need the type declarations” would have been n more effective.
[Comment removed by author]
I do not think Rust would have been that successful without a packaging system, LLVM backend, documentation, concurrent programming primitives, etc. R without matrices as first-class citizens, would also probably not be that successful.
I understand that the author is looking for the most ‘compact’ easily worded explanation of ‘what is a good’ language. I think defining a perimeter of use cases where the most compact definition of ‘expressivity’ applies – would help.
It’s not really about good or bad! I love Rust. The post was supposed to be more about “how can we reuse things we’ve learned from traits in languages that don’t have them”