I’m not a mega type-theory head, so I don’t get something about their type system (also is it all type-checked at runtime)? In this elixirconf video, Guillaume Duboc introduces the difference between weak and strong Elixir function types, where
$ integer() -> integer()
def id_weak(x), do: x
is a weakly typed Elixir function, so it’s of type dynamic(). But
$ integer() -> integer()
def id_weak(x) when is_integer(x), do: x
is strongly typed, of type integer(). I thought the point of typing was you can restrict the types of values that are passed to a function, so if this restriction is enforced, how are the two types any different; and if this restriction isn’t enforced, how is an Elixir type different to a comment informing the user of the type and then type-checking with guards at runtime?
I don’t get something about their type system (also is it all type-checked at runtime)?
Type systems and their associated terminology are really abused by the non-PLT crowd, so let’s establish some definitions here by simplifying the definitions used in modern type theory.
A judgement is something that is externally known about something by construction.
A proposition is an internal hypothesis about something. Its truth depends on the internal rules of the considered model.
Types in a program are roughly the judgements about program fragments and the associated propositions that can be proven about other program fragments and the program as a whole.
A type system is a syntax-directed set of rules which can be used by the type-checker to restrict the set of allowable programs to the set of programs where certain propositions can be proven statically.
The term “dynamically-typed language” is a misnomer, because they are actually unityped, but it’s so ubiquitous that I will roll with it further. In dynamically-typed languages, what they call “types” are not types as defined above, but sets defined by propositions. As they lack a type system, they check the propositions (whether a value belongs to a set) at runtime. They call this dynamic process “type-checking”, but again, this is a misnomer. This time I will go against the industry terminology, and roll with the PLT terminology and call this process “dynamic checking” because it’s important to make this distinction later on.
A very important property of (statically) typed languages is type erasure. What this roughly means is that types do not play any role in the dynamic semantics of the language. Types are erased after compilation, the dynamic semantics can’t refer to types to drive computation.
Elixir has dynamic checking and they are adding a type system to the language. As per the paragraph above, the type system pays no role in the dynamic semantics of the language. After passing through the type checker, (correct) programs will return completely unchanged from what they are today. The runtime will continue to execute them in the same way, and they will still do the same dynamic checks they did before (almost, I am simplifying here somewhat, obviously with a type system it’s safe to remove many of those checks, but that’s an optimization that doesn’t change how you should think of all this).
Because the Elixir dynamic semantics won’t change, and contain certain known dynamic checks, the type system they are adding can rely on the presence of those checks. This is important from pragmatic reasons, hence gradual typing and weak and strong function types.
In case what I have said above is not clear enough, the type system they are adding to Elixir is about proving (enforcing) propositions statically, and type checking happens at compile time.
$ integer() -> integer()
def id_weak(x), do: x
is a weakly typed Elixir function, so it’s of type dynamic()
No, its type is (weak) integer() -> integer().
$ integer() -> integer()
def id_weak(x) when is_integer(x), do: x
is strongly typed, of type integer()
No, its type is (strong) integer() -> integer().
I thought the point of typing was you can restrict the types of values that are passed to a function
The point is restricting programs to the ones where useful propositions can be proven statically, and therefore relied on.
It’s about program behavior, and how you (and the type system) can reason about program behavior, and in this case function behavior.
how are the two types any different
I will get to the main question you are asking in a bit, but first I want to clarify something.
They are different in the static semantics because one is weak and the other one is strong. They are not making a syntactical distinction, however, which might be a point of confusion for some. The distinction exists only in the static semantics. Although the paper makes some notes on possibly making a syntatic distinction also.
so if this restriction is enforced, how are the two types any different; and if this restriction isn’t enforced, how is an Elixir type different to a comment informing the user of the type and then type-checking with guards at runtime?
The functions are different because what you can deduce from their usage is different. With a strong function type you know that if a function returns, it will return a value from its codomain. This is a very useful property to know because it affects what other propositions can be proven about the program as a whole. It makes more things provable statically!
You might think that this hypothesis of whether a function returns makes the system useless, but that is far from the case. In almost every language, a function might not return at all (divergence), or it might throw an exception. Or more subtly, it could perform an effect (which is not reflected in the type, for most languages). Types for functions are still useful even under these circumstances. For Elixir specifically, strong functions may return a type error exception, but as I said, they could always return an exception anyway.
Of course that with a stronger type system you could prove (and enforce) more stuff. With a dependently-typed language you can formalise the entirety of math. They are adding a gradual type system due to pragmatic and ergonomic constraints. It is still amazingly useful. Even with gradual types, Elixir’s type system is a stronger type system than what most popular languages out there have. Union and intersection types allow very precise type specifications, and because they also have singletons this allows for a limited form of encoding dependency. This is huge!
As an aside, you can think of dynamic() in Elixir as an existential type (and very importantly it’s a type that is not a supertype of any other type, nor a decidable subtype, thereby creating a modality). But what can you do with this existential type, how do you unpack it? Well fortunately there already exists an unpacking mechanism realized by the existing Elixir runtime, so just use that!
Note that this is really not too dissimilar in spirit to using reflection in Haskell via Typeable. In both cases the compiler generates a comparable value that stands in for the identity of a type, and you have to write code that is dynamically checked at run time in order to make use of it. (Of course it’s not quite the same because of the automatic materialization done in Elixir.)
I hope that you think that Haskell is useful even if it has Typeable! It doesn’t weaken Haskell in any way, it makes Haskell more flexible. Same with dynamic() in Elixir.
The term “dynamically-typed language” is a misnomer, because they are actually unityped, but it’s so ubiquitous that I will roll with it further. In dynamically-typed languages, what they call “types” are not types as defined above, but sets defined by propositions. As they lack a type system, they check the propositions (whether a value belongs to a set) at runtime. They call this dynamic process “type-checking”, but again, this is a misnomer. This time I will go against the industry terminology, and roll with the PLT terminology and call this process “dynamic checking” because it’s important to make this distinction later on.
The ubiquitous terminology strikes me as more useful than this terminology.
Specifically, I think it’s useful to distinguish between:
“static type checking” - type mismatches are reported at build time
“dynamic type checking” - type mismatches are reported at runtime
“no type checking” - type mismatches are not reported, and silently result in undefined program behavior—e.g. assembly language
This “unityped” definition doesn’t distinguish between assembly language’s complete lack of type checking and runtime type-checkers, and I think that distinction is a useful one to make.
Dynamic “typing” is not about types, it’s about sets. Terms in a “dynamically”-typed (DT) language inhabit a single type, hence unityped, which is, btw, standard terminology in PLT.
There is a lot of difference between DT languages when it comes to safety, and indeed it is something worth discussing and worth mentioning when comparing them. But safe DT languages gain their safety properties through something very different than type safety.
Types classify terms. “Dynamic” types classify values and are actually mathematical sets, not mathematical types.
It’s funny how programmers not familiar with mathematics confuse types with sets, and mathematicians who don’t work in the foundations of mathematics confuse sets with types.
Nit: I might suggest that you change “static type checking” to “type mismatches are reported outside of runtime” here, as Elixir/Erlang’s main “type checking” software has long been Dialyzer, a static type discrepancy analyzer that runs outside of runtime, but also separate from the usual build steps.
I think it often boils down to PLT folks trying to hold on to “type” as being a static proposition because that has very different behavior from runtime “tags” and the behaviors resulting from checking them. If you want to ensure those are distinct you’re driven to that sort of PLT-y terminology.
“dynamic type checking” - type mismatches are reported at runtime
“no type checking” - type mismatches are not reported, and silently result in undefined program behavior—e.g. assembly language
This “unityped” definition doesn’t distinguish between assembly language’s complete lack of type checking and runtime type-checkers, and I think that distinction is a useful one to make.
Isn’t this distinction between strong and (very) weak typing, an axis orthogonal to static typing vs dynamic typing/unityping?
However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the “strength” of the type systems of mainstream programming languages.
I’m not a mega type-theory head, so I don’t get something about their type system (also is it all type-checked at runtime)? In this elixirconf video, Guillaume Duboc introduces the difference between weak and strong Elixir function types, where
is a weakly typed Elixir function, so it’s of type
dynamic()
. Butis strongly typed, of type
integer()
. I thought the point of typing was you can restrict the types of values that are passed to a function, so if this restriction is enforced, how are the two types any different; and if this restriction isn’t enforced, how is an Elixir type different to a comment informing the user of the type and then type-checking with guards at runtime?Type systems and their associated terminology are really abused by the non-PLT crowd, so let’s establish some definitions here by simplifying the definitions used in modern type theory.
A judgement is something that is externally known about something by construction.
A proposition is an internal hypothesis about something. Its truth depends on the internal rules of the considered model.
Types in a program are roughly the judgements about program fragments and the associated propositions that can be proven about other program fragments and the program as a whole.
A type system is a syntax-directed set of rules which can be used by the type-checker to restrict the set of allowable programs to the set of programs where certain propositions can be proven statically.
The term “dynamically-typed language” is a misnomer, because they are actually unityped, but it’s so ubiquitous that I will roll with it further. In dynamically-typed languages, what they call “types” are not types as defined above, but sets defined by propositions. As they lack a type system, they check the propositions (whether a value belongs to a set) at runtime. They call this dynamic process “type-checking”, but again, this is a misnomer. This time I will go against the industry terminology, and roll with the PLT terminology and call this process “dynamic checking” because it’s important to make this distinction later on.
A very important property of (statically) typed languages is type erasure. What this roughly means is that types do not play any role in the dynamic semantics of the language. Types are erased after compilation, the dynamic semantics can’t refer to types to drive computation.
Elixir has dynamic checking and they are adding a type system to the language. As per the paragraph above, the type system pays no role in the dynamic semantics of the language. After passing through the type checker, (correct) programs will return completely unchanged from what they are today. The runtime will continue to execute them in the same way, and they will still do the same dynamic checks they did before (almost, I am simplifying here somewhat, obviously with a type system it’s safe to remove many of those checks, but that’s an optimization that doesn’t change how you should think of all this).
Because the Elixir dynamic semantics won’t change, and contain certain known dynamic checks, the type system they are adding can rely on the presence of those checks. This is important from pragmatic reasons, hence gradual typing and weak and strong function types.
In case what I have said above is not clear enough, the type system they are adding to Elixir is about proving (enforcing) propositions statically, and type checking happens at compile time.
No, its type is (weak)
integer() -> integer()
.No, its type is (strong)
integer() -> integer()
.The point is restricting programs to the ones where useful propositions can be proven statically, and therefore relied on.
It’s about program behavior, and how you (and the type system) can reason about program behavior, and in this case function behavior.
I will get to the main question you are asking in a bit, but first I want to clarify something.
They are different in the static semantics because one is weak and the other one is strong. They are not making a syntactical distinction, however, which might be a point of confusion for some. The distinction exists only in the static semantics. Although the paper makes some notes on possibly making a syntatic distinction also.
The functions are different because what you can deduce from their usage is different. With a strong function type you know that if a function returns, it will return a value from its codomain. This is a very useful property to know because it affects what other propositions can be proven about the program as a whole. It makes more things provable statically!
You might think that this hypothesis of whether a function returns makes the system useless, but that is far from the case. In almost every language, a function might not return at all (divergence), or it might throw an exception. Or more subtly, it could perform an effect (which is not reflected in the type, for most languages). Types for functions are still useful even under these circumstances. For Elixir specifically, strong functions may return a type error exception, but as I said, they could always return an exception anyway.
Of course that with a stronger type system you could prove (and enforce) more stuff. With a dependently-typed language you can formalise the entirety of math. They are adding a gradual type system due to pragmatic and ergonomic constraints. It is still amazingly useful. Even with gradual types, Elixir’s type system is a stronger type system than what most popular languages out there have. Union and intersection types allow very precise type specifications, and because they also have singletons this allows for a limited form of encoding dependency. This is huge!
As an aside, you can think of
dynamic()
in Elixir as an existential type (and very importantly it’s a type that is not a supertype of any other type, nor a decidable subtype, thereby creating a modality). But what can you do with this existential type, how do you unpack it? Well fortunately there already exists an unpacking mechanism realized by the existing Elixir runtime, so just use that!Note that this is really not too dissimilar in spirit to using reflection in Haskell via
Typeable
. In both cases the compiler generates a comparable value that stands in for the identity of a type, and you have to write code that is dynamically checked at run time in order to make use of it. (Of course it’s not quite the same because of the automatic materialization done in Elixir.)I hope that you think that Haskell is useful even if it has
Typeable
! It doesn’t weaken Haskell in any way, it makes Haskell more flexible. Same withdynamic()
in Elixir.The ubiquitous terminology strikes me as more useful than this terminology.
Specifically, I think it’s useful to distinguish between:
This “unityped” definition doesn’t distinguish between assembly language’s complete lack of type checking and runtime type-checkers, and I think that distinction is a useful one to make.
Dynamic “typing” is not about types, it’s about sets. Terms in a “dynamically”-typed (DT) language inhabit a single type, hence unityped, which is, btw, standard terminology in PLT.
There is a lot of difference between DT languages when it comes to safety, and indeed it is something worth discussing and worth mentioning when comparing them. But safe DT languages gain their safety properties through something very different than type safety.
Types classify terms. “Dynamic” types classify values and are actually mathematical sets, not mathematical types.
It’s funny how programmers not familiar with mathematics confuse types with sets, and mathematicians who don’t work in the foundations of mathematics confuse sets with types.
Nit: I might suggest that you change “static type checking” to “type mismatches are reported outside of runtime” here, as Elixir/Erlang’s main “type checking” software has long been Dialyzer, a static type discrepancy analyzer that runs outside of runtime, but also separate from the usual build steps.
https://www.erlang.org/doc/man/dialyzer.html
I think it often boils down to PLT folks trying to hold on to “type” as being a static proposition because that has very different behavior from runtime “tags” and the behaviors resulting from checking them. If you want to ensure those are distinct you’re driven to that sort of PLT-y terminology.
Isn’t this distinction between strong and (very) weak typing, an axis orthogonal to static typing vs dynamic typing/unityping?
From the article in that link:
I agree with Wikipedia on this.
It can type check at compile time, similarly to what TypeScript is doing.