Good writeup on the thought process that went behind this decision. I’d have reservations about making floats and ints the same because I can’t think of a situation where it’s ok for a value that is an integer to be a float, especially in a configuration language, but personally I’d prefer it if all configuration languages allowed you to specify a type as a union of discrete integer values like CUE does, and the author is clearly not going for that.
My new language is also functional with a JSON subset. I’ve been considering the same issues.
The author’s “referential transparency” requirement actually concerns the equality operator. It states that there is only one equality operator, and that this operator tests for operational equivalence. For all pure functions f, f x equals f y implies that x and y are operationally equivalent. (In the traditional definition, a language can be referentially transparent without this restriction on equality.)
If your language is sufficiently simple and inexpressive, then you can get away with having only one equality operator, which tests for operational equivalence. However, that stops working as your language adds more features and becomes more expressive. As they have noticed, one extension that breaks this is distinguishing ints from floats, if you also want 1 == 1.0. Another extension that breaks this, as they have noted, is that the order of elements in a dict is semantically significant, but you want the equality operator to ignore dict order.
My language is too expressive to be constrained this way, so I decided to have two equality operators. There is the “normal” equality operator, which sometimes considers two values that are not operationally equivalent to still be equal. Then there is the “strict” equality operator, which I call “equivalence”, and this one tests for operational equivalence. [My language doesn’t limit itself to pure ASCII, so I use ≡ as the equivalence operator. Other designers might choose to use === instead.]
Problems with numeric equality get worse if you decide to implement all of the features of IEEE floating point numbers, and you include the two special “numbers” negative zero (–0.0) and not-a-number (NaN).
Negative zero is not equivalent to positive zero even under simple arithmetic operators. For example, 1.0/0.0 is not equal to 1.0/(-0.0). But the IEEE standard specifies that the numeric equality operator should treat both zeroes as equal.
NaN is not equal to itself, which breaks the numeric equality operator so badly that it is no longer an equivalence relation, and this makes code using equality harder to reason about.
My language is a very high level language, not a systems language, so I decided to leave out negative zero and NaN, in order to have more intutitive semantics that are easier to reason about.
And of course, this problem with equality is very general, it has nothing to do with gradual types.
What I love about this is the framework the author builds for making intelligent decisions about this language design. Clearly identifying the list of features that they desire (and observing that they conflict) is a really good way to go about making the decision.
Personally, I might have chosen to give up the ability to compare integers and floats instead. Admittedly, JSON doesn’t have the distinction between float and int, but having different types be incomparable is a longstanding feature of most languages with types and I don’t think it is very painful to give up.
I definitely agree with the author that “Referential transparency” is a superficially easy thing to give up, but that the costs are hidden and high.
My take in this space is if you want the benefits of static typing, you really want homogenously typed arrays (and sets/dictionaries), and that would be tradeoff I would make. Thus a set will never need to compare integers and floats since you can only have a “set of ints” or a “set of floats”. You can then choose option #4 and make the user explicitly convert between types before comparison (which sometimes can uncover bugs).
Being a strict JSON superset doesn’t help out here, since it is inherently dynamic, but you can claw back a JSON-compatible data type in a static language using union or sum types. (If you use union types not sum types the equality question does rear it’s head again, though, so if you follow this thread sum types would be preferred - it is interesting how these decisions relate and it was really good to read the author following through and comparing the various consequences).
Good writeup on the thought process that went behind this decision. I’d have reservations about making floats and ints the same because I can’t think of a situation where it’s ok for a value that is an integer to be a float, especially in a configuration language, but personally I’d prefer it if all configuration languages allowed you to specify a type as a union of discrete integer values like CUE does, and the author is clearly not going for that.
My new language is also functional with a JSON subset. I’ve been considering the same issues.
The author’s “referential transparency” requirement actually concerns the equality operator. It states that there is only one equality operator, and that this operator tests for operational equivalence. For all pure functions f, f x equals f y implies that x and y are operationally equivalent. (In the traditional definition, a language can be referentially transparent without this restriction on equality.)
If your language is sufficiently simple and inexpressive, then you can get away with having only one equality operator, which tests for operational equivalence. However, that stops working as your language adds more features and becomes more expressive. As they have noticed, one extension that breaks this is distinguishing ints from floats, if you also want 1 == 1.0. Another extension that breaks this, as they have noted, is that the order of elements in a dict is semantically significant, but you want the equality operator to ignore dict order.
My language is too expressive to be constrained this way, so I decided to have two equality operators. There is the “normal” equality operator, which sometimes considers two values that are not operationally equivalent to still be equal. Then there is the “strict” equality operator, which I call “equivalence”, and this one tests for operational equivalence. [My language doesn’t limit itself to pure ASCII, so I use ≡ as the equivalence operator. Other designers might choose to use === instead.]
Problems with numeric equality get worse if you decide to implement all of the features of IEEE floating point numbers, and you include the two special “numbers” negative zero (–0.0) and not-a-number (NaN).
My language is a very high level language, not a systems language, so I decided to leave out negative zero and NaN, in order to have more intutitive semantics that are easier to reason about.
And of course, this problem with equality is very general, it has nothing to do with gradual types.
What I love about this is the framework the author builds for making intelligent decisions about this language design. Clearly identifying the list of features that they desire (and observing that they conflict) is a really good way to go about making the decision.
Personally, I might have chosen to give up the ability to compare integers and floats instead. Admittedly, JSON doesn’t have the distinction between float and int, but having different types be incomparable is a longstanding feature of most languages with types and I don’t think it is very painful to give up.
I definitely agree with the author that “Referential transparency” is a superficially easy thing to give up, but that the costs are hidden and high.
Would a numeric tower make dropping referential transparency in this specific case more palatable?
My take in this space is if you want the benefits of static typing, you really want homogenously typed arrays (and sets/dictionaries), and that would be tradeoff I would make. Thus a set will never need to compare integers and floats since you can only have a “set of ints” or a “set of floats”. You can then choose option #4 and make the user explicitly convert between types before comparison (which sometimes can uncover bugs).
Being a strict JSON superset doesn’t help out here, since it is inherently dynamic, but you can claw back a JSON-compatible data type in a static language using union or sum types. (If you use union types not sum types the equality question does rear it’s head again, though, so if you follow this thread sum types would be preferred - it is interesting how these decisions relate and it was really good to read the author following through and comparing the various consequences).