When I was briefly in love with Standard ML, I remember trying Poly/ML, SML/NJ and Moscow ML. I remember both Poly/ML and Moscow being easier to use than SML/NJ, but I got distracted by SML/NJ’s build tool. I came away somehow with the impression that SML/NJ was larger and more complete. I don’t know if that is true actually.
I know Haskell better than Standard ML but I sometimes wish that ML had become a bigger force. I especially wish its approach to standardizing the language had taken off. If Standard ML had higher order functors (I believe added by some implementations), a where syntax like Haskell’s, and a better story for overloading/polymorphism (my impression is that the overloading of math operators for int and real is kind of a hack) I think it would be harder to ignore.
The formalization of SML is incredibly cool, whatever one may think about its effect on its further evolution.
Everyone who worked on MLs in their early days and helped them become what they are now has my deepest respect and admiration.
However, that article states many things about OCaml that are either wrong or not as simple, and @yawaramin only scratched the surface in his comment. I’ll omit the points he covered and focus on those he didn’t, mainly in the “not as simple” category.
As I recall, Caml launched with basic modules that worked with UNIX Makefiles to generate .o files, which was regarded as a big win.
OCaml modules today apparently resemble Standard ML’s, if this guy is correct.
Objective Caml (which is now OCaml) has supported functors since at least 1998. I wanted to say “from its inception in 1996”, but I don’t have a source handy to prove it.
Regarding Caml Light, its goal was to bring ML to affordable and widely available machines, which in 1990 meant absurdly low-powered ones. True separate compilation was indeed a big win at the time. Hell, it still is a big win for developer experience and one of the reasons I like OCaml so much — it’s not just fast by itself, but also never rebuilds anything that doesn’t need rebuilding.
ZINC report, the document on Caml Light’s design and implementation, also mentions design issues with ML module syntax design at the time:
Second, StandardML modules are not particularly convenient to use. The problem is that to use some values or types defined in another module… the programmer has to insert their signature … at the beginning of the current module.
I’m not familiar with that part of ML history so I’m not sure what it looked like at the time, but I assume that SML modules and OCaml modules got easy to use around the same time in the mid ’90s.
Weirdest of all, you could actually omit the brackets in (x,y,z), which may be the actual reason why list elements must be separated by semicolons.
One advantage of semicolons as list item separators is that a list of tuples (which is a common way to write an immutable associative array) can be written without any parens: let kvs = ["foo", 1; "bar", 2]. Not requiring parens around tuples also makes multiple bindings look cleaner: let x, y = 1, 2.
Astonishingly, implementors actually used it, achieving compatibility to such an extent that Isabelle could be compiled with either Standard ML of New Jersey or Poly/ML
A fun thing is that Poly/ML requires a function named main in standalone programs while MLton and SML/NJ evaluate all expressions top-down, so any program one wants to be buildable with both needs to be aware of that. The definition of SML says nothing about the program entry point.
support for multi-threading (also here); they say that OCaml is finally catching up, 15 years later, thanks to having 100 times as much funding
OCaml has supported threads for at least 20 years by now. The big deal about OCaml 5.0 is a parallel GC that doesn’t require pauses and doesn’t worsen single-threaded program performance — the latter part, to my knowledge, is a big breakthrough in memory management for any language.
Poly/ML uses the standard fare “stop-the-world” GC, as that very paper says:
The garbage collector is single-threaded. When a garbage collection is required all ML threads are stopped and the collector runs to completion before releasing the ML threads.
I think the new multicore GC for OCaml has stop the world minor collections. Major collections are parallel.
In general I think what Paulson looks at OCaml through the interactive theorem prover lens, in particular through the Isabelle/HOL lens. That means that ML is the meta-language for your proofs, and that it’s loaded dynamically. The criticism of HOL light make a lot of sense then, because OCaml’s toplevel is not particularly designed for such a use case. However, the main proof assistant written in OCaml, Coq, uses OCaml as its implementation language, not as its meta-language. There, OCaml code natively compiled into a binary works very well (and I suspect it’s faster on a single thread than poly/ML).
It’s sad that well into the 21st Century, Computer Science has so regressed that people no longer see the point of distinguishing between a programming language and its implementation.
It hasn’t regressed. It has industrialized.
As somebody told me at the time, “the French did not get anything they wanted”. We know how that ended.
They ended up creating Caml which had none of the things they allegedly wanted…? 🤔
Much of the contention was over the ambiguity inherent in the following declaration:…let f x = ……With ISWIM, this defines f to be a function taking argument x. The same declaration in OCaml similarly declares the function f, unless f is a datatype constructor, when instead it declares x by pattern matching.
Well, no. In OCaml f cannot be a datatype constructor. Datatype constructors need to start with an uppercase. There’s actually no ambiguity.
it’s odd that OCaml doesn’t have a streamlined syntax for (drum roll) declaring a function using pattern-matching
Except it does…? Two of them in fact, one for non-refutable patterns: fun X -> .... And another for refutable patterns: function X -> ... | Y -> ...
It’s odd that the floating-point operators are +., -., etc., rather than being overloaded, as they are in almost every other programming language.
It also makes typechecking much easier, for each arithmetic operator you know its exact concrete type and it’s obvious from the code what arithmetic type you’re working with.
Attempts to move forward with ML were stymied by disagreements among its main proponents.
This is ironic after saying that ‘It’s sad that well into the 21st Century, Computer Science has so regressed that people no longer see the point of distinguishing between a programming language and its implementation’ and not seeing the connection.
All Caml strings were mutable, so you could change any constant, even T into F. It’s now fixed in OCaml, but who could overlook such a point when your key objective is soundness?
Mutability of strings has nothing to do with soundness. All mainstream languages have mutable bytestrings, I doubt people would go around claiming this makes them unsound.
With no way to save an image, you had to rebuild HOL Light from sources at every launch
This doesn’t make sense after saying that Caml emitted .o files and worked with Unix Makefiles just a few paragraphs earlier. It sounds like a HOL thing specifically and not a Caml thing.
But the full analysis library would not finish loading until after lunch. This is a world-beating system?
It also makes typechecking much easier, for each arithmetic operator you know its exact concrete type and it’s obvious from the code what arithmetic type you’re working with.
There are quite a few things in OCaml I wish worked differently or seem like compromises to me, like the uppercase requirement for data constructors and inability to apply data constructors as functions. But separate operators for floats and integers seem like a feature that other languages should borrow. After all, machine floats and integers are fundamentally different and conversion can lead to results that are, at the very least, surprising.
If a language treats infix operators like functions and allows re-binding them, then any code that mostly works with floats can easily do let (+) = (+.) and avoid the verbosity without compromising type safety.
This doesn’t make sense after saying that Caml emitted .o files and worked with Unix Makefiles just a few paragraphs earlier. It sounds like a HOL thing specifically and not a Caml thing.
SML is an image-based system in many implementations. OCaml is not.
you had to rebuild HOL Light from sources at every launch
OCaml and Caml before it are ‘image-based’ in the sense that they compile down to some ‘machine image’ code e.g. bytecode or native executable code. And they are designed to support incremental compilation in almost exactly the same way as C. So it doesn’t make sense to me that HOL Light required recompilation at every launch. Maybe if some linked modules changed, those would need to be recompiled and relinked, sure. Maybe some core parts changed, which would trigger ripple effect recompilations. But on every launch?
are ‘image-based’ in the sense that they compile down to some ‘machine image’ code e.g. bytecode or native executable code.
‘Image-based’ typically refers to something like Smalltalk or many Common Lisp environments where the code gets loaded in to create a runtime state of the system, then that runtime state is persisted and loaded again later as a means of running the program rather than emitting a binary in a format meant to be invoked by anything but the original runtime.
That is not what an image-based system means. In HOL Light you are essentially in OCaml’s toplevel repl. You can’t save the image of the interactive repl, which is a problem for HOL Light, not only because loading it takes minutes, but you can’t save the state of your progress either. This is a non-issue in SML (and Lisp, and Smalltalk…).
HOL Light users use some Linux debugging hacks to be able to save and restore the OCaml toplevel.
This is implying that industrialization and formalization are somehow at odds, which they aren’t.
Well, no. In OCaml f cannot be a datatype constructor. Datatype constructors need to start with an uppercase. There’s actually no ambiguity.
They’re clearly using f as a placeholder here. The point was that OCaml allowing pattern matching in a let binding was a difference from Standard ML.
It also makes typechecking much easier, for each arithmetic operator you know its exact concrete type and it’s obvious from the code what arithmetic type you’re working with.
This doesn’t seem to be an issue in any other language, i.e. type classes in Haskell.
Haskell doesn’t have the equivalent of OCaml functors. As far as I know, the combination of functors and typeclasses (and opaque types, which OCaml has) is a research problem and no language currently supports both.
I believe you are right. The advantage of functors is that you can instantiate them multiple times for the same types. Haskell’s polymorphism is based on type classes, which means you can only have a single instance for a given type. It’s very convenient for common cases, but leaks through in some edge cases in funny ways, like the fact that there are Sum and Product types that decorate numbers so that you can make them Foldable.
It’s far from clear, because if we look at the actual syntax, it’s pretty unambiguous because of the casing. The OP’s use of ‘meta’ f when the case is actually important, is misleading at best.
This doesn’t seem to be an issue in any other language, i.e. type classes in Haskell.
It’s actually a rather big issue in these languages. Haskell will give you indecipherable errors like ‘No instance for Num a found…’ and Elm and others have weird pseudonumber types that they use to get around the issues. OCaml keeps it simple and gives you a clear type error–you want an int or a float? Pick one.
In industry (as opposed to academia), many tools/languages are driven by industrial needs and not by research agendas. Usually the industrial backers are not going to sit back and wait several years for their language to be formalized.
I think my meaning was clear, I’m talking about the specification of the language as opposed to its implementation. That’s pretty clearly the meaning from the OP too.
But the argument put forward was that due to commercial pressures there is no time do to formalization, and the opposite happens in OCaml. No only there is time, but the formalization happens before implementation. So the reason for OCaml lacking a formal specification cannot be this.
When I was briefly in love with Standard ML, I remember trying Poly/ML, SML/NJ and Moscow ML. I remember both Poly/ML and Moscow being easier to use than SML/NJ, but I got distracted by SML/NJ’s build tool. I came away somehow with the impression that SML/NJ was larger and more complete. I don’t know if that is true actually.
I know Haskell better than Standard ML but I sometimes wish that ML had become a bigger force. I especially wish its approach to standardizing the language had taken off. If Standard ML had higher order functors (I believe added by some implementations), a
where
syntax like Haskell’s, and a better story for overloading/polymorphism (my impression is that the overloading of math operators for int and real is kind of a hack) I think it would be harder to ignore.Disclaimer:
However, that article states many things about OCaml that are either wrong or not as simple, and @yawaramin only scratched the surface in his comment. I’ll omit the points he covered and focus on those he didn’t, mainly in the “not as simple” category.
Objective Caml (which is now OCaml) has supported functors since at least 1998. I wanted to say “from its inception in 1996”, but I don’t have a source handy to prove it.
Regarding Caml Light, its goal was to bring ML to affordable and widely available machines, which in 1990 meant absurdly low-powered ones. True separate compilation was indeed a big win at the time. Hell, it still is a big win for developer experience and one of the reasons I like OCaml so much — it’s not just fast by itself, but also never rebuilds anything that doesn’t need rebuilding.
ZINC report, the document on Caml Light’s design and implementation, also mentions design issues with ML module syntax design at the time:
I’m not familiar with that part of ML history so I’m not sure what it looked like at the time, but I assume that SML modules and OCaml modules got easy to use around the same time in the mid ’90s.
One advantage of semicolons as list item separators is that a list of tuples (which is a common way to write an immutable associative array) can be written without any parens:
let kvs = ["foo", 1; "bar", 2]
. Not requiring parens around tuples also makes multiple bindings look cleaner:let x, y = 1, 2
.A fun thing is that Poly/ML requires a function named
main
in standalone programs while MLton and SML/NJ evaluate all expressions top-down, so any program one wants to be buildable with both needs to be aware of that. The definition of SML says nothing about the program entry point.OCaml has supported threads for at least 20 years by now. The big deal about OCaml 5.0 is a parallel GC that doesn’t require pauses and doesn’t worsen single-threaded program performance — the latter part, to my knowledge, is a big breakthrough in memory management for any language.
Poly/ML uses the standard fare “stop-the-world” GC, as that very paper says:
I think the new multicore GC for OCaml has stop the world minor collections. Major collections are parallel.
In general I think what Paulson looks at OCaml through the interactive theorem prover lens, in particular through the Isabelle/HOL lens. That means that ML is the meta-language for your proofs, and that it’s loaded dynamically. The criticism of HOL light make a lot of sense then, because OCaml’s toplevel is not particularly designed for such a use case. However, the main proof assistant written in OCaml, Coq, uses OCaml as its implementation language, not as its meta-language. There, OCaml code natively compiled into a binary works very well (and I suspect it’s faster on a single thread than poly/ML).
It hasn’t regressed. It has industrialized.
They ended up creating Caml which had none of the things they allegedly wanted…? 🤔
Well, no. In OCaml
f
cannot be a datatype constructor. Datatype constructors need to start with an uppercase. There’s actually no ambiguity.Except it does…? Two of them in fact, one for non-refutable patterns:
fun X -> ...
. And another for refutable patterns:function X -> ... | Y -> ...
It also makes typechecking much easier, for each arithmetic operator you know its exact concrete type and it’s obvious from the code what arithmetic type you’re working with.
This is ironic after saying that ‘It’s sad that well into the 21st Century, Computer Science has so regressed that people no longer see the point of distinguishing between a programming language and its implementation’ and not seeing the connection.
Mutability of strings has nothing to do with soundness. All mainstream languages have mutable bytestrings, I doubt people would go around claiming this makes them unsound.
This doesn’t make sense after saying that Caml emitted
.o
files and worked with Unix Makefiles just a few paragraphs earlier. It sounds like a HOL thing specifically and not a Caml thing.The bitterness starts to creep in here 😂
There are quite a few things in OCaml I wish worked differently or seem like compromises to me, like the uppercase requirement for data constructors and inability to apply data constructors as functions. But separate operators for floats and integers seem like a feature that other languages should borrow. After all, machine floats and integers are fundamentally different and conversion can lead to results that are, at the very least, surprising.
If a language treats infix operators like functions and allows re-binding them, then any code that mostly works with floats can easily do
let (+) = (+.)
and avoid the verbosity without compromising type safety.Yup, in fact I even deliberately left out local opens in OCaml so as not to confuse people too much!
SML is an image-based system in many implementations. OCaml is not.
From the article:
OCaml and Caml before it are ‘image-based’ in the sense that they compile down to some ‘machine image’ code e.g. bytecode or native executable code. And they are designed to support incremental compilation in almost exactly the same way as C. So it doesn’t make sense to me that HOL Light required recompilation at every launch. Maybe if some linked modules changed, those would need to be recompiled and relinked, sure. Maybe some core parts changed, which would trigger ripple effect recompilations. But on every launch?
‘Image-based’ typically refers to something like Smalltalk or many Common Lisp environments where the code gets loaded in to create a runtime state of the system, then that runtime state is persisted and loaded again later as a means of running the program rather than emitting a binary in a format meant to be invoked by anything but the original runtime.
That is not what an image-based system means. In HOL Light you are essentially in OCaml’s toplevel repl. You can’t save the image of the interactive repl, which is a problem for HOL Light, not only because loading it takes minutes, but you can’t save the state of your progress either. This is a non-issue in SML (and Lisp, and Smalltalk…).
HOL Light users use some Linux debugging hacks to be able to save and restore the OCaml toplevel.
This is implying that industrialization and formalization are somehow at odds, which they aren’t.
They’re clearly using
f
as a placeholder here. The point was that OCaml allowing pattern matching in a let binding was a difference from Standard ML.This doesn’t seem to be an issue in any other language, i.e. type classes in Haskell.
Haskell doesn’t have the equivalent of OCaml functors. As far as I know, the combination of functors and typeclasses (and opaque types, which OCaml has) is a research problem and no language currently supports both.
I believe you are right. The advantage of functors is that you can instantiate them multiple times for the same types. Haskell’s polymorphism is based on type classes, which means you can only have a single instance for a given type. It’s very convenient for common cases, but leaks through in some edge cases in funny ways, like the fact that there are Sum and Product types that decorate numbers so that you can make them Foldable.
It’s far from clear, because if we look at the actual syntax, it’s pretty unambiguous because of the casing. The OP’s use of ‘meta’
f
when the case is actually important, is misleading at best.It’s actually a rather big issue in these languages. Haskell will give you indecipherable errors like ‘No instance for Num a found…’ and Elm and others have weird pseudonumber types that they use to get around the issues. OCaml keeps it simple and gives you a clear type error–you want an int or a float? Pick one.
Can you expand on what you mean by “industrialized”?
In industry (as opposed to academia), many tools/languages are driven by industrial needs and not by research agendas. Usually the industrial backers are not going to sit back and wait several years for their language to be formalized.
But every new OCaml feature is formalized before being added to the language. All parts of OCaml are formalized, it just lacks a formal specification.
I think my meaning was clear, I’m talking about the specification of the language as opposed to its implementation. That’s pretty clearly the meaning from the OP too.
But the argument put forward was that due to commercial pressures there is no time do to formalization, and the opposite happens in OCaml. No only there is time, but the formalization happens before implementation. So the reason for OCaml lacking a formal specification cannot be this.
Yes, my language was slightly imprecise but to probably 95% of programmers there’s not that much difference.