Well, I understand it’s frustrating to have to use a language you don’t like at work. I mainly program in OCaml so this might come out as defensive in places, but I do think so some of the criticism is valid!
For a start, yeah, the syntax has known flaws (especially wrt nesting; it’s also not great for formatting imho). The lack of line comments is annoying. However, lexing inside comments is an actual feature that allows for nested comments (so you can comment out something with comments in it, no problem).
The interface issue is kind of a thing, but you can either use objects, first-class modules, or go C-style with a record of functions. This is a lot easier than C because OCaml has closures, so it’s just a bit manual, not difficult. I have my gripes with some standard types not being interfaces (in particular, IO channels…). But overall this is a bit exaggerated as it’s not hard to implement your own. The thing about Int.abs puzzles me, what else do you expect Int.abs min_int to return anyway?
And of course, the ecosystem. Yeah, there are multiple standard libraries (although I’d consider batteries to have lost most momentum). It’s painful. It’s made a lot of progress in the past years (e.g. ocaml-lsp-server is a pretty solid LSP server now!) but it’s true that OCaml remains a small language. This part is pretty accurate.
However, the last paragraph annoys me more. Yes, OCaml is a lot more functional than Java or Rust. It’s not Haskell, but: in OCaml, immutable structures are both efficient and idiomatic. In Java, they’re not idiomatic; in Rust, I know of one library (im) that explicitly relies on that, everything in the stdlib is mutable, and most of the ecosystem embraces mutability. In addition, Java does not actually have functions; and Rust has a strong distinction between functions and closures. In OCaml all functions are closures and it’s seamless to create closures thanks to the GC. Try writing CPS code in Rust and see if it’s as trivial as in OCaml! Tailcalls are not easy to emulate…
So there are reasons to use OCaml in 2023. If you work on some algorithm-heavy, symbolic domain like compilers or logic tools (historically a strength of OCaml) it’s still one of the best languages; it’s not tied to a big corporation, compiles to native code with reasonable resource requirements.
However, lexing inside comments is an actual feature that allows for nested comments (so you can comment out something with comments in it, no problem).
You don’t need to actually lex the contents of comments for that, just skim through the character stream looking for opening and closing delimiters and counting how many you find.
Agree with your second-to-last paragraph though; a lot of me learning Rust was learning how not to try to write it as if it were OCaml. Inconvenient closures and lack of partial evaluation were particularly painful.
You don’t need to actually lex the contents of comments for that, just skim through the character stream looking for opening and closing delimiters and counting how many you find.
Yes, that’s what OCaml’s lexer does, but it has to account for string literals as well, to handle these:
(* "this is what a comment opening in OCaml looks like: (*" *)
If you don’t look at " you would think that a “*)” is missing. That’s why (* " *) is not valid.
The Rust code you linked is a great example of why OCaml does this. In OCaml, you can surround any valid code with (* *) to comment out that code (and only that code). That is not the case in Rust – as you demonstrated, surrounding valid code with /* */ commented out the rest of the file, because Rust treats the /* inside the string literal as the start of a nested comment.
As opposed to not allowing a comment to have an un-matched " in it? You can have one or the other but not both, it seems; comments can invisibly contain string delimiters or strings can invisibly contain comment delimiters. I guess I personally find the Rust method less surprising.
in Rust, I know of one library (im) that explicitly relies on that,
This is a bit more subtle. In Rust, there’s basically zero semantic difference between immutable and mutable collections, so there’s simply no need to have them. This is qualitatively different from languages with Java-like semantics.
But the bit about closures is spot on. Rust punishes you hard if you try to write high order code, mostly because there’s no “closure type”.
I know what you mean, but the stdlib collections, for examples, are mutable, because the only way to add or remove elements is literally to mutate them in place. The basic feature of functional collections is that you can cheaply “modify” them by creating a new version without touching the old one. In Rust, im does that, but Vec or HashMap certainly don’t, and to keep both the new and old versions you need a full copy.
OCaml has pretty decent support for immutable collections, as do Clojure, Scala, Erlang/Elixir, Haskell, and probably a few others. It’s not that common otherwise and it’s imho a big part of what makes a language “functional”.
The basic feature of functional collections is that you can cheaply “modify” them by creating a new version without touching the old one
I am not sure. My gut feeling is that in the vast majority of cases, functional programs do not retain historical, old versions of collections. Cheap modification is a spandrel, the actually important bit is value semantics: collections cannot be modified by someone else from under your feet.
You’d be surprised! A use case that I think the compiler has, for example, is to keep the scoping environment stored in each node of the post-typechecking AST. As a pure functional structure, each environment is only slightly different from the previous one, and you can store thousands of variants in the AST for one file. If you do incremental computations, the sharing is also pretty natural between incrementally modified versions.
This updates me a bit, as, indeed, compilers are fairly frequent use-case for FP, and indeed in compilers you want scopes to be immutable. But I think scopes usually are a specialized data structure, rather than an off-the-shelf map? Looking at OCaml’s own compiler:
It seems that they have have “scope which points at the parent scope” plus an immutable map, and the map is hand-coded. So this looks closer to “we have a specialised use-case and implement a specialized data structure for it”, rather than “we re-use facilities provided by the language because they are a great match”.
Sure, this is also for bootstrapping reasons I think. A lot of things in the compiler don’t directly depend on the stdlib because they come too early, or that’s my understanding anyway. Parts are also very old :-).
You can also find some instances of functional structures being used in backtracking algorithms (pretty sure alt-ergo uses them for that, maybe even CVC5 with custom immutable structures in C++). You just keep the old version and return to it when you backtrack.
I personally have some transaction-like code that modifies state (well… an environment) and rolls back to the previous env if some validation failure occurs. It’s also much simpler when everything is immutable.
Granted these are relatively niche but it shows how convenient the ability to keep old versions can be. And the language does facilitate that because the GC is optimized for allocating a lot of small structures (like tree nodes), and there’s easy functional updates of records, etc.
collections cannot be modified by someone else from under your feet.
The other edge case is if you have errors and need to back out of the current operation. Immutable data means you will always have the original copy to fall back to if you want to retry the operation. If you are using single owner mutability, have partially modified the collection and need to back out of the operation then it’s a bit trickier to retry because you need to undo all of the operations to get back to the original state.
I think that they are nearly equivalent, but still different enough to have some tradeoffs. For example, Erlang style let it crash style languages might be better off with full immutability so that they don’t corrupt state on crashes (or things will get complicated).
It’s interesting because I don’t see this distinction being drawn any where.
FWIW, in AccessKit, I previously used im, so that when updating the accessibility tree (which is implemented as a hash map of nodes keyed by node ID), I could have a complete snapshot of the tree before applying the update, without paying the cost of copying all nodes (as opposed to just the ones that were changed or removed). I stopped using im because its license (MPL) is incompatible with my goals for AccessKit, and to avoid compromising performance by copying the whole map on every update, I had to compromise my design; I no longer have a complete snapshot of the tree as it was before the update.
Yup, sometimes you do want to have historical snapshots, and immutable collections are essential for those cases as well. Another good example here is Cargo: it uses im to implement dependency solving using trial&backtracking method.
My point is that’s not the primary reason why immutable collections are prevalent in functional programs. It’s rather that FP really wants to work with values, and immutability is one way to achieve that.
More generally, I’ve observed that “Rust doesn’t have/default to immutable collections, so FP style in rust is hard” is a very common line of reasoning, and I believe it to be wrong and confused (but for a good reason, as that’s a rather subtle part about Rust). In Rust, ordinary vectors are values (http://smallcultfollowing.com/babysteps/blog/2018/02/01/in-rust-ordinary-vectors-are-values/), you really don’t need immutable data structures unless you specifically want to maintain the whole history.
As someone who has used (and loved) OCaml professionally for almost seven years now, I found myself nodding along with many of the points here. The few times I’ve had to interact with the default standard library (comes up when writing macros) have been very unpleasant. Using Core fixes many of the listed issues, but it has been my experience that using Core will make you the target of some ire in the open source community. Writing OCaml also pretty much requires using macros for exactly the points listed. Which, again, some stigma there.
The polymorphism stuff takes a lot of getting used to. It takes a lot of judgment to know when to use functors and when to use first-class modules (or even, in rare cases, when to use objects). Both have downsides, and neither are as ergonomic as typeclasses or traits. Modular implicits have been just around the corner for ten years now; I stopped holding my breath. Coding with existential types (e.g. making a heterogeneous list where all elements conform to the same “interface”) is a bit more cumbersome than in Rust or Haskell.
The syntax is byzantine and the language is very difficult to write correctly without a code formatter pointing out mistakes like the ones highlighted in the article (but pretty trivial once you have one). I think it’s easier to learn than Haskell’s syntax – those whitespace rules, jeez – but that’s not a very high bar.
The open-source tooling is bad enough that, as much as I love OCaml, I don’t use it very often for personal projects. I just don’t want to start every coding session figuring out what’s wrong with my opam switch this time. Dune is a huge leap forward, though.
I don’t know why it’s a problem that OCaml has both persistent and mutable types in the language – that’s one of my favorite parts! The ability to move seamlessly between “low-level” imperative code and high-level data-oriented code within the same module is a massive strength of the language.
The last point feels like a non-sequitor. What does “functional” mean? Something different to everyone.
In my opinion there is no reason to use OCaml in a new project in 2023. If you have a reason to think that OCaml is the best choice for a new project please let me know your use case, I’m genuinely curious.
I dunno; this feels like an attempt to incite an adversarial conversation, which is clever link-aggregator optimization. But I don’t really disagree. The niche is writing code that executes quickly and will still work correctly a decade from now, after extensive changes and extensions and refactors by multiple people over multiple years. But correctness doesn’t really matter for most problems, and neither does performance. Popular web servers and services are one exception, but most web servers are not popular at all, and their lifetime is only measured in years.
I love OCaml, and if you want a garbage-collected language with a powerful type system that compiles to fast, native code, your choices are pretty much OCaml or Haskell. It has been my experience (although note that this might be a lack of expertise on my part) that it is much, much easier to write fast code in OCaml than in Haskell (for some definition of “fast”). The ability to look at code and accurately predict the instructions that it will compile to is very important for some problems.
I’m glad that I got to use it full-time and had a chance to see how incredibly productive OCaml is after the steep learning curve. Because I think that if I had only tried the language for fun, I would come out with exactly this takeaway.
I think like 90% of the problems with the syntax can just be summed up as “operator precedence is bullshit”.
Once you accept this, you learn to just spam everything with parens all the time, and it looks a bit awkward, but you stop running into issues. It’s frustrating in the sense that it’s a very silly language design mistake to make, but not that frustrating in terms of working around the problem once it’s identified.
I agree that the build situation is pretty bad; most of the documentation assumes you already know the thing you’re referencing. But all the same, I found it really enjoyable to use for the handful of smaller programs I’ve written in it.
Yeah, I mean kinda, but I wouldn’t let it off that easily. That’s something that you get used to in a few days – writing OCaml expressions correctly is the easy part.
The problem with OCaml syntax is that there’s so much of it. There’s the structure language, the signature language, the expression language. There’s a syntax for writing patterns, and a syntax for declaring types – and two syntaxes for type annotations. There are macro extension points, which come in three (or is it four?) varieties. The object system has a whole other notation that looks nothing like the rest of the language, in both expressions and type signatures…
It’s easy to get over the initial hump and learn most of the expression language in a few days. But even then… there are so many little-used corners of the language that you won’t stop encountering brand new syntax – syntax! – for years. Contravariance annotations. Refutation cases. Extensible variants. Locally abstract methods.
And when you meet a new syntax for the first time, it’s really hard to google for it! You just have to turn to the person next to you, or else hope you get lucky looking through the language spec. Maybe ChatGPT makes this trivial now.
At least, this has been my experience with the language. I love OCaml, but I do not love its syntax.
Using Core fixes many of the listed issues, but it has been my experience that using Core will make you the target of some ire in the open source community.
I’ve hacked on a codebase using Core, and it’s a constant minor irritation. Everything is slightly different and I have to keep both the standard and Core documentation open in two tabs just to understand what’s going on. It’s not the end of the world, but it’s definitely uncomfortable.
…if you want a garbage-collected language with a powerful type system that compiles to fast, native code, your choices are pretty much OCaml or Haskell.
Yeah. I’ve been able to get quite far with RPython instead, but the type system’s just not strong enough to express some stuff. I’m going to try rewriting that Core codebase in a week or two, and I don’t think I can make any better choice than to keep it in OCaml and incrementally clean it up with the standard refactoring patterns.
There are two fundamental problems with type classes. The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference.[1]
This is not the first article to complain about these points in OCaml and still I am surprised how different my judgement of OCaml is. It’s pretty much my happy place.
That’s not to say the issues OP mentions don’t exist. The syntax problems were surprising at first. And when starting out, I felt like I had to make a choice about a standard library before I could continue learning. As it turns out, the actual standard library is not half as bad as its reputation and the choice of standard library extension/replacement doesn’t matter as much as I initially feared. From this point it’s unfortunate that Real World OCaml uses Core instead of just the standard library.
As for the tooling, I’m genuinely satisfied with what we have. I think dune is an excellent build system and opam is at least decent. Granted, it took me a bit to find out that I should use one local switch per project but I haven’t had any issue since I made that change. It might be my C++ background, but I don’t get all the complaints about OCaml tooling.
Many years ago, an MIT spin out created a hardware description language as a Haskell DSL. They later added SystemVerilog syntax to it, to make it more appealing to hardware developers. They open sourced it a few years ago.
I bring this up because, in spite of being taught Haskell as a student, using it with SystemVerilog syntax was the first time I enjoyed the language. I love the semantics but I can’t stand the syntax and it gets in the way of using the language. And, to be clear, SystemVerilog syntax is awful, that just shows what a low bar for acceptable syntax Haskell fails to cross for me.
I had a similar reaction to ML. It feels like the syntax optimises for typing speed at the expense of readability and I struggle to read anything written in SML or OCaml. I have had a bit more luck with F#, which adds just enough Simila-style syntax that I can follow it.
Could you name your 3 biggest problems with the syntax? Or is it just a general dislike?
If it’s that bad my prime suspect would be the function call syntax:
f (g x) y
vs:
f(g(x), y)
Because it’s simply so different from everything else. I personally have no problem with it because it is so concise, and such a good fit for currying (that thing where “achktually, functions take only 1 argument”)… and I’ve been exposed to that first year in college. But if it’s not among your top 3 problems I’m really curious what is.
Could you name your 3 biggest problems with the syntax? Or is it just a general dislike?
It’s been a while, so probably not. The difference may well be the biggest thing. For Haskell, it was mainly the currying with some implicit precedence that I couldn’t figure out (I write an A -> B -> C function but I meant to write an A -> (B -> C) function or some similar confusion).
For ML, things like the match expressions are very concise and that means they’re hard to read until you’ve had a lot of practice and that initial difficulty meant I was never motivated to practice.
It’s not purely the difference though. I learned Smalltalk and Objective-C and really liked their message-send syntax, even though it’s completely different from Simula-style method-call syntax.
Well, I understand it’s frustrating to have to use a language you don’t like at work. I mainly program in OCaml so this might come out as defensive in places, but I do think so some of the criticism is valid!
For a start, yeah, the syntax has known flaws (especially wrt nesting; it’s also not great for formatting imho). The lack of line comments is annoying. However, lexing inside comments is an actual feature that allows for nested comments (so you can comment out something with comments in it, no problem).
The interface issue is kind of a thing, but you can either use objects, first-class modules, or go C-style with a record of functions. This is a lot easier than C because OCaml has closures, so it’s just a bit manual, not difficult. I have my gripes with some standard types not being interfaces (in particular, IO channels…). But overall this is a bit exaggerated as it’s not hard to implement your own. The thing about
Int.abs
puzzles me, what else do you expectInt.abs min_int
to return anyway?And of course, the ecosystem. Yeah, there are multiple standard libraries (although I’d consider batteries to have lost most momentum). It’s painful. It’s made a lot of progress in the past years (e.g. ocaml-lsp-server is a pretty solid LSP server now!) but it’s true that OCaml remains a small language. This part is pretty accurate.
However, the last paragraph annoys me more. Yes, OCaml is a lot more functional than Java or Rust. It’s not Haskell, but: in OCaml, immutable structures are both efficient and idiomatic. In Java, they’re not idiomatic; in Rust, I know of one library (
im
) that explicitly relies on that, everything in the stdlib is mutable, and most of the ecosystem embraces mutability. In addition, Java does not actually have functions; and Rust has a strong distinction between functions and closures. In OCaml all functions are closures and it’s seamless to create closures thanks to the GC. Try writing CPS code in Rust and see if it’s as trivial as in OCaml! Tailcalls are not easy to emulate…So there are reasons to use OCaml in 2023. If you work on some algorithm-heavy, symbolic domain like compilers or logic tools (historically a strength of OCaml) it’s still one of the best languages; it’s not tied to a big corporation, compiles to native code with reasonable resource requirements.
You don’t need to actually lex the contents of comments for that, just skim through the character stream looking for opening and closing delimiters and counting how many you find.
Agree with your second-to-last paragraph though; a lot of me learning Rust was learning how not to try to write it as if it were OCaml. Inconvenient closures and lack of partial evaluation were particularly painful.
Yes, that’s what OCaml’s lexer does, but it has to account for string literals as well, to handle these:
If you don’t look at
"
you would think that a “*)” is missing. That’s why(* " *)
is not valid.Nope. Why would your comments care whether or not there’s an unclosed string delimiter in them? Rust works this way for example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=973d237c97542b8e34623fdad0423f9c Various langs do various things with various flavor of “raw” strings, but those are a nice-to-have, not an essential feature.
You can do it by having your lexer state machine have entirely separate sub-lexers for strings and block comments, like this pseudocode:
The Rust code you linked is a great example of why OCaml does this. In OCaml, you can surround any valid code with
(* *)
to comment out that code (and only that code). That is not the case in Rust – as you demonstrated, surrounding valid code with/* */
commented out the rest of the file, because Rust treats the/*
inside the string literal as the start of a nested comment.As opposed to not allowing a comment to have an un-matched
"
in it? You can have one or the other but not both, it seems; comments can invisibly contain string delimiters or strings can invisibly contain comment delimiters. I guess I personally find the Rust method less surprising.This is a bit more subtle. In Rust, there’s basically zero semantic difference between immutable and mutable collections, so there’s simply no need to have them. This is qualitatively different from languages with Java-like semantics.
But the bit about closures is spot on. Rust punishes you hard if you try to write high order code, mostly because there’s no “closure type”.
I know what you mean, but the stdlib collections, for examples, are mutable, because the only way to add or remove elements is literally to mutate them in place. The basic feature of functional collections is that you can cheaply “modify” them by creating a new version without touching the old one. In Rust,
im
does that, butVec
orHashMap
certainly don’t, and to keep both the new and old versions you need a full copy.OCaml has pretty decent support for immutable collections, as do Clojure, Scala, Erlang/Elixir, Haskell, and probably a few others. It’s not that common otherwise and it’s imho a big part of what makes a language “functional”.
I am not sure. My gut feeling is that in the vast majority of cases, functional programs do not retain historical, old versions of collections. Cheap modification is a spandrel, the actually important bit is value semantics: collections cannot be modified by someone else from under your feet.
You’d be surprised! A use case that I think the compiler has, for example, is to keep the scoping environment stored in each node of the post-typechecking AST. As a pure functional structure, each environment is only slightly different from the previous one, and you can store thousands of variants in the AST for one file. If you do incremental computations, the sharing is also pretty natural between incrementally modified versions.
This updates me a bit, as, indeed, compilers are fairly frequent use-case for FP, and indeed in compilers you want scopes to be immutable. But I think scopes usually are a specialized data structure, rather than an off-the-shelf map? Looking at OCaml’s own compiler:
It seems that they have have “scope which points at the parent scope” plus an immutable map, and the map is hand-coded. So this looks closer to “we have a specialised use-case and implement a specialized data structure for it”, rather than “we re-use facilities provided by the language because they are a great match”.
Sure, this is also for bootstrapping reasons I think. A lot of things in the compiler don’t directly depend on the stdlib because they come too early, or that’s my understanding anyway. Parts are also very old :-).
You can also find some instances of functional structures being used in backtracking algorithms (pretty sure alt-ergo uses them for that, maybe even CVC5 with custom immutable structures in C++). You just keep the old version and return to it when you backtrack.
I personally have some transaction-like code that modifies state (well… an environment) and rolls back to the previous env if some validation failure occurs. It’s also much simpler when everything is immutable.
Granted these are relatively niche but it shows how convenient the ability to keep old versions can be. And the language does facilitate that because the GC is optimized for allocating a lot of small structures (like tree nodes), and there’s easy functional updates of records, etc.
The other edge case is if you have errors and need to back out of the current operation. Immutable data means you will always have the original copy to fall back to if you want to retry the operation. If you are using single owner mutability, have partially modified the collection and need to back out of the operation then it’s a bit trickier to retry because you need to undo all of the operations to get back to the original state.
I think that they are nearly equivalent, but still different enough to have some tradeoffs. For example, Erlang style let it crash style languages might be better off with full immutability so that they don’t corrupt state on crashes (or things will get complicated).
It’s interesting because I don’t see this distinction being drawn any where.
FWIW, in AccessKit, I previously used im, so that when updating the accessibility tree (which is implemented as a hash map of nodes keyed by node ID), I could have a complete snapshot of the tree before applying the update, without paying the cost of copying all nodes (as opposed to just the ones that were changed or removed). I stopped using im because its license (MPL) is incompatible with my goals for AccessKit, and to avoid compromising performance by copying the whole map on every update, I had to compromise my design; I no longer have a complete snapshot of the tree as it was before the update.
Yup, sometimes you do want to have historical snapshots, and immutable collections are essential for those cases as well. Another good example here is Cargo: it uses im to implement dependency solving using trial&backtracking method.
My point is that’s not the primary reason why immutable collections are prevalent in functional programs. It’s rather that FP really wants to work with values, and immutability is one way to achieve that.
More generally, I’ve observed that “Rust doesn’t have/default to immutable collections, so FP style in rust is hard” is a very common line of reasoning, and I believe it to be wrong and confused (but for a good reason, as that’s a rather subtle part about Rust). In Rust, ordinary vectors are values (http://smallcultfollowing.com/babysteps/blog/2018/02/01/in-rust-ordinary-vectors-are-values/), you really don’t need immutable data structures unless you specifically want to maintain the whole history.
Would https://lib.rs/crates/hamt-rs work?
As someone who has used (and loved) OCaml professionally for almost seven years now, I found myself nodding along with many of the points here. The few times I’ve had to interact with the default standard library (comes up when writing macros) have been very unpleasant. Using
Core
fixes many of the listed issues, but it has been my experience that usingCore
will make you the target of some ire in the open source community. Writing OCaml also pretty much requires using macros for exactly the points listed. Which, again, some stigma there.The polymorphism stuff takes a lot of getting used to. It takes a lot of judgment to know when to use functors and when to use first-class modules (or even, in rare cases, when to use objects). Both have downsides, and neither are as ergonomic as typeclasses or traits. Modular implicits have been just around the corner for ten years now; I stopped holding my breath. Coding with existential types (e.g. making a heterogeneous list where all elements conform to the same “interface”) is a bit more cumbersome than in Rust or Haskell.
The syntax is byzantine and the language is very difficult to write correctly without a code formatter pointing out mistakes like the ones highlighted in the article (but pretty trivial once you have one). I think it’s easier to learn than Haskell’s syntax – those whitespace rules, jeez – but that’s not a very high bar.
The open-source tooling is bad enough that, as much as I love OCaml, I don’t use it very often for personal projects. I just don’t want to start every coding session figuring out what’s wrong with my
opam
switch this time. Dune is a huge leap forward, though.I don’t know why it’s a problem that OCaml has both persistent and mutable types in the language – that’s one of my favorite parts! The ability to move seamlessly between “low-level” imperative code and high-level data-oriented code within the same module is a massive strength of the language.
The last point feels like a non-sequitor. What does “functional” mean? Something different to everyone.
I dunno; this feels like an attempt to incite an adversarial conversation, which is clever link-aggregator optimization. But I don’t really disagree. The niche is writing code that executes quickly and will still work correctly a decade from now, after extensive changes and extensions and refactors by multiple people over multiple years. But correctness doesn’t really matter for most problems, and neither does performance. Popular web servers and services are one exception, but most web servers are not popular at all, and their lifetime is only measured in years.
I love OCaml, and if you want a garbage-collected language with a powerful type system that compiles to fast, native code, your choices are pretty much OCaml or Haskell. It has been my experience (although note that this might be a lack of expertise on my part) that it is much, much easier to write fast code in OCaml than in Haskell (for some definition of “fast”). The ability to look at code and accurately predict the instructions that it will compile to is very important for some problems.
I’m glad that I got to use it full-time and had a chance to see how incredibly productive OCaml is after the steep learning curve. Because I think that if I had only tried the language for fun, I would come out with exactly this takeaway.
I think like 90% of the problems with the syntax can just be summed up as “operator precedence is bullshit”.
Once you accept this, you learn to just spam everything with parens all the time, and it looks a bit awkward, but you stop running into issues. It’s frustrating in the sense that it’s a very silly language design mistake to make, but not that frustrating in terms of working around the problem once it’s identified.
I agree that the build situation is pretty bad; most of the documentation assumes you already know the thing you’re referencing. But all the same, I found it really enjoyable to use for the handful of smaller programs I’ve written in it.
Yeah, I mean kinda, but I wouldn’t let it off that easily. That’s something that you get used to in a few days – writing OCaml expressions correctly is the easy part.
The problem with OCaml syntax is that there’s so much of it. There’s the structure language, the signature language, the expression language. There’s a syntax for writing patterns, and a syntax for declaring types – and two syntaxes for type annotations. There are macro extension points, which come in three (or is it four?) varieties. The object system has a whole other notation that looks nothing like the rest of the language, in both expressions and type signatures…
It’s easy to get over the initial hump and learn most of the expression language in a few days. But even then… there are so many little-used corners of the language that you won’t stop encountering brand new syntax – syntax! – for years. Contravariance annotations. Refutation cases. Extensible variants. Locally abstract methods.
And when you meet a new syntax for the first time, it’s really hard to google for it! You just have to turn to the person next to you, or else hope you get lucky looking through the language spec. Maybe ChatGPT makes this trivial now.
At least, this has been my experience with the language. I love OCaml, but I do not love its syntax.
Fair enough; I should have added a caveat that I haven’t built any large or medium-sized systems in OCaml yet, as much as I’d like to.
Yes please! (Also, smaller binary size, faster compiler, and as you point out, more predictable than Haskell.)
I’ve hacked on a codebase using Core, and it’s a constant minor irritation. Everything is slightly different and I have to keep both the standard and Core documentation open in two tabs just to understand what’s going on. It’s not the end of the world, but it’s definitely uncomfortable.
Yeah. I’ve been able to get quite far with RPython instead, but the type system’s just not strong enough to express some stuff. I’m going to try rewriting that Core codebase in a week or two, and I don’t think I can make any better choice than to keep it in OCaml and incrementally clean it up with the standard refactoring patterns.
[1] https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/
This is not the first article to complain about these points in OCaml and still I am surprised how different my judgement of OCaml is. It’s pretty much my happy place.
That’s not to say the issues OP mentions don’t exist. The syntax problems were surprising at first. And when starting out, I felt like I had to make a choice about a standard library before I could continue learning. As it turns out, the actual standard library is not half as bad as its reputation and the choice of standard library extension/replacement doesn’t matter as much as I initially feared. From this point it’s unfortunate that Real World OCaml uses Core instead of just the standard library.
As for the tooling, I’m genuinely satisfied with what we have. I think dune is an excellent build system and opam is at least decent. Granted, it took me a bit to find out that I should use one local switch per project but I haven’t had any issue since I made that change. It might be my C++ background, but I don’t get all the complaints about OCaml tooling.
Many years ago, an MIT spin out created a hardware description language as a Haskell DSL. They later added SystemVerilog syntax to it, to make it more appealing to hardware developers. They open sourced it a few years ago.
I bring this up because, in spite of being taught Haskell as a student, using it with SystemVerilog syntax was the first time I enjoyed the language. I love the semantics but I can’t stand the syntax and it gets in the way of using the language. And, to be clear, SystemVerilog syntax is awful, that just shows what a low bar for acceptable syntax Haskell fails to cross for me.
I had a similar reaction to ML. It feels like the syntax optimises for typing speed at the expense of readability and I struggle to read anything written in SML or OCaml. I have had a bit more luck with F#, which adds just enough Simila-style syntax that I can follow it.
Could you name your 3 biggest problems with the syntax? Or is it just a general dislike?
If it’s that bad my prime suspect would be the function call syntax:
vs:
Because it’s simply so different from everything else. I personally have no problem with it because it is so concise, and such a good fit for currying (that thing where “achktually, functions take only 1 argument”)… and I’ve been exposed to that first year in college. But if it’s not among your top 3 problems I’m really curious what is.
It’s been a while, so probably not. The difference may well be the biggest thing. For Haskell, it was mainly the currying with some implicit precedence that I couldn’t figure out (I write an A -> B -> C function but I meant to write an A -> (B -> C) function or some similar confusion).
For ML, things like the match expressions are very concise and that means they’re hard to read until you’ve had a lot of practice and that initial difficulty meant I was never motivated to practice.
It’s not purely the difference though. I learned Smalltalk and Objective-C and really liked their message-send syntax, even though it’s completely different from Simula-style method-call syntax.