1. 40

    1. 22

      My perception of the 5.1 release.

      The 5.0 release corresponds to the integration of the Multicore OCaml runtime. It was a very large change, an almost-complete rewrite of the core of the OCaml runtime (in particular the garbage collector). The 5.0 runtime gave up on supporting some features of OCaml 4.x: most native-compilation backends were not updated to support effect handlers and were thus unavailable in 5.0 (only amd64 and arm64 were released, iirc.), some runtime services were completely gone (in particular major-heap compaction and statmemprof, the statistical memory profiler), and some parts of the runtime code had cut corners to be simpler and easier to make multicore-safe, at the cost of performance.

      After the 5.0 release there was a fair amount of work to restore those features that people care about (still not done today; compaction and statmemprof are work in progress). We also heard from various users that were negatively affected by performance regressions, and stayed on 4.14. Based on this feedback we improved the runtime performance around weak pointers and ephemerons, and for dynamic linking. (In 5.0 the dynamic linker had a data-structure pessimized from linear to quadratic complexity, which most programs did not notice but unsurprisingly turned out to be a blocker for plugin-heavy programs, with vastly increased startup times.)

      These last few releases, independently of the Multicore work, have also seen a lot of refactoring action on the type-checker codebase. The type-checker is a complex codebase (more than other parts of the compiler) with some amount of technical debt, and there is an ongoing effort to refactor, document and improve the documentation.

      The 5.1 development period also saw a focused effort to reduce the installed size of the OCaml compiler distribution, prompted by a January 2023 blog post by Fabrice Le Fessant documenting its gradual growth over time. The installed size was cut in half, from 521 MiB to 272 MiB. This was done by removing some variants from the default install rules (we would install both bytecode and native versions of some rarely-used tools, now bytecode-only), removing debug information for some of those installed binaries, and enabling compression of the compiler build artifacts. We use zstd; the artifacts are now 35-40% of their uncompressed size, at no noticeable performance cost on compression or decompression. The compressed serialization routines are also available as part of the standard library. This is however proving to be a headache for some downstream distributions (see issue #12562): we don’t want to link libzstd statically by default because it is fairly large (832KiB, when the whole OCaml runtime system today is 376KiB), but dynamic linking causes sysadmin troubles on some platforms (homebrew, Windows) where it is easy for users to remove libzstd, or for their package manager to move it to a different location, without noticing that they are silently breaking their OCaml binaries.

      Finally, there are also a lot of small quality-of-life improvements for users. A surprisingly large amount of work gos into making error messages clearer, on each release. I don’t think most people realize how much work it takes to provide sensible diagnostics in error scenarios. The OCaml compiler, and I am sure most other language implementations, has been gradually improving bits and pieces of its error message surface continually for years now, and this is never-ending work. I wonder if there are design principles that would let compiler authors get it right, or are least better, from the start.
      (A slightly frustrating aspect of this work is that users rarely notice improvements, and they often complain about error messages. We get a lot of feedback of the form “language X has nicer error messages!”, but there are various reasons why this tends to be very difficult to turn into actionable improvements.)

      1. 4

        Thank you so much for this detailed report!

        On the topic of compiler errors, as someone who recently wrote a little report on their first steps with OCaml and had some minor complaints about error messages: how can we help make these complaints more actionable?

        1. 5

          Let me first clarify that I was not trying to suggest that user reports about error messages are at fault. User reports are of varying quality, but the problem at hand is really what make things difficult. Let me get into more details:

          • Some implementation technologies means that providing error messages requires more work – sometimes a lot more.

            • OCaml uses a LR-family parser generator, and there is no existing approach to produce good syntax error messages. Some languages have given up on parser generators for this reason and reverted to hand-written parsers. This is easy if you have tons of engineers that are willing to write and maintain very boring code, for example in a big company, but OCaml did not choose this route (it is not clear that it would have had the resources to do it), we stuck with LR parser generators which have other benefits (they are efficient, let you know about ambiguities in the grammar, can rather simply be made incremental for IDE usage, etc.). Generating decent error messages for LR parser generators is possible, probably better than doing it by hand, but because no one has done it before it is still an open research problem. And we (the OCaml community) have someone working on this research problem (as others have before), namely Frédéric Bour, who has been working on this on-and-off for a few years – the current state of the work can be found in the lrgrep repository. So the plan is to wait until this work finishes, and adopt it in OCaml – it is “a couple years away” and has been for a few years now. Nothing shocking for a research project (OCaml has been around since the late nineties, we can wait a couple year for syntax errors), but it means that our answer to “Syntax error messages are bad” are more like “we know, we are waiting for the solution to come”. Note that your report is an example of writing let x = foo ; bar instead of let x = foo in bar. This is well-known among OCaml practitioners as the most common difficult syntax error. Most syntax errors are easy to fix, in the sense that the location of the error alone lets you (once you know the OCaml grammar well enough) troubleshoot the issue and fix it easily, people familiar with OCaml don’t need a good error message. A few syntax errors are hard, because the location reported is far away from the source of the error – the grammar of the language makes it so that it could have been correct all along, it is just that no one writes this. let x = foo ; is the most common hard error.
            • Another example of design choice that makes error messages difficult is Hindley-Milner type inference. This form of type inference enables a very lightweight programming style with very few type annotations, and many people like this, but it also makes error messages harder. The less type inference, the easier it is to pinpoint typing errors. There has been a lot of work on improving the error messages of HM type inference languages (I personally like the bayesian approach of SherrLoc), but they usually come at a fair cost in implementation complexity, and those type-checkers are already fairly complex beasts, so to my knowledge these approaches have never been integrated in a production compiler – but sometimes in teaching-oriented, simplified implementations.
          • Some language features and their implementation choices make error messages difficult. For example GADTs typically provide bad error messages; OCaml is doing a rather poor job at it I think, but other languages with GADTs are not faring much better. In general, the more advanced the feature, the worse the error messages.

          • Some aspects of error messages could be improved without any hard research work, but just a fair amount of engineering work for which we don’t usually have resources available. For example, Rust has set a standard where you get delicately hand-carved ASCII diagrams to show the various source elements that an error message is referring to, with a customizable palette inspired by your favorite painter. This looks great and we could probably do it, but it requires a fair amount of work that nobody has volunteered to do. (In fact the situation there has improved recently thanks to the grace library; it still requires a fair amount of integration work if we want to consider having it in the compiler, and the language maintainers are busy with other things right now.)

          Regarding user reports, which again are generally not the issue, here are some things that can make them less actionable:

          • Users tend to report something that they don’t like without thinking too much about what error message could be shown instead. It tends to be useful when they come up with a proposal for replacement. (Sometimes the discussion then goes as follows: your idea for an error message only works in your specific situation, but the error we are talking about shows up in other situations where your suggestion would be misleading, so it is not actually a great idea.)
          • Some users are fixated on the cosmetic aspects of messages (“languages X does it better because it uses colors”). I think that cosmetics matter for some people, so they are a matter to take seriously, but it also tends to be hard to convince time-constrained language maintainers to add ascii escape code when they have difficult bugs or features to work on that requires their full expertise. I also have the intuition (pure guess!) that sometimes people complain about visual aspects as a scapegoat for another source of confusion or discomfort that they have not put the finger on, distracting them (and us) from the underlying issue.
          • Some users complain about an error message involving feature X by showing, in comparison, an error message involving feature Y (in the same language or another) that they find much clearer. Sure! Usually it is because it is much easier to provide a clear error message for Y than for X.

          All this being said, over time we got a lot of nice drive-by contributions to improve error messages. Useful small things such as “using red for errors and orange for warnings”, or “highlight the program types and program source fragments in the middle of the error message”, or “quote the line that we are referring to instead of just showing the location”, all this was contributed by external contributors and improved the OCaml compiler messages quite a bit. (None of them were as easy to implement as it sounds. How do you detect if the error output is going to a terminal that supports colors? What is your fallback for highlighting for dumb terminals and braille readers? Can you actually reliable find the source code line from the source location, what if the user used a preprocessing step or lexer directives?)

          1. 1

            Another example of design choice that makes error messages difficult is Hindley-Milner type inference.

            My rule of thumb is to use type annotations in function signatures and mostly leave them out otherwise. Works pretty well.

            1. 2

              If you use this style, you will typically have decent typing error messages today. But then maybe if everyone was forced to use this style (as is done in some other languages), we could use a simpler type-inference algorithm that would provides even better type errors – or at least make it less work to provide good type errors. I am not saying that this would necessarily be a good change – obviously there is a lot of value in being lightweight with annotations, at least in some problem domains – or suggesting that the language should be changed. Just pointing out that if we decide as a language to offer powerful type inference, we have to deal with the general case (which is also the worst case) when implementing type errors.

              (Note: in the past it has been proposed to add an “easy” mode to the language, meant for beginners, with some restrictions on the features that one can use and tailored, better error messages. This is an interesting idea (see also Racket language levels). The problem is that we don’t know of a good way to make this maintainable in the type-checker implementation, without a lot of code duplication or test noise between the two modes. The type-inference codebase is already a complex mess, so we cannot really afford to add more complexity to it.)

      2. 1

        We get a lot of feedback of the form “language X has nicer error messages!”, but there are various reasons why this tends to be very difficult to turn into actionable improvements.

        Obviously, “X is better at Y” is not super helpful feedback. Is it just the case that more specific critiques of error messages would be helpful? Or are there other important reasons it is difficult to improve error messages in OCaml?

        1. 2

          Good question! I wrote a novella about this there.

    2. 2

      #11904: Remove arm, i386 native-code backends that were already disabled at configuration time

      I have a feeling that this should have been mentioned in the changelog introduction and not just deep inside it. I do agree that nobody who cares about OCaml 5.0 is likely to care about 32-bit x86 and ARM32, but it’s still a significant change!

      It’s great to see so many stdlib function finally become tail-recursive, though.

      1. 22

        The user-visible change happened during the 5.0 release, where these native backends were disabled. 5.1 did a code cleanup by removing the dead code.

        Some context for non-experts: OCaml 5.0 disabled native compilation for i686 and arm32, but bytecode compilation is still supported – it is possible to run OCaml code on those systems, just not as fast. The bytecode runtime requires much less target-specific maintenance and is available on more architectures (all those that Debian support, for example). Maintainers have been eager to remove native compilation support for 32bit systems, to remove a lot of dark corner cases in the compiler and make their life easier going forward. (Note: js_of_ocaml, the compiler from OCaml to Javascript, starts from the OCaml bytecode, it is not a native compilation architecture. Among 32bit systems running OCaml code, javascript engines are probably the most common nowadays.)

        1. 1

          Oh, I see, I misinterpreted the 5.0 changelog and thought that those backends were simply disabled in the configuration and were supposed to be re-enabled in the future. Thanks for the clarification!

          I agree that native 32-bit is now more trouble than it’s worth.