I enjoyed the course a great deal but I recommend it with reservation. I’d written some elementary Haskell before implementing H99 exercises and read John Whitington’s OCaml From The Very Beginning before taking this course so I was already familiar with writing tiny, beginner-level, typed functional programs. For me it was a fun collection of exercises that familiarized me with writing small programs in OCaml. I particularly enjoyed writing the arithmetic interpreter and the toy database.
But sometimes I struggled to understand the instructions for the exercises and without the benefit of the community forum I would have stayed stuck. From reading the frustrated posts of some students it seemed like those without prior exposure to functional programming were not having a good time. The pacing also felt uneven, some weeks I finished the exercises in hardly any time at all while others took a great deal of time.
So I think it was a good course for motivated students with some prior experience with functional programming basics like map and fold. But someone with no previous functional programming experience might want to look at other resources first. My favorite is The Little Schemer, which isn’t about typed functional programming but does a tremendous job getting you comfortable with recursion and higher order functions. I also enjoyed OCaml From The Very Beginning and think that would also be a good introduction.
I saw that the organizers of the course are planning a second edition. Hopefully they can even out some of the rough patches and make it a more enjoyable experience to programmers brand new to functional programming.
Is there a difference between a scale program where everything is explicitly typed then removing all of the types? For example, I cannot think of an example in Ocaml where, if I wrote a program explicitly typing everything and it compiled then removed the explicit types it would not continue to compile and function the same way as before.
As for the article, I think this is one huge benefit of signatures in Ocaml. The first thing I do in my Ocaml programs is to define the .mli for a module, hide whateve types and give clear APIs, then implement it. Inside the module I don’t provide explicit types for anything, unless I’m debugging a compiler error, but the interface gives me a firewall of protection. I know, in a program, that I just nee to make the implementation of the interface internally consistent. The interface file is a meeting point where modules shake hands and agree with the types of values they will use.
You typedef int to your new type
This is much weaker than the guarantee one gets from Ocaml, if I understand correctly. In Ocaml, you would do something like:
module Duration : sig
type t
val to_int : t -> int
val of_int : int -> t
end
Which makes Duration.t a distinct type for int, you cannot cast between them but must ask nicely for the implementation to give you an int. Whether or not this distinction matters to someone I cannot say, to me I find it very powerful and useful. But it’s good that Go at least did not go with the C typedef semantics.
I think you’re correct. Go’s approach to OO is unusual, it goes for composition rather than inheritance so there’s no real subtyping. Polymorphism is possible with duck-typed Java-style interfaces.
Actually, Go’s solution is fairly well understood and it is subtyping, it’s just structual subtyping. Ocaml actually has the same thing, although it’s mostly unused.
Thanks for your comment!
I am the author; In a sense it is “a rewrite for rewrite’s sake” though it’s also part of a much larger project, a collection of init, service manager and device manager. The decision to use sinit as a source of inspiration was due to the notion that obviously, for PID1, simplicity is key.
Why OCaml? Down the line it’ll really assist with the development of the service manager in particular as the benefits of a functional language for dependency and requirement handling are numerous, and I think it makes sense to keep them all in the same language. I also feel it helps raise some awareness to OCaml’s suitability as a systems programming language.
I’m not ruling out adding more features to Lyrica in the future, though I’m currently very happy with where it is at the moment. Should I want to do that, or should anyone else for that matter, OCaml’s memory safety is also an attractive aspect compared to a PID1 written in C.
Not to be “that guy”, but I’ve brought up inlining as part of the reason I use GHC Haskell for my work and I’ve gotten responses that were along the lines of:
OCaml doesn’t need inlining! It’s fast already…cuz strict!
Separate compilation uber alles!
Inlining, even if it makes some code faster, makes it harder to know how fast your code will be! HT @apy
I, for one, am glad OCaml’s compiler is producing faster code. I just wish people would stop making excuses when it’s as simple as labor inputs. There’s nothing wrong with OCaml, it just stands to benefit from some more love.
Another issue for me is concurrency. Async and LWT are not convincing when I have a huge repertoire that includes STM.
Are there any papers or documentation on FLambda? I’d like to compare how this works with the GHC inliner as I am very curious about the differences. An example of something interesting with inlining is how it interacts with typeclasses and modules.
SML is arguably a nicer language than Haskell or OCaml just because it’s a small core with relatively few corners that make no sense. I like to think of it as the intersection of the sane parts of most functional languages even though historically it predates most of them. It’s got a nice module language and the expression language is minimal but “enough” for everything that I want. I’ve also written a few compilers for ML so I know the language well enough that I don’t get surprised by what’s in there and I can hold most of the language in my head and regularly exercise all of it. I also like the fact that I have multiple implementations and an exceedingly well defined language, stability is arguably a practical reason but also it just appeals to me aesthetically.
This means that it’s great whenever I’m writing something that has essentially zero dependencies. So that means things like simple compilers, example programs (particularly for teaching since there are fewer sharp edges of the language), whatever are quite pleasant to write in SML because I don’t have to fight with the language. On the other hand, the second that I need that one library because I want to snarf some JSON and generate sexps on the other end, or even simple things like make a nice colorized or unicode output, I’m basically dead in the water. There’s some tragic irony of SML being wonderful to build abstractions in but there’s almost no one to share it with! The lack of ecosystem, build tools, package management, etc, etc, etc, mean that it’s hard to build things which depend on other things. As soon as I’m writing something that isn’t self contained it’s Haskell or OCaml time.
TLDR: SML when able, OCaml/Haskell when not
I cannot speak for Scala, however that is true of the object part of Ocaml (which is very infrequently used), but Ocaml also has parametric polymorphism which does not implement subtyping, it just lets you express the relationship between types in an expression, and that is how one implements functions like map, in Ocaml the type is ('a -> 'b) -> 'a list -> 'b list, which is not possible to express in Go in a type-safe way.
So this isn’t about code specialization or code monomorphism, it’s about what the type system lets one express (or in this case, not express).
It’s basically OCaml.Net. Defaults to immutability, but mutability is available. Has algebraic data types and pattern matching, but there is also an object system. I enjoy this because I can write code in a functional style, but there’s an imperative safety hatch.
OCaml and F# have slightly different object systems, because the latter was designed for C#/.Net interop. Also, it doesn’t have OCaml’s “functors”, which are generic/parameterized modules. F# does have “type providers” but I haven’t looked into it very thoroughly.
Overall I’m a big fan of the language. It feels very pythonic to me. The code is readable and inferred static types make for nice error messages.
There’s an OCaml library for TLS now. As for init or core utils I see no reason at all such things would need to be written in C (and indeed a quick search turned up e.g. https://github.com/rxse/lyrica ).
It’s the other way around: There’s no reason for these things to be written in any high level language. Look at LibreSSL again, it proves a point. It’s not the language that is the issue, it’s the people that introduce cruft. Or check out sinit. You don’t need OCaml shoved on top of it.
The truly facile thing is to say “it’s just a tool and if you’re a good programmer there’s no problem”. There really are good and bad tools, and the ability to judge one from the other is what makes a good programmer, far more than outright talent does.
What you intend to say is “use the right tool for the job”, and that’s exactly what I said above. I use Go for fucks sake, because C sucks for many things and Go can really make your life easier.
Is that supposed to be an example of readable/maintainable code? Because it doesn’t look so to me.
If you flex your brains a bit you’ll easily understand what it is about. Each line has a certain weight, and the main loop might take a bit to understand, but it’s not rocket science. It’s way different from the coreutils one, because there, before you even get to study the main loop, you have to think about the side-effects of initializemain(), setprogramname (), bindtextdomain(), textdomain() and figure out how parselong_options() works. I bet this takes much longer than figuring out how the one main loop in the sbase yes.c works. And if you are so inclined, you can feel free to split the loop up into two, nobody stops you from that. The thing with maintainable code is that some snippets of code don’t even need maintenance. They just work.
This is not a law of nature. I have a friend who’s writing rust tools that make system calls directly, without needing libc; unikernel approaches in OCaml are gaining some traction. Not all the pieces are there yet, but a future where we run unikernels or non-libc binaries on a provably secure hypervisor or microkernel (e.g. seL4) seems distinctly plausible.
Of course you can do that, but 99.9% of high level languages use C standard libraries and I don’t care about people who put together some rusty (get it?) stuff alone in their barn.
“It’s the other way around: There’s no reason for these things to be written in any high level language. Look at LibreSSL again, it proves a point. It’s not the language that is the issue, it’s the people that introduce cruft.”
That’s incorrect. One should use best tool for the job. The best tool, if there was an objective definition, is probably the one that achieves the programmers' goals as quickly, correctly, and low-maintenance as possible. The Ocaml implementation has acceptable performance for most users. The code is more concise, much of it pure (easy analysis or proofs), and the rest immune by design to specific errors. A different implementation by INRIA and Microsoft in F star was mathematically verified against the spec on top of a lot of testing. That means, if the abstract spec and tiny proof-checker are correct, then the system will behave according to success or error specs in all input situations.
LibreSSL, much as I’ve praised their effort, comes nowhere near close to the level of assurance Ocaml people get against code injection and is 20 ballparks away from verified one in arguing its correctness. Their own comments in the commit log indicated that despite them being excellent C coders. That the Ocaml project succeeded to do so much of TLS clean-slate with those attributes and immunity to key errors with a small team in a few years says plenty about the quality of language and tooling. Smart C developers work much harder to write less software that comes with more qualifiers about safety/security. It does run app’s code, malicious code, or error code some percentage faster, though. ;)
When I read,
OCaml was initially introduced for its execution speed and ease of refactoring.
I wonder if those characterizations of OCaml were based on their own experience and measurements against a defined target or simply restatement of commonly held beliefs. I’ve certainly read both statements elsewhere, but I rarely read of people or companies running in-house experiments to make such decisions. Google and disk performance and lifespan, yes. Jane Street and OCaml, probably. Here?
I understand your argument, and will agree that some of your points are in fact problematic but I’ll disagree on one point:
best of OCaml
I find that taking an ML like OCaml and ripping out the M part is a pretty big change and rather not for the better. Also, personally, ripping out the O part of OCaml is also sad because objects are rather interesting and fit surprisingly well into the language, some syntactic weirdities notwithstanding.
It is a common complaint against Rust that it has a steep learning curve. I agree; it took me a while to figure out how to use Rust so that I know when I’m on the wrong track and the borrow checker is going to give me errors. What I disagree with is that it’s necessarily a bad thing that Rust has a steeper learning curve; I fail to see how we can reach that next step if we try and limit ourselves to constructs that are already-known and paradigms that are familiar.
I see many similarities between the difficulties I faced when I started learning Rust with the difficulties I faced 15 years ago when I was learning OCaml for the first time. I had seen Doug Bagley’s original language shootout, and I wanted to learn the weird French language that was faster than C and C++. I did not have a lot of programming experience or background then, only basic scripting with Python, and the type checker would reject most of the code I wrote. This was extremely frustrating, especially since I wrote similar code in Python, and it worked without any problems! I gave up on OCaml a couple of times, tried to use it again a few months later, and each time I was able to go a little further. Fast-forward to now, and OCaml is easily one of my favorite languages, I feel confident writing programs in it, and I enjoy taking advantage of the type system to improve the quality of my code.
It appears to me that a lot of people are experiencing the same kind of difficulties with Rust’s borrow checker. They struggle a lot at first, curse even more when simple statements fail to compile, but eventually they internalize the model and learn to work with it rather than against it. Unfortunately, many programmers seem to refuse to feel like beginners again, and they hold Rust responsible for this slight against their ego. However, once the initial difficulties are mastered, Rust is not that different from any other language. In the HN thread, burntsushi mentioned that he regularly writes both Go and Rust code, and he can see no difference in productivity between the two languages; I view his experience as a data point in favor of the hypothesis that once you grok the Rust model, the borrow checker is no longer as important an issue as some people make it out to be. I hope that programmers will have the patience and the humility to get over the initial hurdle.
Ocaml is only one of the languages I mentioned, there are other options. In case you are not aware, though, there are other parser libraries such as menhir and stream combinator parsers. I’d also wager it’s easier to make a specialized parser DSL in Ocaml than building a Python implementation. The types alone mean you can get a well typed AST out of it.
I’m also very surprised that you claim Python is better than something like Ocaml for parsing, that has not been my experience at all. I still can’t really imagine that a home-brewed mini-Python is better in terms of all the other aspects that come with a language implementation such as optimizations and debugging.
But it’s your project, so have fun!
This is why I’m not a fan of the idea of adding monads to Rust
What do you mean when you say this? Monads aren’t a language feature, per se, but they are the combination of a few laws. You can express monads in a lot of ways even if the language has no concept of them. The Future’s library I saw for Rust had a bunch of and_then combinators which were basically monadic. Ocaml doesn’t “have monads” but you can define an operator called bind or >>= and implement the rules. It happens that Ocaml has a fairly pleasant way to express that which makes working with them not-terrible but that has nothing to do with monads. So I can do monads in Ocaml but they aren’t a language feature.
I certainly appreciate dreams!
For what it’s worth, a colleague and I looked at OCaml over a weekend recently. It was pretty hard to get started on something that’s relatively simple in at least some other environments; e.g., writing a small client for a HTTPS API that uses JSON formatted messages.
We also had some serious concerns about the way exceptions are handled today in the OCaml compiler. In particular, we would want to abort the process immediately on an unhandled exception (without unwinding the stack), but it doesn’t seem like OCaml keeps enough information in the stack frame to know, a priori, whether a thrown exception will be caught. V8 does track this, enabling us to have the --abort-on-uncaught-exception flag for node.
I think most of Standard ML’s good ideas are in more widespread use in the form of OCaml. Standard ML does have a full program optimizing compiler in the form of MLton, but I don’t think there’s an equivalent for OCaml. That being said, I’m not sure OCaml is as performant as C, and it’s also started to gather some cruft as it gets older. It would be interesting to see a “fresh start” ML-style language designed for this day and age.
So, I threw some more effort in since a dynamically-typed ML is useful. I found that there’s been a number of works on this sort of thing in CompSci but for minimalist languages as proof-of-concepts. It’s mostly been theory. There’s two I found in practice related to this question.
Dynaml - Dynamic types for Ocaml (alpha):
https://web.archive.org/web/20090105164828/http://farrand.net/dynaml.shtml
Note: Links to manuals are dead in most versions of it I tried in Wayback. Proof of concept, though. Used that standard Ocaml parsing library (camp-something).
Clean language has dynamic types:
http://clean.cs.ru.nl/Language_features
Note: A competitor to Haskell and Ocaml that has an interesting set of features. One of those features is optional, dynamic typing. It’s been around a while, too.
An older one I remember is sklogic’s tool he uses for building compilers and tools for static analysis. He starts with a LISP-based foundation that facilitates his task. Then, he implemented as DSL’s Standard ML, Prolog, and a bunch of other things. So, depending on which is easier, he’ll switch back and forth between LISP or his DSL’s to express the solution the easiest way with the intended goals. I could see a different combination of LISP and SML for general applications that added dynamic typing.
For your exception issue, I’m not entirely sure I grok what you mean. If an exception makes it all the way up to the program entry point it will end the program. And stackframes being recorded depends on a runtime option, by default they are off. I actually avoid exceptions for the most part, choosing instead to use a result monad almost everywhere combined with polymorphic variants. I find it significantly less surprising than exceptions.
Yes, in general I’m in favour of “errors as values” as well. I think exceptions make sense for what we would call a “programmer error” (as opposed to an “operational” error), specifically because I don’t want to handle them – because, frankly, you can’t. For a dynamic language like Javascript, these sorts of errors include misspelt property names or type errors that aren’t caught by linting; for a statically compiled and more strongly typed language I’d expect less (or none!) of these at runtime.
I’ll try to explain what I meant by way of example. In Javascript, you catch exceptions by wrapping the potentially throwing code inside a try block. You can’t escape the block, except by executing all the way to the end. Once you enter a try block, you’re in “handled exception” territory, no matter how many nested functions or other try blocks you subsequently enter – they just nest. V8 tracks whether you’re in a try block or not, and if you’re not, we’re able to abort() the process immediately (without unwinding the stack) and get an operating system core file. We can then look at this with the debugger later, as the state up to the panic-style failure was completely preserved. As such, we religiously avoid try except tightly around code which doesn’t stick to the “errors as values” principle (e.g., JSON.parse()), so that we can get core files for all of the programmer errors that arise at runtime.
My understanding of the current structure of the OCaml compiler is that it does not track (in the generated program text) whether you’re in the OCaml equivalent of a try block. Instead, when you raise an exception, it unwinds the stack one frame at a time, checking to see if the exception will be handled in that frame. To get full stack traces emitted to stderr, as you mention, you have to turn on the generation of the entire stack trace string into a global buffer for every exception just in case it turns out to be unhandled. We would rather get a core file at the point of the failure (without unwinding) iff it’s to be unhandled. It’s possible that there’s some property of the OCaml language that means you can’t know a priori if an exception will be caught or not – if so, please forgive my ignorance.
I think it might be visually clearer to weight these arrows by how likely they are, or some such. Even if I agree with the reasoning of the first two choices, presenting them that way makes it look like 75% of projects need to be in C, whereas IME embedded or realtime projects are a tiny minority.
The top part of the chart is outside my expertise, but there are other options on phone platforms, and for large codebases on the web, js_of_ocaml, ScalaJS, Elm etc. are options. For embedded projects I might consider Rust, and for realtime projects Rust or Erlang.
“Care about GC pauses” phrases the question wrongly. If you have a hard latency requirement then C++ may make sense, but the cost in terms of development time and defect rate is very high (and again, Rust or Erlang might be better options in that case, though it’s not my area of expertise).
“Need to integrate with…” could be a question for any platform, and is probably the first question of all. (Though on the JVM I’d use Scala rather than Java, and similarly on the CLR I’d likely use F#).
Rather than “what is your constraint?”, I’d make that “do you have a performance constraint that’s worth paying a substantial cost in development time for?” And I’d probably subdivide that, because there’s a big difference between being 2x slower than (hand-tuned) C (Java, OCaml, …) and being 50x slower than C (Python, Ruby, …) - so maybe the options are “no performance constraint”, “within a factor of 2”, and “absolute maximum performance”. On the “no performance constraint” path I’d want a “large codebase?”[1] decision, with a typed language on the yes branch (Scala for me, though OCaml, Haskell or maybe F# are defensible choices - maybe I’d add another question of “is this an interactive tool that needs to start quickly”, in which case OCaml, since while Scala’s throughput is great its startup latency can be poor) and Python or Ruby on the no. “Within a factor of 2” would go to the same place as the “large codebase” options (i.e. Scala for me). For “absolute maximum performance” I’d probably go with Fortran.
[1] Edited to add: actually, probably a “large codebase or low defect rate” decision - there’s an equivalence in that the larger the codebase the lower the defect rates for individual components have to be. The correct measure is probably something like required defect rate / complexity of requirements^2, because a large codebase not only means more defects, but also means the impact of each defect is larger. And potentially this needs to have a branch for even lower defect rates / larger codebases, where I’d use Idris.