The whole ‘i have unit tests to cover me for typing and refactoring’ was always a bad excuse. It was for Ruby ,where I think it started to be a meme, and it is for Python.
It was strongly believed and advocated by many at the time. It’s actually shocking to see such a large majority now on the side of strong typing. (I was always on the strong typing side, so I’m not shocked that strong typing is so powerful … I’m shocked at how large a percentage of developers have come to that conclusion).
Apparently, this won’t be like the spaces/tabs and emacs/vi(m) religious wars … we might reach mainstream consensus on this one.
Great article. How would you say this compares with TLA/TLA+ ? Are you basically walking through the steps from first principles that lead to something like TLA?
I don’t have the confidence to create a logic like TLA+, but I’ve been hugely impacted by learning it. My actual goal is to have a generative testing workflow that I can use at work and recommend to other people.
One big issue with formal methods is that it is very hard to scale to the implementation level. Leslie Lamport even says this all the time - paraphrasing, but something like: it would be very rare to prove a refinement to the implementation level in practice. seL4 did do this, but on a ~8K LOC codebase, and it took them like 3 years and multiple new PhD’s. My codebase at work is ~500K LOC.
I think a happy medium is “lightweight formal methods,” where we can borrow ideas from amazing tools like TLA+, but apply them to executable code via generative testing. Amazon has done this on a component in S3 for example. I think there’s an opportunity to “connect” code like this to a TLA+ spec too, similar to how Cogent code can be embedded in Isabelle/HOL for proof.
I agree with parts of the article, but heres one thing it overstates: it says the first way of using C# await is almost never correct, and implies by contract that one’s first stab at using goroutines usually is correct. That’s not true. There are a lot of surprises around goroutines and channels. They’re nice tools for the job, but they don’t just make it super simple.
I think it would be interesting to explore which is easier to iterate on. It seems true that the first stab at any modality is usually wrong (with some wiggle room) but I bet theres more leverage on iteration.
As a meta-question, how do you select what to read? I tried following the fire hose that is the Arxiv CS web feed for a while, but the titles were all either hyper-specialized to the point where I knew none of the nouns, or so general that they sounded more like a manifesto than research.
I used to read “the morning paper” which had a CS paper a day, but that unfortunately stopped. The archives are still up though: https://blog.acolyer.org/archives/
Semantic Scholar has a recommendation engine that will give you suggestions for new papers related to your interests. Aside from that, I skim conference proceedings for titles that seem interesting.
Honestly, I just relentlessly google phrases that come to mind, until I find projects / papers / books that I elevate to my personal canon. Once you find major cornerstones like that, you can get pretty far by reading everything related to them and everything that they reference.
For example, the two biggest cornerstones I’ve found in the last couple of years are TLA+ and the seL4 project. Luckily, both of these have dozens of papers related to them, and each paper references their previous work as well as related work in the field.
Seriously - try putting into words what you’re looking for, even if it’s very high level. I was googling really vague things like “software correctness,” and even that will get you going. The trick is figuring out exactly what it is you’re interested in, and putting that into words.
No, I don’t think it would. By finding things “the hard way,” I’ve learned many things and connected many brain pathways that would never be reproduced by just getting a list of answers to your questions right away.
A couple of weeks ago I experimented with asking ChatGPT for fiction book recommendations. When I asked about for more like one book, it told me that was the first in a trilogy. The trilogy proved to be non-existent. The titles were real, but by other authors.
I’ve seen mention before about it confidently making up plausible sounding references to fictitious papers. So do be careful.
This is how I do, start with a paper, then find:
Note, sometimes the PDfs are behind paywalls(unless you’re at a uni that pays licenses). There are ways to get the paper, like from the Authors website( or libgen!)
To add to the other suggestions by the sibling replies, try seeing if there are any State-of-the-Art Report (STAR) papers on your topic, or other good literature reviews. Those can often give you a nice starting place with an overview of the topic and a ton of links to other worthwhile papers to read.
This is an outstanding post. One of the best posts I’ve read this year. I’m looking forward to diving into other content from the same author/site.
Since my last comment I’ve already found several of your posts interesting. I’m interested in any techniques that give us more leverage in terms of specification / testing / correctness. I’m interested in approachable introductions / tutorials / examples of model based testing and formal models.
This looks fantastic. I just added the backlinks Plug, nice! I wish everything (even Back) was a command. I also wish for a cheat sheet on hotkeys.
I agree with the general sentiments, and I prefer to program in a dynamic, pure functional programming language.
Although you can do some functional programming in Python, not all functional programming idioms are supported, efficient or convenient, and the library ecosystem is mostly imperative rather than functional. I would rather use a language that is designed from the ground up to support pure functional programming, with features like proper tail calls, immutable data structures with efficient update, and the ability to do everything without using shared mutable state.
That’s a good question. Any suggestions? There seems to be a shortage of the kind of language I want to use. Highly expressive and dynamic, in the Lisp/Smalltalk sense, yet pure functional, with no shared mutable state, and simple, in the sense of minimizing cognitive load and enabling you to use local reasoning to understand code.
I created my own language called Curv a while back, which is used for creating 2D and 3D art. I’m working on a new one which will be more general purpose.
Clojure seems to be a great fit for your criteria. The only thing that keeps it being close enough to perfect for me is that I want strong typing.
I’m working on something like this. Would be interesting to hear what you’re doing too. Probably will release what I have early next year. I’ll check out Curv.
I would be interested in talking to you about your project. My email address is in the Curv readme. If you have anything online that describes your language, post a link or send it to me.
I’m less than half way through but have already had several OHHHH YEAH moments. Then he wired up that depth slider. That instantly reminded me of Bret Victor.
This tool looks like something I’ve wished for for a while now.
Interesting. It seems to me a good way to make this less common might be to write a tool that adds comments like
// captures: a, b, c before any closures it analyzes. Of course, it has to be kept in sync, but it would make it clear when you’re accidentally including something you don’t mean to capture.
On a similar note, I’ve often wondered how Rust would look if it made all moves explicit. I think they might have experimented with that back in the early days.
It seems we need, Rust - The Good parts book now :P Community consensus on how to write properly without using too much. Basically PEP 8 for Rust.
Lol, “Effective C++” by Scott Myers used to be the poster child – the “morality guide” for a language. Telling you which parts to use and which not to.
Heck, even “The Good Parts” is a pretty old historical throwback now.
(Edit: removed about 90% of my original post b/c it was just whiny. Trying to leave the more focused comments.)
I don’t personally feel like the problem with Haskell is the lack of documentation, even as high-quality book-length material. I tend to learn programming languages – and I’ve become at least somewhat proficient with at least 8-10 over my career, including other FP and FP-friendly ones – by reading code.
Reading Haskell code makes me feel dumb. The use of custom operators, language extensions, and ever-changing algebras and abstractions – remember zippers? monad transformers? oh sorry, it’s free monads and optics all the way now – means I can’t jump in and have any clue what’s actually happening in most real-world Haskell code.
It’s probably just that I’m a little too slow and a lot too impatient, but: after ~20 years of coming back to Haskell every year or two to see if I can finally get up to speed I’m more or less resigned to just never reaching that particular summit.
Perhaps one of the books on the OP’s list would fix that. I dunno.
Professional Haskeller here:
With respect to the ever-changing abstractions comments, there’s always going to be new shiny things in blog posts. Most real applications don’t use these things and stick to the basics. “Simple Haskell” has mostly taken root in the haskell-for-business world. The basics have absolutely stuck around and stood the test of time with a few of those shiny new things making it through slowly.
It would be awesome to see that subset of Haskell documented. It would be subjective so there would be different takes, and that’s ok. “Here’s a subset of haskell we’ve chosen for business app development and why we like it.”
It is documented in books like ‘Haskell in Depth’ and ‘Production Haskell’. These books have chapters diving deep on solving actual problems.
Interesting, I also think basic Haskell is what we should stick to, and it’s also not such a complicated language. Basically, it’s just lambda calculus with lazy evaluation and abstract data types.
It’s a bit like in case of Lisp, where the core language is simple, but there are tons of abstractions because the core language constructs make it easy to build them and that has given Lisp a reputation of being hard to learn and maintain.
May I ask what Haskell subset and what language extensions you use at work?
It’s a lot to summarize. We have a whole style guide.
I’ll maybe summarize with some libraries to give you a feel:
I’d be happy to answer any other specific questions.
We have unary negation in Go, it’s
~x, also, the linked commit is about using
~ in type sets (for generics), it has nothing to do with bitwise operations. Also the blog post is about contributing to Go, but no such contribution has been made, it’s just clickbait.
This seems brutally negative. Dude made a mistake in missing
^ but the article is clear and thorough.
I don’t think that’s even a mistake. I wrote about adding a
I love these kind of articles!
Edit: oh the linked commit is wrong, that was probably the mistake you were referring to.
It’s not clickbait at all. It’s a worked example of how to make a language change from proposal to completion. The author makes no pretense that they have submitted or even intend to submit the change.
It’s definitely not to completion, since no effort at submission has even been made, and yet the post title certainly implies that. But more importantly, the author solves a problem that doesn’t exist because he doesn’t fully grasp the Go language which is the worst possible thing to do if you are trying to contribute to the Go language. And btw, if you want to contribute to the Go language, the first thing to do is not just to write, but more importantly to discuss a proposal, which is exactly what was not done in this case. The author jumped to implementation, and wrote his “proposal” like filing some kind of bureaucratic form. That’s not the point of the proposal process.
“Clickbait” is a judgment laden term. It is definitely a poorly thought through article. Whether the poor thinking is because the author just wanted “clicks” or because the author was careless, I don’t know.
If you just want to show “here’s how to add an operator to a language” you can write that article, and that is almost what this article is. But the article purports to be “here’s how to contribute to Go” which this definitely is not because he opened an issue to contribute to Go a redundant operator, which wasted the time of the Go team to triage the non-issue. Opening a bogus issue to illustrate your blog post is uncool behavior, “clickbait” or not.
If you’re going to use (much more expensive) ref-counted heap objects instead of direct references, you might as well be using Swift. (Or Nim, or one of the other new-ish native-compiling languages.) Rust’s competitive advantage is the borrow checker and the way it lets you use direct pointers safely.
The author should at least have pointed out that there’s a significant runtime overhead to using their recommended technique.
No, this a fatalistic take almost like “if you use
dyn you may as well use Python”.
Swift’s refcounting is always atomic, but Rust can also use faster non-atomic
Rc. Swift has a few local cases where it can omit redundant refcounts, but Rust can borrow
Rc‘s content and avoid all refcounts within a scope, even if object’s usage is complex, and that’s a guarantee not dependent on a Sufficiently Smart Compiler.
Swift doesn’t mind doing implicit heap allocations, and all class instances are heap-allocated. Rust doesn’t allocate implicitly and can keep more things on the stack. Swift uses dynamic dispatch quite often, even in basic data structures like strings. In Rust direct inlineable code is the norm, and monomorphisation is a guarantee, even across libraries.
So there’s still a lot more to Rust, even if you need to use
Arc in a few places.
Uhu. It seems to me that there are two schools of thought here.
RefCell to make life easier.
The other says: the zen of Rust is ownership: if you express a problem as a tree with clear ownership semantics, then the architecture of your entire application becomes radically simpler. Not every problem has clean ownership mapping, but most problems do, even if it might not be obvious for the start.
I don’t know what approach is better for learning Rust. For writing large-scale production apps, I rather strongly feel that the second one is superior. Arcs and Mutexes make the code significantly harder to understand. The last example, a struct where every filed is an Arc, is a code smell to me: I always try to push arcs outwards in such cases, and have an Arc of struct rather than a struct of arcs.
It’s not that every Arc and mutex is a code smell: on the contrary, there’s usually a couple of Arcs and Mutexes at the top level which are the linch-pin of the whole architecture. Like, the whole rust-analyzer is basically an
Arc<RwLock<GlobalState>> plus cancellation. But just throwing arcs and interior mutability everywhere makes it harder to note these central pieces of state management.
I’ve always felt that the order of preference for new code is:
(some people may choose to wedge in “make it correct” somewhere there, but I think that’s either mostly a pipe dream or already part of 1.)
That would mean that you always use the easiest possible techniques in phases 1 and 2 and in phase 3 do something more clever but only if the easy techniques turned out to be a bottleneck.
I’m guessing the easiest technique in Rust terms would be copying a lot.
I tend to agree about that ordering, but I’ve also found that heap allocation and copying is frequently a bottleneck, so much so that I keep it in mind even in steps 1-2. (Of course this applies to pretty low-level performance sensitive code, but that’s the kind of domain Rust gets used for.)
I completely agree. If you don’t need precise control over memory, including the ability to pass around refs to memory safely, then the sane choice is to use a well-designed garbage collected language.
Maybe you’re building something where half needs to control memory and the other half doesn’t. I guess something like this could make sense then.
Swift isn’t exactly “available” on many Linux distributions due to its overengineered build system. The same goes for dotnet. Both of these languages are extremely fickle and run many versions behind the latest stable release offered on the natively supported OS (macOS for Swift and Windows for dotnet).
To build Rust is comparatively sane and a breath of fresh air.
Well, there is OCaml of course. On Linux with a reasonable machine, compiling it from scratch with a C toolchain should take just a handful of minutes. Of course, setting up the OCaml platform tools like opam, dune, etc., will take a few minutes more.
Outstanding article. I went down the oop road and got good at it. It declares one claim you must take on faith— that if you insist that “everything is an object”, it’s possible to find a nice set of objects to implement you’re system, and that’s the best way to do things.
Instead, what it does is tickles a developer’s brain in a super enjoyable way that feels fulfilling … in other words mental masturbation.
What i found is you often can find an elegant solution that consists of objects, and it will feel like a profoundly beautiful thing when you do. It will feel so right that it convinces many that, “this is the way.”
But in reality, it’s over-complicated. Your brain has a blast, but it was more about pleasuring your brain by making you work under a challenging limitation. Turns out you could drop that single dogmatic principle and wind up with a simpler solution faster, still elegant and maintainable (which is of course more fulfilling).
The one claim you had to take on faith, turns out, is exactly where the whole thing breaks down. And the incorrect illusion of “rightness” it gives to the puzzle solver’s brain is the engine that keeps it rolling.
Instead, what it does is tickles a developer’s brain in a super enjoyable way that feels fulfilling … in other words mental masturbation.
I very much agree. I do see “every little thing is an encapsulated object that receives messages” as an artificial constraint which acts like a additional challenge akin to ones used in many games. “Complete game without killing anyone” in an RPG game and get a “pacifist” achievement, etc.
Instead of actually solving the problem and working on solving constraints that are actually important, we tend to get stuck on puzzle solving that makes us feel smart. It’s not limited the OOP. The same kind of motive are e.g. people agonizing over expressing some universal taxonomy, overusing generics to provide “super clean universal abstraction”, or (in Rust) solving a borrow checker puzzle involving 3 lifetimes, just to shave off one data copy in some cold path.
Very true, and a good pattern to be aware of (in yourself and others).
That said, there is an upside to adding additional constraints: constraints can make things easier to reason about. The most obvious, famous example that comes to mind is maintaining referential transparency in functional programming, and isolating your mutations. Or, say, in React, the way you have actions that update a data model, and then a unidirectional flow of that updated model down into the re-rendered view. In these cases, too, (especially at first) it can be a “puzzle” to figure out how re-structure your problem so that it fits into the paradigm. But in both of these cases, at least imo, there is a huge net benefit.
Which is to say that “artifical constraints” per se aren’t the problem. The problem is artificial constraints that are time-consuming but don’t pay you back.
Which is to say that “artifical constraints” per se aren’t the problem. The problem is artificial constraints that are time-consuming but don’t pay you back.
I agree, except for semantics… if the constraint has an actual payoff (like “use only pure functions” does, for example), then it is no longer “arbitrary” or “artificial”.
“Everything is an object” never had a real payoff… it was just dogmatic koolaid.
I am by no means an OOP-first or an OOP-only and spent years being anti-OOP (but without knowing what OOP was or knowing what I preferred instead) – if I had to be pressed to “pick a paradigm” I would say I’m a functional programmer.
However, I have seen (and written myself and tried to come back later) enough “big procedures of doom” to know that is not the way. Or at least not a way I want to have to touch unless for a lot of money. Of course “good procedural” exists just like “good OOP” does but really we should focus on what we (often) have in common, which is that a 50 line procedure with everything inline is (often) an unreadable mess two days later and (more often) gets worse with every edit and so it needs to be modeled somehow (and no, procedures named do_thing, do_second_thing don’t count). You can use polymorphism and call it OOP or you can use closures and call it functional or you can use modules with glue and call it procedural or what have you – the paradigm and the name doesn’t matter, only the result.
Rust sounds like a good match (but i doubt that’s your choice because I think you tried it already). But see the recent post here about rust being as productive as kotlin.
My money is on F#. He gets everything he likes from OCaml as well as proper multicore support. Plus he has the entire .NET ecosystem to fall back on when it comes to libraries, documentation etc.
Looks like you were right:
It sounds like Rust would be an awful choice, because most of his complaints are true for Rust as well, especially where the ecosystem is concerned.
Rust has a much bigger ecosystem especially in terms of “cloud” and web stuff. You won’t have the “there’s no google cloud client library and the postgres client is incomplete” problem in Rust.
Rust is approaching 50,000 crates (that’s 1/5th of nuget, and more than CPAN). It’s not as big as mainstream package repositories, but it’s definitely past being a niche language.
Rust doesn’t really have a good official google cloud sdk. I think the safest choice is Go if your aim is interfacing with google products.
That being said, I wish someone made the equivalent of bucklescript/reasonml that compiles to go so you can just use ocaml with go’s extensive set of libraries.
That’s one of the reasons I don’t like Go—its ecosystem is a walled garden.
OCaml libraries are usable from other languages, you can write a shared library in it. Whether to link everything statically is a choice. With Go you don’t have that choice, if you write a library, no one but other Go users can ever benefit from it.
Regardless of whether you currently think your existing tools need replacing, I urge you to try ripgrep if you haven’t already. Its speed is just amazing.
I’ll second this sentiment. Your favored editor or IDE probably has a plugin to use ripgrep and you should consider trying that too.
As an experiment I wrote a tiny Go webservice that uses
ripgrep to provide a regex aware global code search for the company I work at. The experiment worked so well over a code base of ~30GB that it will probably replace hound which we use for this purpose at the moment. I did not even use any form of caching for this web service, so there is still performance to squeeze out.
https://github.com/phiresky/ripgrep-all comes with caching, it’s a wrapper around rg to search in PDFs, E-Books, Office documents, zip, tar.gz, etc
ripgrep and fd have changed the way I use computers. I’m no longer so careful about putting every file in its right place and having deep, but mostly empty, directory structures. Instead, I just use these tools to find the content I need, and because they’re so fast, I usually have the result in front of me in less than a second.
You should look into
broot as well (aside, it’s also a Rust application). I do the same as you and tend to rotate between using
broot. Since they provide different experiences for the same goal sometimes one comes more naturally than the other.
3 or 4 years ago it was announced that the vs code “find in files“ feature would be powered by ripgrep. Anyone know if that’s still the case?
I think what this comes down to is that there isn’t a great language for building languages :-/
OCaml is supposed to be that language, and I even wrote in ~2015 on my website that I would write all future language projects in OCaml. Yet I didn’t go down that path for Oil, and I don’t regret it.
Instead I came up with this somewhat weird Python + ASDL + multiple code generator combo, but it has worked out well, and compiles to nice C++.
It’s both low level and high level at the same time. Languages need both because they’re extremely repetitive tree matching on the one hand, and very performance sensitive on the other (which I learned the hard way).
A few more threads about metalanguages, with extensive comments on using Rust to implement languages:
This is the programmer’s version of Guy Steele’s computer science complaint: https://www.youtube.com/watch?v=7HKbjYqqPPQ
That is, that the language used for describing languages is horribly imprecise and inconsistent.
I guess it is supposed to be Racket … and yet often it doesn’t seem to play out that way.
As an outside observer, I wonder why that is.
Yes exactly, Racket is supposed to be a language for languages. A bunch of reasons that I see:
I wrote Oil in Python and it was way too slow. Racket would have been also too slow, and even with the new runtime, I doubt it would be fast for parsing shell. JITs tend to speed up numeric workloads more than string workloads. String workloads are dominated by allocations and the JIT may not see through those.
Algebraic data types. I’m not up to date on what Racket offers here, but all the Lispy mechanisms I’ve seen fall somewhat short of the real thing. Algebraic data types really help when dealing with languages. They affect 50-80% of the lines of code. Compilers and interpreters are full of conditionals, and it’s a huge help to encode those as data rather than code you have to step through.
Static types help too. Oil started out dynamically typed and is now statically typed with MyPy.
Syntax. Racket does support syntax unlike other lisps, but I think it’s not culturally there, and the support for parsing is relatively weak. For example, the C ecosystem has re2c, which I used in Oil, etc. The Python ecosystem has a number of options for parsing as well.
Runtime dependencies (for interpreters). As far as I call, the Shill shell (a research project) was written with Racket. But then they switched to something else because the runtime got in the way.
Build time dependencies (for compilers). Compilers are often bootstrapped in a unique way – e.g. Go compilers in Go, Rust in Rust, Zig in Zig, Clang in C++, etc. Compiler writers want to use their own language, and not someone else’s.
So in all of those areas, OCaml and Rust beat Racket IMO. And on top of that, I agree with the downsides about both OCaml and Rust. (Not that my own solution doesn’t have a lot of downsides. The important thing is that they’re all fixable because the code is small and under my control!)
I think Racket is probably very nice for prototyping languages. I’m not sure it’s good for production quality implementations, at least when you’re talking about competitors to CPython, LLVM, rustc, Go, Julia, etc.
Julia was bootstrapped in femtolisp though, which is interesting. I hacked on it at the beginning of Oil, but decided I didn’t like that style.
From my perspective, as someone who’s spent the past 5 years doing more pure functional programming (Elm and Haskell) professionally than anything else, and who’s been working on a compiler written in Rust outside work…I would gladly describe Rust as “great for building languages.”
The learning curve was substantial, but now that I’m comfortable with Rust, I don’t think anyone could sell me on OCaml or Haskell (or any other language I’ve heard of) as a better choice for building a compiler.
Granted, that’s in large part because execution speed is extremely important to me. I have very high standards for how fast I think a compiler should run!
I have very high standards for how fast I think a compiler should run!
Kind of ironic, since you’re using Rust. (I kid, I kid!)
Purely opinion, based on looking at some code like
(linked in the above threads)
It looks more concise than C++, but a lot more wordy than OCaml (or Elm). I’d be interested in seeing other idiomatic Rust code that implements languages (compilers, interpreters, runtimes).
I’m the author of this code. :)
I did experiment with using OCaml to start with, and yes, the Rust version is a lot more verbose, especially when using shared mutable references (which are very concise in OCaml). Despite this I prefer the Rust experience, and I think it’s a more generally useful language to know.
Yes I think we talked on the original Reddit thread which I quoted here
https://lobste.rs/s/vmkv3r/first_thoughts_on_rust_vs_ocaml#c_v5ch1q (as well as your article)
I think you said you prefer Rust there. That is a useful data point for sure. I’m not experienced with Rust but I can definitely see why it’s appealing for languages.
I actually think that “procedural” / stateful code with algebraic data types is a pretty good combo. That’s basically how my Oil project is written, with statically typed Python + ASDL.
I heard someone say that Rust shows you can “just mutate”. In other words, some languages like Clojure and Erlang rely on immutability for concurrency. But Rust relies on its type system for that, so you are safe with mutation.
And I just read someone say something similar here:
Others sometimes don’t like to hear this, but IMO, Rust is not at all functional. … Rust is very much procedural
Although not everyone agrees.
I have come around to this viewpoint. I used to program in more of an immutable style, but now I favor local mutation and “principled” global mutation. It seems to work well for languages, e.g. lexing and parsing are inherently stateful. And that style does seems to be in line with Rust’s design.
Fair enough, and I agree.
But if you like you can work around the problem by telling your browser not to accept cookies from medium.
I opened this page both in Firefox for macOS and Firefox for Android, and both browsers showed me the full article without me having to sign in. I was able to view the article despite not even having a Medium account.
Maybe your problem is related to this message I see at the top of the page:
You have 2 free stories left this month. Sign up and get an extra one for free.
Try opening the page in a Private Browsing window?
Seems to me that modeling git internals is a really clever pedagogical choice here! Appreciate the post. Look forward to part 2.