1. 35
  1.  

  2. 8

    Treat threads like stack variables. That’s it. That’s the tweet.

    Don’t let them outlive the function they’re spawned from, and everything gets a whole lot easier to reason about. The article links oklog/run which is just an expression of the idea. It really doesn’t make things as difficult as you think, in practice it mostly just means moving the mechanics of concurrency out of your components and giving that (now optional) capability to their consumers. Instead of asynchronous Start and Stop methods, you model your thing as a synchronous Run method. Tada.

    1. 1

      That is a structured concurrency convention. It happens to be one which is simple to explain & simple to audit for. :)

    2. 5

      I’m glad to see this on the front page of lobste.rs! It’s one of my favorite programming blog posts of all time, along with its pseudo-companion, 1.

      Structured concurrency will be, I think, amazing, but I have had an idea recently that I think would make it even better.

      Structured concurrency, as currently defined, turns execution into a tree of threads. But what if we could turn it into a DAG?

      It would have these restrictions:

      1. To prevent potential memory cycles (and thus, needing a garbage collector or reference counting), forbid all types from being able to, directly or indirectly, have pointers to themselves.
      2. Some form of RAII with automatic calling of destructors, like C++ or Rust’s Drop trait.
      3. Have structured concurrency as it currently stands.
      4. But add the ability for threads to ask another thread to run code for them.

      Number 4 is the magic. What happens is that, when a thread creates data (the “create thread”) that other threads might want to use, it then creates a child thread to run the code it would have run on the data and waits for requests from other threads. Then, when other threads put in requests for the data, instead of getting a pointer to that data and continuing on their merry way, requiring reference counting for the data, each thread sends a function to the create thread, from which the create thread spawns another child thread that has access to the data it created, which would give every thread indirect access to the data that it wanted.

      Then, every thread that requested the data is paused, waiting, while the execution of the thread it requested runs.

      The reason the create thread spawns a child thread instead of just running code on the data itself is two-fold:

      1. It allows it to respond to requests, and
      2. It prevents the situation where an already existing child of the create thread requests the data and then the child is dependent on its parent to exit first. If the child of the create thread is dependent on a sibling to exit first, all is well.

      In other words, creating the child thread first is what keeps the execution a DAG.

      Number 1 (at top) is important to get rid of all need of any kind of GC, including borrow checking!

      Some may claim that it’s impossible to eliminate cycles, but you can always create a “parent” type that points to all of the child types that could be in a cycle.

      For example, to create a linked list, create a parent type that stores an array with data, and alongside the data, it stores handles that it knows how to interpret as the prev and next pointers. The children don’t know how to interpret those handles, but the parent does, and it can do everything you need it to do in order to emulate a linked list.

      A programming language that provides these facilities should also have reference counting, like Rust does despite its borrow checker, because sometimes, more complicated things are needed. However, if a programmer decides to program with these restrictions, they would have these benefits:

      1. No memory leaks.
      2. No use-after-free.
      3. No memory safety issues.
      4. No borrow checker to explode compilation times.
      5. Memory allocation that could be entirely done with bump allocators, except for data that may need to grow, such as resizable arrays or maps. Even dynamically-allocated stuff that knows what size it needs, but doesn’t need to change its size after allocation, could use the bump allocator.

      While there are some details (a compiler would need to figure out when an item escapes a function), I think these ideas could be pretty useful.

      As for escape analysis, we could make it easy and say that the compiler should just error if a value escapes. I personally would prefer something more sophisticated, where we ensure that a value does not escape to global variables. If a value escapes by being returned, the compiler should just treat that as though the caller has responsibility for it, and it already knows what to do in those cases (nothing in the callee, and call the destructor in the caller, unless it escapes, etc.).

      However, I could be naive; this could be a fantasy. So I guess I am asking all of you: could it work?

      1. 4

        In E and Monte, vats form a tree, while promises form a DAG under resolution. This achieves the balance which you’re imagining. The tradeoff is that many low-level details are lost, which is not what everybody wants.

        A vat is an actor which manages a queue of incoming actions, called “messages”, which are delivered to objects; each delivery is a “turn” of a vat. An object is only referred to by exactly one vat at a time. This is similar to the discipline of nurseries, except that sometimes a vat might be “immortal” and continue to exist even when there are no pending messages to deliver; for example, the Monte REPL’s vat is immortal:

        ⛰  currentVat
        Result: <vat(pa, immortal, 0 turns pending)>
        

        A vat can create or “sprout” many children vats, forming a tree. This is analogous to a tree of nurseries and has similar behavior.

        When we enqueue pending actions upon a vat, we receive a promise for the return value. Promises are references to objects, but they are also first-class objects which can be stored in closures. This means that promises can refer to object graphs. And there is a builtin trick which uses promises to construct object cycles. However, the resolution of promises (looking up the referent object of a fulfilled promise) is acyclic, because a promise can only be resolved once per graph traversal, forcing the traversal to only explore a DAG.

        I should point out that datalock is still possible with this model, and type systems generally cannot prevent it either.

        1. 1

          This is great! Now I can go dig deeper.

        2. 3

          If you can tolerate a higher level language, Everything you ask for basically baked into erlang, plus things you haven’t thought of yet, like combining failure domains (if task x dies, also take down task y… And that should trigger closing y’s open database connection (if it’s open) with 0 extra lines of code kthxbai) The nursery concept in OP is similar to elixir’s Task.async_stream.

          1. 1

            Yeah, I knew that. What I want is some language that can effectively replace C and Rust. C because bugs, Rust because complexity. That means that the language needs to have a minimal runtime and be able to have threads and processes. As far as I know, Erlang only provides the latter.

            1. 1

              Try zig? It doesn’t quite catch as many things as rust but it’s still quite nice, and way, way safer than C (you get overflow, underflow, buffer overflow, double free, memory leak, and many UAF checked in test; you’re writing tests, right?). I think the long play is that zig is very easy to parse (easier than C, tbh) so I think we will see formal verification tools for the cases where you need thread safety. These will also catch race conditions, which one can’t do generally at compile time if you care for a sane dev experience

              1. 2

                I admit I am tired of the Zig advocacy.

                I know about Zig. I’ve talked with Zig’s designer. Zig does not have anything like this here, and in fact, with its async design, has one of the worst (in my opinion) concurrency stories among programming languages. Re-ifying stack frames is just not the greatest idea.

                1. 1

                  I dunno, it worked great for me! (I seamlessly unrolled erlang’s awkward c abi “tail-call” to build in a yield statement that you can drop into an imperative zig loop). You can thus interchange your zig code among the different cooperation styles (cooperative, threaded, preemptive) to tune overhead, latency, and isolation, without having to write your code over and over. That level of flexibility speaks to the correct choice of reifying stack frames.

                  Anyways, to each their own

                  1. 1

                    Flexibility is the problem. Please do not push Zig on me.

                    1. 2

                      Please do not push Zig on me.

                      Anyways, to each their own

        3. 4

          This is a unique type of post. It sparked a whole new stream of language and library feature development. One of the most notable is the design shift of Kotlin coroutines which was well described by one of the lead language designers.

          1. 3

            This seems like a barrier synchronization primitive which is fine in so far as it goes.

            At minimum other programming languages have those at the language level:

            Occam - https://www.eg.bucknell.edu/~cs366/occam.pdf X10 - (mentioned in another comment on the same article) http://x10-lang.org/

            Golang has a barrier construct.

            In general it’s not clear to me what the claim is here. Forking threads is difficult to reason about without any constraints on data sharing. Barriers don’t magically solve that.

            Worth mentioning Verona (discussed in another thread) which has a sort of mirror image construct - a data acquisition barrier to start execution. https://github.com/microsoft/verona/blob/master/docs/explore.md#concurrent-ownership

            1. 2

              I think this misses a point a bit: structured concurrency is not about adding functionality, it’s about adding restrictions. It’s not “go has barriers”, but rather “usage of barriers is not enforced in go”.

              https://250bpm.com/blog:71/ is a different, shorter take on the topic.

              1. 1

                Right but this restriction doesn’t do much because it doesn’t touch data sharing patterns.

                1. 5

                  It’s true that structured concurrency doesn’t prevent data races. But data races are not the only problem with concurrency.

                  Another issue is loss of causality. The most typical symptom is “I need to put sleep in the test, because the tests spawns some concurrent work and I have no way of knowing when it’s done”. A different symptom is “I want to gracefully shutdown the system, but there’s a lot of running concurrent activities, how do I wait for all of them?”.

                  These are separate problems, they require separate solutions. For example, Rust shares data sharing neatly, but doesn’t help with with structuring at all. A typical bug is that, at the moment the main finishes, there are other threads running, which get killed abruptly without calling destructors, which may lead to files not being flushed, etc.

            2. 2

              I built a similar library for Haskell a while back, though I called them supervisors rather than nurseries:

              https://hackage.haskell.org/package/supervisors-0.2.1.0/docs/Supervisors.html

              I needed this due to limitations of Haskell’s async package, which is similar in terms of expressiveness to what the author calls run_concurently.

              I find myself occasionally wanting some notion of move semantics though; at some point I’ll probably bridge it with https://hackage.haskell.org/package/lifetimes (which I also wrote, later), though I might want to play with the linear types stuff in GHC 9 before I commit to hard to an API.