Threads for one-more-minute

    1. 5

      Since the type of Lisp I most often write is Clojure, I’m disappointed (though not very surprised) that this syntax isn’t any more applicable to Clojure than sweet-expressions. Clojure’s [] and {} brackets can be represented as vec and hash-map, but it just doesn’t look as clear. For example, starting with this random snippet of ClojureScript (source):

      (defn modifier-keys
        [e]
        (let [shift (.. e -shiftKey)
              meta (.. e -metaKey)
              ctrl (.. e -ctrlKey)
              alt (.. e -altKey)]
          {:shift shift :meta meta :ctrl ctrl :alt alt}))
      

      This would be the equivalent Wisp-syntax ClojureScript:

      defn modifier-keys
        vec e
        let
          vec
            shift : .. e -shiftKey
            meta : .. e -metaKey
            ctrl : .. e -ctrlKey
            alt : .. e -altKey
          hash-map :shift shift :meta meta :ctrl ctrl :alt alt
      

      It’s much harder to spot the function parameter e in vec e than in [e]. And changing that line to [ e with no closing bracket doesn’t look right either.

      1. 3

        A while back there was a version of this idea as a clojure library. I always liked the haskell-style :. Likely to have bit-rotted though.

    2. 8

      Congrats! I believe Hare uses QBE as a backend, and there may be others. Perhaps some users could be listed on the home page?

      QBE’s goals make a good bit of sense to me; it seems right for the backend to focus on emitting really good machine code, and it’s ok to have a “garbage in, garbage out” philosophy. LLVM was designed to handle almost everything, but recent languages (Swift, Rust, Julia) have their own middle IRs for language-specific optimisations. So it increasingly makes sense to expect that the backend receives relatively good IR, and doesn’t need some of the more magical simplifications. All those arch-specific peephole optimisations provide the real value.

      I’m only slightly disappointed to see Phi nodes, which are a bit less elegant than block arguments IMO (which MLIR, Swift and Cranelift’s newer IRs use – rationale). But of course it’s no deal-breaker.

      1. 7

        I was quite disappointed to see that there’s no pointer type in the IR. That means that it will never be able to target a CHERI platform (or any other architecture where pointers are not Integers), so the Morello system that I’m writing code on right now can never be a target.

        1. 6

          It looks like it changed, but I remember at the beginning the goal of QBE was to be super simple as opposed to the “bloated LLVM”, they were planning to only target amd64 and arm64. It looks like they now also support riscv64, so they might have changed and give up on that “few architecture” goals.

        2. 3

          so the Morello system that I’m writing code on right now

          Exciting! If I had a desktop-capable CHERI machine on my desk, I would also think first of coming online to tell the world :-)

          1. 2

            Unfortunately, the GPU driver doesn’t work yet, but apparently Wayland does once the GPU driver is working. I’m hoping to start using it as my work desktop soon, for now I’m sshing into it from a Windows machine. My normal working environment is the Windows Terminal sshing into a FreeBSD/x86-64 VM, so switching it to sshing into a FreeBSD/Morello computer isn’t that big a change…

            1. 2

              On a quick skim, over the first few search results, CHERI is an ISA, something similar to RISC, just extended with some capabilities around virtualization, memory protection? And Morello is …a CPU? SoC? As in, not exactly ARM, but something like that. Am I in the neighbourhood?

              Can you try to explain to a layman, what does it do differently then arm or riscv?

              1. 14

                CHERI is a set of extensions that add a capability model to memory. The hardware supports a capability type that is protected by a non-addressable tag bit when stored in memory. A capability grants rights to a range of an address space (e.g. read, write, and / or execute permissions). Every memory access (load, store, instruction fetch) must be authorised by a capability, which must be presented for the operation to succeed. For new load and store instructions, the base operand is a capability in a general-purpose capability register. For legacy instructions, the capability is an implicit default data capability.

                In CHERI C/C++, every pointer is lowered by the compiler to be represented by a capability. This means that you cannot access any memory except via a pointer that was created by something holding a more powerful capability. For example, the compiler will derive bounded capabilities to from the stack capability register for stack allocations. The OS will return a capability in response to mmap system calls, which the memory allocator will then hold and subdivide to hand out object-bounded capabilities in response to malloc. This means that you cannot forge a pointer and you cannot ever access out of bounds of an object (guaranteed by the hardware). With our temporal safety work, you also cannot access an object that has been freed and reallocated (guaranteed by software, accelerated by some hardware features). To be able to compiler for this kind of target, the compiler must maintain the integer/pointer distinction all of the way through to the final machine-code lowering (arithmetic on pointers uses different instructions to arithmetic on integers, for example).

                The CHERI extensions are designed to be architecture neutral. We originally prototyped on MIPS and are now dong RISC-V prototyping and are in the early stages of an official CHERI RISC-V extension. Morello is a localisation to AArch64 and Arm has produced a few thousand test chips / dev systems based on the Neoverse N1 for software development and experimentation. This is what I have under my desk: a modified quad-core Neoverse N1 running at 2.5GHz with 16 GiB of RAM in a desktop chassis with a 250 GiB SSD. We also have a load of them in a rack for CI (snmalloc CI on Morello and benchmarking.

                If all goes well, I expect to see CHERI extensions on mainstream architectures on the next few years and so developing a compiler toolchain based on an abstraction that can’t possibly support them without significant rearchitecting seems like an unfortunate decision, especially when maintaining a separate pointer type is fairly simple if you design it in from scratch. The fact that LLVM had separate integer and pointer types in IR made the original CHERI work feasible, the fact that it loses that distinction by the time it reaches SelectionDAG and the back end (one of the first questions the target-agnostic code generator asks the back end is ‘so, which integer type do you want to use for pointers?’) made it harder.

                1. 1

                  Thanks for the summary, very cool stuff. Kudos for pushing this long.

              2. 2

                Discussions from around the Morello announcement: https://lobste.rs/s/w32bav/morello_arm_cheri_prototype_hits_major <- I recommend the Microsoft article https://lobste.rs/s/wqts1n/capability_hardware_enhanced_risc

                CHERI is (I think) both a security model and set of hypothetical and/or experimental extensions for multiple ISA including MIPS, ARM, RISC-V, and x86 the latter of which is currently just a “sketch” [1]

                Morello is the realization of actual silicon implementing the extensions [2]

                As for an actual description I’d rather point you towards the Microsoft article (I honestly really liked it). That and the discussions were what painted most of my picture of the project(s). There’s also the technical report An Introduction to CHERI which helped fill in other details but there were things or referenced concepst I wasn’t clear on.

                [1] https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/cheri-faq.html:

                What ISA(s) does CHERI extend?

                To date, our published research has been based on the 64-bit MIPS ISA; MIPS was the dominant RISC ISA in use in 2010 when the project began. … However, since that time we have performed significant investigation into CHERI as portable architectural security model suitable for use in multiple ISAs. We have also developed an “architectural sketch” of a CHERI-x86-64 that extends the 64-bit x86 ISA with CHERI support.

                [2] https://msrc-blog.microsoft.com/2022/01/20/an_armful_of_cheris/

                The Morello CPU is a quad-core, 2.5GHz modified Arm Neoverse N1, a contemporary superscalar server core. Prior to this, the most advanced CHERI implementation was the CHERI version of Toooba, which can run in an FPGA at 50MHz in a dual-core configuration and is roughly equivalent in microarchitecture to a mid-‘90s CPU.

                1. 2

                  Thank you. I’ll put the MS article on a reading list, sounds very interesting.

            2. 1

              By GPU do you mean the Panfrost port br@ is working on, or an amdgpu in a PCIe slot? How’s the PCIe situation on Morello?

              1. 1

                The panfrost bit. There are some PCIe slots, but I’ve not tried plugging anything into them yet. We’re hoping to set some of them up with some RDMA-capable smartNICs and see if we can do something interesting eventually.

                1. 1

                  Would be very interesting to try amdgpu :)

      2. 5

        Congrats! I believe Hare uses QBE as a backend, and there may be others. Perhaps some users could be listed on the home page?

        There’s a “Users” tab at the top that lists cproc, hare and others: https://c9x.me/compile/users.html

        1. 3

          Right there in the menu as well! Thank you for pointing this out to me. I think I had assumed this would be something like a community page (eg “user mailing list”).

    3. 31

      This is as opposed to languages where conformance must be explicitly declared somehow, either with a formal subclass relationship (the class-based portion of C++) or by explicitly declaring conformance to an interface (Java). Not needing to formally declare conformance is powerful because you can take some pre-existing object, declare a new interface that it conforms to, and then use it with your interface.

      There is a third option: things like Rust’s traits, Swift’s protocols or Haskell’s typeclasses, all of which are like a post-hoc version of Java interfaces. You’re effectively advocating for dynamic/structural typing because it addresses the expression problem. That’s not wrong, but there are ways to do it in more statically/nominally typed systems too.

      1. 4

        Even go, which is not noted for it’s expressive type system, does this.

      2. 1

        I’m not familiar with Rust’s traits or Swift’s protocol. For Haskell’s type classes, if you want to extend a predefined type to conform to a new type class, you would need to newtype it with that type class, which is still inconvenient as you need to call existing functions with that predefined type under a Monad that wraps the newtype.

        1. 14

          if you want to extend a predefined type to conform to a new type class, you would need to newtype it with that type class

          You do not need to do this at all.

          1. 3

            Seconding this, although if you didn’t define either the type or the typeclass you get into orphan instance territory.

          2. 2

            I stand corrected. Thanks. I didn’t have enough coffee. You need newtype only if you want to further customize the type.

        2. 4

          In Haskell, there are no such limitations, as others mentioned. You can define as many instances as you want, as long as they don’t clash when imported.

          In fact, the limitation you’re describing is that of OOP interfaces! It is them that require writing adapters all the time if the class itself does not implement an interface.

          Rust does have a limitation: instances must be written either alongside the type definition, or alongside the trait definition. Less flexible than Haskell, but still much better than OOP interfaces.

    4. 1

      It’s always struck me as odd that so few VMs have lightweight threads. I guess the model wasn’t really proven until Go took off a while back, even if it seems obvious in hindsight.

      Maintaining a native runtime for all platforms is a lot to ask, and this has forced a lot of languages down the async/await path (though Clojure notably did CSP as a macro – Loom will be great for them). I expect that support from the JVM (and soon WASM) will have a big impact on language designs, and make the CSP model a lot more ubiquitous. Your move, dotnet!

      1. 5

        The model was perfectly well proven by Erlang, in the 80’s. Go, by design, was nothing groundbreakingly novel.

        1. 1

          Yes, BEAM is the major exception of course. Though it seems fair to say that both Erlang and its VM are relatively niche. I think Go brought some of those ideas to the mainstream and convinced a lot of people that this approach was right in practice, especially outside of functional programming. It was influential for Julia’s approach to concurrency, for example.

          But I am speculating; if you know why more VMs are only just starting to have this feature, I’d love to hear it. As you say, the ideas have been around for a while, so something else must have changed.

          1. 2

            I suspect it has more to do with the leveling out of clock speeds, and the dominance of imperative programming. Prior to about a decade ago, computers got faster by improving single-threaded performance. Single-threaded programs got faster because the hardware got faster, and as such there wasn’t a strong pressure to move towards more concurrent systems.

            In this single-threaded world, imperative programming thrived. If it ain’t broke, don’t fix it and all that. Imperative languages, however, make the already hard problem of thread scheduling even harder. If you read the justifications for Erlang’s design decisions, functional programming wasn’t chosen on its own merits, but the ease of expressing a coherent concurrency model.

            Fast forward to the early 2010’s, clock speeds level out. Moore’s law is now completely reliant on cramming more cores onto a CPU. There ought to be a hard push towards more concurrency, but it’s sluggish, because the question isn’t “how do we program with more concurrency?” it’s “How do we make our imperative languages more concurrent?” which is a much harder problem than the other, already hard problem.

    5. 2

      The bigger benefit is this: When I write code in most languages the mental representation that I use for thinking about data in memory is a murky sort-of-boxes-and-arrows thing. When I write code in clojure my mental representation is literally just the data literal syntax.

      Part of the reason that programming is so difficult to learn is that most of what is happening is invisible by default and can only be inspected with active effort.

      Yes! I think Rich Hickey likes to talk about “reifying” things; this post makes a great case for that idea. When I was learning to code, so many languages just hand you a black box and expect you to know what to do with it. If you haven’t already developed skill in looking up API docs and such (as well as understanding common data structures implicitly), that’s really hard to deal with. Clojure almost always gives you something you can understand at a glance, and it makes a big difference. I may be a better programmer now, but I’m still thrown when I use a language that won’t let me just print whatever random value, which includes most static ones.

      By the way: I can’t see a date on this post. Looks like it’s new, but it’d be nice to include the time of posting somewhere on the page, if the author is reading.