1. 8

    This article defines “thread-safe” and “thread-compatible” only in terms of whether it’s correct for multiple threads to concurrently perform operations on the same object, which is a C++-centric view. I find Rust’s Sync/Send conceptually simpler than this, and also more expressive.

    An example not captured by the article’s approach is a type which holds an index into a thread-local intern table:

    pub struct InternedStr { i: u32 }
    impl !Sync for InternedStr {}
    impl !Send for InternedStr {}
    impl InternedStr {
        pub fn new(s: &str) -> Self {
            // pseudocode: find s at index i, else insert. return InternedStr{i}
        pub fn get(&self) -> &str {
            // pseudocode: unsafe { TABLE.get_unchecked(self.i) }

    InternedStr falls into the quadrant labelled “(not useful)” at the bottom of the article, because there is no interior mutability in it, yet !Sync. But this is reasonably common in Rust — for example most of the procedural macro API (the compiler’s libproc_macro crate) consists of types like this.

    Thinking about InternedStr in terms of whether a data race occurs if the methods are invoked concurrently, or if the “const” methods are invoked concurrently, doesn’t lead to a useful conclusion because there isn’t mutation being performed.

    In Rust, Send for a type T means sending T from one thread to another is safe. It should be clear that InternedStr is not Send because accessing the same index in a different thread’s intern table would read out of bounds or refer to the wrong data. Sync for a type T means sending &T (a “shared reference”) from one thread to another is safe. InternedStr is not Sync for the same reason. (In general a type can be either one or both or neither.)

    1. 7

      Another related observation is that Rust doesn’t try to promise that all methods are thread safe. For example, Mutex::get_mut gives access to the underlying object without locking. This is safe, as it requires exclusive access to the Mutex object itself (ie, to call this method, you need to prove to the compiler that none other can access the mutex).

      1. 1

        This is a really interesting point. It is another way in which Rust has a richer model for thread-safety than C++. Not only is thread-safety modeled (and checked) by the type system, but thread-safety overhead can be avoided in cases where exclusive accessed can be proved.

        In C++ one could document that some methods are thread-safe and others are not. But this is a very sharp edge that the language can’t check, so in practice you don’t see it very often.

        I added a paragraph about this point to the article, thank you for the info!

        Do you know of other examples of this, beyond Mutex::get_mut()?

        1. 1

          The same pattern works for other synchronization primitives, like AtomicUsize or OnceCell: if you have &mut AtomicU32, you can extract an &mut u32 out of it.

          The most interesting case is &mut Arc<T>, whose get_mut returns an Option<&mut T>. That is, if you have exclusive access to arc at compile time, and, at runtime, the referee count is one, you get exclusive access to the interior. This allows implementing functional persistent data structures using opportunistic mutation for the common case where there are no copies of the data structure.

          1. 1

            That is nifty, thanks for the pointers! :)

      2. 5

        Hi David, thanks for the interesting example. It seems that InternedStr represents a useful pattern for which C++ simply has no analogue, as there is no way to express !Send in C++ in a way that the type system knows about.

        Inasmuch as the article attempts to map C++ concepts to their closest Rust equivalents, I believe it has done so accurately. But you rightly point out that I was too quick to assume that !Sync + no interior mutability quadrant is not useful. Rust’s modeling of the problem opens up new opportunities like InternedStr. I’ll add a correction to the article.

        1. 5

          Thanks. :) As another real-world contrasting example if you want it, the standard library’s MutexGuard type which roughly works like this:

          pub struct Mutex<T>{...}
          pub struct MutexGuard<T>{...}
          impl<T> Mutex<T> {
              pub fn new(value: T) -> Self;
              pub fn lock(&self) -> MutexGuard<T>;
          impl<T> MutexGuard<T> {
              pub fn deref(&self) -> &T;
              pub fn deref_mut(&mut self) -> &mut T;
          impl<T> Drop for MutexGuard<T> {  // this is ~MutexGuard
              fn drop(&mut self) {/* unlock the mutex */}

          Thinking about whether a MutexGuard’s methods cause a data race if you call them concurrently is a mindbender because the point of a mutex is to only give out a single MutexGuard to one thread at a time. Whether that makes MutexGuard thread-safe or thread-compatible or thread-unsafe is big shrug.

          For Sync and Send though, it turns out to be simpler: MutexGuard<T> is Sync as long as T is Sync because sending such a &MutexGuard<T> to a different thread is totally fine; however MutexGuard is not Send because sending ownership of it over to a different thread would result in its destructor running on that other thread, i.e. unlocking the underlying Mutex on a different thread than the one which locked it, which is UB on many platforms/mutex implementations.

      1. 1

        This is awesome. I maintain a programming language interpreter, and now I want to rewrite it using this technique. Plus, it’s cool that the tail-threaded interpreter technique is applicable to a wider domain, like protobuf parsing. What else could I parse this way?

        If I use this technique, then does it block my ability to compile to WASM, or can LLVM generate WASM code that doesn’t explode the stack when I use ‘musttail’? Tail calls are a WASM proposal, in “implementation phase”, but Firefox has been stalled on this for two years, last I checked. I’m just not sure if LLVM ‘musttail’ depends on WASM tail calls.

        1. 1

          That is a great question. I don’t know the answer to it. LLVM has had a “musttail” attribute at the IR level for a while, the Clang change just piggybacked on this existing work. I assumed this means that tail calls are supported on all targets. But I’m not sure if there are complications that would prevent this on some targets, like wasm.

        1. 4

          Great article! At first I didn’t see the significance of the new attribute, but now I see that forcing the compiler to emit a tail-call even in unoptimized builds (and breaking the build if it can’t) is important.

          The referenced article about the design of wasm3 is killer — I came across it a few months ago and wished I’d seen it a year ago when I was implementing a simple byte code interpreter. It spells out the advantages of the threaded call style hot use the old FORTH term) very clearly.

          FYI @haberman, I believe there’s a typo (thinko?) in the following sentence; “caller” should be “callee”:

          The preserve_most attribute makes the caller responsible for preserving nearly all registers, which moves the cost of the register spills to the fallback functions where we want it.

          1. 2

            You are correct, thanks for the heads up!

          1. 1

            i had heard that luaJIT was so complex that basically nobody besides mike pall could maintain it, but i didn’t realize it was written in assembly!

            enjoyed learning more about careful optimization in this article and the interesting connections between programming style & relationship with compiled output.

            1. 6

              Hi there, article author here. I’m glad you enjoyed the article. LuaJIT’s interpreter is written in assembly, but the rest (parser, optimizer, code generator, etc) are written in C.

              1. 1

                i had heard that luaJIT was so complex that basically nobody besides mike pall could maintain it, but i didn’t realize it was written in assembly!

                FWIW, PHP’s JIT is actually basically LuaJIT’s with a lot ripped out of it.