1. 18
  1. 8

    I think the reason we’ve been able to hide the distributed nature of modern CPUs is that a single die is the most reliable distributed system you can make. Latency is near zero, availability is nearly 100%, errors must be very low too.

    Similarly, there are lots of APIs for channeling function calls between threads in a process hexample: C++’s std::async.) These work because threads in a single process aren’t subject to most of the Fallacies Of Distributed Computing.

    1. 2

      It’s an interesting article with what seems to be an interesting thrust, but I’m struggling a bit to understand where the author is trying to take it.

      So, yes, modern computers are in and of themselves distributed systems with a myriad of buses, multi-core processors, and arbitrarily complex software systems governing their operation. That in and of itself is an almost miraculous thing worth pondering.

      But it seems like from there the author is asserting that we don’t currently have good abstractions to help us write potentially globe spanning distributed systems, and this is where they and I part ways.

      I work for a Major Cloud Provider, and we operate crazy scale globe spanning distributed systems as a way of life. We’re practically awash in abstractions that help us achieve this. They all have different characteristics depending on precisely what they’re trying to achieve and how they’re choosing to present that, and no doubt as with everything they need to iterate and evolve.

      So, if what they’re really saying is “We need to continue innovating better abstractions that makes running distributed systems easy and safe” then I’m in violent agreement :)

      1. 8

        I work for a Major Cloud Provider, and we operate crazy scale globe spanning distributed systems as a way of life. We’re practically awash in abstractions that help us achieve this.

        Yes, but those don’t abstract away the distributed nature of the system. That’s why they differ from the abstraction in an individual computer.

        1. 2

          There’s a challenge there though, right? Because human brains by design have an incredibly difficult time visualizing parallel tasks.

          Look at the utter sh** show any kind of threaded programming has been for the last 30 years even when we know there are far better models like Actors available the whole time.

          How does one both enable people to build reliable distributed systems that work and NOT hide their distributed nature?

          1. 3

            How does one both enable people to build reliable distributed systems that work and NOT hide their distributed nature?

            I think you may have misunderstood me. Today’s large-scale distributed abstractions don’t hide the distributed nature of the system, and, ostensibly, they’re used to build reliable systems.

            That differs from the abstraction in an individual computer, which does hide the distributed nature of the system, and also, ostensibly, is used to build reliable systems.

            Both kinds of abstractions, “hiding” and “non-hiding”, are heavily used. (Whether one is better than the other, I haven’t said anything about)

            1. 2

              How does one both enable people to build reliable distributed systems that work and NOT hide their distributed nature?

              I think we already have some great abstractions around distributed systems in both theory and practice, especially if you work on a big cloud. However once it comes to an individual computer, we pretend that everything is working synchronously. I think we could author our systems much more effectively by designing with distributed systems abstractions from the beginning.

              1. 1

                Ah you’re absolutely right. We’re still building systems using an architecture designed in the 50s and 60s when computers were implemented using vacuum tubes, paper tape, and raw grit :)

                I do have to wonder though - If we start thinking about building computers based on non Von Neumann architectures, will humans actually be able to reason about them and their internal workings?

                I’d argue that even WITH an essentially serial architecture most people, myself included can’t even begin to truly wrap our brains around everything inside their modern computer.

                It’s one of the reasons I so very much enjoy working with 8 bit era systems like my Atari 800XL. You really CAN understand everything about the machine from ‘tail to snout’ as they say :)

                1. 3

                  Ah you’re absolutely right. We’re still building systems using an architecture designed in the 50s and 60s when computers were implemented using vacuum tubes, paper tape, and raw grit :)

                  Most modern CPUs have a Harvard architecture up to L2 or L3.

                  1. 2

                    Thanks for that. TIL!

                    https://en.wikipedia.org/wiki/Harvard_architecture

                    I’d not heard of the Harvard Architecture but reading about it the issues around contention are certainly widely felt whenever you talk about performance.

                    1. 1

                      While separate L1 instruction and data caches are ubiquitous, yes, I think that’s largely an implementation detail due to circuit-design constraints – they’re still the same address space. Some CPUs, e.g. x86, will even enforce coherence between them, so a store instruction to an address that happens to be present in the I-cache will invalidate that line (though others require manual I-cache invalidation for things like JITs and self-modifying code).

                2. 2

                  Look at the utter sh** show any kind of threaded programming has been for the last 30 years even when we know there are far better models like Actors available the whole time.

                  Better for what? There are plenty of problems where threads are preferable to actors. There’s a reason threads were invented in the first place, despite already having processes which can communicate through message passing.

                  1. 3

                    I would be open to hearing this story. The version I learned is that threads arose because disk I/O was expensive, and sharing the CPU with threads could allow a program to simultaneously wait for a disk and run a computation. Today, we have better options; we can asynchronously manage our I/O.

                    1. 2

                      Actors are an abstraction implemented on top of threads. (I have implemented Actors myself.)

                      Concurrency by means of multiple communicating OS processes tends to be inefficient, because processes are expensive to create and slow to context-switch between. Messaging is expensive too. So lightweight threads in a process were a performance boost as well as easier to use.

                      The advantage of actors is they’re much easier to reason about. But I agree with you that in some cases it’s simpler to use threads as your model and just deal with mutexes.

                      1. 4

                        in some cases it’s simpler to use threads as your model and just deal with mutexes.

                        Also, in a lot of the (IMO) good use cases for threads, “just deal with mutexes” is barely even a concern. If you are processing a bunch of independent units of work with little or no shared state, threads make the flow of control really easy to reason about and the classic pitfalls rarely come up.

                        This is arguably the situation for a pretty big percentage of multithreaded programs. For example, .NET or Java web services, where there are large numbers of framework-managed threads active and there is shared state under the covers (database connection pools, etc.) but the vast majority of the day-to-day application logic can be correctly and safely written with zero attention paid to thread-related issues.

                        1. 1

                          True dat, but what you’re describing is also a good fit for Actors or similar abstractions. Even the “under the covers” part.

                          1. 1

                            Exactly. A lot of network services that work on stateless protocols (looking at you SIP and NNTP…) essentially have no state to mutate except for a backend database. Threads are easy abstractions because you can partition incoming work in exactly the thread/db connection pool patterns that you talk about.

                            For more stateful work (say working on some form of distributed cache), the thread pool model can proved to be much more complicated.

                        2. 1

                          So, you’re right. My comment was off the cuff and not particularly articulate.

                          What I was getting at is that there was a time - mostly the early-ish Java era, where legions of work-a-day business programmers were exposed to threads and concurrency in contexts with which they were unfamiliar.

                          They then proceeded to, in the large, make a giant cock-up of it because without some helpful abstractions to moderate them, threads can be so powerful that they give people more than the rope they need to hang themselves.

                          Later on, the Java community reacted to this and introduced things like ThreadPool and ThreadGroup (I think that was the name. My Java is rusty) which helped people use threads in ways that were much easier to reason about, and they had a much better time.

                          But you’re right, in the hands of a capable programmer with the right experience, threads are an incredibly powerful tool with tremendous potential.

                          1. 2

                            FWIW the abstraction we arrive at doesn’t have to be threads, actors, or anything else. I’m personally very sympathetic to capability-passing style API boundaries (and I know a few folks here are also 😛) but what I’m trying to get at is, the current abstractions we have now in systems design is purely trying to hold onto the old synchronous model of computing. It’s not a thoughtful abstraction to help tame the complexity of modern computers. It’s literally just an accident of history. And it leaves out a lot of important things you could do an API that does anything except pretend like every action is synchronous.