1. 47
  1.  

  2. 14

    If you haven’t read it, what every programmer should know about memory is an extremely well-written deep-dive into the phenomenon that the author is just touching on here:

    I know it is shocking to learn for anyone unfortunate enough to have a computer science degree, but computers are physical and finite things, made out of matter and powered by energy, and my O(n²) can beat your O(n) when you forget that.

    1. 9

      It is not just the compiler. It depends on the CPU if 4+4+4 or 3*4 is faster. The instruction set itself does not give any guaranties. So even assembly language is not how the computer works?

      1. 22

        Nowadays I’d say that no, assembly language is not how the computer works. The assembly language is also running in another abstract machine.

        1. 7

          Yeah, modern CPUs do all of the following: (micro-)instruction buffering, out-of-order / parallel scheduling, branch prediction, speculative execution…

          Assembly language is definitely not how the machine works anymore, and that’s how we end up with Meltdown and Spectre.

        2. 4

          Hasn’t that been true since microcode was invented? ;)

          1. 2

            It’s differently true today… back in the day, reading assembly was informative in a way it isn’t now that a conditional branch can take zero cycles if it goes one way and many dozen cycles if it goes the other way. Assembly is still the most machine-like language we have, but reading it gives a much less complete picture of what the code does.

            It’s so difficult to read assembly and understand that one read’s very likely to be in L1 or L2, while another read is likely to go to main memory, and when it happens the delay will impact the next 50 instructions. Or that when a conditional branch goes one way the CPU will already have prefetched a lot, and the next 25 instructions are already being executed, whereas when it goes the other way the next instruction will take longer to start than those 25 to finish.

            1. 1

              We might need tools that explain it to us based on the CPU from several, optional perspectives. “Click expand or type this to see more.”

        3. 8

          The writer’s ability to reasonably explain this in very simple terms for everyone is remarkable. It’s a sign of having a good understanding of the domain. Seems obvious now but someone could have made that way more complicated.

          1. 9

            Yeah Steve Klabnik is awesomely good at this. I’m pretty sure he wrote or co-wrote the original “Rust for Rubyists” as well.

            1. 5

              Yes he did and that turned into the official Rust book, which he co-authored with Carol Nichols.

          2. 4

            So maybe one should learn C to learn how a “C abstract machine” works then (especially if you want portability)? A computer can execute the instructions of this “C abstract machine” well enough that 95% (just a reasonable guess) of code written eventually executes on top of this “C abstract machine” without real problems. For places where that’s considered inefficient you can always drop out to something where you can don’t use the “C abstract machine” like Fortran or the CPU’s assembly language.

            1. 3

              For places where that’s considered inefficient you can always drop out to something where you can don’t use the “C abstract machine” like Fortran or the CPU’s assembly language.

              I think this problem happens in every language, including the CPU’s assembly language. If it’s not the “C abstract machine” its the “FORTRAN abstract machine” or the “x86_64 abstract machine” (yes, even assembly language. The user level assembly language has no idea of cache lines and levels, out of order execution or anything of the sort going on in the current CPUs, it is essentially executing in an abstract machine too, one that is really close to the original 8086). I don’t think there is any way of avoiding the abstract machine “problem”.

              1. 3

                For places where that’s considered inefficient

                That’s not the point. The point is the model in which you program in C is a different model from what your machine executes, and you must consider both when programming C, leading to code which may seem awkward or extraneous for no self-evident reason if you want full cache saturation. Considering this does not require dropping out of an escape hatch.

                1. 4

                  That’s not the point. The point is the model in which you program in C is a different model from what your machine executes, and you must consider both when programming C, leading to code which may seem awkward or extraneous for no self-evident reason if you want full cache saturation. Considering this does not require dropping out of an escape hatch.

                  And the model in which you program in Assembly is also a different model from what your machine actually executes, too. It feels like we went through a short period where programmers had a good idea of what their machine actually did, and it worked in general for most machines with the same feature-set, and now we’ve come full-circle to optimization depending on the specific machine.

                  1. 1

                    And the model in which you program in Assembly is also a different model from what your machine actually executes, too

                    And that’s fine most of the time. But it’s important to be aware of ofc

                    and now we’ve come full-circle to optimization depending on the specific machine

                    Definitely. Depends on the shop you ask though, I guess.

                  2. 2

                    Also, concurrency, SIMD, side channels…

                    1. 3

                      SIMD feels like a bottomless hole sometimes. You feel you’ve gotten pretty deep in vectorization and you find out there’s an entirely different alien way you can do things all over again

                      1. 2

                        That’s why I pushed for DSL’s or parallel languages that handle it for us with optional hints. DOD went that route for Exascale funding languages such as Chapel, X10, and Fortress. That was for NUMA and clusters. Futhark is a newer one for GPU’s. I’m sure more like that could be done for SIMD.

                        1. 2

                          Many application domains aren’t DOD though 😃

                          In video game land, Enoki is quite cool as something that goes beyond local vectorization

                          1. 2

                            Many application domains aren’t DOD though 😃

                            Thank goodness. The video games would probably suck. Except the mil sims.

                            “ Enoki is quite cool”

                            It is! Thanks for the tip. Might get some use out of that in the future.

                  3. 1

                    There was a great blog post from a while back that really hammers home the idea of a “C abstract machine.” C Portability Lessons from Weird Machines outlines the radically varied hardware that C “grew up” with.

                  4. 1

                    This is the frustrating thing to me about this article (which is a good writeup).

                    Near the beginning:

                    A little over a year ago, I wrote “Should you learn C to ‘learn how the computer works’”. It was a bit controversial. I had promised two follow-up posts. It’s taken me a year, but here’s the first one.

                    And the end:

                    Because C’s abstract machine is so thin, these kinds of details can really, really matter, as we’ve seen. And this is where the good part of the “C teaches you how the computer works” meme comes in.

                    Klabnik took a year to shoot out an essay that is super easy to misinterpret at “C isn’t how computers really work (and so we shouldn’t learn it!)” and that has become a not-uncommon meme in the Rust Evangelion Strike Force.

                    A year later, he recants, sorta kinda–but in the meantime, what happened? How much stupid argumentation resulted from this? How many C developers had to put up with erroneous Rust fanboyism on that point alone?

                    Folks, please please please be deliberate when writing for your communities on these things.

                    1. 2

                      that has become a not-uncommon meme in the Rust Evangelion Strike Force.

                      Can you link to a specific example or two?

                      1. 1

                        A year of IRC and twitter interactions, though I’m sure you could find comments on HN or even here if you looked.