1. 36
    1. 8

      Couple of observations. In the example, if all array elements are passed in registers, how do you expect to handle variable indexing? I.e.

      fn variable_extract_i64(idx: usize, arr: [i64; 3]) -> i64 {
          arr[idx]
      }
      

      My understanding is that implementing this pretty much requires writing arr to the stack and then reloading.

      The Go internal calling convention makes the same point (emphasis mine):

      Function calls pass arguments and results using a combination of the stack and machine registers. […]. However, any argument or result that contains a non-trivial array or does not fit entirely in the remaining available registers is passed on the stack. […]

      Non-trivial arrays are always passed on the stack because indexing into an array typically requires a computed offset, which generally isn’t possible with registers. Arrays in general are rare in function signatures (only 0.7% of functions in the Go 1.15 standard library and 0.2% in kubelet).

      And gives up on arrays of size bigger than 1:

      Register-assignment of a value V of underlying type T works as follows: […]

      • If T is an array type of length 0, do nothing.
      • If T is an array type of length 1, recursively register-assign its one element. ) - If T is an array type of length > 1, fail. […]

      Also, how often is generating an ad-hoc calling convention worth it, where the same compilation time could be spent on more aggressive inlining/LTO?

      1. 2

        Since it’s passed by value, it could just be spilled to the stack if/when you need it. Since it’s a fixed small size, typical iteration should be able to be unrolled and therefore not require arbitrary indexing in the generated code. Or you could imagine compiling to a small switch with a case for each valid index and a default of panicking. You’re not wrong that this is more complicated and there’s almost definitely lower hanging fruit but iirc inlining tends to only happen inside a crate with the exception of generic functions and functions explicitly marked inline, while LTO which would allow that to not be the case is still pretty slow and expensive.

        1. 1

          Since it’s passed by value, it could just be spilled to the stack if/when you need it. Since it’s a fixed small size, typical iteration should be able to be unrolled and therefore not require arbitrary indexing in the generated code. Or you could imagine compiling to a small switch with a case for each valid index and a default of panicking.

          The point is that spilling to the stack is what the current calling convention already does. Similarly, dense switch statements are generally compiled to a lookup table, which means that you’ll end up having to do a load from memory anyway (only it will be “cold” memory from the .data segment instead of “hot” memory like the stack).

          1. 1

            Sure but now the decision to spill to the stack is deferred from the caller to the callee. If the callee doesn’t need it then you don’t pay for it. If it does then you’re really not paying that much over the current ABI.

      2. 4

        The document preview sidebar is very cool and also very discouraging. You have to trick me into not realizing it will take a half hour to read it all!

        1. 3

          Yes that’s neat, I’d enjoy reading about how it’s implemented.

        2. 2

          Hi, this is really interesting to me. Thank you

          Just some thoughts: machine sympathy would mean laying data for hot loops to be efficient to process. So I’m thinking of the K, J APL languages where everything is a matrix.

          But we want everything to be in registers.

          Is there some way of combining efficient control flow dispatch with this tabular view of software engineering

          I am recently working on coroutines and a multithreaded runtime. I also been thinking about relations - as in SQL and relational algebra and how it could accelerate processing of coroutines.

          1. 2

            The underlined chapter titles break badly on Firefox Android, it took me a while before I realised that wasn’t an intentional glitch. Always hurts to see websites designed solely for Chrome.

            1. 3

              I see the same thing. With a slightly different font size I can also get the same thing to happen on Firefox Desktop.

              The web page uses CSS to set text-decoration-thickness: 0.25ex and text-underline-offset: 4%. When the browser renders an underline it is supposed to leave a gap for letters with descenders so that the the underline doesn’t draw over them. However it looks like in this case something about these settings, the font metrics, and/or Firefox’s algorithm leave it thinking that the line is “too close” to some letters and so it doesn’t draw the line under those letters.

              Chrome seems to interpret the same CSS slightly differently and draws the underline a bit farther away from the text. This means that it doesn’t get treated as intersecting the letters so there are no gaps in the underline. I don’t if one of these behaviors is correct and the other is incorrect or if both are permissible. If you adjust the line so it is closer and/or thicker then Chrome also starts showing similar behavior with the line disappearing under certain letters.

              I have a bit of a hard time blaming the web site for this. The author is using standard CSS with values that don’t seem unreasonable and it renders correctly in both Firefox and Chrome on the desktop (given the font sizes in use). How much testing can we really expect somebody to do to try to find a minor cosmetic issue like this one that somehow only seems to impact Firefox on Android?

              1. 2

                FWIW, it works fine for me with Firefox on Linux (v124.0.1). Actually, I thought it looks pretty awesome - hadn’t seen any design like it, before.