1. 13

    And yet, it is still necessary to download one third of a gigabyte worth of Node modules to build WebPack, to be able to use Phoenix’s live view. Lovely article, but it does not fully cure my Javascript fatigue, haha

    1. 9

      I followed this very helpful post to replace webpack with snowpack, which uses esbuild (a bundler written in go) for super fast iterations and fewer dependencies: https://www.richardtaylor.dev/articles/replacing-webpack-with-snowpack-in-a-phoenix-application

      1. 2

        This is great! Thank you!

      2. 4

        Well, you aren’t forced to use LiveView. Additionally I think that most of the deps come from Webpack which you can replace with “lighter” builder if you want.

        1. 2

          I am using LiveView without webpack. I used symlinks:

          [dmpk2k@bra js]$ ls -lah
          [...]
          -rw-r--r-- 1 dmpk2k dmpk2k 3.7K Feb  2 21:51 app.esm.js
          lrwxrwxrwx 1 dmpk2k dmpk2k   50 Dec  4 22:49 phoenix.js -> ../../../../../deps/phoenix/priv/static/phoenix.js
          lrwxrwxrwx 1 dmpk2k dmpk2k   70 Dec  4 22:49 phoenix_live_view.js -> ../../../../../deps/phoenix_live_view/priv/static/phoenix_live_view.js
          [...]
          

          Inside the layout:

          <script defer type="text/javascript" src="<%= Routes.static_path(@conn, "/js/phoenix.js") %>"></script>
          <script defer type="text/javascript" src="<%= Routes.static_path(@conn, "/js/phoenix_live_view.js") %>"></script>
          <script defer type="module" src="<%= Routes.static_path(@conn, "/js/app.esm.js") %>"></script>
          

          Inside app.esm.js:

          let csrfToken = document.querySelector("meta[name='csrf-token']").getAttribute("content");
          window.liveSocket = new phoenix_live_view.LiveSocket("/live", Phoenix.Socket, {
            params: {_csrf_token: csrfToken},
            hooks: Hooks
          });
          window.liveSocket.connect();
          

          Perhaps I’m missing something by skipping off the beaten path, but it works fine, and life is easier. At some point I’ll make the build gzip the two files, but otherwise there isn’t much more to be gained with a bundler.

          Webpack boils an ocean to make a cup of tea.

          1. 1

            +10! There are lighter projects similar to LiveView: https://github.com/vindarel/awesome-no-js-web-frameworks/ & https://github.com/dbohdan/liveviews some without needing JS at all.

            1. 4

              Only way I see it could work is that it will use WebAsm, which for me isn’t much different from using JS.

              EDIT: I have checked - whole Phoenix LiveView with it’s dependencies (phoenix library for socket integration and morphdom) is ~173 KB unminified and ungzipped. After minification and gzipping it will be almost negligible. Most of the “bloat” comes form Webpack (as mentioned earlier) which can be without much issues replaced with any other build tool of your preference, even just cat to join all files together.

          1. 46

            FWIW, this is how you express conditional arguments / struct fields in Rust. The condition has to encompass the name as well as the type, not just the type as was first attempted.

            I feel like Rust has definitely obliterated its complexity budget in unfortunate ways. Every day somebody comes to the sled discord chat with confusion over some async interaction. The fixation on async-await, despite it slowing down almost every real-world workload it is applied to, and despite it adding additional bug classes and compiler errors that simply don’t exist unless you start using it, has been particularly detrimental to the ecosystem. Sure, the async ecosystem is a “thriving subcommunity” but it’s thriving in the Kuhnian sense where a field must be sufficiently problematic to warrant collaboration. There’s no if-statement community anymore because they tend to work and be reasonably well understood. With async-await, I observed that the problematic space of the overall community shifted a bit from addressing external issues through memory safety bug class mitigation to generally coping with internal issues encountered due to some future not executing as assumed.

            The problematic space of a community is a nice lens to think about in general. It is always in-flux with the shifting userbase and their goals. What do people talk about? What are people using it for? As the problematic space shifts, at best it can introduce the community to new ideas, but there’s also always an aspect of it that causes mourning over what it once represented. Most of my friends who I’ve met through Rust have taken steps to cut interactions with “the Rust community” down to an absolute minimum due to it tending to produce a feeling of alienation over time. I think this is pretty normal.

            I’m going to keep using Rust to build my database in and other things that need to go very fast, but I see communities like Zig’s as being in some ways more aligned with the problematic spaces I enjoy geeking out with in conversations. I’m also thinking about getting a lot more involved in Erlang since I realized I haven’t felt that kind of problem space overlap in any language community since I stopped using it.

            1. 31

              I was surprised to see the Rust community jump on the async-await bandwagon, because it was clear from the beginning it’s a bandwagon. When building a stable platform (e.g. a language) you wait for the current fashion to crest and fall, so you can hear the other side of the story – the people who used it in anger, and discovered what the strengths and weaknesses really are. Rust unwisely didn’t do that.

              I will note though that the weaknesses of the async-await model were apparent right from the beginning, and yet here we are. A lesson for future languages.

              1. 28

                This hits me particularly hard because I had experienced a lot of nearly-identical pain around async when using various flavors of Scala futures for a few years before picking up Rust in 2014. I went to the very first Rust conference, Rust Camp in 2015 at Berkeley, and described a lot of the pain points that had caused significant issues in the Scala community to several of the people directly working on the core async functionality in Rust. Over the years I’ve had lots of personal conversations with many of the people involved, hoping that sharing my experiences would somehow encourage others to avoid well-known painful paths. This overall experience has caused me to learn a lot about human psychology - especially our inability to avoid problems when there are positive social feedback loops that lead to those problems. It makes me really pessimistic about climate apocalypse and rising authoritarianism leading us to war and genocides, and the importance of taking months and years away from work to enjoy life for as long as it is possible to do so.

                The content of ideas does not matter very much compared to the incredibly powerful drive to exist in a tribe. Later on when I read Kuhn’s Structure of Scientific Revolutions, Feyerabend’s Against Method, and Ian Hacking’s Representing and Intervening, which are ostensibly about the social aspects of science, I was blown away by how strongly their explanations of how science often moves in strange directions that may not actually cause “progress” mapped directly to the experiences I’ve had while watching Rust grow and fail to avoid obvious traps due to the naysayers being drowned out by eager participants in the social process of Making Rust.

                1. 7

                  Reminds me of the theory that Haskell and Scala appeal because they’re a way for the programmer to needsnipe themselves

                  1. 5

                    Thanks for fighting the good fight. Just say “no” to complexity.

                    Which of those three books you mentioned do you think is most worthwhile?

                    1. 10

                      I think that Kuhn’s Structure of Scientific Revolutions has the broadest appeal and I think that nearly anyone who has any interaction with open source software will find a tremendous number of connections to their own work. Science’s progressions are described in a way that applies equally to social-technical communities of all kinds. Kuhn is also the most heavily cited thinker in later books on the subject, so by reading his book, you gain deeper access to much of the content of the others, as it is often assumed that you have some familiarity with Kuhn.

                      You can more or less replace any mention of “paper citation” with “software dependency” without much loss in generality while reading Kuhn. Hacking and Feyerabend are more challenging, but I would recommend them both highly. Feyerabend is a bit more radical and critical, and Hacking zooms out a bit more and talks about a variety of viewpoints, including many perspectives on Kuhn and Feyerabend. Hacking’s writing style is really worth experiencing, even by just skimming something random by him, by anyone who writes about deep subjects. I find his writing to be enviably clear, although sometimes he leans a bit into sarcasm in a way that I’ve been put off by.

                    2. 4

                      If you don’t mind, what’s an example of async/await pain that’s common among languages and not to do with how Rust uniquely works? I ask because I’ve had a good time with async/await, but in plainer, application-level languages.

                      (Ed: thanks for the thoughtful replies)

                      1. 12

                        The classic “what color is your function” blog post describes what is, I think, such a pain? You have to choose in your API whether a function can block or not, and it doesn’t compose well.

                        1. 3

                          I read that one, and I took their point. All this tends to make me wonder if Swift (roughly, Rust minus borrow checker plus Apple backing) is doing the right thing by working on async/await now.

                          But so far I don’t mind function coloring as I use it daily in TypeScript. In my experience, functions that need to be async tend to be the most major steps of work. The incoming network request is async, the API call it makes is async, and then all subsequent parsing and page rendering aren’t async, but can be if I like.

                          Maybe, like another commenter said, whether async/await is a net positive has more to do with adapting the language to a domain that isn’t otherwise its strong suit.

                          1. 16

                            You might be interested in knowing that Zig has async/await but there is no function coloring problem.

                            https://kristoff.it/blog/zig-colorblind-async-await/

                            1. 3

                              Indeed this is an interesting difference at least in presentation. Usually, async/await provides sugar for an existing concurrency type like Promise or Task. It doesn’t provide the concurrency in the first place. Function colors are then a tradeoff for hiding the type, letting you think about the task and read it just like plain synchronous code. You retain the option to call without await, such that colors are not totally restrictive, and sometimes you want to use the type by hand; think Promise.all([…]).

                              Zig seems like it might provide all these same benefits by another method, but it’s hard to tell without trying it. I also can’t tell yet if the async frame type is sugared in by the call, or by the function definition. It seems like it’s a sort of generic, where the nature of the call will specialize it all the way down. If so, neat!

                              1. 7

                                It seems like it’s a sort of generic, where the nature of the call will specialize it all the way down. If so, neat!

                                That’s precisely it!

                                1. 2

                                  I’ve been poking at Zig a bit since this thread; thank you for stirring my interest. :)

                        2. 6

                          Well, I think that async/await was a great thing for javascript, and generally it seems to work well in languages that have poor threading support. But Rust has great threading support, and Rust’s future-based strategy aimed from the beginning at copying Scala’s approach. A few loud research-oriented voices in the Rust community said “we think Scala’s approach looks great” and it drowned out the chorus of non-academic users of Scala who had spent years of dealing with frustrating compilation issues and issues where different future implementations were incompatible with each other and overall lots of tribalism that ended up repeating in a similar way in the Rust async/await timeline.

                          1. 5

                            I am somewhat surprised that you say Rust’s futures are modeled after Scala’s. I assume the ones that ended up in the standard library. As for commonalities: They also offer combinators on top of a common futures trait and you need explicit support in libraries - that’s pretty much all that is similar to Rust’s.

                            In Scala, futures were annoying because exceptions and meaningless stacktraces. In Rust, you get the right stacktraces and error propagation.

                            In Rust, Futures sucked for me due to error conversions and borrowing being basically unsupported until async await. Now they are still annoying because of ecosystem split (sync vs various partially compatible async).

                            The mentioned problem of competing libraries is basically unpreventable in fields without wide consensus and would have happened with ANY future alternative. If you get humans to agree on sensible solutions and not fight about irrelevant details, you are a wizard.

                            Where I agree is that it was super risky to spend language complexity budget on async/await, even though solving the underlying generator/state machine problem felt like a good priority. While async await feels a bit to special-cased and hacky to be part of the language… It could be worse. If we find a better solution for async in Rust, we wouldn’t have to teach the current way anymore.

                            Other solutions would just have different pros and cons. E.g. go’s or zig’s approach seemed the solution even deeper into the language with the pro of setting a somewhat universal standard for the language.

                            1. 3

                              It was emulating Finagle from the beginning: https://medium.com/@carllerche/announcing-tokio-df6bb4ddb34 but then the decision to push so much additional complexity into the language itself so that people could have an easier time writing strictly worse systems was just baffling.

                              Having worked in Finagle for a few years before that, I tried to encourage some of the folks to aim for something lighter weight since the subset of Finagle users who felt happy about its high complexity seemed to be the ones who went to work at Twitter where the complexity was justified, but it seemed like most of the community was pretty relieved to switch to Akka which didn’t cause so much type noise once it was available.

                              I don’t expect humans not to fragment now, but over time I’ve learned that it’s a much more irrational process than I had maybe believed in 2014. Mostly I’ve been disappointed about being unable to communicate with what was a tiny community about something that I felt like I had a lot of experience with and could help other people avoid pain around, but nevertheless watch it bloom into a huge crappy thing that now comes back into my life every day even when I try to ignore it by just using a different feature set in my own stuff.

                          2. 3

                            I hope you will get a reply by someone with more Rust experience than me, but I imagine that the primary problem is that even if you don’t have to manually free memory in Rust, you still have to think about where the memory comes from, which tends to make lifetime management more complicated, requiring to occasionally forcefully move things unto the heap (Box) and also use identity semantics (Pin), and so all of this contributes to having to deal with a lot of additional complexity to bake into the application the extra dynamicism that async/await enables, while still maintaining the safety assurances of the borrow checker.

                            Normally, in higher level languages you don’t ever get to decide where the memory comes from, so this is a design dimension that you never get to explore.

                          3. 2

                            I’m curious if you stuck around in Scala or pay attention to what’s going on now because I think it has one of the best stories when it comes to managing concurrency. Zio, Cats Effect, Monix, fs2 and Akka all have different goals and trade offs but the old problem of Future is easily avoided

                          4. 6

                            I was surprised to see the Rust community jump on the async-await bandwagon, because it was clear from the beginning it’s a bandwagon.

                            I’m not surprised. I don’t know how async/await works exactly, but it definitely has a clear use case. I once implemented a 3-way handshake in C. There was some crypto underneath, but the idea was, from the server’s point of view was to receive a first message, respond, then wait for the reply. Once the reply comes and is validated the handshake is complete. (The crypto details were handled by a library.)

                            Even that simple handshake was a pain in the butt to handle in C. Every time the server gets a new message, it needs to either spawn a new state machine, or dispatch it to an existing one. Then the state machine can do something, suspend, and wait for a new message. Note that it can go on after the handshake, as part of normal network communication.

                            That state machine business is cumbersome and error prone, I don’t want to deal with it. The programming model I want is blocking I/O with threads. The efficiency model I want is async I/O. So having a language construct that easily lets me suspend & resume execution at will is very enticing, and I would jump to anything that gives me that —at least until I know better, which I currently don’t.

                            I’d even go further: given the performance of our machines (high latencies and high throughputs), I believe non-blocking I/O at every level is the only reasonable way forward. Not just for networking, but for disk I/O, filling graphics card buffers, everything. Language support for this is becoming as critical as generics themselves. We laughed “lol no generics” at Go, but now I do believe it is time to start laughing “lol no async I/O” as well. The problem now is to figure out how to do it. Current solutions don’t seem to be perfect (though there may be one I’m not aware of).

                            1. 2

                              The whole thing with async I/O is that process creation is too slow, and then thread creation was too slow, and some might even consider coroutine creation too slow [1]. It appears that concerns that formerly were of the kernel (managing I/O among tasks; scheduling tasks) are now being pushed out to userland. Is this the direction we really want to go?

                              [1] I use coroutines in Lua to manage async I/O and I think it’s fine. It makes the code look like non-blocking, but it’s not.

                              1. 2

                                I don’t think it’s unreasonable to think that the kernel should have as few concerns as possible. It’s a singleton, it doesn’t run with the benefit of memory protection, and its internal APIs aren’t as stable as the ones it provides to userland.

                                … and, yes, I think a lot of async/await work is LARPing. But that’s because a lot of benchmark-oriented development is LARPing, and probably isn’t special to async/await specifically.

                                1. 1

                                  I’m not sure what you’re getting at. I want async I/O to avoid process creation and thread creation and context switches, and even scheduling to some extent. What I want is one thread per core, and short tasks being sent to them. No process creation, no thread creation, no context switching. Just jump the instruction pointer to the relevant task, and return to the caller when it’s done.

                                  And when the task needs to, say, read from the disk, then it should do so asynchronously: suspend execution, return to the caller, wait for the response to come back, and when it does resume execution. It can be done explicitly with message passing, but that’s excruciating. A programming model where I can kinda pretend the call is blocking (but in fact we’re yielding to the caller) is much nicer, and I believe very fast.

                              2. 5

                                Agreed. I always told people that async/await will be just as popular as Java’s synchronized a few years down the road. Some were surprised, some were offended, but sometimes reality is uncomfortable.

                              3. 29

                                Thank you for sharing, Zig has been much more conservative than Rust in terms of complexity but we too have splurged a good chunk of the budget on async/await. Based on my experience producing introductory materials for Zig, async/await is by far the hardest thing to explain and probably is going to be my biggest challenge to tackle for 2021. (that said, it’s continuations, these things are confusing by nature)

                                On the upside

                                despite it slowing down almost every real-world workload it is applied to

                                This is hopefully not going to be a problem in our case. Part of the complexity of async/await in Zig is that a single library implementation can be used in both blocking and evented mode, so in the end it should never be the case that you can only find an async version of a client library, assuming authors are willing to do the work, but even if not, support can be added incrementally by contributors interested in having their use case supported.

                                1. 17

                                  I feel like Rust has definitely obliterated its complexity budget in unfortunate ways.

                                  I remember the time I asked one of the more well-known Rust proponents “so you think adding features improves a language” and he said “yes”. So it was pretty clear to me early on that Rust would join the feature death march of C++, C#, …

                                  Rust has many language features and they’re all largely disjoint from each other, so knowing some doesn’t help me guess the others.

                                  That’s so painfully true.

                                  For instance, it has different syntax for struct creation and function calls, their poor syntax choices also mean that structs/functions won’t get default values any time soon.

                                  ; is mandatory (what is this, 1980?), but you can leave out , at the end.

                                  The severe design mistake of using <> for generics also means you have to learn 4 different syntax variations, and when to use them.

                                  The whole module stuff is way too complex and only makes sense if you programmed in C before. I have basically given up on getting to know the intricacies, and just let IntelliJ handle uses.

                                  Super weird that both if and switch exist.

                                  Most of my friends who I’ve met through Rust have taken steps to cut interactions with “the Rust community” down to an absolute minimum due to it tending to produce a feeling of alienation over time.

                                  Yes, that’s my experience too. I have some (rather popular) projects on GitHub that I archive from time to time to not having to deal with Rust people. There are some incredibly toxic ones, which seem to be – for whatever reason – close to some “core” Rust people, so they can do whatever the hell they like.

                                  1. 6

                                    For instance, it has different syntax for struct creation and function calls

                                    Perhaps they are trying to avoid the C++ thing where you can’t tell whether foo(bar) is struct creation or a function call without knowing what foo is?

                                    The whole module stuff is way too complex and only makes sense if you programmed in C before. I have basically given up on getting to know the intricacies, and just let IntelliJ handle uses.

                                    It only makes sense to someone who has programmed in C++. C’s “module” system is far simpler and easier to grok.

                                    Super weird that both if and switch exist.

                                    Would you have preferred

                                    match condition() {
                                        true => {
                                        
                                        },
                                        false => {
                                    
                                        },
                                    }
                                    

                                    I think that syntax is clunky when you start needing else if.

                                    1. 1

                                      Perhaps they are trying to avoid the C++ thing where you can’t tell whether foo(bar) is struct creation or a function call without knowing what foo is?

                                      Why wouldn’t you be able to tell?

                                      Even if that was the issue (it isn’t), that’s not the problem C++ has – it’s that foo also could be 3 dozen other things.

                                      Would you have preferred […]

                                      No, I prefer having one unified construct that can deal with both usecases reasonably well.

                                      1. 2

                                        Why wouldn’t you be able to tell?

                                        struct M { };
                                        void L(M m);
                                        
                                        void f() {
                                            M(m); // e.g. M m;
                                            L(m); // function call
                                        }
                                        

                                        The only way to tell what is going on is if you already know the types of all the symbols.

                                        No, I prefer having one unified construct that can deal with both usecases reasonably well.

                                        Ok, do you have an example from another language which you think handles this reasonably well?

                                        1. 2

                                          The only way to tell what is going on is if you already know the types of all the symbols.

                                          Let the IDE color things accordingly. Solved problem.

                                          Ok, do you have an example from another language which you think handles this reasonably well?

                                          I’m currently in the process of implementing it, but I think this is a good intro to my plans.

                                          1. 1

                                            Let the IDE color things accordingly. Solved problem.

                                            The problem of course is for the writer of the IDE :)

                                            Constructs like these in C++ make it not only harder for humans to parse the code, but for compilers as well. This turns into real-world performance decreases which are avoided in other languages.

                                            I’m currently in the process of implementing it, but I think this is a good intro to my plans.

                                            That’s interesting, but I think there’s a conflict with Rust’s goal of being a systems-level programming language. Part of that is having primitives which map reasonably well onto things that the compiler can translate into machine code. Part of the reason that languages like C have both if and switch is because switch statements of the correct form may be translated into an indirect jump instead of repeated branches. Of course, a Sufficiently Smart Compiler could optimize this even in the if case, but it is very easy to write code which is not optimizable in such a way. I think there is value to both humans and computers in having separate constructs for arbitrary conditionals and for equality. It helps separate intent and provides some good optimization hints.

                                            Another reason why this exists is for exhaustiveness checks. Languages with switch can check that you handle all cases of an enum.

                                            The other half of this is that Rust is the bastard child of ML and C++. ML and C++ both have match/switch, so Rust has one too.


                                            I think you will have a lot of trouble producing good error messages with such a syntax. For example, say someone forgets an = or even both ==s. If your language does false-y and truth-y coercion, then there may be no error at all here. And to the parser, it is not clear at all where the error is. Further, this sort of extension cannot be generalized to one-liners. That is, you cannot unambiguously parse if a == b then c == d then e without line-breaks.

                                            On the subject, in terms of prior-art, verilog allows expressions in its case labels. This allows for some similar syntax constructions (though more limited since functions are not as useful as in regular programming languages).

                                    2. 3

                                      For instance, it has different syntax for struct creation and function calls, their poor syntax choices also mean that structs/functions won’t get default values any time soon.

                                      This is a good thing. Creating a struct is a meaningfully different operation from calling a function, and there’s no problem with having there be separate syntax for these two separate things.

                                      The Rust standard library provides a Default trait, with examples of how to use it and customize it. I don’t find it at all difficult to work with structs with default values in Rust.

                                      The whole module stuff is way too complex and only makes sense if you programmed in C before. I have basically given up on getting to know the intricacies, and just let IntelliJ handle uses.

                                      I don’t understand this comment at all. Rust’s module system seems fairly similar to module systems in some other languages I’ve used, although I’m having trouble thinking of other languages that allow you to create a module hierarchy within a single file, like you can do with the mod { } keyword (C++ allows nested namespaces I think, but that’s it). I don’t see how knowing C has anything to do with understand Rust modules better. C has no module system at all.

                                      1. 2

                                        I’m having trouble thinking of other languages that allow you to create a module hierarchy within a single file

                                        Lua can do this, although it’s not common.

                                        1. 1

                                          This is a good thing.

                                          I guess that’s why many Rust devs – immediately after writing a struct – also define a fun to wrap their struct creation? :-)

                                          Creating a struct is a meaningfully different operation from calling a function

                                          It really isn’t.

                                          The Rust standard library provides a Default trait, with examples of how to use it and customize it. I don’t find it at all difficult to work with structs with default values in Rust.

                                          That’s clearly not what I alluded to.

                                          I don’t see how knowing C has anything to do with understand Rust modules better. C has no module system at all.

                                          Rust’s module system only makes sense if you keep in mind that it’s main goal is to produce one big ball of compiled code in the end. In that sense, Rust’s module system is a round-about way to describe which parts of the code end up being part of that big ball.

                                          1. 3

                                            Putting a OCaml hat on:

                                            • Struct creation and function calls are quite different. In particular it’s good to have structure syntax that can be mirrored in pattern matching, whereas function call has no equivalent in match.
                                            • Multiple modules in one file is also possible in ML/OCaml. Maybe in some Wirth language, though I’m not sure on that one.

                                            it’s main goal is to produce one big ball of compiled code in the end.

                                            What other goal would there be? That’s what 100% of compiled languages aim at… Comparing rust to C which has 0 notion of module is just weird.

                                            1. 1

                                              Struct creation and function calls are quite different. In particular it’s good to have structure syntax that can be mirrored in pattern matching, whereas function call has no equivalent in match.

                                              In what sense would this be an obstacle? I would expect that a modern language let’s you match on anything that provides the required method/has the right signature. “This is a struct, so you can match on it” feels rather antiquated.

                                              What other goal would there be? That’s what 100% of compiled languages aim at… Comparing rust to C which has 0 notion of module is just weird.

                                              It feels like it was built by someone who never used anything but C in his life, and then went “wouldn’t it be nice if it was clearer than in C which parts of the code contribute to the result?”.

                                              The whole aliasing, reexporting etc. functionality feels like it exists as a replacement for some convenience C macros, and not something one actually would want. I prefer that there is a direct relationship between placing a file somewhere and it ending up in a specific place, without having to wire up everything again with the module system.

                                              1. 1

                                                There is documented inspiration from OCaml from the rust original creator. The first compiler was even in OCaml, and a lot of names stuck (like Some/None rather than the Haskell Just/Nothing). It also has obvious C++ influences, notably the namespace syntax being :: and <> for generics. The module system most closely reminds me of a mix of OCaml and… python, with special file names (mod.rs, like __init__.py or something like that?), even though it’s much much simpler than OCaml. Again not just “auto wiring” files in is a net benefit (another lesson from OCaml I’d guess, where the build system has to clarify what’s in or out a specific library). It makes build more declarative.

                                                As for the matching: rust doesn’t have active patterns or the scala-style deconstruction. In this context (match against values you can pre-compile pattern-matching very efficiently to decision trees and constant time access to fields by offset. This would be harder to do efficiently with “just call this deconstuct method”. This is more speculation on my side, but it squares with rust’s efficiency concerns.

                                                1. 1

                                                  I see your point, but in that case Rust would need to disallow match guards too (because what else are guards, but less reusable unapply methods?).

                                              2. 1

                                                Comparing rust to C which has 0 notion of module is just weird.

                                                Well there are translation units :) (though you can only import using the linker)

                                            2. 1

                                              I’m having trouble thinking of other languages that allow you to create a module hierarchy within a single file,

                                              Perl can do this.

                                              1. 3

                                                Elixir also allows you to create a module hierarchy within a single file.

                                                1. 2

                                                  And Julia. Maybe this isn’t so rare.

                                            3. 1

                                              ; is mandatory (what is this, 1980?), but you can leave out , at the end.

                                              Ugh this one gets me every time. Why Rust, why.

                                              1. 2

                                                Same in Zig? Curious to know Zig rationale for this.

                                                1. 10

                                                  In almost all languages with mandatory semicolons, they exist to prevent multi-line syntax ambiguities. The designers of Go and Lua both went to great pains to avoid such problems in their language grammars. Unlike, for example, JavaScript. This article about semicolon insertion rules causing ambiguity and unexpected results should help illustrate some of these problems.

                                                  1. 3

                                                    Pointing out Javascript isn’t a valid excuse.

                                                    Javascript’s problems are solely Javascript’s. If we discarded every concept that was implemented poorly in Javascript, we wouldn’t have many concepts left to program with.

                                                    I want semicolon inference done right, simple as that.

                                                    1. 4

                                                      That’s not what I’m saying. JavaScript is merely an easy example of some syntax problems that can occur. I merely assume that Rust, which has many more features than Go or Lua, decided not to maintain an unambiguous grammar without using semicolons.

                                                      1. 2

                                                        Why would the grammar be ambiguous? Are you sure that you don’t keep arguing from a JavaScript POV?

                                                        Not needing ; doesn’t mean the grammar is ambiguous.

                                                        1. 4

                                                          ~_~

                                                          Semicolons are an easy way to eliminate grammar ambiguity for multi-line syntax. For any language. C++ for example would have numerous similar problems without semicolons.

                                                          Not needing ; doesn’t mean the grammar is ambiguous.

                                                          Of course. Go and Lua are examples of languages designed specifically to avoid ambiguity without semicolons. JavaScript, C++, and Rust were not designed that way. JavaScript happens to be an easy way to illustrate possible problems because it has janky automatic semicolon insertion, whereas C++ and Rust do not.

                                                          1. 0

                                                            I’m completely unsure what you are trying to argue – it doesn’t make much sense. Has your triple negation above perhaps confused you a bit?

                                                            The main point is that a language created after 2000 simply shouldn’t need ;.

                                                            1. 5

                                                              ; is mandatory (what is this, 1980?), but you can leave out , at the end.

                                                              Same in Zig? Curious to know Zig rationale for this.

                                                              The rationale for semicolons. They make parsing simpler, particularly for multi-line syntax constructs. I have been extremely clear about this the entire time. I have rephrased my thesis multiple times:

                                                              In almost all languages with mandatory semicolons, they exist to prevent multi-line syntax ambiguities.

                                                              Semicolons are an easy way to eliminate grammar ambiguity for multi-line syntax.

                                                              Many underestimate the difficulty of creating a language without semicolons. Go has done so with substantial effort, and maintaining that property has by no means been effortless for them when adding new syntax to the language.

                                                              1. 0

                                                                Yeah, you know, maybe we should stop building languages that are so complex that they need explicitly inserted tokens to mark “previous thing ends here”? That’s the point I’m making.

                                                                when adding new syntax to the language

                                                                Cry me a river. Adding features does not improve a language.

                                                                1. 1

                                                                  Having a clear syntax where errors don’t occur 15 lines below the missing ) or } (as would unavoidably happen without some separator — trust me, it’s one of OCaml’s big syntax problems for toplevel statements) is a net plus and not bloat.

                                                                  What language has no semicolon (or another separator, or parenthesis, like lisp) and still has a simple syntax? Even python has ; for same-line statements. Using vertical whitespace as a heuristic for automatic insertion isn’t a win in my book.

                                                                  1. 2

                                                                    Both Kotlin and Swift have managed to make a working , unambiguous C-like syntax without semicolons.

                                                                    1. 2

                                                                      I didn’t know. That involves no whitespace/lexer trick at all? I mean, if you flatten a whole file into one line, does it still work? Is it still in LALR(1)/LR(1)/some nice fragment?

                                                                      The typical problem in this kind of grammar is that, while binding constructs are easy to delimit (var/val/let…), pure sequencing is not. If you have a = 1 b = 2 + 3 c = 4 d = f(a) semicolons make things just simpler for the parser.

                                                                      1. 1

                                                                        Why are line breaks not allowed to be significant? I don’t think I care if I can write an arbitrarily long program on one line…

                                                                    2. 0

                                                                      Using vertical whitespace as a heuristic for automatic insertion isn’t a win in my book.

                                                                      I agree completely. I love Lua in particular. You can have zero newlines yet it requires no semicolons, due to its extreme simplicity. Lua has only one ambiguous case: when a line begins with a ( and the previous line ends with a value.

                                                                      a = b
                                                                      (f or g)() -- call f, or g when f is nil
                                                                      

                                                                      Since Lua has no semantic newlines, this is exactly equivalent to:

                                                                      a = b(f or g)()
                                                                      

                                                                      The Lua manual thus recommends inserting a ; before any line starting with (.

                                                                      a = b
                                                                      ;(f or g)()
                                                                      

                                                                      But I have never needed to do this. And if I did, I would probably write this instead:

                                                                      a = b
                                                                      local helpful_explanatory_name = f or g
                                                                      helpful_explanatory_name()
                                                                      
                                                    2. 3

                                                      Also curious, as well as why Zig uses parentheses in ifs etc. I know what I’ll say is lame, but those two things frustrate me when looking at Zig’s code. If I could learn the rationale, it might hopefully at least make those a bit easier for me to accept and get over.

                                                      1. 3

                                                        One reason for this choice is to remove the need for a ternary operator without greatly harming ergonomics. Having the parentheses means that the blocks may be made optional which allows for example:

                                                        const foo = if (bar) a else b;
                                                        
                                                        1. 9

                                                          There’s a blog post by Graydon Hoare that I can’t find at the moment, where he enumerates features of Rust he thinks are clear improvements over C/C++ that have nothing to do with the borrow checker. Forcing if statements to always use braces is one of the items on his list; which I completely agree with. It’s annoying that in C/C++, if you want to add an additional line to a block of a brace-less if statement, you have to remember to go back and add the braces; and there have been major security vulnerabilities caused by people forgetting to do this.

                                                          1. 6
                                                          2. 6

                                                            The following would work just as well:

                                                            const foo = if bar { a } else { b };
                                                            

                                                            I’ve written an expression oriented language, where the parenthesis were optional, and the braces mandatory. I could use the exact same syntactic construct in regular code and in the ternary operator situation.

                                                            Another solution is inserting another keyword between the condition and the first branch, as many ML languages do:

                                                            const foo = if bar then a else b;
                                                            
                                                            1. 2

                                                              I don’t get how that’s worth making everything else ugly. I imagine there’s some larger reason. The parens on ifs really do feel terrible after using go and rust for so long.

                                                              1. 1

                                                                For what values of a, b, c would this be ambiguous?

                                                                const x = if a b else c
                                                                

                                                                I guess it looks a little ugly?

                                                                1. 5

                                                                  If b is actually a parenthesised expression like (2+2), then the whole thing looks like a function call:

                                                                  const x = if a (2+2) else c
                                                                  

                                                                  Parsing is no longer enough, you need to notice that a is not a function. Lua has a similar problem with optional semicolon, and chose to interpret such situations as function calls. (Basically, a Lua instruction stops as soon as not doing so would cause a parse error).

                                                                  Your syntax would make sense in a world of optional semicolons, with a parser (and programmers) ready to handle this ambiguity. With mandatory semicolons however, I would tend to have mandatory curly braces as well:

                                                                  const x = if a { b } else { c };
                                                                  
                                                                  1. 4

                                                                    Ah, Julia gets around this by banning whitespace between the function name and the opening parenthesis, but I know some people would miss that extra spacing.

                                                                  2. 3
                                                                    abs() { x = if a < 0 - a else a }
                                                                    
                                                                    1. 1

                                                                      Thanks for the example!

                                                                      I think this is another case where banning bad whitespace makes this unambiguous.

                                                                      a - b => binary
                                                                      -a => unary
                                                                      a-b => binary
                                                                      a -b => error
                                                                      a- b => error
                                                                      - a => error
                                                                      

                                                                      You can summarise these rules as “infix operators must have balanced whitespace” and “unary operators must not be followed by whitespace”.

                                                                      Following these rules, your expression is unambiguously a syntax error, but if you remove the whitespace between - and a it works.

                                                                      1. 1

                                                                        Or you simply ban unary operators.

                                                                        1. 1

                                                                          Sure, seems a bit drastic, tho. I like unary logical not, and negation is useful sometimes too.

                                                                          1. 1

                                                                            Not sure how some cryptic operator without working jump-to-declaration is better than some bog-standard method …

                                                                            1. 1

                                                                              A minus sign before a number to indicate a negative number is probably recognizable as a negative number to most people in my country. I imagine most would recognise -x as “negative x”, too. Generalising that to other identifiers is not difficult.

                                                                              An exclamation mark for boolean negation is less well known, but it’s not very difficult to learn. I don’t see why jump-to should fail if you’re using a language server, either.

                                                                              More generally, people have been using specialist notations for centuries. Some mathematicians get a lot of credit for simply inventing a new way to write an older concept. Maybe we’d be better off with only named function calls, maybe our existing notations are made obsolete by auto-complete, but I am not convinced.

                                                        2. 9

                                                          My current feeling is that async/await is the worst way to express concurrency … except for all the other ways.

                                                          I have only minor experience with it (in Nim), but a good amount of experience with concurrency. Doing it with explicit threads sends you into a world of pain with mutexes everywhere and deadlocks and race conditions aplenty. For my current C++ project I built an Actor library atop thread pools (or dispatch queues), which works pretty well except that all calls to other actors are one-way so you now need callbacks, which become painful. I’m looking forward to C++ coroutines.

                                                          1. 3

                                                            except for all the other ways

                                                            I think people are complaining about the current trend to just always use async for everything. Which ends up complaining about rust having async at all.

                                                          2. 8

                                                            This is amazing. I had similar feelings (looking previously at JS/Scala futures) when the the plans for async/await were floating around but decided to suspend my disbelief because of how good previous design decisions in the language were. Do you think there’s some other approach to concurrency fit for a runtime-less language that would have worked better?

                                                            1. 17

                                                              My belief is generally that threads as they exist today (not as they existed in 2001 when the C10K problem was written, but nevertheless keeps existing as zombie perf canon that no longer refers to living characteristics) are the nicest choice for the vast majority of use cases, and that Rust-style executor-backed tasks are inappropriate even in the rare cases where M:N pays off in languages like Go or Erlang (pretty much just a small subset of latency-bound load balancers that don’t perform very much CPU work per socket). When you start caring about millions of concurrent tasks, having all of the sources of accidental implicit state and interactions of async tasks is a massive liability.

                                                              I think The ADA Ravenscar profile (see chapter 2 for “motivation” which starts at pdf page 7 / marked page 3) and its successful application to safety critical hard real time systems is worth looking at for inspiration. It can be broken down to this set of specific features if you want to dig deeper. ADA has a runtime but I’m kind of ignoring that part of your question since it is suitable for hard real-time. In some ways it reminds me of an attempt to get the program to look like a pretty simple petri net.

                                                              I think that message passing and STM are not utilized enough, and when used judiciously they can reduce a lot of risk in concurrent systems. STM can additionally be made wait-free and thus suitable for use in some hard real-time systems.

                                                              I think that Send and Sync are amazing primitives, and I only wish I could prove more properties at compile time. The research on session types is cool to look at, and you can get a lot of inspiration about how to encode various interactions safely in the type system from the papers coming out around this. But it can get cumbersome and thus create more risks to the overall engineering effort than it solves if you’re not careful.

                                                              A lot of the hard parts of concurrency become a bit easier when we’re able to establish maximum bounds on how concurrent we’re going to be. Threads have a little bit more of a forcing function to keep this complexity minimized due to the fact that spawning is fallible due to often under-configured system thread limits. Having fixed concurrency avoids many sources of bugs and performance issues, and enables a lot of relatively unexplored wait-free algorithmic design space that gets bounded worst-case performance (while still usually being able to attempt a lock-free fast path and only falling back to wait-free when contention picks up). Structured concurrency often leans into this for getting more determinism, and I think this is an area with a lot of great techniques for containing risk.

                                                              In the end we just have code and data and risk. It’s best to have a language with forcing functions that pressure us to minimize all of these over time. Languages that let you forget about accruing data and code and risk tend to keep people very busy over time. Friction in some places can be a good thing if it encourages less code, less data, and less risk.

                                                              1. 17

                                                                I like rust and I like threads, and do indeed regret that most libraries have been switching to async-only. It’s a lot more complex and almost a new sub-language to learn.

                                                                That being said, I don’t see a better technical solution for rust (i.e. no mandatory runtime, no implicit allocations, no compromise on performance) for people who want to manage millions of connections. Sadly a lot of language design is driven by the use case of giant internet companies in the cloud and that’s a problem they have; not sure why anyone else cares. But if you want to do that, threads start getting in the way at 10k threads-ish? Maybe 100k if you tune linux well, but even then the memory overhead and latency are not insignificant, whereas a future can be very tiny.

                                                                Ada’s tasks seem awesome but to the best of my knowledge they’re for very limited concurrency (i.e the number of tasks is small, or even fixed beforehand), so it’s not a solution to this particular problem.

                                                                Of course async/await in other languages with runtimes is just a bad choice. Python in particular could have gone with “goroutines” (for lack of a better word) like stackless python already had, and avoid a lot of complexity. (How do people still say python is simple?!). At least java’s Loom project is heading in the right direction.

                                                                1. 12

                                                                  Just like some teenagers enjoy making their slow cars super loud to emulate people who they look up to who drive fast cars, we all make similar aesthetic statements when we program. I think I may write on the internet in a way that attempts to emulate a grumpy grey-beard for similarly aesthetic socially motivated reasons. The actual effect of a program or its maintenance is only a part of our expression while coding. Without thinking about it, we also code as an expression of our social status among other coders. I find myself testing random things with quickcheck, even if they don’t actually matter for anything, because I think of myself as the kind of person who tests more than others. Maybe it’s kind of chicken-and-egg, but I think maybe we all do these things as statements of values - even to ourselves even when nobody else is looking.

                                                                  Sometimes these costumes tend to work out in terms of the effects they grant us. But the overhead of Rust executors is just perf theater that comes with nasty correctness hazards, and it’s not a good choice beyond prototyping if you’re actually trying to build a system that handles millions of concurrent in-flight bits of work. It locks you into a bunch of default decisions around QoS, low level epoll behavior etc… that will always be suboptimal unless you rewrite a big chunk of the stack yourself, and at that point, the abstraction has lost its value and just adds cycles and cognitive complexity on top of the stack that you’ve already fully tweaked.

                                                                  1. 3

                                                                    The green process abstraction seems to work well enough in Erlang to serve tens of thousands of concurrent connections. Why do you think the async/await abstraction won’t work for Rust? (I understand they are very different solutions to a similar problem.)

                                                                    1. 4

                                                                      Not who you’re asking, but the reason why rust can’t have green threads (as it used to have pre-1.0, and it was scraped), as far as I undertand:

                                                                      Rust is shooting for C or C++-like levels of performance, with the ability to go pretty close to the metal (or close to whatever C does). This adds some constraints, such as the necessity to support some calling conventions (esp. for C interop), and precludes the use of a GC. I’m also pretty sure the overhead of the probes inserted in Erlang’s bytecode to check for reduction counts in recursive calls would contradict that (in rust they’d also have to be in loops, btw); afaik that’s how Erlang implements its preemptive scheduling of processes. I think Go has split stacks (so that each goroutine takes less stack space) and some probes for preemption, but the costs are real and in particular the C FFI is slower as a result. (saying that as a total non-expert on the topic).

                                                                      I don’t see why async/await wouldn’t work… since it does; the biggest issues are additional complexity (a very real problem), fragmentation (the ecosystem hasn’t converged yet on a common event loop), and the lack of real preemption which can sometimes cause unfairness. I think Tokio hit some problems on the unfairness side.

                                                                      1. 4

                                                                        The biggest problem with green threads is literally C interop. If you have tiny call stacks, then whenever you call into C you have to make sure there’s enough stack space for it, because the C code you’re calling into doesn’t know how to grow your tiny stack. If you do a lot of C FFI, then you either lose the ability to use small stacks in practice (because every “green” thread winds up making an FFI call and growing its stack) or implementing some complex “stack switching” machinery (where you have a dedicated FFI stack that’s shared between multiple green threads).

                                                                        Stack probes themselves aren’t that big of a deal. Rust already inserts them sometimes anyway, to avoid stack smashing attacks.

                                                                        In both cases, you don’t really have zero-overhead C FFI any more, and Rust really wants zero-overhead FFI.

                                                                        I think Go has split stacks (so that each goroutine takes less stack space)

                                                                        No they don’t any more. Split Stacks have some really annoying performance cliffs. They instead use movable stacks: when they run out of stack space, they copy it to a larger allocation, a lot like how Vec works, with all the nice “amortized linear” performance patterns that result.

                                                                      2. 3

                                                                        Two huge differences:

                                                                        • Erlang’s data structures are immutable (and it has much slower single threaded speed).
                                                                        • Erlang doesn’t have threads like Rust does.

                                                                        That changes everything with regard to concurrency, so you can’t really compare the two. A comparison to Python makes more sense, and Python async has many of the same problems (mutable state, and the need to compose with code and libraries written with other concurrency models)

                                                                  2. 4

                                                                    I’d like to see a good STM implementation in a library in Rust.

                                                                2. 6

                                                                  The fixation on async-await, despite it slowing down almost every real-world workload it is applied to, and despite it adding additional bug classes and compiler errors that simply don’t exist unless you start using it, has been particularly detrimental to the ecosystem.

                                                                  I’m curious about this perspective. The number of individual threads available on most commodity machines even today is quite low, and if you’re doing anything involving external requests on an incoming-request basis (serializing external APIs, rewriting HTML served by another site, reading from slow disk, etc) and these external requests take anything longer than a few milliseconds (which is mostly anything assuming you have a commodity connection in most parts of the world, or on slower disks), then you are better off with a some form of “async” (or otherwise lightweight concurrent model of execution.) I understand that badly-used synchronization can absolutely tank performance with this many “tasks”, but in situations where synchronization is low (e.g. making remote calls, storing state in a db or separate in-memory cache), performance should be better than threaded execution.

                                                                  Also, if I reach for Rust I’m deliberately avoiding GC. Go, Python, and Haskell are the languages I tend to reach for if I just want to write code and not think too hard about who owns which portion of data or how exactly the runtime schedules my code. With Rust I’m in it specifically to think about these details and think hard about them. That means I’m more prone to write complicated solutions in Rust, because I wouldn’t reach for Rust if I wanted to write something “simple and obvious”. I suspect a lot of other Rust authors are the same.

                                                                  1. 5

                                                                    The number of individual threads available on most commodity machines even today is quite low

                                                                    I don’t agree with the premise here. It depends more on the kernel, not the “machine”, and Linux in particular has very good threading performance. You can have 10,000 simultaneous threads on vanilla Linux on a vanilla machine. async may be better for certain specific problems, but that’s not the claim.

                                                                    Also a pure async model doesn’t let you use all your cores, whereas a pure threading model does. If you really care about performance and utilization, your system will need threads or process level concurrency in some form.

                                                                    1. 4

                                                                      I don’t agree with the premise here. It depends more on the kernel, not the “machine”, and Linux in particular has very good threading performance. You can have 10,000 simultaneous threads on vanilla Linux on a vanilla machine. async may be better for certain specific problems, but that’s not the claim.

                                                                      I wasn’t rigorous enough in my reply, apologies.

                                                                      What I meant to say was, the number of cores available on a commodity machine is quite low. Even if you spawn thousands of threads, your actual thread-level parallelism is limited to the # of cores available. If you’re at the point where you need to spawn more kernel threads than there are available cores, then you need to put engineering into determining how many threads to create and when. For IO bound workloads (which I described in my previous post), the typical strategy is to create a thread pool, and to allocate threads from this pool. Thread pools themselves are a solution so that applications don’t saturate available memory with threads and so you don’t overwhelm the kernel with time spent switching threads. At this point, your most granular “unit of concurrency” is each thread in this thread pool. If most of your workload is IO bound, you end up having to play around with your thread pool sizes to ensure that your workload is processed without thread contention on the one hand (too few threads) or up against resource limits (too many threads). You could of course build a more granular scheduler atop these threads, to put threads “to sleep” once they begin to wait on IO, but that is essentially what most async implementations are, just optimizations on “thread-grained” applications. Given that you’re already putting in the work to create thread pools and all of the fiddly logic with locking the pool, pulling out a thread, then locking and putting threads back, it’s not a huge lift to deal with async tasks. Of course if your workload is CPU bound, then these are all silly, as your main limiting resource is not IO but is CPU, so performing work beyond the amount of available CPU you have necessitates queuing.

                                                                      Moreover the context with which I was saying this is that most Rust async libraries I’ve seen are async because they deal with IO and not CPU, which is what async models are good at.

                                                                    2. 3

                                                                      Various downsides are elaborated at length in this thread.

                                                                      1. 2

                                                                        Thanks for the listed points. What it’s made me realize is that there isn’t really a detailed model which allows us to demonstrate tradeoffs that come with selecting an async model vs a threaded model. Thanks for some food for thought.

                                                                        My main concern with Rust async is mostly just its immaturity. Forget the code semantics; I have very little actual visibility into Tokio’s (for example) scheduler without reading the code. How does it treat many small jobs? Is starvation a problem, and under what conditions? If I wanted to write a high reliability web service with IO bound logic, I would not want my event loop to starve a long running request that may have to wait longer on IO than a short running request and cause long running requests to timeout and fail. With a threaded model and an understanding of my IO processing latency, I can ensure that I have the correct # of threads available with some simple math and not be afraid of things like starvation because I trust the Linux kernel thread scheduler much more than Tokio’s async scheduler.

                                                                    3. 3

                                                                      There’s no if-statement community

                                                                      That had me laughing out loud!

                                                                      1. 2

                                                                        probably because it’s if-expressions 🙃

                                                                      2. 2

                                                                        I hope I’m not opening any wounds or whacking a bee-hive for asking but… what sort of problematic interactions occur with the Rust community? I follow Rust mostly as an intellectual curiosity and therefor aren’t in deep enough to see the more annoying aspects. When I think of unprofessional language community behavior my mind mostly goes to Rails during the aughts when it was just straight-up hostile towards Java and PHP stuff. Is Rust doing a similar thing to C/C++?

                                                                      1. 3

                                                                        Do note that this debate continued, and Linus Torvalds is still wrong on this.

                                                                        1. 2

                                                                          Not disagreeing, but that post dates to 2006.

                                                                          1. 1

                                                                            As far as I am aware, Linus is yet to acknowledge the superiority of the microkernel approach.

                                                                            1. 8

                                                                              While I find microkernels interesting (albeit only one facet of the discussion in kernel design, it overshadows others), I’m not sure if you’re aware how smug/condescending you sound in this comment, and others I’ve seen from you on this topic. It suffocates the potential for discussion/debate and sets a bad atmosphere for it.

                                                                              1. 3

                                                                                Bad atmosphere or not, I think everybody who has ever had a kernel problem in production shares the sentiment.

                                                                                What I would give for the ability to gcore and restart a misbehaving process, instead of the entire machine with several dozen tenants on it. It makes development and security updates easier too.

                                                                        1. 1

                                                                          I don’t know much about data-oriented programming, but it seems popular among some game developers. Cool to see that a book is being written on it though, I might give it a read to see what the fuss is about.

                                                                          1. 7

                                                                            The programming paradigm used by game developers is called Data-Oriented design, whose main purpose is to improve performances of an application.

                                                                            The book is about Data-Oriented programming, a paradigm aimed at reducing the complexity of software systems.

                                                                            More about the distinction between the two in this article.

                                                                            1. 2

                                                                              Really? Those terms are so similar, my bad. That’s quite interesting though, I wonder how it works. Guess I’ll take a look at those preview chapters. Thanks for clearing that up.

                                                                              1. 3

                                                                                Uhh, when I look at the table of contents, this book is indistinguishable from one about functional programming. Functional programming can be viewed as data-oriented as well – it’s two sides of the same coin.

                                                                                Rich Hickey advocates programming with “functions and data”. The book has chapters about persistent and immutable data structures, and the author lists Clojure in his experience.

                                                                                So I hope that he explains somewhere why a new term is necessary. I don’t think it is, but I’m not interested enough in the book as is to make it worth finding out.

                                                                                1. 2

                                                                                  DOP is not a new term and it not the same as FP.

                                                                                  1. The term Data-Oriented programming was coined in the 2000’s by Eugene Kuznetsov.
                                                                                  2. Clojure was the first programming language to embrace DOP. I don’t think that other FP languages embrace data in the same way as Clojure.
                                                                                  3. In a sense, the purpose of the book is to illustrate how to apply DOP to languages other than Clojure.
                                                                                  1. 2

                                                                                    Clojure was the first programming language to embrace DOP.

                                                                                    I’m trying to understand in what way DOP differs from more traditional functional approaches. Erlang is the one I am most familiar with, and between persistent data structures and structural pattern matching, it seems to match DOP just fine. How is Clojure different?

                                                                                    1. 1

                                                                                      Does Erlang give you a generic access to the data via its information path?

                                                                                      I’ll give you an example from my book. Consider a simplistic representation library catalog data:

                                                                                      var catalog = {
                                                                                          "books": [
                                                                                            {
                                                                                              "title": "Watchmen",
                                                                                              "publicationYear": 1986,
                                                                                              "authors": [
                                                                                                {
                                                                                                  "firstName": "Alan",
                                                                                                  "lastName": "Moore"
                                                                                                },
                                                                                                {
                                                                                                  "firstName": "Dave",
                                                                                                  "lastName": "Gibbons"
                                                                                                }
                                                                                              ]
                                                                                            }
                                                                                      }
                                                                                      

                                                                                      Assume you’d like to retrieve the first name of the first author of the first book. The information path is: [0, "authors", 0, "firstName"].

                                                                                      In DOP, we access the information via a code like this:

                                                                                      get(catalog, [0, "authors", 0, "firstName"])
                                                                                      

                                                                                      Accessing data in DOP via its information path requires only knowledge about the structure of the data (basically field names).

                                                                                      As a consequence, we leverage general-purpose data manipulation functions (provided by the language or by third party libraries) to write our business logic.

                                                                                      It makes a huge difference!

                                                                                      1. 4

                                                                                        To be honest that explanation still feels extremely abstract to me, as it’s not clear what “get” actually represents in your description, but in Erlang that kind of “path” can be handled using function argument destructuring. In the following example, the function get_thing accepts an argument that contains a structure corresponding to your object of nested maps and lists, and pulls out the interesting bit without any code in the actual function body, binding it to the variable FirstName which is then printed out in the single line in the function body.

                                                                                        % this is a comment, starting with a % character
                                                                                        
                                                                                        % the following is the main function, and we don't care
                                                                                        % about the arguments passed to the function so we use
                                                                                        % an underscore instead of naming them or destructuring
                                                                                        % them. Underscores will be used later to ignore the
                                                                                        % "rest of the list" in the pattern match that handles
                                                                                        % our "path".
                                                                                        
                                                                                        main(_) ->
                                                                                            % Catalog corresponds to "var catalog" in
                                                                                            % the above example. Variables are
                                                                                            % capitalized in Erlang.
                                                                                            Catalog = #{
                                                                                              "books" => [
                                                                                                #{
                                                                                                  "title" => "Watchmen",
                                                                                                  "publicationYear" => 1986,
                                                                                                  "authors" => [
                                                                                                    #{
                                                                                                      "firstName" => "Alan",
                                                                                                      "lastName" => "Moore"
                                                                                                    },
                                                                                                    #{
                                                                                                      "firstName" => "Dave",
                                                                                                      "lastName" => "Gibbons"
                                                                                                    }
                                                                                                  ]
                                                                                                }
                                                                                              ]
                                                                                            },
                                                                                            
                                                                                            % The final line of an Erlang function ends with a period.
                                                                                            % Others before the end end with a comma.
                                                                                        
                                                                                            % passes the Catalog variable to the get_thing function
                                                                                            get_thing(Catalog).
                                                                                        
                                                                                        % the following is a function that uses 
                                                                                        % destructuring to pull apart the complex
                                                                                        % argument and binds the interesting item
                                                                                        % to the variable named "FirstName",
                                                                                        % and then prints it out.
                                                                                        
                                                                                        get_thing(#{"books" := [ #{"authors" := [#{"firstName" := FirstName} | _]} | _]}) ->
                                                                                          io:format("first name of first author of the first book: ~p\n", [FirstName]).
                                                                                        

                                                                                        It’s not clear to me what a non-data-oriented programming approach to this would be, though. A helpful idea from Saussure (I think): words are meaningful only due to how they are different from other words.

                                                                                        1. 2

                                                                                          Wouldn’t the ‘information path’ in this case be ["books", 0, ...]? Anyway, I think most people would be more familiar with the notation catalog.books[0].authors[0].firstName, but that’s a small nitpick. I think a more interesting point is that to a ‘dyed in the wool’ functional programmer, this would be an example of where one could use lenses to compose together powerful data access patterns, e.g. in Haskell notation imagine you have some basic lenses on the above data structures, you could compose them like:

                                                                                          (firstName.first.authors.first.books) catalog
                                                                                          

                                                                                          (The words inside the parentheses are the lenses and they’re read from right-to-left.)

                                                                                          Gabriel Gonzalez has a very cool blog post about the power of lenses: https://www.haskellforall.com/2013/05/program-imperatively-using-haskell.html

                                                                                          1. 3

                                                                                            You are right: the correct information path is: ["books", 0, "authors", 0, "firstName"],

                                                                                            There is an important difference between the ability to access any piece of data via its information path (like in get(catalog, ["books", 0, "authors", 0, "firstName"])) and the familiar notation that you mentioned (like in catalog.books[0].authors[0].firstName).

                                                                                            The difference is that the information path is a first-class citizen (it’s nothing more than an array!) that can be manipulated programmatically.

                                                                                            For example, we can:

                                                                                            1. Pass the information path as an argument to a function
                                                                                            2. Store the information path in a variable
                                                                                            3. Count the number of access per information path
                                                                                            4. Use the information path as a key for a cache

                                                                                            Would you say that with lenses, information path is a first-class citizen?

                                                                                            1. 1

                                                                                              Yes, lenses are normal values in the program, so they are very much first-class.

                                                                                              1. 1

                                                                                                Information paths remind me of key paths in Swift. Given a struct S, you can define a key path for S.member, store it, pass it and use it as a subscript for any value of S.

                                                                                        2. 1

                                                                                          I think it’s weird that you use Clojure as an example of Data-Oriented Programming, seeing as it is a peculiar dialect of Lisp. Lisp is famous for blurring the line between code and data, which allows for a powerful macro system. Can you elaborate on this?

                                                                                          1. 3

                                                                                            Clojure is not just an example of Data-Oriented Programming. Clojure is as far as I know, the first language to embrace DOP at the level of the language.

                                                                                            Clojure native data structures (maps, vectors, sets etc…) are persistent and their implementation is efficient both in terms of memory and computation. Since then, the implementation of Clojure persistent data structures has been ported to other programming languages (e.g Immutable.js for JavaScript, Paguro for Java). As a consequence, DOP is applicable to other languages.

                                                                                            Clojure is also a LISP and as so it is a homoiconic language. However, the fact that code and data share the same syntax doesn’t mean the the line between code and data is blurred. It means that we can write macros that manipulate code as it it were data.

                                                                                            In DOP, we separate code from data in the sense that data is not encapsulated into objects or lexical scope. We prefer to receive the data that we manipulate as an explicit argument.

                                                                                            I hope it makes things a bit clearer.

                                                                                            For more details, you can watch my talk “Data-Oriented programming: the secret sauce that makes Clojure systems less complex”

                                                                                            1. 2

                                                                                              The Erlang behaviors also force you to be very explicit about the state involved, causing overly complex architectures to feel kind of painful until you refactor them to be simpler. This downward pressure imposed by the language is one of my favorite aspects of it. Most languages pride themselves on making abstraction as easy as possible, but it’s hazardous to make abstraction so easy that you forget about the risk accumulating in the system with more code and data.

                                                                                        3. 1

                                                                                          I agree, it seems quite similar. From the draft Wikipedia article1 he links to in his article, it seems as if Data-Oriented Programming likes more general data structures. Maybe they’re opposed to strong type systems? I can’t really tell myself. It also says “Also, in FP usage of lexical scope could break the clear separation between code and data that DOP requires”, which just makes me more confused. Do they love global state?

                                                                                          1. 2

                                                                                            Indeed, DOP is a more natural fit on dynamically-typed languages, although I believe it could be applied to statically-typed languages also.

                                                                                            According to DOP, functions should receive the data they manipulate as an explicit argument.

                                                                                            We are allowed to represent the whole state of the system as a (big nested) hash map. But even then, the state is passed an an explicit argument to functions that access the state.

                                                                                  1. 1

                                                                                    Some really neat ideas, but also, a lot of headshaking around things like the XLA/Tensorflow dependency. The serious numerical story in Elixir doesn’t exist because core doesn’t care and, frankly, most Elixir folks are basically just doing web stuff as Ruby + Rails refugees.

                                                                                    Nice agitprop though if you want to boost your language’s popularity by riding the AI/ML gravy train.

                                                                                    1. 17

                                                                                      The serious numerical story in Elixir doesn’t exist because core doesn’t care and, frankly, most Elixir folks are basically just doing web stuff

                                                                                      Same applied to Python way back. That the Elixir devs are attempting to broaden their ecosystem is good.

                                                                                      1. 6

                                                                                        BEAM would not be my first choice for anything where I had to do some serious number crunching. I wonder who will even use this, and if it’ll die out like the Swift extensions for it.

                                                                                        I use Elixir because it’s good at networking, concurrency, and parsing, not anything AI.

                                                                                        1. 3

                                                                                          I had the same initial reaction and would rather bet on Julia for the future of ML. But at the very least, breaking Python’s monopoly is good and, of course, there is no good reason for Python’s ascent in this field. So, there’s no reason other ecosystems that have certain advantages over Python shouldn’t try to make a dent here.

                                                                                          A lot of ML projects use Ray (ray.io) for orchestration and workflows (or so I hear), basically implementing the actor model, and Elixir/BEAM is probably a more natural fit for that side of the problem.

                                                                                          1. 3

                                                                                            On the other hand, not everybody needs serious number crunching. This could let people in the elixir ecosystem stay within it for longer.

                                                                                            1. 1

                                                                                              Seems like they are using multistage programming to make this rather fast - I’m guessing this stuff wouldn’t be running on the BEAM directly? Could be wrong though.

                                                                                            2. 5

                                                                                              The serious numerical story in Elixir doesn’t exist because core doesn’t care

                                                                                              Ehhhh. Clearly you don’t mean “Elixir core”, as Jose is the one writing this. Even if you mean the Erlang core team, they might not care enough to lead implementation but they’re open to additions that support it: https://github.com/erlang/otp/pull/2890

                                                                                              Anyway, BEAM isn’t the logical execution environment for this stuff, it needs GPU to fly. For this use case, BEAM is a great place to write a compiler, and will function adequately as a test environment.

                                                                                              1. 1

                                                                                                ¯\_(ツ)_/¯

                                                                                              2. 5

                                                                                                FWIW, the compiler backend of Nx is pluggable, so other compilers aside from XLA can also be integrated with. XLA happened to be the first one they’ve gone with thus far.

                                                                                                What do you mean with “the core doesn’t care”?

                                                                                                1. 5

                                                                                                  Paraphrasing a colleague’s gripes…Elixir has a bunch of math functionality inherited from Erlang, and hasn’t bothered to fix any of the problems around it.

                                                                                                  In erlang:

                                                                                                  2> math:exp(344).
                                                                                                  2.4963287283217065e149
                                                                                                  3> math:exp(3444).
                                                                                                  ** exception error: an error occurred when evaluating an arithmetic expression
                                                                                                       in function  math:exp/1
                                                                                                          called as math:exp(3444)
                                                                                                  

                                                                                                  In Elixir:

                                                                                                  iex(3)> :math.exp(344) 
                                                                                                  2.4963287283217065e149
                                                                                                  iex(3)> :math.exp(3444)  
                                                                                                  ** (ArithmeticError) bad argument in arithmetic expression
                                                                                                      (stdlib 3.13.2) :math.exp(3444)
                                                                                                  

                                                                                                  Contrast with JS:

                                                                                                  > Math.exp(344)
                                                                                                  2.4963287283217065e+149
                                                                                                  > Math.exp(3444)
                                                                                                  Infinity
                                                                                                  

                                                                                                  What’s going on here is that Erlang uses doubles internally (except for integers, but that’s a different kettle of fish), but does not do IEEE754 support for special values like NaN or Infinite. This is certainly a choice they could make, but it rears its ugly head when you implement something mathematically well-behaved like the sigmoid function:

                                                                                                  iex(5)> sigmoid = fn (x) -> 1.0 / (1.0 + :math.exp(-x)) end
                                                                                                  #Function<44.97283095/1 in :erl_eval.expr/5>
                                                                                                  iex(6)> sigmoid.(-1000)                                    
                                                                                                  ** (ArithmeticError) bad argument in arithmetic expression
                                                                                                      (stdlib 3.13.2) :math.exp(1000)
                                                                                                  iex(6)> sigmoid.(-100) 
                                                                                                  3.7200759760208356e-44
                                                                                                  

                                                                                                  In JS, we’d get a negative in the denominator, it’d go to Inf, and the function would work fine. In Elixir, it explodes.

                                                                                                  Core has had the opportunity to fix this for a long time, but has been spending cycles elsewhere.

                                                                                                  1. 5

                                                                                                    Elixir has a bunch of math functionality inherited from Erlang, and hasn’t bothered to fix any of the problems around it.

                                                                                                    I found no tickets in the Erlang (where BEAM issues belong) or Elixir bug trackers discussing infinity.

                                                                                                    1. 2

                                                                                                      For all the supposed number crunching going on, nobody’s requesting working IEEE doubles?

                                                                                                      1. 2

                                                                                                        Not really, as most of “interesting” part of IEEE is already there. What is not supported are qNaN (sNaN is supported as it’s behaviour is exactly the same as what Erlang does - just throw exception) and infinites. All other values are fully supported in Erlang. And while it can be irritating to have exception in infinity, I think that in most situations you do not mind, as that would mean that the result of computation would be probably garbage to you anyway. So throwing up early is probably what will provide clearer information, especially with Erlang error isolation approach.

                                                                                                2. 2

                                                                                                  The serious numerical story in Elixir doesn’t exist because core doesn’t care and, frankly, most Elixir folks are basically just doing web stuff

                                                                                                  That perfectly describes how I got into Elixir. Phoenix got me into the ecosystem, but I branched out pretty quickly once I understood the power of the OTP model. About two years ago I started looking into writing drone flight control software in Elixir and ran into a huge hurdle with the numerical processing portion of it; I was fiddling around with NIFs and things, but it felt like I was just compromising the robustness of the system by having it call out to my own C code.

                                                                                                  The Nx project has me very very excited that the project I put on the shelf might be viable now!

                                                                                                1. 5

                                                                                                  For Python developers, Nx currently takes its main inspirations from Numpy and JAX but packaged into a single unified library.

                                                                                                  Julia might be better inspiration; I hope they didn’t stop with the Python ecosystem.

                                                                                                  1. 2

                                                                                                    You might want to elaborate why this might be :)

                                                                                                  1. 2

                                                                                                    Long article. I started by glancing at the conclusions, which turned out to be a good idea in hindsight.

                                                                                                    Monolithic vs micro-kernel vs your-favourite-denomination-kernel. This is one of the ultimate nerd debates, and I don’t think that the answer to this question is actually very important.

                                                                                                    The highlighted part saved me a lot of reading. I dismissed the article as easily as the writer dismissed the state of the art in OS architecture; They didn’t even bother with doing some basic research on the subject, yet felt entitled enough to rudely dismiss it.

                                                                                                    1. 10

                                                                                                      I think it’s a mistake to dismiss this article based on it not dealing with a single question of monolithic vs microkernel kernel architecture. Most of what it dealt with was the design of operating system userspace paradigms, which I think is as worthy of attention as kernel architecture. Certainly it’s far more visible to the end user than whether the kernel is a monolith or not.

                                                                                                      1. 2

                                                                                                        I think it’s a mistake to dismiss this article based on it not dealing with a single question of monolithic vs microkernel kernel architecture.

                                                                                                        I was otherwise fine with the article not dealing with this, but they didn’t stop at that, and had to add some disrespectful blabber.

                                                                                                        Most of what it dealt with was the design of operating system userspace paradigms

                                                                                                        Sure, I did end up reading more, but was understandably pissed off at how the part I highlighted was worded. Just because the author isn’t personally interested, it doesn’t give them agency to do this.

                                                                                                        1. 4

                                                                                                          Hahaha, especially since they have an issue open on Github titled “How to handle when a device driver panics?”.

                                                                                                      2. 2

                                                                                                        Given some of the grief I’ve had due to faulty drivers or kernel services in production systems, I can only imagine the article’s author has little experience with production servers. I’d give a lot for the ability to gcore and restart currently in-kernel services without rebooting the entire system: less impact for fixes and upgrades, far easier debugging, likely higher overall stability, and a smaller attack surface while we’re at it.

                                                                                                        The answer to this question is very important.

                                                                                                      1. 3

                                                                                                        Hi all. I wrote this blog post to describe bytecode rewriting in an interpreter. It’s a distillation of one of the Brunthaler papers, reduced to ~100 lines of C. Please let me know what you think!

                                                                                                        1. 2

                                                                                                          Your past two posts are excellent technical content. Given the relative rarity of harder technical content, it’s appreciated!

                                                                                                          1. 1

                                                                                                            I’m glad you like them ^_^ I’ve got some more planned that are a little more advanced. Hopefully they make it.

                                                                                                        1. 7

                                                                                                          This would interest me, but the article is behind the Medium account wall. Workaround:

                                                                                                          https://outline.com/7h2YYw

                                                                                                          1. 12

                                                                                                            Notice the “cached” button on every story in lobsters.

                                                                                                            1. 4

                                                                                                              Doh, how could I have missed that. Thanks!

                                                                                                              1. 2

                                                                                                                This doesn’t seem to work. Links to archive.md that just times out. never heard of archive.md before? What is it?

                                                                                                                1. 3

                                                                                                                  A bit like an on-demand Wayback Machine. It archives links.

                                                                                                                  Good chance you’re using Cloudflare for DNS resolution. Apparently there’s some disagreement between Cloudflare and archive.md, leading to failed resolution.

                                                                                                              2. 3

                                                                                                                If you set up a cookie-autodelete plugin you’ll never see the medium login-wall, I forget it exists until somebody like yourself (rightly) complains about it.

                                                                                                              1. 6

                                                                                                                Krste is a strong believer in macro-op fusion but I remain unconvinced. It requires decoder complexity (power and complexity), more i-cache space (power), trace caches if you want to avoid having it on the hot path in loops (power and complexity), weird performance anomalies when the macro-ops span a fetch granule and so the fusion doesn’t happen (software pain). And, in exchange for all of this, you get something that you could have got for free from a well-designed instruction set.

                                                                                                                1. 3

                                                                                                                  Why is more instruction cache needed? Is this lumping in the uop cache with L1i, or…?

                                                                                                                  1. 13

                                                                                                                    Imagine you need to clear the upper word of a variable (var &= 0xFFFFFFFF). This is very common when doing arithmetic on size smaller than the word size (e.g. 32-bit on a 64-bit computer). On RISC-V it is done with two instructions: a shift left followed by a shift right. For example, consider a 16-bit unsigned addition x8 += x9:

                                                                                                                    add x8, x8, x9
                                                                                                                    slli x8, x8, 16
                                                                                                                    srli x8, x8, 16
                                                                                                                    

                                                                                                                    In the base instruction set, each instruction would require 4 bytes, for a total of 12 bytes. Of course, the compressed instruction set allows these instructions to be encoded as 2 bytes each, for a total of 6 bytes. As a point of comparison, on x86 this would be ADD ax, bx, which is encoded as 3 bytes.

                                                                                                                    Separate to the encoding size issue is that on a typical architecture, these instructions will take at least 3 cycles to complete. They cannot be rearranged or completed in parallel because of the dependency of each instruction on the result of the previous instruction. One way around this is to just create a new instruction to zero-out the top 16-bits of a register. However, the RISC-V people decided against doing this. Instead, they suggest that CPUs implement macro-op fusion. So the instruction decoder has to recognize that a left shift followed by a right shift of the same amount can be done in one cycle. This is tricky to get right (and no one has done it in silicon AFAIK). It also introduces a lot of complexity to optimization, similar to in-order superscalar cores (see e.g. the pentium 1 section of this guide).

                                                                                                                    However, there is also the problem that while the above code could execute in 2 cycles with macro-op fusion, it still takes 6 bytes to encode. This takes up extra space in the instruction cache when compared to a dedicated “clear upper word” instruction (4 bytes) or more compact/irregular instruction sets like x86 (3 bytes).

                                                                                                                    (A way around this could be to cache the decoded instructions. This is not an uncommon technique, but decoded instructions are usually larger than encoded instructions. So it’s unlikely that this can be used to improve instruction cache density.)

                                                                                                                    1. 4

                                                                                                                      Thanks for the elaboration. I have been enlightened. :)

                                                                                                                      So that particular sequence of instructions would be longer, but assuming that macro-op fusion can be made to work well, is it possible to increase the code density this way overall? For example, although there is not a specific instruction to clear the upper bits, since there are fewer instructions to encode it potentially allows more common instructions to be shorter?

                                                                                                                      1. 6

                                                                                                                        So that particular sequence of instructions would be longer, but assuming that macro-op fusion can be made to work well, is it possible to increase the code density this way overall?

                                                                                                                        Yes (hopefully). In general, the RISC-V authors have chosen to make RISC-V very RISC-y when compared to other RISC ISAs. This manifests in few instructions, 3-operands for everything, and a very regular encoding in the base ISA. This has been criticized as wasting space in the base ISA, though I don’t have a link to such criticism on hand.

                                                                                                                        For example, although there is not a specific instruction to clear the upper bits, since there are fewer instructions to encode it potentially allows more common instructions to be shorter?

                                                                                                                        This is effectively what compressed instructions are for.

                                                                                                                  2. 1

                                                                                                                    The paper linked in the article appears to show RV64GC, the compressed variant of RV64G, results in smaller program sizes than x86_64. If that’s true, wouldn’t that mean you would need less i-cache space? This isn’t my area of expertise, but I find it fascinating.

                                                                                                                    1. 3

                                                                                                                      There are a lot of variables here. One is the input corpus. As I recall, that particular paper evaluated almost exclusively C code. The generated code for C++ will use a slightly different instruction mix, for other languages the difference is even greater. To give a concrete example, C/C++ do not have (in the standard) any checking for integer overflow. It either wraps for unsigned arithmetic or is undefined for signed. This means that a+b on any C integer type up to [u]int64_t is a single RISC-V instruction. A lot of other languages (including Rust, I believe, and the implementations of most dynamic languages) depend on overflow-checked arithmetic on their fast paths. With Arm or x86 (32- or 64-bit variants), the add instructions set a condition code that you can then branch on, accumulate in a GPR, or use in a conditional move instruction. If you want to have a good fast path, you accumulate the condition code after each arithmetic op in a hot path then branch at the end and hit a slow path if any of the calculations overflowed. This is very dense on x86 or Arm.

                                                                                                                      RISC-V does not have condition codes. This is great for microarchitects. Condition code registers are somewhat painful because they’re an implicit data dependency from any arithmetic instruction to a load of others. In spite of this, Arm kept them with AArch64 (though dramatically reduced the number of predicated instructions, to simplify the microarchitecture) because they did a lot of measurement and found that a carefully optimised compiler made significant use of them.

                                                                                                                      RISC-V also doesn’t have a conditional move instruction. Krste likes to cite a paper by the Alpha authors regretting their choice of a conditional move, because it required one extra read port on the register file. These days, conditional moves are typically folded into the register rename engine and so are quite cheap in the microarchitecture of anything doing a non-trivial amount of register rename (they’re just an update in the rename directory telling subsequent instructions which value to use). Compilers have become really good at if-conversion, turning small if blocks into a path that does both versions and selects the results. This is so common that LLVM has a select instruction in the IR. To do the equivalent with RISC-V, you need to have logic in decode that recognises a small branch forward and converts it into a predicated sequence. That’s a lot more difficult to do than simply having a conditional move instruction and reduces code density.

                                                                                                                      I had a student try adding a conditional move to a small RISC-V processor a few years ago and they reproduced the result that Arm used in making this decision: Without conditional moves, you need roughly four times as much branch predictor state to get the same overall performance.

                                                                                                                      Note, also, that these results predate any vector extensions for RISC-V. They are not comparing autovectorised code with SSE, AVX, Neon, or SVE. RISC-V has used up all of its 16-bit and most of its 32-bit instruction space and so can’t add the instructions that other architectures have introduced to improve code density without going into the larger 48-bit encoding space.

                                                                                                                      1. 1

                                                                                                                        The paper linked in the article appears to show RV64GC, the compressed variant of RV64G, results in smaller program sizes than x86_64

                                                                                                                        x86-64 has pretty large binary sizes; if your compressed instruction set doesn’t have smaller binaries than it should be redesigned.

                                                                                                                        I would take the measurements in that paper with a grain of salt; they aren’t comparing like-to-like.
                                                                                                                        The Cortex A-15 Core benchmarks, for example, should have also been run in Thumb-2 mode.
                                                                                                                        Thumb-2 causes substantial reductions in code size; it’s dubious to compare your compressed ISA to a competitor’s uncompressed ISA.

                                                                                                                        1. 1

                                                                                                                          This paper has some size comparisons against thumb (along with Huawei’s custom extensions). This page has sone as well.

                                                                                                                      1. 2

                                                                                                                        I agree that the GDB TUI is terrible. I use GDB in Emacs and to get the stacktrace and print variables I type the commands in the GUD command window. It’s not optimal, but it works. I’ve never had the graphical interface work as advertised.

                                                                                                                        Having a decent GUI interface with good memory views is a real boon to debugging low-level problems. It’s too bad that it doesn’t seem to work.

                                                                                                                        1. 1

                                                                                                                          It’s a real pity.

                                                                                                                          For as long as I’ve known about the TUI it has been in bad shape, and yet the demand for a better debug UI is very high. I don’t understand why the former, given the latter.

                                                                                                                          1. 2

                                                                                                                            I’m sure like every other foundational open source project, GDB doesn’t have enough people working on it.

                                                                                                                            Also, the philosophy is to let people make UIs on top of it, which lessens the need for the core to cater to everyone’s needs. Almost every debugger in the blog post is built on top of GDB! (or maybe EVERY single one?) So you can choose the one you want.

                                                                                                                            In my case, I noticed that the proprietary one (CLion) has the best GUI, which isn’t a surprise.

                                                                                                                            1. 2

                                                                                                                              I wouldn’t be surprised if there weren’t enough people working on GDB or LLDB, and yet GCC and LLVM seem to have no such problems, even though I’d argue that improving the debuggers have much higher utility.

                                                                                                                              My guess is that compilers are more exciting and/or higher status work, and attract their own share of PhD students. I have the exact same problem with the Ocaml ecosystem, where the compiler sees plenty of work, but attempts to improve debugging have barely gone anywhere.

                                                                                                                          2. 1

                                                                                                                            GUD

                                                                                                                            He is under using GUD,

                                                                                                                            See documentation on gdb-many-windows command.

                                                                                                                            I have on one frame the gdb command line, the local vars, the source code, the stack frames, the i/o, the break points and threads.

                                                                                                                          1. 3

                                                                                                                            Would be interesting to see performance compared while taking into account temperatures. Both Intel and AMD chips are heavily dependent on temperature for reaching and maintaining boost clocks.

                                                                                                                            This article compares the M1 against previous generation MacBooks, which optimise for compactness and quietness over performance. An Intel chip in a properly cooled system would perform better. They also include Ryzen desktop CPUs into the mix, which had to have been cooled using conventional desktop parts.

                                                                                                                            These numbers are useful for people choosing between MacBooks but in terms of actual performance they could be misleading.

                                                                                                                            1. 3

                                                                                                                              As far as I can tell, the article does not discuss the most important component here: the secondary storage itself. Some active benchmarking is needed here before any conclusions can be drawn.

                                                                                                                              The new M1 apparently has an SSD that is nearly twice as fast as a previous gen mac, but its numbers are typical for an NVMe drive. I rather suspect that normalizing for storage performance would render the M1 CPU itself uninteresting.

                                                                                                                              1. 2

                                                                                                                                The new M1 apparently has an SSD that is nearly twice as fast as a previous gen Mac

                                                                                                                                only on the MacBook Air which previously had an SSD that was 2x sower than what all other Macs had

                                                                                                                                1. 1

                                                                                                                                  Fair enough. Point stands though.

                                                                                                                            1. 11

                                                                                                                              I learned Erlang before Elixir, and it definitely helped me with Elixir tooling and the environment, which goes a long way learning a new language. But, honestly, I didn’t realize how much Erlang’s unique syntax helped me with Erlang. I think since it was so different than other languages, it was easy to start “thinking” in terms of recursion, pattern matching, and concurrency. When I started with Elixir, I had to remind myself that it isn’t Lua or C or Perl, it’s Elang with more normalized syntax. I am writing a lot more Elixir these days because I am using Phoenix, but I would prefer to write Erlang. Not that syntax is everything, but I enjoyed writing in it.

                                                                                                                              1. 9

                                                                                                                                But, honestly, I didn’t realize how much Erlang’s unique syntax helped me with Erlang

                                                                                                                                Erlang doesn’t have unique syntax, it has Prolog-like syntax. Erlang was originally prototyped in Prolog and the syntax was never changed. Fortunately, although the semantics are very different, the style of programming is quite similar between the two.

                                                                                                                                1. 4

                                                                                                                                  Ah, yes. I now remember reading that the syntax is Prolog inspired. Good reminder. I’ve never done anything in Prolog though so the syntax felt unique!

                                                                                                                                2. 2

                                                                                                                                  When using it at work I did find Elixir’s syntax a bit clunkier for functional programming, if I’m honest. Prolog syntax has it’s quirks and is clunky in its own way, but it did seem like a better fit!

                                                                                                                                  1. 5

                                                                                                                                    Erlang code is beautifully terse. Very simple language, with minimal ornamentation.

                                                                                                                                    Elixir shines in other areas. I’m glad for the cross-pollination.

                                                                                                                                    1. 1

                                                                                                                                      Yeah, it was also good to see more people learning about pattern matching, immutability, and the actor model via Elixir. And those people bring other ideas too! So yeah, definitely glad for that.

                                                                                                                                1. 21

                                                                                                                                  There is just such a huge ergonomics hit. On top of that, any time I’ve compared benchmarks between async and threaded code with the cpu frequency stabilized for repeatable results, async has usually resulted in lower throughput and only in unrealistically low-cpu workloads have I sometimes measured latency improvements. Far worse ergonomics, more error prone code, worse compiler inference, tons of dependencies, etc… for approximately equal or worse throughput and latency.

                                                                                                                                  Have other people run responsibly controlled benchmarks that show significant throughput improvements on modern server operating systems when using async? It’s kind of weird to me that people will go through all of this pain because some random person on the internet told them it was better, but it doesn’t seem like many people have seriously evaluated the costs and benefits.

                                                                                                                                  If you want, try out this echo example after disabling turbo boost and see for yourself:

                                                                                                                                  # build with turbo boost
                                                                                                                                  cargo build --release --bins
                                                                                                                                  
                                                                                                                                  # disable turbo boost for repeatable results
                                                                                                                                  echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
                                                                                                                                  
                                                                                                                                  # start threaded and async servers
                                                                                                                                  cargo run --release --bin req_res_threads & # starts on port 7000
                                                                                                                                  cargo run --release --bin req_res_async & # starts on port 7001
                                                                                                                                  
                                                                                                                                  # see how long it takes to receive 100k echo round trips of 4k buffers from 10 concurrent clients
                                                                                                                                  time cargo run --release --bin req_res_sender -- 7000 10 # "bad" thread per client
                                                                                                                                  time cargo run --release --bin req_res_sender -- 7001 10 # async
                                                                                                                                  
                                                                                                                                  # re-enable turbo boost for more enjoyable computing
                                                                                                                                  echo 0 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
                                                                                                                                  

                                                                                                                                  On my linux servers, threads tend to beat async throughput by 5-20%. Threaded version is doing new thread per client. Async version is using a multi-threaded work-stealing executor.

                                                                                                                                  It’s interesting to strace the async version to see how many more syscalls can be generated when hammering on epoll.

                                                                                                                                  Anyway, can we please start measuring more? So much effort is being spent for negative gains :/

                                                                                                                                  1. 16

                                                                                                                                    It’s kind of weird to me that people will go through all of this pain because some random person on the internet told them it was better, but it doesn’t seem like many people have seriously evaluated the costs and benefits.

                                                                                                                                    It is fascinating how much damage early 2000’s “threads = slow, async = fast” FUD can do. Almost all of it doesn’t apply anymore, anyway. It seems like this meme has somehow been imparted into the collective programmer (un)conscious, and it seems impossible to root out at this point. Remember Node.js being marketed as “everything is async, therefore very fast”?

                                                                                                                                    Even today, somehow, being async is revered as a great feature. The very reason for “being async” seems to have long been forgotten by the general public. I have first hand experience with this phenomenon from teaching people Go. When they start learning, at some point, they ask the spooky question: “Does this call block a thread? I heard blocking is slow.”, and a wise Go sage promptly answers: “Fear not! The Go runtime actually uses async under the hood!”, and just like that, the pupil’s worries about performance are gone! Poof! The async magic sauce makes everything go fast, so if it’s under the hood, we need not worry about performance at all.

                                                                                                                                    In fact, I believe “being async” and exposing such primitives is an enormous disadvantage, in every single department except perhaps performance, and, even in that case, there exist multiple facets, as the benchmark linked in the parent post shows. Look at the absolutely remarkable amount of code (and the complexity thereof) the author of the blog post has to write towards the end of the article, in order to do something that is, conceptually, very simple[1]. And what have we gained from this, in the real world? A few less MBs of memory sitting around unused per connection in a server?

                                                                                                                                    I’m not buying what the async people are selling. The async programming model is strictly worse. Async code does not compose well with “regular” code, and can be difficult to reason about. Writing and debugging the reactors is difficult, especially if you want a cross-platform one, since each platform (epoll, kqueue, event ports, IOCP) has its own quirks. Having visibility into these runtimes requires a tremendous amount of extra work, because none of the existing tools understand them. Meanwhile, a thread / process, and calling read(2) is just about the same everywhere. But that would be too easy, wouldn’t it?

                                                                                                                                    When I look at much of what modern software development is like, I can’t help but feel like

                                                                                                                                    Anyway, can we please start measuring more? So much effort is being spent for negative gains :/

                                                                                                                                    alludes to a much deeper problem. You see it in many places, and the async cargo culting I tried to point out with this post is just one of them.

                                                                                                                                    [1] Here it is, in 8 lines of Go, and another ~20 for a complete working demo https://play.golang.org/p/B-ZmhxNIYPb

                                                                                                                                    1. 4

                                                                                                                                      In general, as someone who has worked a lot with servers in a language that doesn’t have async, Python[1], I will say that async programming is worse in many cases. I would not use it for anything outside the web domain. Unfortunately, at least in my career, web servers have eaten the world, and it turns out that async helps a lot there. With Python servers, it turns out that frequently they are only able to effectively make use of around 40% of a typical cloud VM[2]. If you start to get into the 60% CPU range performance quickly degrades. This also comports with my experience of Ruby at a previous job. Note that I’m also ignoring the absolutely astounding growth rate of memory usage resulting from a typical Python or Ruby code base that effectively limits the number of worker threads that you can actually run.

                                                                                                                                      Now, most of this stuff where someone writes an ETL or some other trivial thing? Yeah, just use threads and traditional concurrency primitives. I’ll note though that the article here is literally a toy example to demonstrate the trait implementations and compiler errors, not an example of best practices.

                                                                                                                                      • [1]: I do know about Python async “stuff,” but for all intents and purposes Python does not have async.
                                                                                                                                      • [2]: Say, a c5.large on AWS.
                                                                                                                                      1. 2

                                                                                                                                        You do not need to expose an async interface to avoid the high memory consumption that a Python/Ruby system has though. Go and Erlang/Elixir are two fine examples; there are plenty of others.

                                                                                                                                        A multiprocess single-threaded synchronous dynamically-typed interpreted GC’d language is worst case for memory in a high-concurrency environment; it’s just one overhead after another. Python and Ruby are both technological dead ends in web dev for the reasons you pointed out.

                                                                                                                                        I suspect I’m drifting off from the topic at hand, however…

                                                                                                                                    2. 8

                                                                                                                                      I want to upvote you even more than I can. I think its sad that async took so much mindshare. there are few if any good http servers, database clients and other libs that don’t depend on tokio or some other runtime.

                                                                                                                                    1. 7

                                                                                                                                      Use research-based methods that advantage the least prepared students

                                                                                                                                      So we should reward lack of preparation?

                                                                                                                                      Make the highest grades achievable by all students

                                                                                                                                      If everybody has the highest grade, there’s no point having a grade. It basically means the entire data range is clipped.

                                                                                                                                      Call a truce on prosecuting plagiarism on programming assignments

                                                                                                                                      So cheating is okay now too?

                                                                                                                                      In order to maintain the belief that all are equal, despite all evidence, we need to give up on all our values and integrity? People truly have lost their minds.

                                                                                                                                      1. 5

                                                                                                                                        If everybody has the highest grade, there’s no point having a grade. It basically means the entire data range is clipped.

                                                                                                                                        Ideally, it means everyone learned the material up to the highest standard. If that’s honestly what happened, it’s impossible to justify not giving everyone the highest grade.

                                                                                                                                        1. 3

                                                                                                                                          If that is true then that’s fine. But I don’t think in the field, you would have a situation where everybody has achieved to the highest standard. And what is in actual practice is that some students are really bad, and the elimination of grading isn’t meant to remove a superfluous and pointless system, but a useful system that is honest enough to point out that some people have not achieved.

                                                                                                                                          1. 1

                                                                                                                                            It comes down to what’s on the syllabus: In order to earn a specific grade, you must demonstrate competence in this list of topics. If every student has done that for the list of topics required for an A, then not giving some of them As is a violation of the syllabus, which means the teacher is randomly violating the course’s implied contract with some of their students. If the course needs to be redesigned so fewer students get an A, that’s up to the professors to do it honestly, as opposed to picking names out of a hat and saying that these students fail even though they’re at the same level as the ones who pass.

                                                                                                                                            1. 3

                                                                                                                                              saying that these students fail even though they’re at the same level as the ones who pass.

                                                                                                                                              Does this happen?

                                                                                                                                        2. 4

                                                                                                                                          If everybody has the highest grade, there’s no point having a grade. It basically means the entire data range is clipped.

                                                                                                                                          I went to a school where students were not graded at all. I am now a professional programmer with a well paid job. An education professional is telling you that the way we are currently grading is counter productive. Your response is ‘but it would be different in the following way’. Can you elaborate on why this is a problem?

                                                                                                                                          1. 3

                                                                                                                                            A grading system is used to rank people. So a 1,2,3,4 would mean that 1 is more capable than 2 and 2 is more than 3 … If everybody is a 1, then you don’t know which one is more capable.

                                                                                                                                            I think it is a matter of fact that some people are better at some things than others. And grading is a feedback both to the person himself, and the community as a whole, of his capability and thus what role would give him the best comparative advantage.

                                                                                                                                            1. 2

                                                                                                                                              It’s much harder to tell where your weaknesses are relative to your competitors if you’re all getting top grades. My worry is that many weaker students will graduate without the skills they need in the market, and not even know it until too late. This is after four years and a large loan that cannot be discharged easily.

                                                                                                                                              To top it off, most(?) employers don’t look at grades anyway, so what are we achieving by this? Certainly nobody has ever asked for my grades since I graduated in the early 2000s, because even back then the market was compensating for grade inflation.

                                                                                                                                              I find this argument for clipping grades is putting a positive spin on what is effectively lying to the student while taking their money.

                                                                                                                                              1. 3

                                                                                                                                                It’s much harder to tell where your weaknesses are relative to your competitors if you’re all getting top grades.

                                                                                                                                                Is that important? Either I can do the work or I can’t there will always be people better or worse than me.

                                                                                                                                                My worry is that many weaker students will graduate without the skills they need in the market, and not even know it until too late. This is after four years and a large loan that cannot be discharged easily.

                                                                                                                                                This happens every day under the current system. Do you have any suggestions as to how we could resolve or minimise this problem? Free education solves the problem of the loan but the issue of graduating with insufficient skills is a much more difficult one.

                                                                                                                                                1. 2

                                                                                                                                                  Is that important?

                                                                                                                                                  You’re competing against a distribution of abilities, both when seeking and performing work.

                                                                                                                                                  Free education solves the problem of the loan but the issue of graduating with insufficient skills is a much more difficult one.

                                                                                                                                                  Agreed.

                                                                                                                                          1. 4

                                                                                                                                            I tried the C++ code with and without sorting and both versions ran in about 1.8s on my machine.

                                                                                                                                            Has something changed in compilers or hardware since 2012 when this was first posted?

                                                                                                                                            1. 6

                                                                                                                                              I have a CPU from that era (Intel i7 2600K, still going strong and I’m very happy with it) and I could reproduce it. With sort about 6.3 seconds, without sort about 16.0 seconds. My g++ version is 7.5.0. My only guess is that yes, they must have improved something about the branch prediction (in hardware or microcode?) for this effect to disappear.

                                                                                                                                              Also, wow, your machine is so much faster than mine. I didn’t know I could expect such a big single core speedup from upgrading my CPU. Maybe I should buy a new CPU after all. However, having an old CPU is really beneficial for development because if it performs well on my machine, it will perform even better on other people’s.

                                                                                                                                              1. 7

                                                                                                                                                Using old hardware for the benefit of your users. You, sir, are a hero. ∠(・`_´・ )

                                                                                                                                                I thought maybe -O2 optimisation was doing something fancy, but no, doesn’t seem to matter without -O2 either. I’m on an i7-7600U CPU @ 2.80GHz with gcc 8.3.0, which doesn’t seem that newer. I wonder if I’m doing something wrong in my test.

                                                                                                                                              2. 2

                                                                                                                                                Your compiler is almost certainly using cmov (conditional move) or adc (add with carry), if it hasn’t gone vectorization crazy (e.g. some versions of clang, depending on the flags). These instructions have the same timing regardless of the value being compared, since there is no branch involved.

                                                                                                                                                If you use objdump, you’ll probably find something along these lines in your binary:

                                                                                                                                                cmp word ptr [rax], 128
                                                                                                                                                adc rdx, 0
                                                                                                                                                

                                                                                                                                                In my experience, an unrolled loop using adc is hard to beat in this context.

                                                                                                                                                1. 2

                                                                                                                                                  Here’s the disassembly of gcc 8.3.0’s output up until the clock call at the end.

                                                                                                                                                  00000000000011a5 <main>:
                                                                                                                                                      11a5:	41 55                	push   %r13
                                                                                                                                                      11a7:	41 54                	push   %r12
                                                                                                                                                      11a9:	55                   	push   %rbp
                                                                                                                                                      11aa:	53                   	push   %rbx
                                                                                                                                                      11ab:	48 81 ec 08 00 02 00 	sub    $0x20008,%rsp
                                                                                                                                                      11b2:	49 89 e4             	mov    %rsp,%r12
                                                                                                                                                      11b5:	48 8d ac 24 00 00 02 	lea    0x20000(%rsp),%rbp
                                                                                                                                                      11bc:	00 
                                                                                                                                                      11bd:	4c 89 e3             	mov    %r12,%rbx
                                                                                                                                                      11c0:	e8 6b fe ff ff       	callq  1030 <rand@plt>
                                                                                                                                                      11c5:	99                   	cltd   
                                                                                                                                                      11c6:	c1 ea 18             	shr    $0x18,%edx
                                                                                                                                                      11c9:	01 d0                	add    %edx,%eax
                                                                                                                                                      11cb:	0f b6 c0             	movzbl %al,%eax
                                                                                                                                                      11ce:	29 d0                	sub    %edx,%eax
                                                                                                                                                      11d0:	89 03                	mov    %eax,(%rbx)
                                                                                                                                                      11d2:	48 83 c3 04          	add    $0x4,%rbx
                                                                                                                                                      11d6:	48 39 eb             	cmp    %rbp,%rbx
                                                                                                                                                      11d9:	75 e5                	jne    11c0 <main+0x1b>
                                                                                                                                                      11db:	e8 80 fe ff ff       	callq  1060 <clock@plt>
                                                                                                                                                      11e0:	49 89 c5             	mov    %rax,%r13
                                                                                                                                                      11e3:	be a0 86 01 00       	mov    $0x186a0,%esi
                                                                                                                                                      11e8:	bb 00 00 00 00       	mov    $0x0,%ebx
                                                                                                                                                      11ed:	eb 05                	jmp    11f4 <main+0x4f>
                                                                                                                                                      11ef:	83 ee 01             	sub    $0x1,%esi
                                                                                                                                                      11f2:	74 1d                	je     1211 <main+0x6c>
                                                                                                                                                      11f4:	4c 89 e0             	mov    %r12,%rax
                                                                                                                                                      11f7:	8b 08                	mov    (%rax),%ecx
                                                                                                                                                      11f9:	48 63 d1             	movslq %ecx,%rdx
                                                                                                                                                      11fc:	48 01 da             	add    %rbx,%rdx
                                                                                                                                                      11ff:	83 f9 7f             	cmp    $0x7f,%ecx
                                                                                                                                                      1202:	48 0f 4f da          	cmovg  %rdx,%rbx
                                                                                                                                                      1206:	48 83 c0 04          	add    $0x4,%rax
                                                                                                                                                      120a:	48 39 e8             	cmp    %rbp,%rax
                                                                                                                                                      120d:	75 e8                	jne    11f7 <main+0x52>
                                                                                                                                                      120f:	eb de                	jmp    11ef <main+0x4a>
                                                                                                                                                      1211:	e8 4a fe ff ff       	callq  1060 <clock@plt>
                                                                                                                                                  

                                                                                                                                                  So you’re saying cmovg is what’s removing the branching?

                                                                                                                                                  1. 4

                                                                                                                                                    (Butting in)

                                                                                                                                                    Yes, that’s right! The inner loop runs from 11f7 to 120d. It performs the addition every time, but only moves the result into the sum register if the condition is true.

                                                                                                                                              1. 1

                                                                                                                                                If you’ve read this far and are thinking “If Redo is so great, how come it hasn’t taken over?”

                                                                                                                                                I wrote my own implementation, tested it by converting several large OSS codebases, and even made a leaderless distribution system atop it. I was a true believer. I am no longer.

                                                                                                                                                IMHO redo hasn’t taken over for two reasons:

                                                                                                                                                1. Multiple outputs do not work; the workarounds are hacks.
                                                                                                                                                2. The build graph can’t be walked concurrently. redo uses a global lock.

                                                                                                                                                If redo inspires, ejholmes/walk might too. There’s a big design space.

                                                                                                                                                1. 1

                                                                                                                                                  Do you consider walk an improvement over redo, and if so: how?

                                                                                                                                                  1. 1

                                                                                                                                                    I consider it a different place in the design space; it’s near to make and redo. It’s far from Bazel. Ninja is probably in-between.

                                                                                                                                                    1. 1

                                                                                                                                                      The only real difference I found was the two stages for building dependencies (deps) and building (exec). I don’t know what kind of advantage comes from that or whether that is worthwhile.

                                                                                                                                                  1. 1

                                                                                                                                                    This is not a serious overview of fibers or async / await.

                                                                                                                                                    1. 1

                                                                                                                                                      Compared to?