Threads for amw-zero

    1. 2

      The primary paradigm for OLGP is interactive transactions.

      Is this meant to say OLTP, or is OLGP an acronym I haven’t heard of?

      1. 2

        Good catch, GP stands for general purpose, clarified in https://github.com/tigerbeetle/tigerbeetle/pull/2796

      2. 2

        I’m curious if anyone has any team wide practices for knowledge sharing in this area.

        I ask because “better” spends on values. For example, I’m writing a rust tutorial and I COULD introduce more abstraction to make invalid state impossible at more layers, however that code would add indirection, and it’s unclear if it’s meaningful or not. I chose to make the example simpler, faster to write instead of choosing to harden it against accidental misuse. Neither is better, they prioritize different values. Sometimes things are messy because the author was learning and knows better now, sometimes they are messy because

        In my ideal world developers would share strong “do this” and “not that” opinions, the always and nevers. And even some “this is okay” cases of “we can’t make this better, even though it seems suboptimal it’s fine”. Ideally listing out their values for why they feel some code is better than others so if we turn these feelings into a linter or shared style guide then we preserve enough info revisit old decisions/opinions. But I’ve not seen that done in a way that’s not just one dev ad-hoc writing down their preferences. I genuinely want an inclusive team-wide framework for continuous improvement and learning. Does that make sense? Any suggestions?

        The author makes the assumption that everyone is in alignment on what dirty tasks and things everyone wishes someone else would clean up, but in my experience alignment is truly the hardest part. The best devs I’ve worked with would happily nerdsnipe themselves all day every day over refactorings and tinkering. But it makes them terribly slow sometimes and the process take forever. I feel that if we spent time learning how to improve continuously (yes the wording strikes me as a bit on the nose) that they would feel empowered to ship suboptimal things and take more risks, knowing that there was space for tidying later (I like that more than the loaded “clean” word). I know how to do that as an individual, but don’t know how to drive that as a process as a team. Suggestions?

        1. 1

          “Cleanliness” is not formalizable. To your point, people disagree on what is or is not “clean.”

          The only antidote I’ve found for this is team-building / team “connecting.” What I mean by this is looking for the moments where you bring up a topic with a teammate, and something just clicks between you. This happened to me the other day with a teammate on the subject of Go channels for example.

          This kind of stinks, because this is a pretty fuzzy suggestion. But, I think there’s extreme value in nurturing these connections between everyone on the team. And there are things you can do to be more intentional about making them happen.

          A group code review is another good example. Where you review code amongst multiple people. This gives a place for the team to analyze and comment on real code, and share that process together to build connections.

        2. 2

          Distributed systems are flourishing. Tierless languages are what have stalled.

          Some examples of tierless languages:

          Erlang has some about distribution built in.

          There’s a bunch I’m leaving out. It’s definitely something people keep coming back to, but I think it’s really hard.

          I tried my hand at creating one with an early incarnation of Sligh. It worked ok for very simple logic, but it’s really hard to generalize. That’s why I ended up pivoting to just focusing on the testing side of the language.

          I like the idea in theory. It just may be in the camp of visual programming languages, where it seems like the obvious way to develop, yet it simply isn’t ergonomic in practice.

          1. 2

            My take with tierless languages (well honestly Ur/Web is the one I’m most familiar with) is that they work well for the happy path but if there are any problems across the underlying tiers that the tierless abstraction bridges, debugging and inspecting these problems becomes very difficult. In fairness I haven’t used a tierless language in production (aside from some toys I built with Ur/Web which isn’t saying much.)

            My own experience in applying distributed systems has been interesting. When I first joined the industry I was still captivated by the mindshare in academia thinking of tierless languages as interesting forward developments. The more time I spent in industry the more I realized the real problems were always at domain boundaries and so I spent years thinking that actually explicit code to work with other domains was useful so we could inspect issues and get ahead of them.

            As I did this for years I became fatigued at fighting the same set of problems with slightly differing subsets everywhere I left. I can’t count the number of times I’ve been in an incident room replaying an incident I was in 5 years earlier. It’s made me want to seek out new distributed abstractions again.

          2. 2

            No mention at all of TLA+ in the paper? TLA+ has no known axioms it relies on to my knowledge. It’s also just “simple math”, i.e. set theory and predicate logic. I didn’t see anything that was all that different from TLA+ here, except TLA+ talks way more about actual program execution, which I didn’t see that much of here.

            I mean I like the idea, for the same reason that I like TLA+: it’s actually digestible. Set theory really is simple and understandable, vs. something like dependent type theory.

            I just didn’t understand some of the criticisms, for example how is this definition of refinement different than any others? It seems like the same thing.

            1. 4

              I think TLA⁺ does have axioms, since it has to define what the temporal logic of actions is. So in A Science of Concurrent Programs Lamport goes through simpler constructions like the “Logic of Actions” which introduces sequences of states indexed by a time parameter, then on to “Raw Temporal Logic of Actions” which gets rid of the time parameter and defines the □ operator, finally ending up at the “Temporal Logic of Actions” which requires formulas to be stuttering-insensitive so introduces □[A]ᵥ and ◇⟨A⟩ᵥ. It all ultimately is built on top of the Kahn model mentioned in this paper where variables are mapped to specific values in a given state.

              Lamport also cautions that TLA is weirder than ordinary predicate logic on page 69 of A Science of Concurrent Programs:

              Although elegant and useful, temporal logic is weird. It’s not ordinary math. In ordinary math, any operator Op we can define satisfies the condition, sometimes called substitutivity, that the value of an expression Op(e1, . . . , en ) is unchanged if we replace any ei by an expression equal to ei. […] However, the temporal operator □ is not substitutive.

              You’re right TLA⁺ does have a simple definition of refinement, where a system described by TLA formula P refines a system described by TLA formula Q if P ⇒ Q. However, this isn’t really unique; actually Exercise 9 of Chapter 1.1 (Volume 1) of The Art of Computer Programming has you invent a set-theoretic definition of refinement whole-cloth (it’s a fun exercise, take a look at it!).

              1. 3

                yea, I was mistaken, TLA does have axioms (figure 5 in section 5.6 in Temporal Logic of Actions).

                I’ll check out that exercise at some point. I’m in the “refinement is one of the most important ideas in CS” camp, so any deeper insight there is always welcome.

                I am aware of some limitations of refinement, for example a program which does nothing (has no observable behaviors) refines every program vacuously. But this is just a consequence of refinement being defined in terms of subset / implication: the empty set is a subset of everything, false implies true, etc.

                But I didn’t see how these refinement definitions had any benefit so far.

                1. 1

                  The vacuous refinement case is annoying because it’s definitely possible to accidentally write an inconsistent fairness constraint ensuring your program generates zero behaviors, thus vacuously satisfying any refinement relation you’re trying to check. I proposed adding a warning if this happens: https://github.com/tlaplus/tlaplus/issues/161

            2. 4

              I get the feeling that Uncle Bob developed TDD in the environment he found himself in most—with a legacy system that needs to be modified and thus, the smallest amount of code is preferred over large amounts of code. For a green-field program, I think TDD would be a miserable failure, much like Ron Jeffries’ sudoku solver.

              1. 9

                Just fyi, Bob did not invent TDD. Your thought still stands as a possibility, but wanted to point that out to readers.

                1. 3

                  TDD is test driven design. If you already have large amount of badly written code, you can’t really design it. Not saying that testing is bad or no design happens at all, just that it is more of a refactoring tool in this context, rather than design.

                  1. 6

                    As far as I know TDD means Test Driven Development, or at least it’s how I’ve always seen it. Was UB refering to Design for TDD?

                    1. 2

                      Sorry, I stay corrected. Somehow I believed all my life that jast D stands for design.

                      1. 4

                        There are certainly some hardcore TDD fans who insist that TDD is a design methodology! But that’s not what it was initially coined as.

                        1. 2

                          Yes there is a school of thought that does preach the D as “design”. The idea is, you never add new code unless you’re looking at a red test, and thereby you guarantee both that your tests really do fail when they should, and that your code truly is unit testable.

                          I’m not really an advocate except in maybe rare circumstances, but that’s the idea.

                          So the original D meant “Development”, and then another camp took it further and advocated for it to mean “Design”.

                  2. 3

                    I’m curious what you mean? My experience is that TDD is much better in a greenfield project, because you’re not beholden to existing interfaces. So you can use it to experiment with what feels right a lot easier.

                    1. 2

                      In my understanding of TDD, one writes a test first, then only writes enough code to satisfy that one test. Then you write the next test, enough code to pass that test, and repeat. That is my understanding of it.

                      Now, with that out of the way, a recent project I wrote was a 6809 assembler. If you were to approach the same problem, what would be your very first test? How would you approach such a project? I know me, and having to write tests before writing code would drive me insane, especially in a green field project.

                      1. 7

                        I wrote a Uxn assembler recently and while I don’t practice TDD at all in my day-to-day, in this case it felt natural to get a sample program and a sample assembled output, add it to a test and build enough until that test passed, then I added a second more complex example and did the same, and so on and so on. I ended up with 5 tests covering quite a bit of the implementation. At the start I just had the bare minimum in terms of operations and data flow. By the fifth test I had an implementation that so far has done well (although it’s not perfect, but that was a explicit tradeoff, once I find limitations I’ll fix them)

                            #[test]
                            fn test_basic_assemble() {
                                let src = "|100 #01 #02 ADD BRK".to_string();
                                let program = assemble(src).unwrap();
                        
                                assert_eq!(program.symbol_table, BTreeMap::new());
                                assert_eq!(
                                    program.rom,
                                    vec![0x80, 0x01, 0x80, 0x02, 0x18, 0x00],
                                )
                            }
                        
                        1. 3

                          I’ve not done this with an assembler, but I’ve tried to do this with projects with a similar level of complexity, including a Java library for generating code at runtime. This is probably a skill issue, but I always end up with a lot of little commits, then I end up with some big design issue I didn’t anticipate and there’s a “rewrite everything” commit that ends up as a 1000 line diff.

                          I still aim to do TDD where I can, but it’s like the old 2004 saying about CSS: “spend 50 minutes, then give up and use tables.”

                        2. 4

                          First, you are totally correct that “true” TDD proponents say that you have to drive every single change with a failing test. Let me say that I don’t subscribe to that, so that might end the discussion right there.

                          But, I still believe in the value of using tests to drive development. For example, in your assembler, the first test I would write is an end to end test, probably focusing on a single instruction.

                          To get that to pass, you’ll solve so many problems that you might have spent a bunch of time going back and forth on. But writing the test gets you to make a choice. It drives the development.

                          From there, more specific components arise, and you can test those independently as you see fit. But, an assembler is an interesting example to pick, because it’s so incredibly easy to test.

                          1. 1

                            First, you are totally correct that “true” TDD proponents say that you have to drive every single change with a failing test. Let me say that I don’t subscribe to that, so that might end the discussion right there.

                            Then I can counter with, “then you haven’t exactly done TDD with a greenfield project, have you?”

                            When I wrote my assembler, yes, I would write small files and work my way up, but in no way did I ever do test, fail, code, test, ad nauseam. I was hoping to get an answer from someone who does, I guess for a lack of a better term, “pure TDD” even for greenfield development, because I just don’t see it working for that. My own strawman approach to this would be:

                            #include <stdio.h>
                            #include <string.h>
                            
                            int main(void)
                            {
                              char buffer[BUFSIZ];
                              scanf("%s\n",buffer);
                              if (strcmp(buffer,"RTS") == 0)
                                putchar('\x39');
                              else
                                fprintf(stderr,"bad opcode\n");
                              return 0;
                            }
                            

                            That technically parses an opcode, generates output and passes the test “does it assemble an instruction?”

                            1. 2

                              What is the problem with the code you posted?

                              I have done “true” TDD on a greenfield project, and it was fine. It’s just an unnecessary thing to adhere to blindly. From the test you have here, you would add more cases for more opcodes, and add their functionality in turn.

                              Alternatively, you could write a test of a whole program involving many opcodes if you want to implement a bunch at once, or test something more substantial.

                              1. 1

                                It’s just that I would never start with that code. And if the “test” consists of an entire assembly program, can that really be TDD with a test for every change? What is a change? Or are TDD proponents more pragmatic about it than they let on? “TDD for thee, but I know best for me” type of argument.

                                1. 2

                                  Yes, you could make new tests that consist of new programs which expose some gap in functionality, and then implement that functionality.

                                  A change can be whatever you want it to be, but presumably you’re changing the code for some reason. That reason can be encoded in a test. If no new behavior is being introduced, then you don’t need to add a test, because that’s a refactor. And that’s what tests are for: to allow you to change the internal design and know if you broke existing behavior or not.

                                  1. 2

                                    I guess I was beaten over the head by my manager at my previous job. He demanded not only TDD, but the smallest possible test, and the smallest code change to get that test to pass. And if I could fine an existing test to cover the new feature, the better [1].

                                    [1] No way I was going to do that, what with 17,000+ tests.

                                    1. 2

                                      Yea that’s a whole different story. We were talking about what’s possible for a bit, but you’re asking should you do this.

                                      Dogmatic TDD is not necessary, and doesn’t even meet the desired goal of ensuring quality by checking all cases. There are better tests for getting higher coverage, for example property based tests.

                                      For me, the sweet spot is simply writing tests first when I’m struggling to make progress on something. The test gives me a concrete goal to work towards. I don’t even care about committing it afterwards.

                    2. 7

                      I don’t get it, is it saying that test-first development is “so damn hard” because devs are lazy and stupid?

                      1. 3

                        Not only that, but they also don’t practice, and they only ship to appease management. Stupid devs.

                        1. 1

                          Sorry I didn’t get (or didn’t see) the warning about that, which I normally see.

                        2. 6

                          This was pretty much my only con with deterministic simulation testing: it’s up to you as the simulation writer to come up with fault scenarios that model the real world in any component that’s stubbed out for determinism. This isn’t a huge con, but it has been something that’s crossed my mind.

                          True integration tests are one way around that problem, by exercising the real component. It can be very difficult to create failure modes in real components though (as mentioned here, via killing / pausing processes, etc.) For example, fsync gate in postgres wasn’t detected for years and years, even in thousands of systems running in production. In part, this is because it because it was a result of hardware errors, which are nigh impossible to reproduce.

                          Still, the fault injection techniques described here are worthwhile since they will certainly uncover some bugs. My “pie in the sky” thinking is that we should design infrastructure components (down and including to the OS) to support prophecy variables, which would allow for triggering failure scenarios in generative tests by generating values for these variables.

                          Fun to think about, but this obviously doesn’t help with testing on existing OS’s.

                          Please keep up the writing and sharing of this awesome stuff, TigerBeetle team.

                          1. 5

                            Fun to think about, but this obviously doesn’t help with testing on existing OS’s.

                            You can do some fault-injection in userland using libfiu, which uses the LD_PRELOAD trick, but I agree that it doesn’t solve all problems.

                            1. 2

                              I had never heard of libfiu, thanks for sharing. It’s an interesting design space.

                                1. 3

                                  Apparently Linux has some fault injection mechanisms as well: https://docs.kernel.org/fault-injection/fault-injection.html.

                                  This has my wheels spinning…

                                  1. 1

                                    Oh, I didn’t know about the Linux stuff. Cool, thanks.

                                    This has my wheels spinning…

                                    In what direction? I haven’t thought this through, but I wonder if we have an interface (of the filesystem, say) we implement that interface twice, one real implementation (that uses the POSIX open, write, read, fsync etc) and a fake one (that uses some in-memory representation of the filesystem). Can we somehow test the real and the fake interfaces against each other? The happy path is easy, but what happens if we start injecting faults to the real implementation? We would then be forced to extend the fake in a way such that it responds to fault injection as well, that’s compatible with the real implementation. Once we’ve accounted for all faults in the fake, we can use the fake with faults as basis for simulation testing a program which depends on the filesystem.

                                    Would this approach catch fsync gate? I don’t think so, but I believe it could still be useful.

                                    1. 1

                                      I linked this above, but this isn’t talked about until the end of the post so here it is again: https://concerningquality.com/prophecy-variables/. If you look at the “Prophecy-Aware Dependencies…” section, this implements the exact approach you’re talking about here, minus the comparison of the fake to the real implementation.

                                      What I was thinking is, if we have fault injection built into the dependency in a deterministic way, then in theory we don’t even need a fake because we can control even the most rare error cases. Building the fake here was suggested because it’s more likely that you have no control over nondeterminism in any existing dependency today, so in the mean time you can wrap it with one of these modal fakes. And furthermore, errors are just one special case of nondeterminism. I’d like to do something like control the order of thread completions so concurrency could be controlled in tests.

                                      Basically, with enough of these in place, a real dependency could be used in a simulation test, because its determinism could be controlled.

                          2. 2

                            Is this like a property based testing approach on the real system?

                            The post says generative so i assume the workloads are built up dynamically which sounds pretty sick

                            1. 4

                              Yep that’s what it sounds like. Take a look at Jepsen too: https://jepsen.io/.

                              Which has applied this to many popular databases to check for correctness guarantees.

                            2. 1

                              Boolean is itself a bad name, because Boolean logic is concerned with 0 and 1, not with true or false values.

                              In logic, the result of propositions are most often called “truth values.” Theres nothing wrong with the fact that this refers to one of the variants (true). That being an issue was introduced without justification. “TruthValue” would be a perfectly fine sum type name for these.

                              1. 4

                                Boolean is itself a bad name, because Boolean logic is concerned with 0 and 1, not with true or false values.

                                From the opening of it’s wikipedia page: “First, the values of the variables are the truth values true and false, usually denoted 1 and 0”.

                                The name “boolean” works better for me. Good naming is about exploiting the already established “muscle-memory” associations in your audience rather than literal descriptiveness. That is, familiarity trumps correctness. If you were starting with greenfield brains, then sure “TruthValue” would be better. But ofc we’re not starting with greenfield brains.

                              2. 7

                                @tuxes do you have an Atom/RSS feed I can subscribe to for the next post?

                                1. 4

                                  I’ve been meaning to. For now my site is just a collection of HTML files. I’ll reply to your comment when I post the next one, or I add an RSS feed.

                                    1. 1

                                      Mildly unrelated, but are you just using an RSS reader app? I’m big into the personal blog space, so I’m always looking for the best way to interact with content.

                                      1. 4

                                        Yeah, I use a self-hosted reader to be able to read my feeds from my phone or a computer without dealing with sync issues. I used to use FreshRSS years ago, then transitioned to Miniflux (when FreshRSS wasn’t packaged/a module on NixOS) and that’s what I’m using to this day.

                                      1. 2

                                        whoa, TIL that search looks through the text of the articles!

                                        1. 1

                                          I don’t think there’s any issue with reposting, especially after a year.

                                          New conversations can be had.

                                          1. 4

                                            I post links to previous conversations because they are interesting, not as a criticism. (I greatly dislike comments that try to police submissions.)

                                        2. 31

                                          You’ve reached a new level of language success when the top 2 posts on any site are:

                                          • a post about a rewrite to the language
                                          • a post about criticisms of the language
                                          1. 17

                                            Yep, it indicates relevance and care. I’m happy to see Zig succeeding.

                                          2. 12

                                            Hi, one of the authors here. Following this work around the existing French Tax Code, we are now working on a domain-specific language improving transparency and maintainability of legal computations, called Catala.

                                            We also have a few papers around it, including the semantics of Catala, some fun stuff around date computation, and automatic testcase generation by concolic execution.

                                            1. 3

                                              I’ve had this idea for ages and was told it wasn’t possible several times. wouldn’t surprised if some of your team has heard the same. Very glad they didn’t listen! This is awesome.

                                              1. 1

                                                France has had tax code compilers for quite a while. I think it’s probably not possible in the US because the US tax code is really ambiguous (or at least that’s my understanding as a layperson).

                                                1. 1

                                                  That very well could be the case, yeah. Not sure what that says about our tax code but I don’t see that as a positive.

                                                  1. 2

                                                    Yes, I agree. The ambiguity in the US system allows billionaires to pay far less tax, so it’s not likely to go anywhere.

                                              2. 2

                                                I knew this looked familiar. I stumbled on the automatic test case generation part a while back, which is a big area of interest of mine.

                                                1. 1

                                                  Is the DGFiP “codebase” actually available beyond the small snippets you present in your papers? Seems like it would be a fun peek into a system that … in theory shouldn’t be harmful to expose to the world?

                                                2. 3

                                                  This is a very nice resulting API. OCaml code strikes the right balance between type system power (a good amount, but not all the way to dependent types) while also having a really clean syntax (imo).

                                                  And I do like this better than the other referenced query builders, since it totally avoids building up queries as strings. This is the sweet spot of query builders to me - why use one if there’s going to be excessive string mangling?

                                                  1. 1

                                                    I really like OCaml (especially the module system as I slowly learn more about it), although I’m not sure I would describe the syntax as really clean :)

                                                  2. 1

                                                    What’s the benefit of this over creating materialized views by hand?

                                                    1. 5

                                                      To give the canonical concrete example, imagine you have tables of stories and votes, and you have a query that gives you (story, count(votes)). To refresh a normal materialized view, you’re going to have to read the entirety of each table.

                                                      In Readyset, views/caches are incrementally maintained. When a new vote is added, Readyset knows that to update the view, all it needs to do is add one to the old vote count, a much simpler and cheaper operation. Additionally, views/caches are partial, which is to say that not every (story, count(votes)) tuple will be stored in memory, only the recently accessed ones, so if a write affects a result row that is not cached, no update work will be performed.

                                                      (I work for Readyset.)

                                                        1. 1

                                                          I’m always interested to see work in this area. How would you say Noria/Readyset compares to, e.g., Materialize, DBSP/Feldera, and pg_ivm? I.e., what advantages and disadvantages does Readyset have compared to those, and when would it be more appropriate to use Readyset vs one of those?

                                                        2. 2

                                                          One benefit is the automated updating of the cache, which you have to manage yourself when using a materialized view. This is often non-trivial.

                                                          1. 2

                                                            I haven’t had my coffee yet, but can’t you just refresh the view using triggers on the changed tables? Is the trigger management the non-trivial bit?

                                                            1. 2

                                                              It definitely can be non-trivial, if you want the most optimized performance. Rebuilding the materialized view has a cost, so doing it too often is not ideal.

                                                              For example. Consider tables A, B, and C, And columns a, b, and c on each of those tables. On day 1, you make a materialized via that references A.a and B.c. On day 2, this is updated to reference C.b. This requires adding an additional trigger for that column. On day 3, only C.bis referenced, which requires dropping the triggers onA.a, and B.c`.

                                                              Calculating the “data flow” dependencies would automatically manage these triggers rather than relying on hand-inspecting the query for data dependencies. Doing it manually is error prone, and can result in both too many or not enough columns being observed for changes.

                                                              1. 2

                                                                Consider a materialized view built from source tables will hundreds of billions of rows. Materializing the view for all rows in a traditional DBMS like Postgres may take weeks to months. In such a case, if your trigger just refreshes the view when a source table changes you will never get its results.

                                                                Depending on the view or trigger, you may be able to manually implement incremental view maintenance, but it’s not always obvious how to write the trigger. Particularly views that do a lot of joins are hard to maintain by hand.

                                                          2. 22

                                                            I remember John Carmack describing in one of his Doom 3 talks how he was shocked to discover that he made a mistake in the game loop that caused one needless frame of input latency. To his great relief, he discovered it just in time to fix it before the game shipped. He cares about every single millisecond. Meanwhile, the display server and compositing window manager introduce latency left and right. It’s painful to see how the computing world is devolving in many areas, particularly in usability and performance.

                                                            1. 17

                                                              He cares about every single millisecond. Meanwhile, the display server and compositing window manager introduce latency left and right.

                                                              I will say the unpopular-but-true thing here: Carmack probably was wrong to do that, and you would be just as wrong to adopt that philosophy today. The bookkeeping counting-bytes-and-cycles side of programming is, in the truest Brooksian sense, accidental complexity which we ought to try to vanquish in order to better attack the essential complexity of the problems we work on.

                                                              There are still, occasionally, times and places when being a Scrooge, sitting in his counting-house and begrudging every last ha’penny of expenditure, is forced on a programmer, but they are not as common as is commonly thought. Even in game programming – always brought up as the last bastion of Performance-Carers who Care About Performance™ – the overwhelming majority very obviously don’t actually care about performance the way Carmack or Muratori do, and don’t have to care and haven’t had to for years. “Yeah, but will it run Crysis?” reached meme status nearly 20 years ago!

                                                              The point of advances in hardware has not been to cause us to become ever more Scrooge-like, but to free us from having to be Scrooges in the first place. Much as Scrooge himself became a kindly and generous man after the visitation of the spirits, we too can become kinder and have more generous performance budgets after being visited by even moderately modern hardware,

                                                              (and the examples of old software so often held up as paragons of Caring About Performance are basically just survivorship bias anyway – the average piece of software always had average performance for and in its era, and we forget how many mediocre stuff was out there while holding up only one or two extreme outliers which were in no way representative of programming practice at the time of their creation)

                                                              1. 35

                                                                There is certainly a version of performance optimization where the juice is not worth the squeeze, but is there any indication that Carmack’s approach fell into that category? The given example of “a mistake in the game loop that caused one needless frame of input latency” seems like a bug that definitely should have been fixed.

                                                                I’m having a hard time following your reasons for saying Carmack was “wrong” to care so much about performance. Is there some way in which the world would be better if he didn’t? Are you saying he should have cared about something else more?

                                                                1. 14

                                                                  16ms of input latency is enormous for a fast-faced mouse driven game; definitely something the player can notice.

                                                                2. 18

                                                                  There are different kinds of complexity. Everything in engineering is about compromises. If you decide to trade some latency for some other benefit, that’s fine. If you introduce latency because you weren’t modelling it in your trade-off space, that’s quite another.

                                                                  1. 8

                                                                    the overwhelming majority very obviously don’t actually care about performance the way Carmack or Muratori do, and don’t have to care and haven’t had to for years. “Yeah, but will it run Crysis?” reached meme status nearly 20 years ago!

                                                                    the amount of people complaining about game performance in literally any game forum, steam reviews / comments / whatnot obviously shows that wrong. Businesses don’t care about performance but actual humans being do care ; the problem is the constantly increasing disconnect between business and people.

                                                                    1. 3

                                                                      Minecraft – the best-selling video game of all time – is known for both its horrid performance and for being almost universally beloved by players.

                                                                      The idea that “business” is somehow forcing this onto people (especially when Minecraft started out and initially exploded in popularity as an indie game with even worse performance than it has today) is just not supported by empirical reality, sorry.

                                                                      1. 8

                                                                        But the success is despite the game’s terrible performance, not thanks to it. Or do you think if you asked people if they would prefer minecraft to be faster they would say no ? If it was not a problem then a mod that does a marginal performance improvement certainly would not have 10M downloads: https://modrinth.com/mod/moreculling . So people definitely do care ; they just don’t have a choice because if you want to play “minecraft” with your friends this is your only option. Just like for instance Slack, Gitlab or Jira are absolutely terrible but you don’t have a choice to use it because that’s where your coworkers are.

                                                                        1. 5

                                                                          I don’t know of any game that succeeded because of their great performance, but I know of plenty that have succeeded despite their horrible performance. While performance can improve player satisfaction, for games, it’s a secondary measure of success, and it’s foolish to focus on it without having the rest of the game being good to play. It’s the case for most other software as well - most of the time, it’s “do the job well, in a convenient to use way, and preferably fast”. There’s fairly few problems where the main factor for software solving it is their speed first.

                                                                          1. 2

                                                                            I don’t know of any game that succeeded because of their great performance,

                                                                            … every competitive shooter ? you think counter-strike would have succeeded if it had the performance of, say, neverwinter nights 2 ?

                                                                            1. 1

                                                                              Bad performance can kill a decent game. Good performance cannot bring success to an otherwise mediocre game. If it worked that way, my simple games that run at ~1000FPS would have taken over the world already.

                                                                          2. 3

                                                                            Or do you think if you asked people if they would prefer minecraft to be faster they would say no ?

                                                                            Even if a game was written by an entire army of Carmacks and Muratoris squeezing every last bit of performance they could get, people would almost certainly answer “yes” to “would you prefer it to be faster”. It’s a meaningless question, because nobody says no to it even when the performance is already very good.

                                                                            And the fact that Minecraft succeeded as an indie game based on people loving its gameplay even though it had terrible performance really and truly does put the lie to the notion that game dev is somehow a unique performance-carer industry or that people who play games are somehow super uniquely sensitive to performance. Gamers routinely accept things that are way worse than the sins of your least favorite Electron app or React SPA.

                                                                            1. 6

                                                                              I think a more generous interpretation of the hypothetical would be to phrase the question as: “Do you think the performance of Minecraft is a problem?”

                                                                              In that scenario, I would imagine that even people who love the game would likely say yes. At the same time, if you asked that question about some Carmack-ified game, you might get mostly “no” responses.

                                                                              1. 1

                                                                                Gamers routinely accept things

                                                                                how is accepting things an argument for anything ? we are better than this as a species

                                                                                1. 1

                                                                                  Can you clarify the claim that you are making, and why the chosen example has any bearing on it? Obviously gaming is different from other industries in some ways and the same in other ways.

                                                                          3. 7

                                                                            I think the Scrooge analogy only works in some cases. Scrooge was free to become more generous because he was dealing with his own money. In the same way, when writing programs that run on our own servers, we should feel free to trade efficiency for other things if we wish. But when writing programs that run on our users’ machines, the resources, whether RAM or battery life, aren’t ours to take, so we should be as sparing with them as possible while still doing what we need to do.

                                                                            Unfortunately, that last phrase, “while still doing what we need to do”, is doing a lot of work there. I have myself shipped a desktop app that uses Electron, because there was a need to get it out quickly, both to make money for my (small, bootstrapped) company and to solve a problem which no other product has solved. But I’ve still put in some small efforts here and there to make the app frugal for an Electron app, while not nearly as frugal as it would be if it were fully native.

                                                                            1. 6

                                                                              I used to be passionate about this too, but I really think villianizing accidental complexity is a false idol. Accidental complexity is the domain of the programmer. We will always have to translate some idealized functionality into a physically executable system. And that system should be fast. And that will always mean reorganizing the data structures and algorithms to be more performant.

                                                                              My point of view today is that implementation details should be completely embraced, and we should build software that takes advantage of its environment to the fullest. The best way to do this while also retaining the essential complexity of the domain is by completely separating specification from implementation. I believe we should be writing executable specifications and using them in model-based tests on the real implementation. The specifications disregard implementation details, making them much smaller and more comprehensible.

                                                                              I have working examples of doing this if this sounds interesting, or even farfetched.

                                                                              1. 3

                                                                                I agree with this view. I used to be enamored by the ideas of Domain Driven Design (referring to the code implementation aspects here and not the human aspects) and Clean/Hexagonal Architecture and whatever other similar design philosophies where the shape of your actual code is supposed to mirror the shape of the domain concepts.

                                                                                One of the easiest ways to break that spell is to try to work on a system with a SQL database where there are a lot of tables with a lot of relations, where ACID matters (e.g., you actually understand and leverage your transaction isolation settings), and where performance matters (e.g., many records, can’t just SELECT * from every JOINed table, etc).

                                                                                I don’t know where I first heard the term, but I really like to refer to “mechanical sympathy”. Don’t write code that exactly mirrors your business logic; your job as a programmer is to translate the business logic into machine instructions, not to translate business logic into business logic. So, write instructions that will run well on the machine.

                                                                              2. 3

                                                                                Everything is a tradeoff. For example, in C++, when you create a vector and grow it, it is automatically zeroed. You could improve performance by using a plain array that you allocate yourself. I usually forgo this optimization because it costs time and often makes the code more unpleasant to work with. I also don’t go and optimize the assembly by hand, unless there is no other way to achieve what I want. With that being said, performance is a killer feature and lack of performance can kill a product. We absolutely need developers who are more educated in performance matters. Performance problems don’t just cripple our own industry, they cripple the whole world which relies on software. I think the mindset you described here is defeatist and, if it proliferates, will lead to worse software.

                                                                                1. 12

                                                                                  You could improve performance by using a plain array that you allocate yourself.

                                                                                  This one isn’t actually clear cut. Most modern CPUs do store allocate in L1. If you write an entire L1 line in the window of the store buffer, it will materialise the line in L1 without fetching from memory or a remote cache (just sending out some broadcast invalidates if the line is in someone else’s cache). If you zero, this will, definitely happen. If you don’t and initialise piecemeal, you may hit the same optimisation, but you may end up pulling in data from memory and then overwriting it.

                                                                                  If the array is big and you do this, you may find that it’s triggering some page faults eagerly to allocate the underlying storage. If you were going to use only a small amount of the total space, this will increase memory usage and hurt your cache. If you use all of it, then the kernel may see that you’ve rapidly faulted on two adjacent pages and handle a bit more in the page faults eagerly handler. This pre-faulting may also move page faults off some later path and reduce jitter.

                                                                                  Both approaches will be faster in some settings.

                                                                                  1. 4

                                                                                    Ah, you must be one of those “Performance-Carers who Care About Performance™” ;)

                                                                                    Both approaches will be faster in some settings.

                                                                                    This is so often the case, and it always worries me that attitudes like the GP lead to people not even knowing about how to properly benchmark and performance analyse anymore. Not too long ago I showed somebody who was an L4 SWE-SRE at Google a flamegraph - and he had never seen one before!

                                                                                    1. 11

                                                                                      Ah, you must be one of those “Performance-Carers who Care About Performance™” ;)

                                                                                      Sometimes, and that’s the important bit. Performance is one of the things that I can optimise for, sometimes it’s not the right thing. I recently wrote a document processing framework for my next book. It runs all of its passes in Lua. It simplifies memory management by doing a load of copies of std::string. For a 200+ page book, well under one second of execution time is spent in all of that code, the vast majority is spent in libclang parsing all of the C++ examples and building semantic markup from them. The code is optimised for me to be able to easily add lowerings from new kinds of semantic markup to semantic HTML or typeset PDF, not for performance.

                                                                                      Similarly, a lot of what I work on now is an embedded platform. Microcontrollers are insanely fast relative to memory sizes these days. The computers I learned to program on had a bit less memory, but CPUs that were two orders of magnitude slower. So the main thing I care about is code and data size. If an O(n) algorithm is smaller than an O(log(n)) one, I may still prefer it because I know n is probably 1, and never more than 8 in a lot of cases.

                                                                                      But when I do want to optimise for performance, I want to understand why things are slow and how to fix it. I learned this lesson as a PhD student, where my PhD supervisor gave me some code that avoided passing things in parameters down deep function calls and stored them in globals instead. On the old machine he’d written it for, that was a speedup. Parameters were all passed on the stack and globals were fast to access (no PIC, load a global was just load from a hard-coded address). On the newer machines, it meant things had to go via a slower sequence for PC-relative loads and the accesses to globals impeded SSA construction and so inhibited a load of optimisation. Passing the state down as parameters kept it in registers and enabled local reasoning in the compiler. Undoing his optimisation gave me a 20% speedup. Introducing his optimisation gave him a greater speedup on the hardware that he originally used.

                                                                                      1. 1

                                                                                        This is so often the case, and it always worries me that attitudes like the GP lead to people not even knowing about how to properly benchmark and performance analyse anymore.

                                                                                        I know how to and I teach it to people I work with. Just recently at work I rebuilt a major service, cut the DB queries it was doing by a factor of about 4 in the process, and it went from multi-second to single-digit-millisecond p95 response times.

                                                                                        But I also don’t pull constant all-nighters worrying that there might be some tiny bit of performance I left on the table, or switching from “slow” to “faster” programming languages, or really any of the stuff people always allege I ought to be doing if I really “care about performance”. I approach a project with a reasonable baseline performance budget, and if I’m within that then I leave it alone and move on to the next thing. I’m not going to wake up in a cold sweat wondering if maybe I could have shaved another picosecond somewhere.

                                                                                        And the fact that you can’t really respond to or engage with criticism of hyper-obsession with performance (or, you can but only through sneering strawmen) isn’t really helpful, y’know?

                                                                                        1. 2

                                                                                          And the fact that you can’t really respond to or engage with criticism of hyper-obsession with performance (or, you can but only through sneering strawmen) isn’t really helpful, y’know?

                                                                                          How were we supposed to know that you were criticizing “hyper-obsession” that leads to all-nighters, worry, and loss of sleep over shaving off picoseconds? From your other post it sounded like you were criticizing Carmack’s approach, and I haven’t seen any indication that it corresponds to the “hyper-obsession” you describe.

                                                                                          Where’s the strawman really?

                                                                                      2. 2

                                                                                        This one isn’t actually clear cut.

                                                                                        I did a consulting gig a few years ago where just switching from zeroing with std::vector to pre-zeroed with calloc was a double-digit % improvement on Linux.

                                                                                    2. 3

                                                                                      I think answer is somewhere in the middle: should game programmers in general care? Maybe too broad of a statement. Does ID Software, producers of top-of-the-class, extremely fast shooters benefit from someone who cares so deeply to make sure their games are super snappy? Probably yes.

                                                                                    3. 5

                                                                                      You think thats bad, consider the advent of “web-apps” for everything.

                                                                                      On anything other than an M-series Apple computer they feel sluggish, even with absurd computer specifications. The largest improvement I felt going from a i9-9900K to an M1 was that Slack suddenly felt like a native app, going back to my old PC felt like going back to the 90’s.

                                                                                      I would love to dig into why.

                                                                                      1. 11

                                                                                        The bit that was really shocking to me was how ctrl-1 and ctrl-2 (switching Slack workspaces) took around a second on a powerful AMD laptop on Linux.

                                                                                        At work we use Matrix/Element. It has its share of issues but the performance isn’t nearly as bad.

                                                                                        1. 8

                                                                                          I don’t really see how switching tabs inside a program is really related to the DRM subsystem, or to Kernel Mode Setting.

                                                                                          1. 1

                                                                                            I thought they were mentioning ctrl-alt-F1/F2 switching (virtual terminals), which used to be indeed slow.

                                                                                            My bad.

                                                                                      2. 7

                                                                                        There is a wide spectrum of performance in Electron apps. Although it’s mostly VS Code versus everyone else. VS Code is not particularly snappy, but it’s decent. Discord also feels faster than other messengers. The rest of highly interactive webapps are used are unbearably sluggish.

                                                                                        So I think these measured 6ms are irrelevant. I’m on Wayland Gnome and everything feels snappy except highly interactive webapps. Even my 10-year-old laptop felt great, but I retired it because some webapps were too painful (while compiling Rust felt… OK? Also browsing non-JS content sites was great).

                                                                                        Heck, my favorite comparison is to run Q2 on WASM. How can that feel so much snappier than a chat application like Slack?

                                                                                        1. 12

                                                                                          I got so accustomed to the latency, when I use something with nearly zero latency (e.g. an 80’s computer with CRT), I get the surreal impression that the character appeared before I pressed the button.

                                                                                          1. 4

                                                                                            I had the same feeling recently with a commadore64.

                                                                                            It really was striking how a computer with less power than the microcontroller in my keyboard could feel so fast, but obviously when you actually give it an instruction to think about, the limitations of the computer are clear.

                                                                                            EDIT: Oh hey, I wasn’t kidding.

                                                                                            The CPU in my keyboard is 16MHz: ControllerBoard Microcontroller PDF Datasheet

                                                                                            The CPU in the commadore64 I was using was 0.9-1MHz: https://en.wikipedia.org/wiki/MOS_Technology_6510

                                                                                        2. 4

                                                                                          As a user on smaller platforms without native apps, I will gladly take a web app or PWA over no access.

                                                                                          In the ’90s almost everything was running Microsoft Windows with on x86 for personal computers with almost everyone running at the 5 different screen resolunions so it was more reasonable to make a singular app for a singular CPU architecture & call it a day. Also security was an afterthought. To support all of these newer platforms, architectures, device types, & have the code in a sandbox, going the HTML + CSS + JavaScript route is a tradeoff many are willing to take for portability since browsers are ubiquitous. The weird thing is that a web app doesn’t have to be slow, & not every application has the same demands to warrant a native release.

                                                                                          1. 10

                                                                                            Having been around the BSD, and the Linux block 20+ years ago, I share the sentiment. Quirky and/or slow apps are annoying, but still more efficient than no apps.

                                                                                            Besides, as far as UIs go, “native” is just… a moderately useful description at this point. macOS is the only one that’s sort of there, but that wasn’t always the case in all this time, either (remember when it shipped with two toolkits and three themes?). Windows has like three generations of UI toolkits, and one of the two on which the *nix world has mostly converged is frequently used along with things like Kirigami, making it native in the sense that it all eventually goes through some low-level Qt drawing code and color schemes kind of work, but that’s about it.

                                                                                            Don’t get me wrong, I definitely prefer a unified “native” experience; even several native options were tolerable, like back when you could tell a Windows 3.x-era application from other Windows 98 applications because the Open file… dialog looked different and whatnot, but keybindings were generally the same, widgets mostly worked the same etc.

                                                                                            But that’s a lost cause, this is not how applications are developed anymore – both because developers have lost interest in it and because most platform developers (in the wide sense – e.g. Microsoft) have lost interest in it. A rich, native framework is one of the most complex types of software to maintain, with some of the highest validation and maintenance costs. Building one, only to find almost everyone avoids it due to portability or vendor lock-in concerns unless they literally don’t have a choice, and that even then they try to use as little of it as humanly possible, is not a very good use of already scant resources in an age where most of the profit is in mobile and services, not desktop.

                                                                                            You can focus on the bad and point out that the vast majority of Electron applications out there are slow, inconsistent, and their UIs suck. Which is true, but you can also focus on the good and point out that the corpus of Electron applications we have now is a lot wider and more capable than their Xaw/Motif/Wx/Xforms/GTK/Qt/a million others – such consistency, much wow! – equivalents from 25 years ago, whose UIs also sucked.

                                                                                      3. 3

                                                                                        Ooh, someone’s implemented an IDE where I can only display the functions I care about, instead of entire files filled with tons of “related functions” that probably aren’t actually at all related to the task at hand.

                                                                                        Not sure about the rest of it, but that’s something I’ve thought we should have for a long time.

                                                                                        1. 3

                                                                                          This idea has been in the back of my head for some time, and honestly I feel like multiple systems have implemented it in different ways, I just don’t recall one that’s truly nailed it. But, symbols in the program should be queryable so that you can create different “views” for many different contexts. I believe that’s what’s going on here too, but I’ve never actually checked this out.

                                                                                        2. 1

                                                                                          One of my favorite niche sub-genres of games is “programmable algorithm games.” Allowing these dungeon exploration algorithms to be written in almost any language is a cool riff on that genre.

                                                                                          1. 2

                                                                                            First of all, the idea of generating a test that checks the correctness of some other generated code is the notion of certifying compilation (also sometimes “self-certifying”). It’s one of my favorite concepts! This is more common in formal verification, for example in the Cogent language which generates proofs of verified filesystem implementations. But tests (especially property-based tests) can be seen as a proxy for such proofs, so generating tests instead is the same idea. I think it’s super cool and something we should do more of.

                                                                                            Second, I ran into this exact problem when trying to generate more interesting test values than random ones as inputs to property-based tests. I was looking for a combinatorial approach that could produce lots of combinations based on “categories” of values, but the solution I came up with wasn’t very generalizable and suffered from this same problem:

                                                                                            Seems ok, until we find out that the function returns only one combined value with only one test value per attribute

                                                                                            Using generators seems like a really good solution to this problem!