1. 5

    Few people are excited about both monads and macros; only one person that I know comes to mind.

    Plenty of Haskell projects use both Template Haskell (macros) and monads; probably every single large-scale Haskell program includes both. Not only is there nothing wrong with mixing the two, amusingly enough Template Haskell runs in a monad! So.. not only are both useful at the same time in Haskell, you don’t get macros without monads. At the same time some of the most popular Haskell libraries use macros like lens (Control.Lens.TH) and aeson (Data.Aeson.TH). I end up using these in every Haskell program I ever write.

    Aside from that, the two concepts aren’t related at all in any way mathematically. They definitely aren’t both about information hiding.

    The author sadly gets both monads and macros totally wrong.

    Monads let you concentrate on functions by providing a sort of side channel for information associated with those functions

    Maybe is a monad. It sequences computations that can fail. There is no side channel of information at all there; a computation either fails or doesn’t. Not to mention the Sum and functions (->) which are also monads and can’t be thought of as having any side channel – they compose actions. Or Product which combines functors. This is not a good mental model for monads.

    Macros let you hide the fact that your programming language doesn’t have features that you would like

    This is at best the C view of macros “Hack stuff together because C sucks”. It’s not the view of macros that you have any language that has a proper macro system (lisp, scheme, haskell, etc.). Macros let you treat code as data, so you can do things like perform static analysis on the code, have code that generates programs, etc.

    Macros and monads are totally unrelated ideas. Macros let you modify code as if it was data. Monads sequence actions in interesting ways

    1.  

      Monads let you concentrate on functions by providing a sort of side channel for information associated with those functions

      This is not a good mental model for monads.

      It resembles my own mental model. Monads sweep repetitive logic under the rug and let you focus on a chain of computations. Methods from particular monads—not part of the Monad interface itself—let you access or interact with that submerged logic.

      My other mental model is “statements, but made out of expressions”

      1.  

        Beware your mental models—they may be incorrect. My own mental model for monads, that they are new types that have no constants (so they disallow optimizations like constant propagation, or that values aren’t volatile), does explain the IO, random, and sequential sequencing monads, but that’s not how they actually work (it’s an incorrect mental model).

    1. 2

      In our analysis, we figured out that the most frequent category of bugs caught by End-to-End tests was schema violations

      Is a schema violation something more subtle than “this was supposed to have a foo field but it doesn’t”? Or, “it was supposed supposed to be a string but it’s null”?

      How was that the common bug caught by end to end testing? Did they completely not have unit tests before? This reads like a straw man dreamed up by some kind of static typing zealot.

      (I say this as a static typing zealot.)

      1. 3

        How was that the common bug caught by end to end testing? Did they completely not have unit tests before?

        I think here it’s because the data is coming from a different service, so the E2E test fails if one service changes its data format and the client doesn’t. A unit test (or static typing) wouldn’t cover that because it’s different codebases.

        1. 1

          Is it common practice to change your end points in place?

          What I’ve always seen done is that each end point either never changes or only changes in backwards compatible ways. Then the server and client can independently use unit tests or types to check that they satisfy or understand the schema of that end point.

          Actually for monoglot code bases I’ve mostly seen services provide a client side library that handles the versioning internally and exposes an unversioned api to the client linking it.

        2. 1

          It might be something like “it’s supposed to be a valid NANP phone number, but it’s not” (experienced that one at work).

        1. 3

          The fact that it works at all is amazing. However, 6502 is a really tough target for compiled languages. Even something as basic as having a standard function calling convention is expensive.

          1. 2

            Likewise, I’m very impressed it works. Aside from you correctly pointing out how weak stack operations are on the 6502, however, it doesn’t generate even vaguely idiomatic 6502 assembly. That clear-screen extract was horrible.

            1. 2

              The 6502 is best used treating zero page as a lot of registers with the same kind of calling convention as modern RISC (and x86_64) use: some number of registers that are used for passing arguments and return values and for temporary calculations inside a function (and so that leaf functions don’t have to save anything), plus a certain number of registers that are preserved over function calls and you have to save and restore them if you want to use them. The rest of zero page can be used for globals, the same as .sdata referenced from a Global Pointer register on machines such as RISC-V or Itanium.

              If you do that then the only stack accesses needed are push and pop or a set of registers. If you generate the code appropriately then you only have to know to save N registers on function entry and restore the same N and then return on function exit. You can use a small set of special subroutines for that, saving code size. RISC-V does exactly the same thing with the -msave-restore option to gcc or clang.

              Of course for larger programs you’ll want to implement your own stack (using two zero page locations as the stack pointer) for the saved registers. 256 bytes should be enough for just the function return addresses.

              1. 1

                But I wonder how much of the zero page you can use without stepping on the locations reserved for ROM routines, particularly on the Apple II. It’s been almost three decades since I’ve done any serious programming on the Apple II, but didn’t its ROM reserve some zero-page locations for allowing redirection of ROM I/O routines? If I were programming for that platform today, I’d still want to use those routines, so that, for example, the Textalker screen reader (used in conjunction with the Echo II card) would work. My guess is that similar considerations would apply on the C64.

                1. 1

                  The monitor doesn’t use a lot. AppleSoft uses a lot more, but that’s ok because it initialises what it needs on entry.

                  https://pbs.twimg.com/media/E_xJ5oWUYAAUo3a?format=jpg&name=4096x4096

                  Seems a shame now to have defaced the manual, but in my defence I did it 40 years ago.

                2. 1

                  Now I’ve looked into the implementation I see they’re doing something like this, but using only 4 zero page bytes as caller-saved registers. This is nowhere near enough!

                  Even 32 bit ARM uses 4 registers, which should probably translate to 8 bytes on 6502 (four pointers or 16 bit integers).

                  x86_64, which has the same number of registers as arm32, uses six argument registers. RISC-V uses 8 argument registers, plus another 7 “temporary” registers which a called function is free to overwrite. PowerPC uses 8 argument registers.

                  6502 effectively has 128 16-bit registers (the size of pointers or int). There is no reason why you shouldn’t be at least as generous with argument and temporary registers as the RISC ISAs that have 32 registers.

                  I’d suggest maybe 16 bytes for caller-save (arguments), 16 bytes for temporaries, 32 bytes for callee-save. That leaves 192 bytes for globals (2 bytes of which will be the software stack pointer).

                  1. 1

                    Where are you going to save them? In the 256 BYTE stack the 6502 has? Even if the stack wasn’t limited, you still only have as most 65,536 bytes of memory to work with.

                    1. 1

                      Would be cool to see if this stuff were built to expect bank switching hardware.

                      1. 1

                        I quote myself:

                        Of course for larger programs you’ll want to implement your own stack (using two zero page locations as the stack pointer) for the saved registers. 256 bytes should be enough for just the function return addresses.

                        64k of total memory is of course a fundamental limitation of the 6502, so is irrelevant to what details of code generation and calling convention you use. Other than that you want as compact code as possible, of course.

                  2. 2

                    GEOS has a pretty interesting calling convention for some of its functions (e.g. used at https://github.com/mist64/geowrite/blob/main/geoWrite-1.s#L82): Given that there’s normally no concurrency, and little recursive code, arguments can be stored directly in code:

                    jsr function
                    .byte arg1
                    .byte arg2
                    

                    function then picks apart the return address to get at the arguments, then moves it forward before returning to skip over the data. A recursive function (where the same call site might be re-entered before leaving, with different arguments) would have to build a trampoline on a stack or something like that:

                    lda #argcnt
                    jsr trampoline
                    .word function
                    .byte arg1
                    ...
                    .byte argcnt
                    

                    where trampoline creates jsr function, a copy of the arguments + rts on the stack, messes with the returrn address to skip the arguments block, then jumps to that newly created contraption. But I’d rather just avoid recursive functions :-)

                    1. 1

                      Having to need self-modifying code to deal with function calls is reminding me of the PDP-8, which didn’t even have a stack - you had to modify code to put your return address in.

                      1. 1

                        Are those the actual arguments and self-modifying code is used to get non-constant data there? Or are the various .byte values the address to find the argument, in Zero Page?

                        That’s pretty compact at the call site, but a lot of work in the called function to access the arguments. It would be ok for big functions that are expensive anyway, but on 6502 you probably (for code compactness) want to call a function even for something like adding two 32 bit (or 16 bit) integers.

                        e.g. to add a number at address 30-31 into a variable at address 24-25 you’d have at the caller …

                            jsr add16
                            .byte 24
                            .byte 30
                        

                        … and at the called function …

                        add16:
                            pla
                            sta ARGP
                            tax
                            pla
                            sta ARGP+1
                            tay
                            clc
                            txa
                            adc #2
                            pha
                            tya
                            adc #0
                            pha
                            ldy #0
                            lda (ARGP),y
                            tax
                            iny
                            lda (ARGP),y
                            tay
                        
                        add16_q:
                            clc
                            lda $0000,y
                            adc $00,x
                            sta $00,x
                            lda $0001,y
                            adc $01,x
                            sta $01,x
                            rts
                        

                        So the stuff between add16 and add16_q is 26 bytes of code and 52 clock cycles. The stuff in add16_q is 16 bytes of code and 28 clock cycles. The call to add16 is 5 bytes of code and 6 clock cycles.

                        It’s possible to replace everything between add16 and add16_q with a jsr to a subroutine called, perhaps, getArgsXY. That will save a lot of code (because it will be used in many such subroutines) but add even more clock cycles – 12 for the JSR/RTS plus more code to pop/save/load/push the 2nd return address on the stack (26 cycles?).

                        But there’s another way! And this is something I’ve used myself in the past.

                        Keep add16_q and change the calling code to…

                            ldx #24
                            ldy #30
                            jsr add16_q
                        

                        That’s 7 bytes of code instead of 5 (bad), and 10 clock cycles instead of 6 – but you get to entirely skip the 52 clock cycles of code at add16 (maybe 90 cycles if you call a getArgsXY subroutine instead).

                        You may quite often be able to omit the load immediate of X or Y because one or the other might be the same as the previous call, reducing the calling sequence to 5 bytes.

                        If there’s some way to make add16 more efficient I’d be interested to know, but I’m not seeing it.

                        Maybe you could get rid of all the PLA/PHA and use TSX;STX usp;LDX #1;STX usp+1 to duplicate the stack pointer in a 16-bit pointer in Zero Page, grab the return address using LDA instead of PLA, and increment the return address directly on the stack. It’s probably not much better, if at all.

                        1. 1

                          These calling conventions are provided for some functions only, and mostly the expensive ones. From the way it’s implemented for BitmapUp, without looking too closely at the macros, it seems they store the return address at a known address and index through that.

                          GEOS has pretty complex functions and normally uses virtual registers in the zero page, so I guess this is more an optimization for constant calls: no need to have endless lists of lda #value; sta $02; ... in your code - as GEOS then copies it into the virtual registers and just calls the regular function, the only advantage of the format is compactness.

                    1. 34

                      return err is almost always the wrong thing to do. Instead of:

                      if err := foo(); err != nil {
                      	return err
                      }
                      

                      Write:

                      if err := foo(); err != nil {
                      	return fmt.Errorf("fooing: %w", err)
                      }
                      

                      Yes, this is even more verbose, but doing this is what makes error messages actually useful. Deciding what to put in the error message requires meaningful thought and cannot be adequately automated. Furthermore, stack traces are not adequate context for user-facing, non-programming errors. They are verbose, leak implementation details, are disrupted by any form of indirection or concurrency, etc.

                      Even with proper context, lots of error paths like this is potentially a code smell. It means you probably have broader error strategy problems. I’d try to give some advice on how to improve the code the author provided, but it is too abstract in order to provide any useful insights.

                      1. 18

                        I disagree on a higher level. What we really want is a stacktrace so we know where the error originated, not manually dispensed breadcrumbs…

                        1. 32

                          maybe you do, but I prefer an error chain that was designed. A Go program rarely has just one stack, because every goroutine is its own stack. Having the trace of just that one stack isn’t really a statement of the program as a whole since there’s many stacks, not one. Additionally, stack traces omit the parameters to the functions at each frame, which means that understanding the error means starting with your stack trace, and then bouncing all over your code and reading the code and running it in your head in order to understand your stack trace. This is even more annoying if you’re looking at an error several days later in a heterogeneous environment where you may need the additional complication of having to figure out which version of the code was running when that trace originated. Or you could just have an error like “failed to create a room: unable to reserve room in database ‘database-name’: request timed out” or something similar. Additionally, hand-crafted error chains have the effect that they are often much easier to understand for people who operate but don’t author something; they may have never seen the code before, so understanding what a stack trace means exactly may be difficult for them, especially if they’re not familiar with the language.

                          1. 6

                            I dunno. Erlang and related languages give you back a stack trace (with parameters) in concurrently running processes no problem

                            1. 5

                              It’s been ages since I wrote Erlang, but I remember that back then I rarely wanted a stack trace. My stack were typically 1-2 levels deep: each process had a single function that dispatched messages and did a small amount of work in each one. The thing that I wanted was the state of the process that had sent the unexpected message. I ended up with some debugging modes that attached the PID of the sending process and some other information so that I could reconstruct the state at the point where the problem occurred. This is almost the same situation as Go, where you don’t want the stack trace of the goroutine, you want to capture a stack trace of the program at the point where a goroutine was created and inspect that at the point where the goroutine failed.

                              This isn’t specific to concurrent programs, though it is more common there, it’s similar for anything written in a dataflow / pipeline style. For example, when I’m debugging something in clang’s IR generation I often wish I could go back and see what had caused that particular AST node to be constructed during parsing or semantic analysis. I can’t because all of the state associated with that stack is long gone.

                          2. 10

                            FWIW, I wrote a helper that adds tracing information.

                            I sort of have two minds about this. On the one hand, yeah, computers are good at tracking stack traces, why are we adding them manually and sporadically? OTOH, it’s nice that you can decide if you want the traces or not and it gives you the ability to do higher level things like using errors as response codes and whatnot.

                            The thing that I have read about in Zig that I wish Go had is an error trace which is different from the stack trace, which shows how the error was created, not the how the error propagates back to the execution error boundary which is not very interesting in most scenarios.

                            1. 7

                              The nice thing about those error traces is that they end where the stack trace begins, so it’s seamless to the point that you don’t even need to know that they are a thing, you just get exactly the information that otherwise you would be manually looking for.

                            2. 8

                              In a multiprocess system that’s exchanging messages: which stack?

                              1. 2

                                see: erlang

                              2. 5

                                You don’t want stack traces; you want to know what went wrong.

                                A stack trace can suggest what may have gone wrong, but an error message that declares exactly what went wrong is far more valuable, no?

                                1. 8

                                  An error message is easy, we already have that: “i/o timeout”. A stack trace tells me the exact code path that lead to that error. Building up a string of breadcrumbs that led to that timeout is just a poorly implemented, ad-hoc stack trace.

                                  1. 5

                                    Indeed and I wouldn’t argue with that. I love a good stack trace, but I find they’re often relied upon in lieu of useful error messages and I think that’s a problem.

                                    1. 2

                                      Building up a string of breadcrumbs that led to that timeout is just a poorly implemented, ad-hoc stack trace.

                                      That’s a bit of an over-generalization. A stack trace is inherently a story about the construction of the program that originated the error, while an error chain is a story about the events that led to an error. A stack trace can’t tell you what went wrong if you don’t have access to the program’s source code in the way that a hand crafted error chain can. A stack trace is more about where an error occurred, while an error chain is more about why an error occurred. I think they’re much more distinct than you are suggesting.

                                      and of course, if people are just bubbling up errors without wrapping them, yeah you’re going to have a bad time, but I think attacking that case is like suggesting that every language that has exceptions encourages Pokémon exception handling. That’s a bad exception-handling pattern, but I don’t think that the possibility of this pattern is a fair indictment of exceptions generally. Meanwhile you’re using examples of bad error handling practices that are not usually employed by Go programmers with more than a few weeks experience to indict the entire paradigm.

                                  2. 4

                                    Stack traces are expensive to compute and inappropriate to display to most users. Also, errors aren’t exceptions.

                                    1. 1

                                      That’s why Swift throws errors instead. Exceptions immediately abort the program.

                                    2. 3

                                      What really is the “origin” of an error? Isn’t that somewhat arbitrary? If the error comes from a system call, isn’t the origin deeper in the kernel somewhere? What if you call in to a remote, 3rd party service. Do you want the client to get the stack trace with references to the service’s private code? If you’re using an interface, presumably the purpose is to abstract over the specific implementation. Maybe the stack trace should be truncated at the boundary like a kernel call or API call?

                                      Stack traces are inherently an encapsulation violation. They can be useful for debugging your internals, but they are an anti-feature for your users debugging their own system. If your user sees a stack trace, that means your program is bugged, not theirs.

                                      1. 5

                                        I get a line of logging output: error: i/o timeout. What do I do with that? With Ruby, I get a stack trace which tells me exactly where the timeout came from, giving me a huge lead on debugging the issue.

                                        1. 5

                                          I get a line of logging output: error: i/o timeout. What do I do with that?

                                          Well, that’s a problem you fix by annotating your errors properly. You don’t need stack traces.

                                          1. 3

                                            When your Ruby service returns an HTTP 500, do you send me the stack trace in the response body? What do I do with that?

                                            Go will produce stack traces on panics as well, but that’s precisely the point here: these are two different things. Panics capture stack traces as a “better than nothing” breadcrumb trail for when the programmer has failed to account for a possibility. They are for producers of code, not consumers of it.

                                          2. 2

                                            There’s definitely competing needs between different audiences and environments here.

                                            A non-technical end user doesn’t want to see anything past “something went wrong on our end, but we’re aware of it”. Well, they don’t even want to see that.

                                            A developer wants to see the entire stack trace, or at least have it available. They probably only care about frames in their own code at first, and maybe will want to delve into library code if the error truly doesn’t seem to come from their code or is hard to understand in the first place.

                                            A technical end user might want to see something in-between: they don’t want to see “something was wrong”. They might not even want to see solely the outer error of “something went wrong while persisting data” if the root cause was “I couldn’t reach this host”, because the latter is something they could actually debug within their environment.

                                        2. 9

                                          This is one reason I haven’t gone back to Go since university - There’s no right way to do anything. I think I’ve seen a thousand different right ways to return errors.

                                          1. 9

                                            Lots of pundits say lots of stuff. One good way to learn good patterns (I won’t call them “right”), is to look at real code by experienced Go developers. For instance, if you look at https://github.com/tailscale/tailscale you’ll find pervasive use of fmt.Errorf. One thing you might not see – at least not without careful study – is how to handle code with lots of error paths. That is by it’s very nature harder to see because you have to read and understand what the code is trying to do and what has to happen when something goes wrong in that specific situation.

                                            1. 6

                                              there is a right way to do most things; but it takes some context and understanding for why.

                                              the mistake is thinking go is approachable for beginners; it’s not.

                                              go is an ergonomic joy for people that spend a lot of time investing in it, or bring a ton of context from other languages.

                                              for beginners with little context, it is definitely a mess.

                                              1. 9

                                                I thought Go was for beginners, because Rob Pike doesn’t trust programmers to be good.

                                                1. 18

                                                  I’d assume that Rob Pike, an industry veteran, probably has excellent insight into precisely how good the average programmer at Google is, and what kind of language will enable them to be productive at the stuff Google makes. If this makes programming languages connaisseurs sad, that’s not his problem.

                                                  1. 9

                                                    Here’s the actual quote:

                                                    The key point here is our programmers are Googlers, they’re not researchers. They’re typically, fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.

                                                    So I have to wonder who is capable of understanding a “brilliant language” …

                                                    1. 8

                                                      So I have to wonder who is capable of understanding a “brilliant language” …

                                                      Many people. They don’t work at Google at an entry-level capacity, that’s all.

                                                      There’s a subtle fallacy at work here - Google makes a lot of money, so Google can afford to employ smart people (like Rob Pike!) It does not follow that everyone who works at Google is, on average, smarter than anyone else.

                                                      (edited to include quote)

                                                      1. 8

                                                        Let’s say concretely we are talking about OCaml. Surely entry-level Googlers are capable of understanding OCaml. Jane Street teaches it to all new hires (devs or not) in a two-week bootcamp. I’ve heard stories of people quickly becoming productive in Elm too.

                                                        The real meaning of that quote is not ‘entry-level Googlers are not capable of it’, it’s ‘We don’t trust them with it’ and ‘We’re not willing to invest in training them in it’. They want people to start banging out code almost instantly, not take some time to ramp up.

                                                        1. 8

                                                          Let’s say concretely we are talking about OCaml. Surely entry-level Googlers are capable of understanding OCaml. Jane Street teaches it to all new hires (devs or not) in a two-week bootcamp.

                                                          I suspect that Jane Street’s hiring selects for people who are capable of understanding OCaml; I guarantee that the inverse happens and applicants interested in OCaml self select for careers at Jane Street, just like Erlang-ers used to flock towards Ericsson.

                                                          Google has two orders of magnitude more employees than Jane Street. It needs a much bigger funnel and is likely far less selective in hiring. Go is “the law of large numbers” manifest as a programming language. That’s not necessarily bad, just something that is important for a massive software company and far less important for small boutiques.

                                                          1. 2

                                                            applicants interested in OCaml self select for careers at Jane Street,

                                                            As I said, they teach it to all hires, including non-devs.

                                                            Google has two orders of magnitude more employees than Jane Street. It needs a much bigger funnel and is likely far less selective in hiring

                                                            Surely though, they are not so loose that they hire Tom Dick and Harry off the street. Why don’t we actually look at an actual listing and check? E.g. https://careers.google.com/jobs/results/115367821606560454-software-developer-intern-bachelors-summer-2022/

                                                            Job title: Software Developer Intern, Bachelors, Summer 2022 (not exactly senior level)

                                                            Minimum qualifications:

                                                            Pursuing a Bachelor’s degree program or post secondary or training experience with a focus on subjects in software development or other technical related field. Experience in Software Development and coding in a general purpose programming language. Experience coding in two of C, C++, Java, JavaScript, Python or similar.

                                                            I’m sorry but there’s no way I’m believing that these candidates would be capable of learning Go but not OCaml (e.g.). It’s not about their capability, it’s about what Google wants to invest in them. Another reply even openly admits this! https://lobste.rs/s/yjvmlh/go_ing_insane_part_one_endless_error#c_s3peh9

                                                            1. 2

                                                              And I remember when Google would require at minimum a Masters Degree before hiring.

                                                              1. 1

                                                                I had a master’s degree in engineering (though not in CS) and I couldn’t code my way out of a paper bag when I graduated. Thankfully no-one cared in Dot Com Bubble 1.0!

                                                            2. 3

                                                              They want people to start banging out code almost instantly, not take some time to ramp up.

                                                              Yes, and? The commodification of software developers is a well-known trend (and goal) of most companies. When your assets are basically servers, intangible assets like software and patents, and the people required to keep the stuff running, you naturally try to lower the costs of hiring and paying salary, just like you try to have faster servers and more efficient code.

                                                              People are mad at Rob Pike, but he just made a language for Google. It’s not his fault the rest of the industry thought “OMG this is the bee’s knees, let’s GO!” and adopted it widely.

                                                              1. 1

                                                                Yes, I agree that the commodification of software developers is prevalent today. And we can all see the result, the profession is in dire straits–hard to hire because of bonkers interview practices, hard to keep people because management refuses to compensate them properly, and cranking out bugs like no tomorrow.

                                                              2. 2

                                                                on the contrary, google provides a ton of ramp up time for new hires because getting to grips with all the internal infrastructure takes a while (the language is the least part of it). indeed, when I joined a standard part of the orientation lecture was that whatever our experience level was, we should not expect to be productive any time soon.

                                                                what go (which I do not use very much) might be optimising for is a certain straightforwardness and uniformity in the code base, so that engineers can move between projects without having to learn essentially a new DSL every time they do.

                                                                1. 1

                                                                  You may have a misconception that good programming languages force people to ‘essentially learn a new DSL’ in every project. In any case, as you yourself said, the language is the least part of the ramp-up of a new project, so even if that bit were true, it’s still optimizing for the wrong thing.

                                                                  1. 1

                                                                    no, you misunderstood what i was getting at. i was saying that go was optimisng for straightforwardness and uniformity so that there would be less chance of complex projects evolving their own way of doing things, not that better languages would force people to invent their own DSLs per project.

                                                                    also the ramp-up time i was referring to was for new hires; a lot of google’s internal libraries and services are pretty consistently used across projects (and even languages via bindings and RPC) so changing teams requires a lot less ramp up than joining google in the first place.

                                                                    1. 1

                                                                      i was saying that go was optimisng for straightforwardness and uniformity so that there would be less chance of complex projects evolving their own way of doing things,

                                                                      Again, the chances of that happening are not really as great as the Go people seem to be afraid it is, provided we are talking about a reasonable, good language. So let’s say we leave out Haskell or Clojure. The fear of language-enabled complexity seems pretty overblown to me. Especially considering the effort put into the response, creating an entirely new language and surrounding ecosystem.

                                                        2. 9

                                                          No, Rob observed, correctly, that in an organization of 10,000 programmers, the skill level trends towards the mean. And so if you’re designing a language for this environment, you have to keep that in mind.

                                                          1. 4

                                                            it’s not just that. It’s a language that has to reconcile the reality that skill level trends toward the mean, with the fact that the way that google interviews incurs a selection/survival bias towards very junior programmers who think they are the shit, and thus are very dangerous with the wrong type of power.

                                                            1. 4

                                                              As I get older and become, presumably, a better programmer, it really does occur to me just how bad I was for how long. I think because I learned how to program as a second grader, I didn’t get how much of a factor “it’s neat he can do it all” was in my self-assessment. I was pretty bad, but since I was being compared to the other kids who did zero programming, it didn’t matter that objectively I was quite awful, and I thought I was hot shit.

                                                            2. 4

                                                              Right! But the cargo-cult mentality of the industry meant that a language designed to facilitate the commodification of software development for a huge, singular organization escaped and was inflicted on the rest of us.

                                                              1. 4

                                                                But let’s be real for a moment:

                                                                a language designed to facilitate the commodification of software development

                                                                This is what matters.

                                                                It doesn’t matter if you work for a company of 12 or 120,000: if you are paid to program – that is, you are not a founder – the people who sign your paychecks are absolutely doing everything within their power to make you and your coworkers just cogs in the machine.

                                                                So I don’t think this is a case of “the little fish copying what big bad Google does” as much as it is an essential quality of being a software developer.

                                                                1. 1

                                                                  Thank you, yes. But also, the cargo cult mentality is real.

                                                            3. 2

                                                              Go is for compilers, because Google builds a billion lines a day.

                                                        3. 2

                                                          return errors.Wrapf(err, "fooing %s", bar) is a bit nicer.

                                                          1. 13

                                                            That uses the non-standard errors package and has been obsolete since 1.13: https://stackoverflow.com/questions/61933650/whats-the-difference-between-errors-wrapf-errors-errorf-and-fmt-errorf

                                                            1. 1

                                                              Thanks, that’s good to know.

                                                            2. 8

                                                              return fmt.Errorf("fooing %s %w", bar, err) is idiomatic.

                                                              1. 8

                                                                Very small tweak: normally you’d include a colon between the current message and the %w, to separate error messages in the chain, like so:

                                                                return fmt.Errorf("fooing %s: %w", bar, err)
                                                                
                                                            3. 1

                                                              It makes error messages useful but if it returns a modified err then I can’t catch it further up with if err == someErr, correct?

                                                              1. 2

                                                                You can use errors.Is to check wrapped errors - https://pkg.go.dev/errors#Is

                                                                Is unwraps its first argument sequentially looking for an error that matches the second. It reports whether it finds a match. It should be used in preference to simple equality checks

                                                                1. 2

                                                                  Thanks! I actually didn’t know about that.

                                                                2. 2

                                                                  Yes, but you can use errors.Is and errors.As to solve that problem. These use errors.Unwrap under the hood. This error chaining mechanism was introduced in Go 1.13 after being incubated in the “errors” package for a long while before that. See https://go.dev/blog/go1.13-errors for details.

                                                              1. 4

                                                                This is actually not a bad rundown, though I feel like the discussion of UB lacks the correct nuance. When referring to integer overflow:

                                                                The GNU C compiler (gcc) generates code for this function which can return a negative integer

                                                                No, it doesn’t “return a negative integer”, it has already hit undefined-behaviour-land by that point. The program might appear to behave as if a negative integer was returned, but may not do so consistently, and that is different from having a negative integer actually returned, especially since the program might even exhibit odd behaviours that don’t correspond to the value being negative or the arithmetically correct value, or which don’t even appear to involve the value at all. (Of course, at the machine level, it might do a calculation which stores a negative result into a register or memory location; but, that’s the wrong level to look at it, because the presence of the addition operation has effects on compiler state that can affect code generation well beyond that one operation. Despite the claim being made often, C is not a “portable assembler”. I’m glad this particular article doesn’t make that mistake).

                                                                1. 3

                                                                  What? The code in question:

                                                                  int f(int n)
                                                                  {
                                                                      if (n < 0)
                                                                          return 0;
                                                                      n = n + 100;
                                                                      if (n < 0)
                                                                          return 0;
                                                                      return n;
                                                                  }
                                                                  

                                                                  What the article is saying is that on modern C compilers, the check for n < 0 indicates to the compiler that the programmer is rejecting negative numbers, and because programmers never invoke undefined behavior (cough cough yeah, right) the second check when n < 0 can be removed because of course that can’t happen!

                                                                  So what can actually happen in that case? An aborted program? Reformatted hard drive? Or a negative number returned from f() (which is what I suspect would happen in most cases)? Show generated assembly code to prove or disprove me please … (yes, I’m tired of C language lawyers pedantically warning about possible UB behavior).

                                                                  1. 3

                                                                    because programmers never invoke undefined behavior

                                                                    They shouldn’t, but they often do. That’s why articles such as the one in title should be super clear about the repercussions.

                                                                    So what can actually happen in that case?

                                                                    Anything - that’s the point. That’s what the “undefined” in “undefined behaviour” means.

                                                                    (yes, I’m tired of C language lawyers pedantically warning about possible UB behavior).

                                                                    The issue is that a lot of this “possible UB behaviour” is actual compiler behaviour, but it’s impossible to predict which exact behaviour you’ll get.

                                                                    You might be “tired of C language lawyers pedantically warning about possible UB behaviour”, but I’m personally tired of programmers invoking UB and thinking that it’s ok.

                                                                    1. 1

                                                                      They shouldn’t, but they often do.

                                                                      Yes they do, but only because there’s a lot of undefined behaviors in C. The C standard lists them all (along with unspecified, implementation and locale-specific behaviors). You want to know why they often do? Because C89 defined about 100 undefined behaviors, C99 about 200 and C11 300. It’s a bit scary to think that C code that is fine today could cause undefined behavior in the future—I guess C is a bit like California; in California everything causes cancer, and in C, everything is undefined.

                                                                      A lot historically came about because of different ways CPUs handle certain conditions—the 80386 will trap any attempt to divide by 0 [1] but the MIPS chip doesn’t. Some have nothing to do with the CPU—it’s undefined behavior if a C file doesn’t end with a new line character. Some have to do with incorrect library usage (calling va_arg() without calling va_start()).

                                                                      I’m personally tired of programmers invoking UB and thinking that it’s ok.

                                                                      Undefined behavior is just that—undefined. Most of the undefined behavior in C is pretty straightforward (like calling va_arg() incorrectly), it’s really only signed-integer math and pointers where most of the problems with undefined behavior is bad. Signed-integer math is bad only in that it might generate invalid indices for arrays or for pointer arithmetic (I mean, incorrect answers are still bad, but I’m more thinking of security here). Outside of that, I don’t know of any system in general use today that will trap on signed overflow [2]. So I come back to my original “What?” question. The x86 and ARM architectures have well defined signed integer semantics (they wrap! I’ve yet to come across a system where that doesn’t happen, again [2]) so is it any wonder that programmers will invoke UB and think it’s okay?

                                                                      And for pointers, I would hazard a guess that most programmers today don’t have experience with segmented architectures which is where a lot of the weirder pointer rules probably stem from. Pointers by themselves aren’t the problem per se, it’s C’s semantics with pointers and arrays that lead to most, if not all, problems with undefined behavior with pointers (in my opinion). Saying “Oh! Undefined behavior has been invoked! Abandon all hope!” doesn’t actually help.

                                                                      [1] IEEE-754 floating point doesn’t trap on division by 0.

                                                                      [2] I would love to know of a system where signed overflow is trapped. Heck, I would like to know of a system where trap representations exist! Better yet, name the general purpose systems I can buy new, today, that use sign magnitude or 1s-complement for integer math.

                                                                      1. 2

                                                                        Because C89 defined about 100 undefined behaviors, C99 about 200 and C11 300

                                                                        It didn’t define them; It listed circumstances which have undefined behaviour. This may seem nit-picky, but the necessity of correctly understanding what is “undefined behaviour” is the premise of my original post.

                                                                        A draft of C17 that I have lists 211 undefined behaviours listed. An article on UB - https://www.cs.utah.edu/~regehr/ub-2017-qualcomm.pdf - claims 199 for C11. I don’t think your figure of 300 is correct.

                                                                        A bunch of the C11 circumstances for UB are to do with the multi-threading support which didn’t exist in C99. In general I don’t think there’s any strong reason to believe that code with clearly well-specified behaviour now will have UB in the future.

                                                                        So I come back to my original “What?” question

                                                                        It’s not clear to me what your “what?” question is about. I elaborated in the first post on what I meant by “No, it doesn’t “return a negative integer””.

                                                                        Compilers will for eg. remove checks for impossible (in the absence of UB) conditions and other things that may be even harder to predict; C programmers should be aware of that.

                                                                        Now, if you want to argue “compilers shouldn’t do that”, I wouldn’t necessarily disagree. The problem is: they do it, and the language specification makes it clear that they are allowed to do it.

                                                                        The x86 and ARM architectures have well defined signed integer semantics

                                                                        so is it any wonder that programmers will invoke UB and think it’s okay?

                                                                        This illustrates my point: if we allow the view of C as a “portable assembly language” to be propagated, and especially the view of “UB is just the semantics of the underlying architecture”, we’ll get code being produced which doesn’t work (and worse, is in some cases exploitable) when compiled by today’s compilers.

                                                                        1. 1

                                                                          I don’t think your figure of 300 is correct.

                                                                          You are right. I recounted, and there are around 215 or so for C11. But there’s still that doubling from C89 to C99.

                                                                          No, it doesn’t “return a negative integer”, it has already hit undefined-behaviour-land by that point.

                                                                          It’s not clear to me what your “what?” question is about.

                                                                          Unless the machine in question traps on signed overflow, the code in question returns something when it runs. Just saying “it’s undefined behavior! Anything can happen!” doesn’t help. The CPU will either trap, or it won’t. There is no third thing that can happen. An argument can be made that CPUs should trap, but the reality is nearly every machine being programmed today is a byte-oriented, 2’s complement machine with defined signed overflow semantics.

                                                                          1. 1

                                                                            Just saying “it’s undefined behavior! Anything can happen!” doesn’t help

                                                                            It makes it clear that you should have no expectations on behaviour in the circumstance - which you shouldn’t.

                                                                            Unless the machine in question traps on signed overflow, the code in question returns something when it runs.

                                                                            No, as already evidenced, the “result” can be something that doesn’t pass the ‘x < 0’ check yet displays as a negative when printed, for example. It’s not a real value.

                                                                            The CPU will either trap, or it won’t

                                                                            C’s addition doesn’t map directly to the underlying “add” instruction of the target architecture; it has different semantics. It doesn’t matter what the CPU will or won’t do when it executes an “add” instruction.

                                                                  2. 1

                                                                    Yes, the code generated does in fact return a negative integer. You shouldn’t rely on it, another compiler may do something different. But once compiled undefined behaviour isn’t relevant anymore. The generated x86 does in fact contain a function that may return a negative integer.

                                                                    Again, it would be completely legal for the compiler to generate code that corrupted memory or ejected your CD drive. But this statement is talking about the code that happened to be generated by a particular run of a particular compiler. In this case it did in fact emit a function that may return a negative number.

                                                                    1. 1

                                                                      When we talk about undefined behaviour, we’re talking about the semantics at the level of the C language, not the generated code. (As you alluded, that wouldn’t make much sense.)

                                                                      At some point you have to map semantics between source and generated code. My point was, you can’t map the “generates a negative value” of the generated code back to the source semantics. We only say it’s a negative value on the basis that its representation (bit pattern) is that of a negative value, as typically represented in the architecture, and even then we’re assuming that for instance some register (for example) that is typically used to return values does in fact hold the return value of the function …

                                                                      … which it doesn’t, if we’re talking about the source function. Because that function doesn’t return once undefined behaviour is invoked; it ceases to have any defined behaviour at all.

                                                                      I know this is highly conceptual and abstract, but that’s at the heart of the message - C semantics are at a higher level than the underlying machine; it’s not useful to think in terms of “undefined behaviour makes the function return a negative value” because then we’re imposing artificial constraints on undefined behaviour and what it is; from there, we’ll start to believe we can predict it, or worse, that the language semantics and machine semantics are in fact one-to-one.

                                                                      I’ll refer again to the same example as was in the original piece: the signed integer overflow occurs and is followed by a negative check, which fails (“is optimised away by the compiler”, but remember that optimisation preserves semantics). So, it’s not correct to say that the value is negative (otherwise it would have been picked up by the (n < 0) check); it’s not guaranteed to behave as a negative value. It’s not guaranteed to behave any way at all.

                                                                      Sure, the generated code does something and it has much stricter semantics than C. But saying that the generated function “returns a negative value” is lacking the correct nuance. Even if it’s true that in some similar case, the observable result - from some particular version of some particular compiler for some particular architecture - is that the number always appears to be negative, this is not something we should in any way suggest is the actual semantics of C.

                                                                    2. 0

                                                                      Of course, at the machine level, it might do a calculation which stores a negative result into a register or memory location; but, that’s the wrong level to look at it, because the presence of the addition operation has effects on compiler state that can affect code generation well beyond that one operation.

                                                                      Compilers specifically have ways of ensuring that there is no interference between operations, so no. This is incorrect. Unless you want to point to the part of the GCC and Clang source code that decides unexpectedly to stop doing that?

                                                                      1. 1

                                                                        In the original example, the presence of the addition causes the following negative check (n < 0) to be omitted from the generated code.

                                                                        Unless you want to point to the part of the GCC and Clang source code that decides unexpectedly to stop doing that?

                                                                        If that’s at all a practical suggestion, perhaps you can go find the part that ensures “that there is no interference between operations” and point that out?

                                                                        1. 1

                                                                          In the original example, the presence of the addition causes the following negative check (n < 0) to be omitted from the generated code.

                                                                          Right, because register allocation relies upon UB for performance optimization. It’s the same in both GCC and Clang (Clang is actually worse with regards to it’s relentless use of UB to optimize opcode generation, presumably this is also why they have more tooling around catching errors and sanitizing code). This is a design feature from the perspective of compiler designers. There is absolutely nothing in the literature to back up your point that register allocation suddenly faceplants on UB – I’d be more than happy to read it if you can find it, though.

                                                                          If that’s at all a practical suggestion, perhaps you can go find the part that ensures “that there is no interference between operations” and point that out?

                                                                          *points at the entire register allocation subsystem*

                                                                          But no, the burden of proof is on you, as you made the claim that the register allocator and interference graph fails on UB. It is up to you to prove that claim. I personally cannot find anything that backs your claim up, and it is common knowledge (backed up by many, many messages about this on the mailing list) that the compiler relies on Undefined Behaviour.

                                                                          Seriously, I want to believe you. I would be happy to see another reason of why having the compiler rely on UB is a negative point. For this reason I also accept a code example where you can use the above example of UB to cause the compiler to clobber registers and return an incorrect result. The presence of a negative number alone is not sufficient as that does not demonstrate register overwriting.

                                                                          1. 2

                                                                            There is absolutely nothing in the literature to back up your point that register allocation suddenly faceplants on UB

                                                                            What point? I think you’ve misinterpreted something.

                                                                            you made the claim that the register allocator and interference graph fails on UB

                                                                            No, I didn’t.

                                                                          2. 1

                                                                            It isn’t the addition; the second check is omitted because n is known to be greater than 0. Here’s the example with value range annotations for n.

                                                                            int f(int n)
                                                                            {
                                                                                // [INT_MIN, INT_MAX]
                                                                                if (n < 0)
                                                                                {
                                                                                    // [INT_MIN, -1]
                                                                                    return 0;
                                                                                }
                                                                                // [0, INT_MAX]
                                                                                n = n + 100;
                                                                                // [100, INT_MAX] - overflow is undefined so n must be >= 100 
                                                                                if (n < 0)
                                                                                {
                                                                                    return 0;
                                                                                }
                                                                                return n;
                                                                            }
                                                                            
                                                                            1. 2

                                                                              You’re correct that I oversimplified it. The tone of the person I responded to was combative and I couldn’t really be bothered going into detail again one something that I’ve now gone over several times in different posts right here in this discussion.

                                                                              As you point out it’s the combination of “already compared to 0” and “added a positive integer” that makes the final comparison to 0 redundant. The original point, that the semantics of C, and in particular the possibility of UB, mean that a simple operation can affect later code generation.

                                                                              Here’s an example that works without interval analysis: (edit: or rather, that requires slightly more sophisticated analysis):

                                                                              int f(int n)
                                                                              {
                                                                                  int orig_n = n;
                                                                                  n = n + 100;
                                                                                  if (n < orig_n)
                                                                                  {
                                                                                      return 0;
                                                                                  }
                                                                                  return n;
                                                                              }
                                                                              
                                                                      1. 7

                                                                        The quote from the C standard is not quite the right one. %d is a perfectly valid conversion specification. The problem arises with the sentence afterward:

                                                                        If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

                                                                        Also note that the cast syntax in this example is only valid in C++. If you try to compile that code as C, you’ll get an error. It’s a bit strange to me that the author used some ambiguous looking C++-specific syntax to argue against writing C.

                                                                        That said, I agree that variadic functions can be very error-prone. You miss out on type-checking for the variadic arguments, so they should be only used sparingly and with extra caution.

                                                                        1. 11

                                                                          the code, as presented, won’t compile under any actual C compiler. The line:

                                                                          printf("%d\n",double(42));
                                                                          

                                                                          is not valid C syntax. Given that he use #include <cstdio> tells me he used a C++ compiler, which may very well allow such dubious syntax (I don’t know, I don’t program in C++). This blog post to me reads as an anti-C screed by someone who doesn’t program in C.

                                                                          1. 3

                                                                            double(1) is a call to the double primitive type’s instrinsic constructor. It’s not really a type cast, though that’s effectively what happens (at least as I understand it).

                                                                            1. 1

                                                                              It’s completely legit syntax in C++.

                                                                              The principle behind it is that user-defined types should generally be able to everything primitive types can do, and to some extent vice-versa. This is why C++ supports operator overloading, for example.

                                                                              In this case allowing primitive types to use constructor-like syntax means you can write something like T(42) in a template and it will compile and do the expected thing whether T is a user-defined class or the primitive double type.

                                                                              1. 3

                                                                                Nobody said it was legitimate syntax in C++. They said it was illegitimate syntax in C

                                                                          1. 18

                                                                            The whole damn thing.

                                                                            Instead of having this Frankenstein’s monster of different OSs and different programming languages and browsers that are OSs and OSs that are browsers, just have one thing.

                                                                            There is one language. There is one modular OS written in this language. You can hot-fix the code. Bits and pieces are stripped out for lower powered machines. Someone who knows security has designed this thing to be secure.

                                                                            The same code can run on your local machine, or on someone else’s machine. A website is just a document on someone else’s machine. It can run scripts on their machine or yours. Except on your machine they can’t run unless you let them and they can’t do I/O unless you let them.

                                                                            There is one email protocol. Email addresses can’t be spoofed. If someone doesn’t like getting an email from you, they can charge you a dollar for it.

                                                                            There is one IM protocol. It’s used by computers including cellphones.

                                                                            There is one teleconferencing protocol.

                                                                            There is one document format. Plain text with simple markup for formatting, alignment, links and images. It looks a lot like Markdown, probably.

                                                                            Every GUI program is a CLI program underneath and can be scripted.

                                                                            (Some of this was inspired by legends of what LISP can do.)

                                                                            1. 24

                                                                              Goodness, no - are you INSANE? Technological monocultures are one of the greatest non-ecological threats to the human race!

                                                                              1. 1

                                                                                I need some elaboration here. Why would it be a threat to have everyone use the same OS and the same programming language and the same communications protocols?

                                                                                1. 6

                                                                                  One vulnerability to rule them all.

                                                                                  1. 2

                                                                                    Pithy as that sounds, it is not convincing for me.

                                                                                    Having many different systems and languages in order to have security by obscurity by having many different vulnerabilities does not sound like a good idea.

                                                                                    I would hope a proper inclusion of security principles while designing an OS/language would be a better way to go.

                                                                                    1. 4

                                                                                      It is not security through obscurity, it is security through diversity, which is a very different thing. Security through obscurity says that you may have vulnerabilities but you’ve tried to hide them so an attacker can’t exploit them because they don’t know about them. This works as well as your secrecy mechanism. It is generally considered bad because information disclosure vulnerabilities are the hardest to fix and they are the root of your security in a system that depends on obscurity.

                                                                                      Security through diversity, in contrast, says that you may have vulnerabilities but they won’t affect your entire fleet. You can build reliable systems on top of this. For example, the Verisign-run DNS roots use a mixture of FreeBSD and Linux and a mixture of bind, unbound, and their own in-house DNS server. If you find a Linux vulnerability, you can take out half of the machines, but the other half will still work (just slower). Similarly, a FreeBSD vulnerability can take out half of them. A bind or unbound vulnerability will take out a third of them. A bind vulnerability that depends on something OS-specific will take out about a sixth.

                                                                                      This is really important when it comes to self-propagating malware. Back in the XP days, there were several worms that would compromise every Windows machine on the local network. I recall doing a fresh install of Windows XP and connecting it to the university network to install Windows update: it was compromised before it was able to download the fix for the vulnerability that the worm was exploiting. If we’d only had XP machines on the network, getting out of that would have been very difficult. Because we had a load of Linux machines and Macs, we were able to download the latest roll-up fix for Windows, burn it to a CD, redo the install, and then do an offline update.

                                                                                      Looking at the growing Linux / Docker monoculture today, I wonder how much damage a motivated individual with a Linux remote arbitrary-code execution vulnerability could do.

                                                                                      1. 1

                                                                                        Sure, but is this an intentional strategy? Did we set out to have Windows and Mac and Linux in order that we could prevent viruses from spreading? It’s an accidental observation and not a really compelling one.

                                                                                        I’ve pointed out my thinking in this part of the thread https://lobste.rs/s/sdum3p/if_you_could_rewrite_anything_from#c_ennbfs

                                                                                        In short, there must be more principled ways of securing our computers than hoping multiple green field implementations of the same application have different sets of bugs.

                                                                                      2. 3

                                                                                        A few examples come to mine though—heartbleed (which affected anyone using OpenSSL) and Specter (anyone using the x86 platform). Also, Microsoft Windows for years had plenty of critical exploits because it had well over 90% of the desktop market.

                                                                                        You might also want to look up the impending doom of bananas, because over 90% of bananas sold today are genetic clones (it’s basically one plant) and there’s a fungus threatening to kill the banana market. A monoculture is a bad idea.

                                                                                        1. 1

                                                                                          Yes, for humans (and other living things) the idea of immunity through obscurity (to coin a phrase) is evolutionarily advantageous. Our varied responses to COVID is one such immediate example. It does have the drawback that it makes it harder to develop therapies since we see population specificity in responses.

                                                                                          I don’t buy that the we need to employ the same idea in an engineered system. It’s a convenient back-ported bullet list advantage of having a chaotic mess of OSes and programming languages, but it certainly wasn’t intentional.

                                                                                          I’d rather have an engineered, intentional robustness to the systems we build.

                                                                                          1. 4

                                                                                            To go in a slightly different direction—building codes. The farther north you go, the steeper roofs tend to get. In Sweden, one needs a steep roof to shed show buildup, but where I live (South Florida, just north of Cuba) building such a roof would be a waste of resources because we don’t have snow—we just need a shallow angle to shed rain water. Conversely, we don’t need codes to deal with earthquakes, nor does California need to deal with hurricanes. Yet it would be so much simpler to have a single building code in the US. I’m sure there are plenty of people who would love to force such a thing everywhere if only to make their lives easier (or for rent-seeking purposes).

                                                                                            1. 2

                                                                                              We have different houses for different environments, and we have different programs for different use cases. This does not mean we need different programing languages.

                                                                                        2. 2

                                                                                          I would hope a proper inclusion of security principles while designing an OS/language would be a better way to go.

                                                                                          In principle, yeah. But even the best security engineers are human and prone to fail.

                                                                                          If every deployment was the same version of the same software, then attackers could find an exploitable bug and exploit it across every single system.

                                                                                          Would you like to drive in a car where every single engine blows up, killing all inside the car? If all cars are the same, they’ll all explode. We’d eventually move back to horse and buggy. ;-) Having a variety of cars helps mitigate issues other cars have–while still having problems of its own.

                                                                                          1. 1

                                                                                            In this heterogeneous system we have more bugs (assuming the same rate of bugs everywhere) and fewer reports (since there are fewer users per system) and a more drawn out deployment of fixes. I don’t think this is better.

                                                                                            1. 1

                                                                                              Sure, you’d have more bugs. But the bugs would (hopefully) be in different, distinct places. One car might blow up, another might just blow a tire.

                                                                                              From an attacker’s perspective, if everyone drives the same car, it the attacker knows that the flaws from one car are reproducible with 100% success rate, then the attacker doesn’t need to spend time/resources of other cars. The attacker can just reuse and continue to rinse, reuse, recycle. All are vulnerable to the same bug. All can be exploited in the same manner reliably, time after another.

                                                                                              1. 3

                                                                                                To go by the car analogy, the bugs that would be uncovered by drivers rather than during the testing process would be rare ones, like, if I hit the gas pedal and brake at the same time it exposes a bug in the ECU that leads to total loss of power at any speed.

                                                                                                I’d rather drive a car a million other drivers have been driving than drive a car that’s driven by 100 people. Because over a million drivers it’s much more likely someone hits the gas and brake at the same time and uncovers the bug which can then be fixed in one go.

                                                                                  2. 3
                                                                                    1. 1

                                                                                      Yes, that’s probably the LISP thing I was thinking of, thanks!

                                                                                    2. 2

                                                                                      I agree completely!

                                                                                      We would need to put some safety measures in place, and there would have to be processes defined for how you go about suggesting/approving/adding/changing designs (that anyone can be a part of), but otherwise, it would be a boon for the human race. In two generations, we would all be experts in our computers and systems would interoperate with everything!

                                                                                      There would be no need to learn new tools every X months. The UI would familiar to everyone, and any improvements would be forced to go through human testing/trials before being accepted, since it would be used by everyone! There would be continual advancements in every area of life. Time would be spent on improving the existing experience/tool, instead of recreating or fixing things.

                                                                                      1. 2

                                                                                        I would also like to rewrite most stuff from the ground up. But monocultures aren’t good. Orthogonality in basic building blocks is very important. And picking the right abstractions to avoid footguns. Some ideas, not necessarily the best ones:

                                                                                        • proven correct microkernel written in rust (or similar borrow-checked language), something like L4
                                                                                        • capability based OS
                                                                                        • no TCP/HTTP monoculture in networks (SCTP? pubsub networks?)
                                                                                        • are our current processor architectures anywhere near sane? could safe concurrency be encouraged at a hardware level?
                                                                                        • less walled gardens and centralisation
                                                                                        1. 2

                                                                                          proven correct microkernel written in rust (or similar borrow-checked language), something like L4

                                                                                          A solved problem. seL4, including support for capabilities.

                                                                                          1. 5

                                                                                            seL4 is proven correct by treating a lot of things as axioms and by presenting a programmer model that punts all of the difficult bits to get correct to application developers, making it almost impossible to write correct code on top of. It’s a fantastic demonstration of the state of modern proof tools, it’s a terrible example of a microkernel.

                                                                                            1. 2

                                                                                              FUD unless proven otherwise.

                                                                                              Counter-examples exist; seL4 can definitely be used, as demonstrated by many successful uses.

                                                                                              The seL4 foundation is getting a lot of high profile members.

                                                                                              Furthermore, Genode, which is relatively easy to use, supports seL4 as a kernel.

                                                                                        2. 2

                                                                                          Someone wrote a detailed vision of rebuilding everything from scratch, if you’re interested. 1

                                                                                            1. 11

                                                                                              I never understood this thing.

                                                                                              1. 7

                                                                                                I think that is deliberate.

                                                                                            2. 1

                                                                                              And one leader to rule them all. No, thanks.

                                                                                              1. 4

                                                                                                Well, I was thinking of something even worse - design by committee, like for electrical stuff, but your idea sounds better.

                                                                                              2. 1

                                                                                                We already have this, dozens of them. All you need to do is point guns at everybody and make them use your favourite. What a terrible idea.

                                                                                              1. 21

                                                                                                I’d like a much smaller version of the web platform, something focused on documents rather than apps. I’m aware of a few projects in that direction but none of them are in quite the design space I’d personally aim for.

                                                                                                1. 6

                                                                                                  Well, “we” tried that with PDF and it still was infected with featureitis and Acrobat Reader is yet another web browser. Perhaps not unsurprising considering Adobe’s track record, but if you factor in their proprietary extensions (there’s javascript in there, 3D models, there used to be Flash and probably still is somewhere..) it followed the same general trajectory and timeline as the W3C soup. Luckily much of that failed to get traction (tooling, proprietary and web network effect all spoke against it) and thus is still more thought of “as a document”.

                                                                                                  1. 20

                                                                                                    This is another example of “it’s not the tech, it’s the economy, stupid!” The modern web isn’t a adware-infested cesspool because of HTML5, CSS, and JavaScript, it’s a cesspool because (mis)using these tools make people money.

                                                                                                    1. 5

                                                                                                      Yeah exactly, for some examples: Twitter stopped working without JS recently (what I assume must be a purposeful decision). Then I noticed Medium doesn’t – it no longer shows you the whole article without JS. And Reddit has absolutely awful JS that obscures the content.

                                                                                                      All of this was done within the web platform. It could have been good, but they decided to make it bad on purpose. And at least in the case of Reddit, it used to be good!

                                                                                                      Restricting or rewriting the platform doesn’t solve that problem – they are pushing people to use their mobile apps and sign in, etc. They will simply use a different platform.

                                                                                                      (Also note that these platforms somehow make themselves available to crawlers, so I use https://archive.is/, ditto with the NYTimes and so forth. IMO search engines should not jump through special hoops to see this content; conversely, if they make their content visible to search engines, then it’s fair game for readers to see.)

                                                                                                      1. 4

                                                                                                        I’ll put it like this: I expect corporate interests to continue using the most full-featured platforms available, including the web platform as we know it today. After all, those features were mostly created for corporate interests.

                                                                                                        That doesn’t mean everybody else has to build stuff the same way the corps do. I think we can and should aspire for something better - where by better in this case I mean less featureful.

                                                                                                        1. 4

                                                                                                          That doesn’t mean everybody else has to build stuff the same way the corps do. I think we can and should aspire for something better - where by better in this case I mean less featureful.

                                                                                                          The trick here is to make sure people use it for a large value of people. I was pretty interested in Gemini from the beginning and wrote some stuff on the network (including an HN mirror) and I found that pushing back against markup languages, uploads, and some form of in-band signaling (compression etc) ends up creating a narrower community than I’d like. I fully acknowledge this might just be a “me thing” though.

                                                                                                          EDIT: I also think you’ve touched upon something a lot of folks are interested in right now as evidenced by both the conversation here and the interest in Gemini as a whole.

                                                                                                          1. 3

                                                                                                            I appreciate those thoughts, for sure. Thank you.

                                                                                                          2. 2

                                                                                                            That doesn’t mean everybody else has to build stuff the same way the corps do.

                                                                                                            I agree, and you can look at https://www.oilshell.org/ as a demonstration of that (both the site and the software). But all of that is perfectly possible with existing platforms and tools. In fact it’s greatly aided by many old and proven tools (shell, Python) and some new-ish ones (Ninja).

                                                                                                            There is value in rebuilding alternatives to platforms for sure, but it can also be overestimated (e.g. fragmenting ecosystems, diluting efforts, what Jamie Zawinski calls CADT, etc.).


                                                                                                            Similar to my “alternative shell challenges”, I thought of a “document publishing challenge” based on my comment today on a related story:

                                                                                                            The challenge is if the platform can express a widely praised, commercial multimedia document:

                                                                                                            https://ciechanow.ski/gears/

                                                                                                            https://ciechanow.ski/js/gears.js (source code is instructive to look at)

                                                                                                            https://news.ycombinator.com/item?id=22310813 (many appreciative comments)

                                                                                                            1. 2

                                                                                                              Yeah, there are good reasons this is my answer to “if you could” and not “what are your current projects”. :)

                                                                                                              I like the idea of that challenge. I don’t actually know whether my ideal platform would make that possible or not, but situating it with respect to the challenge is definitely useful for thinking about it.

                                                                                                              1. 1

                                                                                                                Oops, I meant NON-commercial! that was of course the point

                                                                                                                There is non-commercial content that makes good use of recent features of the web

                                                                                                          3. 4

                                                                                                            Indeed - tech isn’t the blocker to fixing this problem. The tools gets misused from the economic incentives overpowering the ones from the intended use. Sure you can nudge development in a certain direction by providing references, templates, frameworks, documentation, what have you - but whatever replacement needs to also provide enough economic incentives to minimise the appeal of abuse. Worse still, deployed at a tipping point where the value added exceed the inertia and network effect of the current Web.

                                                                                                            1. 2

                                                                                                              I absolutely believe that the most important part of any effort at improving the situation has to be making the stuff you just said clear to everyone. It’s important to make it explicit from the start that the project’s view is that corporate interests shouldn’t have a say in the direction of development, because the default is that they do.

                                                                                                              1. 2

                                                                                                                I think the interests of a corporation should be expressible and considered through some representative, but given the natural advantage an aggregate has in terms of resources, influence, “network effect”, … they should also be subject to scrutiny and transparency that match their relative advantage over other participants. Since that rarely happens, effect instead seem to be that the Pareto Principle sets in and the corporation becomes the authority in ‘appeal to authority’. They can then lean back and cash in with less effort than anyone else. Those points are moot though if the values of the intended tool/project/society aren’t even expressed, agreed upon or enforced.

                                                                                                                1. 1

                                                                                                                  Yes, I agree with most of that, and the parts I don’t agree with are quite defensible. Well said.

                                                                                                          4. 2

                                                                                                            Yes, I agree. I do think that this is largely a result of PDF being a corporate-driven project rather than a grassroots one. As somebody else said in the side discussion about Gemini, that’s not the only source of feature creep, but I do think it’s the most important factor.

                                                                                                          5. 5

                                                                                                            I’m curious about what direction is that too. I’ve been using and enjoying the gemini protocol and I think it’s fantastic.

                                                                                                            Even the TLS seems great since it would allow some some simple form of client authentication but in a very anonymous way

                                                                                                            1. 7

                                                                                                              I do like the general idea of Gemini. I’m honestly still trying to put my thoughts together, but I’d like something where it’s guaranteed to be meaningful to interact with it offline, and ideally with an experience that looks, you know… more like 2005 than 1995 in terms of visual complexity, if you see what I mean. I don’t think we have to go all the way back to unformatted text, it just needs to be a stable target. The web as it exists right now seems like it’s on a path to keep growing in technical complexity forever, with no upper bound.

                                                                                                              1. 9

                                                                                                                I have some thoughts in this area:

                                                                                                                • TCP/IP/HTTP is fine (I disagree with Gemini there). It’s HTML/CSS/JS that are impossible to implement on a shoestring.

                                                                                                                • The web’s core value proposition is documents with inline hyperlinks. Load all resources atomically, without any privacy-leaking dependent loads.

                                                                                                                • Software delivery should be out of scope. It’s only needed because our computers are too complex to audit, and the programs we install keep exceeding their rights. Let’s solve that problem at the source.

                                                                                                                I’ve thought about this enough to make a little prototype.

                                                                                                                1. 5

                                                                                                                  It’s of course totally fine to disagree, but I genuinely believe it will be impossible to ever avoid fingerprinting with HTTP. I’ve seen stuff, not all of which I’m at liberty to talk about. So from a privacy standpoint I am on board with a radically simpler protocol for that layer. TCP and IP are fine, of course.

                                                                                                                  I agree wholeheartedly with your other points.

                                                                                                                  That is a really cool project! Thank you for sharing it!

                                                                                                                  1. 4

                                                                                                                    Sorry, I neglected to expand on that bit. My understanding is that the bits of HTTP that can be used for fingerprinting require client (browser) support. I was implicitly assuming that we’d prune those bits from the browser while we’re reimplementing it from scratch anyway. Does that seem workable? I’m not an expert here.

                                                                                                                    1. 6

                                                                                                                      I’ve been involved with Gemini since the beginning (I wrote the very first Gemini server) and I was at first amazed at just how often people push to add HTTP features back into Gemini. A little feature here, a little feature there, and pretty soon it’s HTTP all over again. Prune all you want, but people will add those features back if it’s at all possible. I’m convinced of that.

                                                                                                                      1. 4

                                                                                                                        So you’re saying that a new protocol didn’t help either? :)

                                                                                                                        1. 4

                                                                                                                          Pretty much. At least Gemini drew a hard line in the sand and not try to prune an existing protocol. But people like their uploads and markup languages.

                                                                                                                          1. 2

                                                                                                                            Huh. I guess the right thing to do, then, is design the header format with attention to minimizing how many distinguishing bits it leaks.

                                                                                                                      2. 1

                                                                                                                        Absolutely. There is nothing very fingerprintable in minimal valid http requests.

                                                                                                                  2. 5

                                                                                                                    , but I’d like something where it’s guaranteed to be meaningful to interact with it offline

                                                                                                                    This is where my interest in store-and-forward networks lie. I find that a lot of the stuff I do on the internet is pull down content (read threads, comments, articles, documentation) and I push content (respond to things, upload content, etc) much less frequently. For that situation (which I realize is fairly particular to me) I find that a store-and-forward network would make offline-first interaction a first-class citizen.

                                                                                                                    I distinguish this from IM (like Matrix, IRC, Discord, etc) which is specifically about near instant interaction.

                                                                                                                    1. 1

                                                                                                                      I agree.

                                                                                                                2. 2

                                                                                                                  Have you looked at the gemini protocol?

                                                                                                                  1. 2

                                                                                                                    I have, see my other reply.

                                                                                                                1. 2

                                                                                                                  Feeling like an computer illiterate dinosaur thanks to Microsoft Windows [1]. All I need to do is transfer a file so a fellow cow-orker has a copy. It’s on my Linux laptop, not the Microsoft laptop I have. On Windows, on the command line, I do:

                                                                                                                  cd Desktop
                                                                                                                  scp blahblah@linuxlaptop:file-to-transfer .
                                                                                                                  

                                                                                                                  And that works—I mean, the file is there after I transfer it. Then I need to copy it somehow to cloud-drop-drive-box-whatever-the-hell-it’s-called and I can’t find the file in the file manager … thingy. I clock on Desktop in the file manager … thingy, and it’s not there. Apparently, the “Desktop” directory from the command line isn’t the same “Desktop” directory from the file manager … thingy. No, I have to navigate from the C: drive (no, there is no shortcut to my “account” on the Microsoft Laptop—WTF?) and navigate to Users, me, then Desktop and lo’, there it is. Then I have to find the upload link in Teams, which … is @#$@#$ somewhere on that god forsaken interface. I’m finally directed to it by my fellow cow-orker and … it fails, probably because I’m behind honest-to-God DSL and not a local gigabite ethernet connection at the office.

                                                                                                                  I’ve been using computers since the mid-80s. Unix since the early 90s. Mac OS from about 2005. Why am I unable to use them anymore? What happened?

                                                                                                                  [1] I went from MS-DOS to Unix, managing to skip entirely Windows, except in the past year when new management at the company I work for shoved a Microsoft laptop at me. I don’t use it at all except for Microsoft Teams. Yes, that is insane. But the new corporate overlords are a Microsoft shop, except for the department I work in, which still requires Unix (Solaris, specifically).

                                                                                                                  1. 4

                                                                                                                    Summary: The author has a lot of expectations that mainstream no longer cares about.

                                                                                                                    But it is true that because Unix in practice is a PITA to manage - especially the gnarly userspace with tendencies to try to support all niche variants, even from 30 years ago, and since storage is so cheap, the mainstream industry turned into packaging each app with its userspace.

                                                                                                                    Hardcore Unixers on BSD are not having a great time, and I can sympathize. The whole world moves into a direction that is making them less relevant. Each Linux-specific extensions embraced by mainstream like cgroups/containers, systemd, etc.is a barrier to interoperability which they relied on.

                                                                                                                    1. 2

                                                                                                                      Okay, fine, each application now comes with its own copies of shared libraries … and the point of having shared libraries is … ? If the point of a shared library to to, you know, share it among processes, then what the hell is gained with Docker when you bundle all the shared libraries an application uses with it? There’s no sharing going on, which (in my opinion) defeats the actual purpose of shared libraries.

                                                                                                                      1. 6

                                                                                                                        what the hell is gained with Docker when you bundle all the shared libraries an application uses with it?

                                                                                                                        Reliable and (somewhat) reproducible build, deployment and behavior at runtime.

                                                                                                                        defeats the actual purpose of shared libraries.

                                                                                                                        Yes. No one cares. Computers have gigabytes of ram and terrybytes of storage. And in industrial application, people don’t even work on one system anyway. One part of the system gets 5 computers, the other 10, and so on.

                                                                                                                        Shared libraries were always and still are a PITA, and are simply not worthwhile tradeoff anymore. People are running even most trivial applications build on stuff like Electron, throwing away gigabytes of memory upfront. Why would anyone care about couple of megabytes.

                                                                                                                        1. 1

                                                                                                                          It depends a bit on how it’s implemented, but this can still be possible with a container-based workflow, in three ways:

                                                                                                                          • Containers may include multiple individual programs. These can trivially share mappings to libraries that are common to them.
                                                                                                                          • Containers are made up of layers. If two containers are built on top of the same base layer then any mapping of pages in this layer (e.g. libc) will be shared. If they use different base layers then they don’t get sharing but this is graceful fallback for when they have different libraries.
                                                                                                                          • In a lot of server deployments, you’re running one container per VM, so you’re relying on page deduplication for sharing. In Linux, you can turn on KSM even if you’re running multiple containers in a single VM, so you can share identical pages even if they’re from different base layers.

                                                                                                                          The shared base layer bit also helps with the ‘reduced distribution size’ benefit of shared libraries.

                                                                                                                          In addition, you get deterministic builds and non-interference between containers. This is the same underlying idea as PC-BSD’s old PBI format, which would ship all dependencies bundled with an app and use hard links to deduplicate them. If two containers contain the same thing, there are multiple ways in which they can share on-disk and in-memory resources but when they want to have different versions then everything gracefully fails back to the non-shared case.

                                                                                                                      1. 3

                                                                                                                        I personally don’t use IDEs, and not for any of the reasons mentioned in the post. I don’t use them for two reasons:

                                                                                                                        1. I don’t have to learn “yet another editor!” I’ve been programming long enough to remember language-specific IDEs (and to some extent, they still are, see the next point). They were rarely configurable (again, early ones), and they certainly didn’t work like my preferred editor at the time. Things might be better today, but

                                                                                                                        2. I have yet to find one that didn’t crash on me. I have this uncanny ability to crash the darned things. The last one crashed on a simple two-file C program (one C file, one header). Even though it claimed to support “C/C++”, it didn’t. I had valid C code, and yet, it would crash immediately upon loading the project. I suspect I know why [1], but didn’t bother with submitting a bug report for fear of hearing “What? Don’t do that!” because it was really a C++ IDE, not a C IDE.

                                                                                                                        I think I’m just more of a language maven than a tool maven.

                                                                                                                        [1] Because I used ‘class’ as a field name in a structure definition, which is a perfectly cromulent use of ‘class’ in C code.

                                                                                                                        1. 10

                                                                                                                          I’m not a huge fan of IDEs but this article is mostly nonsense.

                                                                                                                          This in particular I take exception to:

                                                                                                                          Less Code, Better Readability

                                                                                                                          Not having Autocomplete / Code Generation guides you naturally to writing more compact and idiomatic Code in the Language of your Choice. It helps you to learn language-specific features and syntactic sugar. Consider

                                                                                                                          System.out.println(“hello”);

                                                                                                                          printf(“hello”);

                                                                                                                          You see that in C, where using an IDE is uncommon, the languge/libraries/frameworks naturally becomes more easily readable and writable.

                                                                                                                          I see this as a win for IDEs especially in large projects because the cost of avoiding useless abbreviations is minimal.

                                                                                                                          ShoeApi.GetPriceAndFeatures();
                                                                                                                          

                                                                                                                          is way more readable and explicit than

                                                                                                                          Api.GetPrcFeat();
                                                                                                                          

                                                                                                                          Maybe not the most realistic example but you get what I’m saying and we’ve all seen code like the latter and had no clue what it does without drilling into the method.

                                                                                                                          Also since when does having autocomplete equal full-blown IDE?

                                                                                                                          1. 4

                                                                                                                            The reason the C library and Unix functions have such short names is, in part, because the original C linker only looked at the first 8 characters of symbol names.

                                                                                                                            (Likewise, I seem to recall that the original Unix filesystem only allowed 8-character filenames, forcing shell commands to be terse. I may be wrong on this; but most filesystems of that era had really limited names, some only 6 characters, which is why the classic Adventure game was also called ADVENT.)

                                                                                                                            1. 3

                                                                                                                              I think you might be conflating Unix with MS-DOS. The early Unix file systems allowed 14 character names, while MS-DOS limited filenames of 8 (well, 8 characters for the name, plus 3 for the extension).

                                                                                                                              1. 8

                                                                                                                                I guess Ken could have spelt creat with an e after all.

                                                                                                                              2. 3

                                                                                                                                The reason the C library and Unix functions have such short names is, in part, because the original C linker only looked at the first 8 characters of symbol names.

                                                                                                                                This was actually part of the first C standard, and it was limited to “6 significant initial characters in an external identifier”. (Look for “2.2.4.1 Translation limits” here.)

                                                                                                                                This was almost certainly due to limitations from FORTRAN and PDP-11. PDP-11 assembly language only considered the first 6 characters. (See section 3.2.2, point 3.) FORTRAN (in)famously only used 6 characters to distinguish names. If you wanted interoperability with those systems, which were still dominate in the 80s, then writing everything to fit in 6 characters made sense.

                                                                                                                              3. 3
                                                                                                                                ShoeApi.GetPriceAndFeatures();
                                                                                                                                

                                                                                                                                is way more readable and explicit than

                                                                                                                                Api.GetPrcFeat();
                                                                                                                                

                                                                                                                                I see a few independent criteria these comparisons:

                                                                                                                                1. The inclusion of the namespace (System. / ShoeApi. / Api vs… not)
                                                                                                                                2. Low vs. high context names (ShoeApi vs. Api)
                                                                                                                                3. Abbreviated names GetPriceAndFeatures vs GetPrcFeat

                                                                                                                                I prefer to elide the namespace if it appears more than a couple times. (X; Y; Z; over HighContextName.X; HighContextName.Y; HighContextName.Z;)

                                                                                                                                I prefer high context names, especially when there’s already a namespace. (service.Shoes.Api over service.Shoes.ShoeApi)

                                                                                                                                I prefer to not abbreviate. (GetPriceAndFeatures over GetPrcFeat … though both aren’t great.)

                                                                                                                                Best (if I must use a Java-like): Api.Fields(Price, Features).Get()

                                                                                                                              1. 3
                                                                                                                                • Drinking – not as much, tapering down now due the line below this
                                                                                                                                • Going through boxing training again and excercising every day
                                                                                                                                • Mentally and physically preparing myself for schooling
                                                                                                                                • Struggling my way through the hell known as Python

                                                                                                                                Edit: Installing a distro on my PC. I just destroyed my setup on accident by deleting a bunch of root dirs. I already know… the moment I close anything out or reboot, this setup is toast.

                                                                                                                                1. 2

                                                                                                                                  What’s the current struggle with python?

                                                                                                                                  1. 3

                                                                                                                                    Everything. I dislike it completely at a fundamental level.

                                                                                                                                    I hate the syntax, I hate the forced style for the syntax, I hate it’s tooling, I hate it’s ecosystem, I hate it’s errors that can get to be quite ridiculous (such as missing or adding an extra line which stops the whole script).

                                                                                                                                    The only reason I’m able to struggle through it is because I want to give it one more honest chance and write down all of my feelings about it.

                                                                                                                                    Every once in a while, I come across a nice feature that surprises me.

                                                                                                                                    But that’s it. It’s an honestly irredeemable language for me.

                                                                                                                                    1. 1

                                                                                                                                      Interesting! There are certainly aspects of python I dislike, but I mostly enjoy it.

                                                                                                                                      What languages do you prefer?

                                                                                                                                      1. 2

                                                                                                                                        Rust, BASH, PHP, Lua, Racket, and C++, in no particular order.

                                                                                                                                        1. 1

                                                                                                                                          You know, I haven’t spent much time with Lua and it has its own issues (cough global variables cough), but it is a really lovely little language. I like that it feels much simpler than python. If I were better with it I might not feel the need for python.

                                                                                                                                          1. 2

                                                                                                                                            About the only positive thing I might say about Python (as someone who agrees with Phate6660 about Python, and loves using Lua), it does come with batteries, unlike Lua which comes with a very minimal library. For what I use Lua for, that’s not an issue, but it is one of the main criticisms of Lua.

                                                                                                                                            1. 3

                                                                                                                                              That’s actually a strike against python when trying to use it in the kind of environment blur is meant for (embedding in another program)

                                                                                                                                1. 1

                                                                                                                                  But why now?

                                                                                                                                  Why ignore the problem for 25 years, then turn around and admitting the problem when null and undefined has been a problem for longer than most of us have lived?

                                                                                                                                  1. 22

                                                                                                                                    Because we are living through a programming language renaissance: it seems that major languages (Java, JavaScript, C++, Python) had a period of stability/stagnation, when they were considered “done”, as they fully implemented their vision of OOP. Then, the OOP craze subsided, and people started to look into adapting more of FP ideas. Hence, all those languages finally started to get basic conveniences as:

                                                                                                                                    • statement-level type inference (auto in C++, var in Java) / gradual types (TypeScript, mypy)
                                                                                                                                    • optional and result monads (?. in JS, optional/outcome in C++, optional in Java. Curiously, Python seems to get by without them)
                                                                                                                                    • data types (records in Java, <=> operator/hash in C++, NamedTupple/@dataclass in Python. Curiously, JS seem to get by without them)
                                                                                                                                    • sum types / union types / pattern matching (pattern matching in Java, variant in C++, pattern matching in Python, union types in TS)
                                                                                                                                    • destructing declaration ( auto [x, y] = in C++, richer unpacking syntax in Python, let { foo } in TypeScript. Nothing yet in Java I think?)
                                                                                                                                    • async/await (curiously, Java seems to have chosen stackful coroutines instead)
                                                                                                                                    • and, of course, lambdas. JS doesn’t get enough credited for being the first mainstream language with closure functions, but they shortened the syntax. Python curiously got stuck in local optima, where lambda was pretty innovative at the time, but now feels restrictive.
                                                                                                                                    1. 1

                                                                                                                                      data types… Curiously, JS seem to get by without them

                                                                                                                                      FWIW the way anonymous objects (and their types in typescript) work in JS is just as convenient for casually structuring data as e.g. NamedTuple in Python.

                                                                                                                                      1. 1

                                                                                                                                        I consider structural eq/hash/ord a part of being a data type. I think JS still doset have those?

                                                                                                                                        1. 1

                                                                                                                                          No and it’s never getting them, but oh well, eq is in lodash.

                                                                                                                                    2. 3

                                                                                                                                      The “billion dollar problem” hasn’t been ignored, modern languages (last 5-10 years), usually treat null and undefined completely different than the past. Nowadays, they are treated as algebraic data types (like tagged unions) that must explicitly be checked by the developer, avoiding runtime NullReferenceExceptions.

                                                                                                                                      New syntax ,such as ?., tries to make these checked as easy as possible.

                                                                                                                                      //c, this value may or may not be null
                                                                                                                                      SomeStruct * s = get_struct();
                                                                                                                                      //This may or may not be an error
                                                                                                                                      int value = s->some_field;
                                                                                                                                      
                                                                                                                                      //Typescript
                                                                                                                                      type ReturnValue = SomeStruct | null
                                                                                                                                      const s : ReturnValue = get_struct();
                                                                                                                                      //The compiler will not allow this
                                                                                                                                      Const value = a.some_field;
                                                                                                                                      //you have to do this
                                                                                                                                      const value : number | null = s !== null ? s.some_field : null;
                                                                                                                                      //or with the new syntax
                                                                                                                                      const value  = s?.some_field;
                                                                                                                                      
                                                                                                                                      

                                                                                                                                      Where it really shines, is chaining multiple checks together:

                                                                                                                                      //Turn this:
                                                                                                                                      Interface I {
                                                                                                                                        field1?: {
                                                                                                                                          field2?: {
                                                                                                                                          field3?: number
                                                                                                                                        }
                                                                                                                                        }
                                                                                                                                      }
                                                                                                                                      function func(arg: I | undefined){
                                                                                                                                        If(!arg){
                                                                                                                                         return;
                                                                                                                                       }
                                                                                                                                       if(!arg.field1){
                                                                                                                                          return;
                                                                                                                                       }
                                                                                                                                       if(!arg.field1.field2){
                                                                                                                                          return;
                                                                                                                                       }
                                                                                                                                       return arg1.field1.field2.field3;
                                                                                                                                      }
                                                                                                                                      
                                                                                                                                      //into this:
                                                                                                                                      const func = (arg: I | null) => arg?.field1?.field2?.field3;
                                                                                                                                      
                                                                                                                                      1. 1

                                                                                                                                        Er, that’s my point. The problem has been known for fifty years, solutions have been known for a long time as well. Why did TS/JS ignore the problem until now?

                                                                                                                                        1. 5

                                                                                                                                          Because it used to have much bigger problems to deal with first.

                                                                                                                                          1. 3

                                                                                                                                            Just because someone knows the problem doesn’t mean it’s in everybody’s understanding, or on everybody’s roadmap.

                                                                                                                                            1. 1

                                                                                                                                              The problem was solved in a few languages for a long time (e.g. Haskell) - it’s just popularity.

                                                                                                                                          2. 2

                                                                                                                                            Honestly, having null and undefined doesn’t bother me at all, as long as they are part of the type system- as is the case in TypeScript.

                                                                                                                                            They mean subtly different things. The most common convention is that null means “this value has been explicitly set to empty/bottom/default” and undefined means “this value has not been set”. (I know this is not the case for the TypeScript compiler project that just prefers undefined for everything)

                                                                                                                                            It would be better to just have an Option<T> type, IMO, but TypeScript is trying to be a thin veneer over JavaScript, so that’s out of scope for it.

                                                                                                                                            1. 2

                                                                                                                                              In my opinion, null and undefined actually conflate three different cases:

                                                                                                                                              1. This variable has not been defined
                                                                                                                                              2. This variable has not been given a value
                                                                                                                                              3. This variable has no value (or value not found)
                                                                                                                                              1. 3

                                                                                                                                                Interesting. If I understand your list, #1 refers to a variable name just not existing at all in scope (never declared with var, let, const), #2 refers to a let or var declaration, but no assignment: let foo;, and #3 is setting something to null.

                                                                                                                                                I think you’re right and that makes sense, but do you see any value in differentiating scenarios 1 and 2?

                                                                                                                                                Also, if I understand and remember correctly, #1 is not allowed in TypeScript at all. So I think, for TypeScript, your list and my described convention is the same.

                                                                                                                                                1. 1

                                                                                                                                                  For languages where you do not have to pre-declare variables, then yes, there is a value in differentiating between 1 and 2.

                                                                                                                                          1. 5

                                                                                                                                            Is it me, or does a dataset of 200.000 pictures seem a bit small?

                                                                                                                                            1. 8

                                                                                                                                              For now. Who’s to say authorities won’t ask to scan photos for known terrorists, criminals, or political agitators? Or how long until Apple is “forced” to scan phones directly because pedophiles are avoiding the Apple Cloud?

                                                                                                                                              1. 11

                                                                                                                                                That’s not how the technology works. It matches known images only. Like PhotoDNA—the original technology used for this purpose—it’s resistant to things like cropping, resizing, or re-encoding. But it can’t do things like facial recognition, it only detects a fixed set of images compiled by various authorities and tech companies. Read this technical summary from Apple.

                                                                                                                                                FWIW, most major tech companies that host images have been matching against this shared database for years. Google Photos, Drive, Gmail, DropBox, OneDrive, and plenty more things commonly used on both iPhones and Androids. Apple is a decade late to this party—I’m genuinely surprised they haven’t been doing this already.

                                                                                                                                                1. 5

                                                                                                                                                  Apple does scan this when it hits iCloud.

                                                                                                                                                  The difference is now they’re making your phone scan it’s own photos before they ever leave your device.

                                                                                                                                                  1. 3

                                                                                                                                                    Only if they are uploaded to iCloud. I understand it feels iffy that the matching against known bad hashes is done on-device, but this could be a way to implement E2E for iCloud Photos later on.

                                                                                                                                              2. 4

                                                                                                                                                But one match is enough. The goal is to detect, not to rate.

                                                                                                                                              1. 24

                                                                                                                                                I agree with most of what’s said in the article. However, it misses a bit the forest for the trees, at least considering the introduction. Programs that take seconds to load or display a list are not just ignoring cache optimizations. They’re using slow languages (or language implementations, for the pedantics out there) like cpython where even the simplest operations require a dictionary lookup, or using layers and layers of abstractions like electron, or making http requests for the most trivial things (I suspect it’s what makes slack slow; I know it’s what makes GCP’s web UI absolutely terrible). A lot of bad architectural choices, too.

                                                                                                                                                Cache optimizations can be important but only as the last step. There’s a lot to be fixed before that, imho.

                                                                                                                                                1. 16

                                                                                                                                                  Even beyond than that, I think there are more more baseline things going on: Most developers don’t even benchmark or profile. In my experience the most egregious performance problems I’ve seen have been straight-up bugs, and they don’t get caught because nobody’s testing. And the profiler basically never agrees with what I would have guessed the problem was. I don’t disagree with the author’s overall point, but it’s rare to come across a program that’s slow enough to be a problem that doesn’t have much lower hanging fruit than locality issues.

                                                                                                                                                  1. 3

                                                                                                                                                    I agree so much! I’d even say that profiling is one half of the problem (statistical profiling, that is, like perf). The other half is tracing, which nowadays can be done with very convenient tools like Tracy or the chrome trace visualizer (“catapult”) if you instrument your code a bit so it can spit out json traces. These give insights in where time is actually spent.

                                                                                                                                                    1. 1

                                                                                                                                                      Absolutely. Most developers only benchmark if there’s a serious problem, and most users are so inured to bad response times that they just take whatever bad experience they receive and try to use the app regardless. Most of the time it’s some stupid thing the devs did that they didn’t realize and didn’t bother checking for (oops, looks like we’re instantiating this object on every loop iteration, look at that.)

                                                                                                                                                    2. 9

                                                                                                                                                      Programs that take seconds to load or display a list are not just ignoring cache optimizations.

                                                                                                                                                      That’s right. I hammered on the cache example because it’s easy to show an example of what a massive difference it can make, but I did not mean to imply that it’s the only reason. Basically, any time we lose track of what the computer must do, we risk introducing slowness. Now, I don’t mean that having layers of abstractions or using dictionary are inherently bad (they will likely have a performance cost, but it may be reasonable to reach another objective), but we should make these choices intentionally rather than going by rote, by peer pressure, by habit, etc.

                                                                                                                                                      1. 5

                                                                                                                                                        The article implies the programmer has access to low level details like cache memory layout, but if you are programming in Python, Lua, Ruby, Perl, or similar, the programmer doesn’t have such access (and for those languages, the trade off is developer ease). I’m not even sure you get to such details in Java (last time I worked in Java, it was only a year old).

                                                                                                                                                        The article also makes the mistake that “the world is x86”—at work, we still use SPARC based machines. I’m sure they too have cache, and maybe the same applies to them, but micro-optimizations are quite difficult across different architectures (and even across the same family but different generations).

                                                                                                                                                        1. 6

                                                                                                                                                          The article implies the programmer has access to low level details like cache memory layout, but if you are programming in Python, Lua, Ruby, Perl, or similar, the programmer doesn’t have such access

                                                                                                                                                          The level of control that a programmer has is reduced in favor of other tradeoffs, as you said, but there’s still some amount of control. Often, it’s found in those languages best practices. For example, in Erlang one should prefer to use binaries for text rather than strings, because binaries are a contiguous sequence of bytes while strings are linked lists of characters. Another example, in Python it’s preferable to accumulate small substrings in a list and then use the join method rather that using concatenation (full += sub).

                                                                                                                                                          The article also makes the mistake that “the world is x86”—at work, we still use SPARC based machines. I’m sure they too have cache, and maybe the same applies to them, but micro-optimizations are quite difficult across different architectures (and even across the same family but different generations).

                                                                                                                                                          I don’t personally have that view, but I realize that it wasn’t made very clear in the text, my apologies. Basically what I want myself and other programmers to be mindful of is mechanical sympathy — to not lose track of the actual hardware that the program is going to run on.

                                                                                                                                                          1. 4

                                                                                                                                                            I know a fun Python example. Check this yes implementation:

                                                                                                                                                            def yes(s):
                                                                                                                                                              p = print
                                                                                                                                                              while True:
                                                                                                                                                                p(s)
                                                                                                                                                            
                                                                                                                                                            yes("y")
                                                                                                                                                            

                                                                                                                                                            This hot-loop will perform significantly better than the simpler print(s) because of the way variable lookups work in Python. It first checks the local scope, then the global scope, and then the built-ins scope before finally raising a NameError exception if it still isn’t found. By adding a reference to the print function to the local scope here, we reduce the number of hash-table lookups by 2 for each iteration!

                                                                                                                                                            I’ve never actually seen this done in real Python code, understandably. It’s counter-intuitive and ugly. And if you care this much about performance then Python might not be the right choice in the first place. The dynamism of Python (any name can be reassigned, at any time, even by another thread) is sometimes useful but it makes all these lookups necessary. It’s just one of the design decisions that makes it difficult to write a high-performance implementation of Python.

                                                                                                                                                            1. 3

                                                                                                                                                              That’s not how scoping works in Python.

                                                                                                                                                              The Python parser statically determines the scope of a name (where possible.) If you look at the bytecode for your function (using dis.dis) you will see either a LOAD_GLOBAL, LOAD_FAST, LOAD_DEREF, or LOAD_NAME, corresponding to global, local, closure, or unknown scope. The last bytecode (LOAD_NAME) is the only situation in which multiple scopes are checked, and these are relatively rare to see in practice.

                                                                                                                                                              The transformation from LOAD_GLOBAL to LOAD_FAST is not uncommon, and you see it in the standard library: e.g., https://github.com/python/cpython/blob/main/Lib/json/encoder.py#L259

                                                                                                                                                              I don’t know what current measurements of the performance improvement look like, after LOAD_GLOBAL optimisations in Python 3.9, which reported 40% improvement: https://bugs.python.org/issue26219 (It may be the case that the global-to-local transformation is no longer considered a meaningful speed-up.)

                                                                                                                                                              Note that the transformation from global-to-local scope, while likely innocuous, is a semantic change. If builtins.print or the global print is modified in some other execution unit (e.g., another thread,) the function will not reflect this (as global lookups can be considered late-bound, which is often desirable.)

                                                                                                                                                              1. 8

                                                                                                                                                                I think this small point speaks more broadly to the dissatisfaction many of us have with the “software is slow” mindset. The criticisms seem very shallow.

                                                                                                                                                                Complaining about slow software or slow languages is an easy criticism to make from the outside, especially considering that the biggest risk many projects face is failure to complete or failure to capture critical requirements.

                                                                                                                                                                Given a known, fixed problem with decades of computer science research behind it, it’s much easier to focus on performance—whether micro-optimisations or architectural and algorithmic improvements. Given three separate, completed implementations of the same problem, it’s easy to pick out which is the fastest and also happens to have satisfied just the right business requirements to succeed with users.

                                                                                                                                                                I think the commenters who suggest that performance and performance-regression testing should be integrated into the software development practice from the beginning are on the right track. (Right now, I think the industry is still struggling with getting basic correctness testing and documentation integrated into software development practice.)

                                                                                                                                                                But the example above shows something important. Making everything static or precluding a number of dynamic semantics would definitely give languages like Python a better chance at being faster. But these semantics are—ultimately—useful, and it may be difficult to predict exactly when and where they are critical to satisfying requirements.

                                                                                                                                                                It may well be the case that some languages and systems err too heavily on the side of allowing functionality that reduces the aforementioned risks. (It’s definitely the case that Python is more dynamic in design than many users make use of in practice!)

                                                                                                                                                                1. 2

                                                                                                                                                                  Interesting! I was unaware that the parser (!?) did that optimization. I suppose it isn’t difficult to craft code that forces LOAD_NAME every time (say, by reading a string from stdin and passing it to exec) but I find it totally plausible that that rarely happens in non-pathological code.

                                                                                                                                                                  Hm. For a lark, I decided to try it:

                                                                                                                                                                  >>> def yes(s):
                                                                                                                                                                  ...  exec("p = print")
                                                                                                                                                                  ...  p(s)
                                                                                                                                                                  ... 
                                                                                                                                                                  >>> dis.dis(yes)
                                                                                                                                                                    2           0 LOAD_GLOBAL              0 (exec)
                                                                                                                                                                                2 LOAD_CONST               1 ('p = print')
                                                                                                                                                                                4 CALL_FUNCTION            1
                                                                                                                                                                                6 POP_TOP
                                                                                                                                                                  
                                                                                                                                                                    3           8 LOAD_GLOBAL              1 (p)
                                                                                                                                                                               10 LOAD_FAST                0 (s)
                                                                                                                                                                               12 CALL_FUNCTION            1
                                                                                                                                                                               14 POP_TOP
                                                                                                                                                                               16 LOAD_CONST               0 (None)
                                                                                                                                                                               18 RETURN_VALUE
                                                                                                                                                                  >>> yes("y")
                                                                                                                                                                  Traceback (most recent call last):
                                                                                                                                                                    File "<stdin>", line 1, in <module>
                                                                                                                                                                    File "<stdin>", line 3, in yes
                                                                                                                                                                  NameError: name 'p' is not defined
                                                                                                                                                                  
                                                                                                                                                            2. 5

                                                                                                                                                              and for those languages, the trade off is developer ease

                                                                                                                                                              I heard Jonathan Blow make this point on a podcast and it stuck with me:

                                                                                                                                                              We’re trading off performance for developer ease, but is it really that much easier? It’s not like “well, we’re programming in a visual language and just snapping bits together in a GUI, and it’s slow, but it’s so easy we can make stuff really quickly.” Like Python is easier than Rust, but is it that much easier? In both cases, it’s a text based OO language. One just lets you ignore types and memory lifetimes. But Python is still pretty complicated.

                                                                                                                                                              Blow is probably a little overblown (ha), but I do think we need to ask ourselves how much convenience we’re really buying by slowing down our software by factors of 100x or more. Maybe we should be more demanding for our slow downs and expect something that trades more back for it.

                                                                                                                                                              1. 2

                                                                                                                                                                Like Python is easier than Rust, but is it that much easier?

                                                                                                                                                                I don’t want to start a fight about types but, speaking for myself, Python became much more attractive when they added type annotations, for this reason. Modern Python feels quite productive, to me, so the trade-off is more tolerable.

                                                                                                                                                                1. 1

                                                                                                                                                                  It depends upon the task. Are you manipulating or parsing text? Sure, C will be faster in execution, but in development?

                                                                                                                                                                  At work, I was told to look into SIP, and I started writing a prototype (or proof-of-concept if you will) in Lua (using LPeg to parse SIP messages). That “proof-of-concept” went into production (and is still in production six years later) because it was “fast enough” for use, and it’s been easy to modify over the years. And if we can ever switch to using x86 on the servers [1], we could easily use LuaJIT.

                                                                                                                                                                  [1] For reasons, we have to use SPARC in production, and LuaJIT does not support that architecture.

                                                                                                                                                            3. 7

                                                                                                                                                              The trick about cache optimizations is that that can be a case where, sure, individually you’re shaving nanoseconds off, but sometimes those are alarmingly common in the program flow and worth doing before any higher-level fixes.

                                                                                                                                                              To wit: I worked on a CAD system implemented in Java, and the “small optimization” of switching to a pooled-allocation strategy for vectors instead of relying on the normal GC meant the difference between an unusable application and a fluidly interactive one, simply because the operation I fixed was so core to everything that was being done.

                                                                                                                                                              Optimizing cache hits for something like mouse move math can totally be worth it as a first step, if you know your workload and what code is in the “hot” path (see also sibling comments talking about profiling).

                                                                                                                                                              1. 6

                                                                                                                                                                They’re using slow languages (or language implementations, for the pedantics out there) like cpython where even the simplest operations require a dictionary lookup

                                                                                                                                                                I take issue with statements like this, because the majority of code in most programs is not being executed in a tight loop on large enough data to matter. The overall speed of a program has more to do with how it was architected than with how well the language it’s written in scores on microbenchmarks.

                                                                                                                                                                Besides, Python’s performance cost isn’t a just an oversight. It’s a tradeoff that provides benefits elsewhere in flexibility and extensibility. Problems like serialization are trivial because of meta-programming and reflection. Complex string manipulation code is simple because the GC tracks references for you and manages the cleanup. Building many types of tools is simpler because you can easily hook into stuff at runtime. Fixing an exception in a Python script is a far more pleasant experience than fixing a segfault in a C program that hasn’t been built with DWARF symbols.

                                                                                                                                                                Granted, modern compiled languages like Rust/Go/Zig are much better at things like providing nice error messages and helpful backtraces, but you’re paying a small cost for keeping a backtrace around in the first place. Should that be thrown out in favor of more speed? Depending on the context, yes! But a lot of code is just glue code that benefits more from useful error reporting than faster runtime.

                                                                                                                                                                For me, the choice in language usually comes down to how quickly I can get a working program with limited bugs built. For many things (up to and including interactive GUIs) this ends up being Python, largely because of the incredible library support, but I might choose Rust instead if I was concerned about multithreading correctness, or Go if I wanted strong green-thread support (Python’s async is kinda meh). If I happen to pick a “fast” language, that’s a nice bonus, but it’s rarely a significant factor in that decision making process. I can just call out to a fast language for the slow parts.

                                                                                                                                                                That’s not to say I wouldn’t have mechanical sympathy and try to keep data structures flat and simple from the get go, but no matter which language I pick, I’d still expect to go back with a profiler and do some performance tuning later once I have a better sense of a real-world workload.

                                                                                                                                                                1. 4

                                                                                                                                                                  To add to what you say: Until you’ve exhausted the space of algorithmic improvements, they’re going to trump any microoptimisation that you try. Storing your data in a contiguous array may be more efficient (for search, anyway - wait until you need to insert something in the middle), but no matter how fast you make your linear scan over a million entries, if you can reframe your algorithm so that you only need to look at five of them to answer your query then a fairly simple data structure built out of Python dictionaries will outperform your hand-optimised OpenCL code scanning the entire array.

                                                                                                                                                                  The kind of microoptimisation that the article’s talking about makes sense once you’ve exhausted algorithmic improvements, need to squeeze the last bit of performance out of the system, and are certain that the requirements aren’t going to change for a while. The last bit is really important because it doesn’t matter how fast your program runs if it doesn’t solve the problem that the user actually has. grep, which the article uses as an example, is a great demonstration here. Implementations of grep have been carefully optimised but they suffer from the fact that requirements changed over time. Grep used to just search ASCII text files for strings. Then it needed to do regular expression matching. Then it needed to support unicode and do unicode canonicalisation. The bottlenecks when doing a unicode regex match over a UTF-8 file are completely different to the ones doing fixed-string matching over an ASCII text file. If you’d carefully optimised a grep implementation for fixed-string matching on ASCII, you’d really struggle to make it fast doing unicode regex matches over arbitrary unicode encodings.

                                                                                                                                                                  1. 1

                                                                                                                                                                    The kind of microoptimisation that the article’s talking about makes sense once you’ve exhausted algorithmic improvements, need to squeeze the last bit of performance out of the system, and are certain that the requirements aren’t going to change for a while.

                                                                                                                                                                    To be fair, I think the article also speaks of the kind of algorithmic improvements that you mention.

                                                                                                                                                                  2. 3

                                                                                                                                                                    Maybe it’s no coincidence that Django and Rails both seem to aim at 100 concurrent requests, though. Both use a lot of language magic (runtime reflection/metaprogramming/metaclasses), afaik. You start with a slow dynamic language, and pile up more work to do at runtime (in this same slow language). In this sense, I’d argue that the design is slow in many different ways, including architecturally.

                                                                                                                                                                    Complex string manipulation code is simple because the GC tracks references for you

                                                                                                                                                                    No modern language has a problem with that (deliberately ignoring C). Refcounted/GC’d strings are table stakes.

                                                                                                                                                                    I personally dislike Go’s design a lot, but it’s clearly designed in a way that performance will be much better than python with enough dynamic features to get you reflection-based deserialization.

                                                                                                                                                                  3. 1

                                                                                                                                                                    All the times I had an urge to fire up a profiler the problem was either an inefficient algorithm (worse big-O) or repeated database fetches (inefficient cache usage). Never have I found that performance was bad because of slow abstractions. Of course, this might be because of software I work with (Python web services) has a lot of experiences on crafting good, fast abstractions. Of course, you can find new people writing Python that don’t use them, which results in bad performance, but that is quickly learned away. What is important if you want to write performant Python code, is to use as little of “pure Python” as possible. Python is a great glue language, and it works best when it is used that way.

                                                                                                                                                                    1. 1

                                                                                                                                                                      Never have I found that performance was bad because of slow abstractions.

                                                                                                                                                                      I have. There was the time when fgets() was the culprit, and another time when checking the limit of a string of hex digits was the culprit. The most surprising result I’ve had from profiling is a poorly written or poorly compiled library.

                                                                                                                                                                      Looking back on my experiences, I would have to say I’ve been surprised by a profile result about half the time.

                                                                                                                                                                    2. 1

                                                                                                                                                                      As a pedantic out here, I wanted to say that I appreciate you :)

                                                                                                                                                                    1. 8

                                                                                                                                                                      I don’t think I’m ever going to not hate Typescript. What’s worse is that I both feel it would most likely would keep things cleaner in the long run and it would help noobs get into my codebase more easily. It feels like a semi-enforced documentation step.

                                                                                                                                                                      But I hate it. I hate how it looks. I hate having a compiling step. I hate having to annotate everything. I hate how everyone thinks that you need it for modern development and that without it you’re coding in the stone ages. I hate that it’s really just suggesting types when I can just use any as void* which based on my aversion I almost always do whenever I’ve had to deal with it in the past. I even hate that the article’s author is using a ligature based font - and that one isn’t even fair.

                                                                                                                                                                      /rant

                                                                                                                                                                      If someone started a project with it I could understand why. It really does make a lot of sense. But I’ll never use it of my own free will.

                                                                                                                                                                      1. 3

                                                                                                                                                                        Some people may have thought I wrote the above rant (including about the font). What I really dislike is how it’s treated as defacto. When I see a team implement Haskell on the back-end because they value types but also strictness in managing IO, no any, and ergonomics composition for a functional code base, then say “yup, TypeScript and fp-ts is good enough for our front-end”.

                                                                                                                                                                        1. 2

                                                                                                                                                                          I mean, TS with strict mode, noUncheckedIndexedAccess, plus all the ESLint rules that forbid any and all other unsafety is getting pretty close to Haskell in terms of type safety IMO.

                                                                                                                                                                          1. 2

                                                                                                                                                                            Out of curiosity, have you used Haskell professionally?

                                                                                                                                                                            1. 2

                                                                                                                                                                              No, although from what I gather from people who have, it’s used quite differently to Haskell in academic contexts. Much fewer weird tricks (lenses was the example they gave), and a willingness to accept IO in places where fundamentally it is necessary (e.g. managing application state shared across concurrency boundaries).

                                                                                                                                                                            2. 2

                                                                                                                                                                              Without an IO monad, do, and composition infix operators, it not close in ergonomics.

                                                                                                                                                                              1. 2

                                                                                                                                                                                Well, Promise is the equivalent of the IO monad, which makes async/await equivalent to do in that domain. Not having general monads for failure or immutable state is clunky though. Composition/infix can be emulated with “fluent” object interfaces.

                                                                                                                                                                                So yes, they’re obviously very different languages, but the point is that TS can be a very safe, expressive type system if you use it right.

                                                                                                                                                                                1. 1

                                                                                                                                                                                  I’d disagree, not that your perspective of similarity is wrong, but because those differences matter a lot to me. Having seen a lot of libraries with .pipe(), et.al. all over the place, it’s very hard to read when you know the author would rather be in a language where this style is supported first-class. Comparing Promise isn’t great either: it’s not synchronous, it’s immediately evoked when constructed so many thunk it, canceling is a pain and rarely supported, and the error handling with reject and catch is a mess across libraries instead of branching Result or Either.

                                                                                                                                                                          2. 2

                                                                                                                                                                            Please note that I’ve used TS in many different contexts beyond your standard browser or node server and a portion of my frustration comes from trying to get TS running in atypical environments.

                                                                                                                                                                            For me TS is still frustrating with new projects. I’ve found that the configuration for getting a new TS project up and running with tests and build targets is very complex and fragile. The few times I’ve tried to start with TS I found it cost me so much time in setup that I would have been better off sticking with plain JS. In addition, I have encountered runtime errors multiple times despite compiling in strict mode and using as much typing as possible.

                                                                                                                                                                            Given this, I do think it does provide some minimal benefit around things like basic type and name errors being caught, that I think can be worthwhile for most projects to use.

                                                                                                                                                                            1. 2

                                                                                                                                                                              How much experience do you have with languages that require compilation? I ask, because my experience (not with Typescript, but in general) is that I tend to dislike dynamic languages (even though I love using Lua) because the types are so loose (I learned C early on in my career). I have a table in Lua, I have no idea what can be in it, whereas in C, I know because I can look up the structure definition.

                                                                                                                                                                              1. 1

                                                                                                                                                                                C++, Go, and Java. Java professionally. I dislike them all equally on the typing front. If anything it would be nice if types just magically sprung up after I’ve finished writing the rough draft of the code and had no enforcement until I’ve left the code base for a while. The thing is that while I’m in the code I know what I’m doing. It’s after I come back from working on something else I’m lost.

                                                                                                                                                                                1. 1

                                                                                                                                                                                  Have you ever used languages with whole program type inference? (E.g., Haskell, F#) That might be what you are looking for.

                                                                                                                                                                            1. 5

                                                                                                                                                                              I thought that writing to one union member and reading from another is UB in C? Or does that apply only to members of different type?

                                                                                                                                                                              1. 2

                                                                                                                                                                                Only in c++, not c. Unions (along with memcpy) are the ‘correct’ way to do type punning.

                                                                                                                                                                                1. 4
                                                                                                                                                                                2. 1

                                                                                                                                                                                  It is (probably due to possible “trap values” [1]) but there’s a lot of code that assumes one can do that.

                                                                                                                                                                                  [1] In over 30 years of coding, I’ve yet to program on an architecture that has a “trap value”. I’ve yet to program on something other than a byte-oriented, 2’s complement machine.

                                                                                                                                                                                1. 28

                                                                                                                                                                                  I just can’t shake the feeling that Kubernetes is Google externalizing their training costs to the industry as a whole (and I feel the same applies to Go).

                                                                                                                                                                                  1. 9

                                                                                                                                                                                    Golang is great for general application development, IME. I like the culture of explicit error handling with thoughtful error messages, the culture of debugging with fast unit tests, when possible, and the culture of straightforward design. And interfaces are great for breaking development into components which can be developed in parallel. What don’t you like about it?

                                                                                                                                                                                    1. 12

                                                                                                                                                                                      It was initially the patronizing quote from Rob Pike that turned me off Go. I’m also not a fan of gofmt [1] (and I’m not a fan of opinionated software in general, unless I’m the one controlling the opinions [2]). I’m also unconvinced about the whole “unit testing” thing [5]. Also, it’s from Google [3]. I rarely mention it, because it goes against the current zeitgeist (especially at the Orange Site), and really, what can I do about it?

                                                                                                                                                                                      [1] I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.

                                                                                                                                                                                      [2] And yes, I realize I’m being hypocritical here.

                                                                                                                                                                                      [3] Google is (in my opinion, in case that’s not apparent) shoving what they want on the entire industry to a degree that Microsoft could only dream of. [4]

                                                                                                                                                                                      [4] No, I’m not bitter. Really!

                                                                                                                                                                                      [5] As an aside, but up through late 2020, my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.

                                                                                                                                                                                      Way to go!

                                                                                                                                                                                      I should also note that my current manager retired, the other developer left for another job, and the QA engineer assigned to our team also left for another job (but has since come back because the job he moved to was worse, and we could really use him back in our office). So nearly the entire team was replaced back around December of 2020.

                                                                                                                                                                                      1. 11

                                                                                                                                                                                        I can’t even tell if this is a troll post or not.

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          I can assure you that I’m not intentionally trolling, and those are my current feelings.

                                                                                                                                                                                        2. 2

                                                                                                                                                                                          I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.

                                                                                                                                                                                          I use a portrait monitor with a full-screen Emacs window for my programming, and I still find myself wishing for more vertical space when programming in curly-brace languages such as Go. And when I am stuck on a laptop screen I am delighted when working on a codebase which does not waste vertical space.

                                                                                                                                                                                          Are you perhaps younger than I am, with very small fonts configured? I have found that as I age I find a need for large and larger fonts. Nothing grotesque yet, but I went from 9 to 12 to 14 and, in a few places, 16 points. All real 1/72” points, because I have my display settings configured that way. 18-year-old me would have thought I am ridiculous! Granted, you’ve been at your current employer at least 10 years, so I doubt you are 18🙂

                                                                                                                                                                                          I’m also unconvinced about the whole “unit testing” thing … my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.

                                                                                                                                                                                          I suspect that the increase in bugs has to do with the change in process rather than the testing regime. Adding more tests on its own can only lead to more bugs if either incorrect tests flag correct behaviour as bugs (leading to buggy ‘bugfixes,’ or rework to fix the tests), or if correct tests for unimportant bugs lead to investing resources inefficiently, or if the increased emphasis leads to worse code architecture or rework rewriting old code to conform to the old architecture (I think I covered all the bases here). OTOH, changing development processes almost inevitably leads to poor outcomes in the short term: there is a learning curve; people and secondary processes must adapt &c.

                                                                                                                                                                                          That is worth it if the long-term outcomes are sufficiently better. In the specific case of unit testing, I think it is worth it, especially in the long run and especially as team size increases. The trickiest thing about it in my experience has been getting the units right. I feel pretty confident about the right approach now, but … ask me in a decade!

                                                                                                                                                                                          1. 2

                                                                                                                                                                                            Are you perhaps younger than I am, with very small fonts configured?

                                                                                                                                                                                            I don’t know, you didn’t give your age. I’m currently 52, and my coworkers (back when I was in the office) often complained about the small font size I use (and have used).

                                                                                                                                                                                            I suspect that the increase in bugs has to do with the change in process rather than the testing regime.

                                                                                                                                                                                            The code (and it’s several different programs that comprise the whole thing) was not written with unit testing in mind (even though it was initially written in 2010, it’s in C89/C++98, and the developer who wrote it didn’t believe in unit tests). We do have a regression test that tests end-to-end [1] but there are a few cases that as of right now require manual testing [2], which I (as a dev) can do, but generally QA does a more in-depth testing. And I (or rather, we devs did, before the major change) work closely with the QA engineer to coordinate testing.

                                                                                                                                                                                            And that’s just the testing regime. The development regime is also being forced changed.

                                                                                                                                                                                            [1] One program to generate the data required, and another program that runs the eight programs required (five of which aren’t being tested but need to be endpoints our stuff talks to) and runs through 15,800+ tests we have (it takes around two minutes). It’s gotten harder to add tests to it (the regression test is over five years old) due to the nature of how the cases are generated (automatically, and not all cases generated are technically “valid” in the sense we’ll see it in production).

                                                                                                                                                                                            [2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”.

                                                                                                                                                                                            1. 1

                                                                                                                                                                                              [2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”

                                                                                                                                                                                              Here are three options, but there are many others:

                                                                                                                                                                                              1. Separate the networking code from the business logic, test the business logic
                                                                                                                                                                                              2. Have the business logic send to a test server running on localhost, have it send back results ordered as needed
                                                                                                                                                                                              3. Change the routing configuration or use netfilter to rewrite the requests to a test server, have it send back results ordered as needed.

                                                                                                                                                                                              Re-ordering results from databases is a major part of what Jepsen does; you could take ideas from there too.

                                                                                                                                                                                              1. 1
                                                                                                                                                                                                1. Even if that was possible (and I wish it was), I would still have to test the networking code to ensure it’s working, per the new regime.
                                                                                                                                                                                                2. That’s what I’m doing
                                                                                                                                                                                                3. I’m not sure I understand what you mean by “routing configuration”, but I do understand what “netfilter” is, and my response to that is—the new regime wants “push button testing,” and if there’s a way to automate that, then that is an option.
                                                                                                                                                                                                1. 2
                                                                                                                                                                                                  1. Yes, of course the networking code would still need to be tested.

                                                                                                                                                                                                    Ideally, the networking code would have its own unit tests. And, of course, unit tests don’t replace integration tests. Test pyramid and such.

                                                                                                                                                                                                  2. 🚀

                                                                                                                                                                                                  3. netfilter can be automated. It’s an API.

                                                                                                                                                                                                  What’s push button testing?

                                                                                                                                                                                                  1. 1

                                                                                                                                                                                                    You want to test the program. You push a button. All the tests run. That’s it. Fully automated testing.

                                                                                                                                                                                                    1. 1

                                                                                                                                                                                                      👌🏾

                                                                                                                                                                                                      Everything I’ve worked on since ~2005 has been fully and automatically tested via continuous integration. IMHO it’s a game changer.

                                                                                                                                                                                          2. 1

                                                                                                                                                                                            Would love to hear about your prior development method. Did adopting the new practices have any upsides?

                                                                                                                                                                                            1. 4

                                                                                                                                                                                              First off, our stuff is a collection of components that work together. There are two front-end pieces (one for SS7 traffic, one for SIP traffic) that then talk to the back-end (that implements the business logic). The back-end makes parallel DNS queries [1] to get the required information, muck with the data according to the business logic, then return data to the front-ends to ultimately return the information back to the Oligarchic Cell Phone Companies. Since this process happens as a call is being placed we are on the Oligarchic Cell Phone Companies network, and we have some pretty short time constraints. And due to this, not only do we have some pretty severe SLAs, but any updates have to be approved 10 business days before deployment by said Oligarchic Cell Phone Companies. As a result, we might get four deployments per year [2].

                                                                                                                                                                                              And the components are written in a combination of C89, C++98 [3], C99, and Lua [4].

                                                                                                                                                                                              So, now that you have some background, our development process. We do trunk based development (all work done on one branch, for the most part). We do NOT have continuous deployment (as noted above). When working, we developers (which never numbered more than three) would do local testing, either with the regression test, or another tool that allows us to target a particular data configuration (based off the regression test, which starts eight programs, five of which are just needed for the components being tested). Why not test just the business logic? Said logic is spread throughout the back-end process, intermixed with all the I/O it does (it needs data from multiple sources, queried at the same time).

                                                                                                                                                                                              Anyway, code is written, committed (main line), tested, fixed, committed (main line), repeat, until we feel it’s good. And the “tested” part not only includes us developers, but also QA at the same time. Once it’s deemed working (using both regression testing and manual testing), we then officially pass it over to QA, who walks it down the line from the QA servers, staging servers and finally (once we get permission from the Oligarchic Cell Phone Companies) into production, where not only devops is involved, but QA and the developer who’s code is being installed (at 2:00 am Eastern, Tuesday, Wednesday or Thursday, never Monday or Friday).

                                                                                                                                                                                              Due to the nature of what we are dealing with, testing at all is damn near impossible (or rather, hideously expensive, because getting actual cell phone traffic through the lab environment involves, well, being a phone company (which we aren’t), very expensive and hard to get equipment, and a very expensive and hard to get laboratory setup (that will meet FCC regulations, blah blah yada yada)) so we do the best we can. We can inject messages as if they were coming from cell phones, but it’s still not a real cell phone, so there is testing done during deployment into production.

                                                                                                                                                                                              It’s been a 10 year process, and it has gotten better until this past December.

                                                                                                                                                                                              Now it’s all Agile, scrum, stories, milestones, sprints, and unit testing über alles! As I told my new manager, why bother with a two week sprint when the Oligarchic Cell Phone Companies have a two year sprint? It’s not like we ever did continuous deployment. Could more testing be done automatically? I’m sure, but there are aspects that are very difficult to test automatically [5]. Also, more branch development. I wouldn’t mind so much this, except we’re using SVN (for reasons that are mostly historical at this point) and branching is … um … not as easy as in git. [6] And the new developer sent me diffs to ensure his work passes the tests. When I asked him why didn’t he check the new code in, he said he was told by the new manager not to, as it could “break the build.” But we’ve broken the build before this—all we do is just fix code and check it in [8]. But no, no “breaking the build”, even though we don’t do continuous integration, nor continuous deployment, and what deployment process we do have locks the build number from Jenkins of what does get pushed (or considered “gold”).

                                                                                                                                                                                              Is there any upside to the new regime? Well, I have rewritten the regression test (for the third time now) to include such features as “delay this response” and “did we not send a notification to this process”. I should note that is is code for us, not for our customer, which, need I remind people, is the Oligarchic Cell Phone Companies. If anyone is interested, I have spent June and July blogging about this (among other things).

                                                                                                                                                                                              [1] Looking up NAPTR records to convert phone numbers to names, and another set to return the “reputation” of the phone number.

                                                                                                                                                                                              [2] It took us five years to get one SIP header changed slightly by the Oligarchic Cell Phone Companies to add a bit more context to the call. Five years. Continuous deployment? What’s that?

                                                                                                                                                                                              [3] The original development happened in 2010, and the only developer at the time was a) very conservative, b) didn’t believe in unit tests. The code is not written in a way to make it easy to unit test, at least, as how I understand unit testing.

                                                                                                                                                                                              [4] A prototype I wrote to get my head around parsing SIP messages that got deployed to production without my knowing it by a previous manager who was convinced the company would go out of business if it wasn’t. This was six years ago. We’re still in business, and I don’t think we’re going out of business any time soon.

                                                                                                                                                                                              [5] As I mentioned, we have multiple outstanding requests to various data sources, and other components that are notified on a “fire and forget” mechanism (UDP, but it’s all on the same segment) that the new regime want to ensure gets notified correctly. Think about that for a second, how do you prove a negative? That is, something that wasn’t supposed to happen (like a component not getting notified) didn’t happen?

                                                                                                                                                                                              [6] I think we’re the only department left using SVN—the rest of the company has switched to git. Why are we still on SVN? 1) Because the Solaris [7] build servers aren’t configured to pull from git yet and 2) the only redeeming feature of SVN is the ability to checkout a subdirectory, which given the layout of our repository, and how devops want the build servers configured, is used extensively. I did look into using git submodules, but man, what a mess. It totally doesn’t work for us.

                                                                                                                                                                                              [7] Oh, did I neglect to mention we’re still using Solaris because of SLAs? Because we are.

                                                                                                                                                                                              [8] Usually, it’s Jenkins that breaks the build, not the code we checked in. Sometimes, the Jenkins checkout fails. Devops has to fix the build server [7] and try the call again.

                                                                                                                                                                                              1. 2

                                                                                                                                                                                                As a result, we might get four deployments per year [2]

                                                                                                                                                                                                AIUI most agile practices are to decrease cycle time and get faster feedback. If you can’t, though, then you can’t! Wrong practices for the wrong context.

                                                                                                                                                                                                I feel for you.

                                                                                                                                                                                                1. 1

                                                                                                                                                                                                  Thank you! More grist for my “unit testing is fine in its place” mill.

                                                                                                                                                                                                  Also: hiring new management is super risky.