Threads for Loup-Vaillant

  1. 4

    Although the keys of this initial encryption are known to observers of the connection

    I haven’t looked at the specs yet. Is that true? Isn’t that horrible?

    1. 6

      I think it’s fundamentally unavoidable. At the point that a browser initiates a connection to a server, the server doesn’t yet know which certificate to present. DH alone doesn’t authenticate that you haven’t been MITM’d.

      1. 5

        It’s not unavoidable if both parties can agree on a PSK (pre-shared key) out-of-band, or from a previous session - and IIRC, the TLS 1.3 0-RTT handshake which is now used by QUIC can negotiate PSKs or tickets for use in future sessions once key exchange is done. But for the first time connection between two unknown parties, it is certainly unavoidable when SNI is required, due to the aforementioned inability to present appropriate certificates.

        1. 2

          On the other hand, if you have been MitM’d you’ll notice it instantly (and know that the server certificate has been leaked to Mallory in the Middle). And now every connection you make is broken, including the ones they did not want to block. I see to ways of avoiding that:

          1. Don’t actually MitM.
          2. Be a certificate authority your users “trust” (install your public key in everyone’s computers, mostly).
          1. 2

            No, but DH prevents sending a key across the wire, making them known and prevents passive observers from reading ciphertext. Wouldn’t it make sense to talk to the server first?

            1. 3

              Without some form of authentication (provided by TLS certificates in this case), you have no way to know whether you’re doing key exchange with the desired endpoint or some middlebox, so you don’t really gain anything there.

              1. 3

                You gain protection against passive observers, thereby increasing costs of attackers trying to snoop on what services people connect to. Also when you then anyways end up receiving the certificate you at worst retro-actively could verify you weren’t snooped at, which is more than you have when it’s actually that you send a key that allows you to decrypt, which still sounds odd to me.

                1. 3

                  What you’re suggesting is described on https://www.ietf.org/id/draft-duke-quic-protected-initial-04.html This leverages TLS’s encrypted client hello to generate QUIC’s INITIAL keys.

              2. 1

                I don’t know how much sense it makes? Doing a DH first adds more round trips to connection start, which is the specific thing QUIC is trying to avoid, and changes the way TLS integrates with the protocol, which affects implementability, the main hurdle QUIC has had to overcome.

                1. 1

                  I get that, but how does it make sense to send something encrypted when you send the key to decrypt it with it? You might as well save that step, after all the main reason to encrypt something is to prevent it from being read.

                  EDIT: How that initial key is sent isn’t part of TLS, is it? It’s part of QUIC-TLS (RFC9001). Not completely sure, but doesn’t regular 0-RTT in TLSv1.3 work differently?

                  1. 5

                    The purpose of encrypting initial packets is to prevent ossification.

                    1. 1

                      Okay, but to be fair that kind of still makes it seem like the better choice would be unauthenticated encryption that is not easily decryptable.

                      I know 0RTT is a goal but at least to me it seems like the tradeoff isn’t really worth it.

                      Anyways thanks for your explanations. It was pretty insightful.

                      I guess I’ll read through more quic and TLS on the weekend if I have time.

                      1. 1

                        The next version of QUIC has a different salt which prevents ossification. To achieve encryption without authentication, the server and the client can agree on a different salt. There’s a draft describing this approach, I think.

                    2. 1

                      how does it make sense to send something encrypted when you send the key to decrypt it with it?

                      According to https://quic.ulfheim.net/ :

                      Encrypting the Initial packets prevents certain kinds of attacks such as request forgery attacks.

              3. 2

                It’s not more horrible than the existing TLS 1.3 :-) I sent out a link to something that may be of interest to you.

                1. 0

                  It’s only the public keys that are known, and if they did their job well, they only need to expose ephemeral keys (which are basically random, and thus don’t reveal anything). In the end, the only thing an eavesdropper can know is the fact you’re initiating a QUIC connection.

                  If you want to hide that, you’d have to go full steganography. One step that can help you there is making sure ephemeral keys are indistinguishable from random numbers (With Curve25519, you can use Elligator). Then you embed your abnormally high-entropy traffic in cute pictures of cats, or whatever will not raise suspicion.

                  1. 1

                    This is incorrect, see RFC9001. As a passive observer you have all the information you need to decrypt the rest of the handshake. This is by design and is also mentioned again in the draft that rpaulo mentioned.

                    The problems with this are mentioned in 9001, the mentioned draft and the article.

                    1. 1

                      Goodness, I’m reading section 7 of the RFC right now, it sounds pretty bad. The thing was devised in 2012, we knew how to make nice handshakes that leak little information and for heaven’s sake authenticate everything.

                      As a passive observer you have all the information you need to decrypt the rest of the handshake.

                      Now I’m sure it’s not that bad. I said “It’s only the public keys that are known”. You can’t be implying we can decrypt or guess the private keys as well? And as a passive observer at that? That would effectively void encryption entirely.

                1. 4

                  I love the idea of an E-Ink laptop. I only hope it has a backlight as well.

                  1. 3

                    I was really hoping that the line of research of layered displays that produced the OLPC display would give me an eInk display under an OLED. I’d love to be able to use a colour eInk layer (which draws power only on change) for everything that’s static and then OLED (which draws power only when lit) for videos on top and possibly a mouse cursor (a glowing cursor on a matte screen would be very easy to find!).

                    1. 1

                      Having a backlight in a pinch is a good idea, but if you use it regularly during standard usage then it wrecks battery life and defeats the point of having a passive display.

                      The problem isn’t the feature itself, but that people tend to assume that if a device has a feature built in, it’s meant to be used.

                      1. 5

                        I have a backlight on my Kobo reader that is permanently on, but I set it close to the minimum for reading in a pitch-black room. It also happens to help in low-light conditions. Battery life is still quite long. I’d say at least 48 hours, not counting the time it stays asleep (total time between two recharges is measured in weeks).

                        From this one data point, I guess a backlight is unlikely to be the most power hungry thing on a laptop.

                        1. 1

                          My experience (Kobo Clara HD) is the same. When I moved from an early-generation Kobo Touch with no frontlight to the Clara, I thought I’d have to turn the frontlight off during the day to get the kind of battery life I was used to. But instead, I keep it turned on, usually around 10-25% brightness, and get similar battery life (a couple weeks of normal usage). The frontlight also improves the contrast even in a well-lit room.

                        2. 1

                          I find that it is easier on my eyes so for me it makes sense and doesn’t defeat the purpose. That said I only use the backlight in the evenings, anyhow.

                      1. 3

                        I’ve recently wondered whether there is merit to the idea of taking the crypto acceleration instructions in most normal CPU’s and turning them into a dedicated co-processor, maybe also with some dedicated RAM. Not really for performance reasons, but rather for isolation; I’m imagining a single, in-order core with basically a single algorithm programmed into it and no bus, clock, or anything else shared with the main CPU. You can then prove that the co-processor always runs at a fixed rate for given input, know for a fact that nothing else is doing anything that can tamper with its performance (since it’s a single execution thread dedicated to a particular calculation, ideally one that is in-order), and it would (hopefully) be easier to control the side-channels one can observe from it. The main processor would not be able to ask the coprocessor about its clock rate, load/store latency, power usage, or anything else that could leak info about what it’s doing. Maybe you could even have multiple coprocessors to aid throughput, multiplexed by the OS; another process might be able to tell that you’re using a crypto coprocessor, but nothing else apart from “started using it” and “stopped using it”.

                        Then the rest of our programs could go off and use whatever optimizations hardware wants to implement, while the time-sensitive parts get their own dedicated sandbox.

                        1. 2

                          I think a “I’m doing crypto now” mode could be more useful (fixed clocks, fixed memory access times, fixed operation times). Might slow down some algorithms, but is guaranteed to not have any timing side-channels (and maybe no power side-channels?). Would be annoying to handle this in kernel though (what do you do if a process put a core into this mode and got preempted? Do you keep that mode on? Do you turn it off and turn it back on when putting the process back on? Do you only allow mode change in the kernel to be accessed by a syscall?)

                          1. 1

                            A lot of these assurances would be provided by executing your crypto with the help of a trusted execution environment or straight up a hardware security module.

                            1. 1

                              Tempting, but may be costly.

                              If you want something generic capable of implementing many primitives, you may require quite a bit of silicone even if the pathologically straight-line code we see in crypto does not benefit from out of order execution or even a cache hierarchy. You’ll still need sizeable ALUs, and efficient multiplication (an array of 64->128 multipliers would be awfully nice), and a form of SIMD to feed all those units.

                              If however you want something cheap yet fast, you’ll probably need to settle on some hardware friendly primitive like Keccak. Asymmetric crypto may still be a problem though, except perhaps if we use a binary field elliptic curve (about which I’ve heard security is not as settled as it is for prime field curves). The biggest problem is it’s quite inflexible.

                              Unless the world comes crashing down, I don’t see hardware vendors proposing either alternative. Except for some niches, but then we already have FPGA or even ASIC implementations in specific places.

                            1. 9

                              In a DevGAMM presentation, Jon Blow managed to convince himself and other game developers that high level languages are going to cause the “collapse of civilisation” due to a loss of some sort of “capability” to do low-level programming, and that abstraction will make people forget how to do things.

                              I believe this is a gross misrepresentation of Jon Blow’s actual argument.

                              If I got it correctly, the gist of Blow’s argument is that loss of knowledge may collapse of civilizations. He points at the collapse of the Mediterranean Bronze Age, which in all likely hood had many intertwined causes, and one contributing factor was loss of knowledge.

                              Then he notes that something like loss of knowledge seems to be going on in our field. More and more we delegate low-level efforts to ever more concentrated teams of experts, to the point where the world at large seems to be rather ignorant of what happens under the hood. And as a consequence, we get stuff like criminally slow programs like Photoshop.

                              When I say “criminally” I am not even exaggerating. Slow program waste the time of all their users, and sufficiently popular program can easily lose cumulated lifetimes. We could argue that wasting a cumulative 60 years is just as bad as accidentally killing someone.

                              In our quest for better and better programmer productivity, we forgot that performance is not a niche concern. For interactive programs, anything short of “instantaneous” is slower than ideal. And then there’s energy consumption and the cost of silicon. Making a high performance chip is incredibly polluting. There’s a good chance that slower chips could be much better for the environment (not to mention our wallets). But for slower chips to be good enough, we need our programs to speed the hell up.

                              So, are high level languages causing the collapse of our civilisation? Not quite. The problem is more that fewer and fewer people know (or even care to know) what happens underneath, and there may come a point where this become unsustainable.

                              1. 5

                                We could argue that wasting a cumulative 60 years is just as bad as accidentally killing someone.

                                The time spent waiting for an image to render in Photoshop, or for a compile to finish, is not blank nothingness ripped out of our existence. It’s a time the mind can spend to reflect, to plan, to do other tasks.

                                1. 1

                                  Good point, my 1 to 1 scale was incorrect.

                                  Still, the sheer time wasted can be fairly huge. Assuming the following.

                                  • 10 second wasted per work day.
                                  • 5 day work week, 40 weeks per year.
                                  • 1 million such users
                                  • over 10 years.

                                  That’s about 300 years worth of wasted time. About 5 lifetimes. (I think I’m being very conservative here.)

                                  So as you pointed out, those 300 years aren’t completely lost. They are partially lost. I just wonder how much is actually lost here. How many lifetimes of waiting must we cause for it to be just as bad as wasting an actual lifetime? in my opinion it’s most probably somewhere between 5 and 100. And there’s definitely an upper bound.

                                  1. 2

                                    Software (like Adobe Photoshop) exists at a local maximum of reliability - development resources - backwards compatibility - target hardware availability - profit.

                                    One could imagine a version that was very fast and let the user not wait at all, but randomly corrupted images. No-one would like that, even if it saved virtual lives.

                                    There’s a corrective to bad software: it’s called the market. So far, no-one has managed to displace Photoshop from its dominance as a PC-based image editor. Maybe someone will, and Adobe will shift their priority and local maximum to address that shortcoming.

                                    In the meantime, I’d be more worried about stuff that actually kills people, like pollution, non-optimal access to healthcare, and wars.

                                    (Edit seen today: Polluted air cuts global life expectancy by two years

                                    1. 1

                                      I agree, to a point.

                                      The market is not perfect. Because it is made up of individuals, a bit of lost time is hardly noticed. Moreover, users have basically now way of distinguishing between necessary lost time and avoidable lost time. Plus, the incentives are all wrong, especially for productivity software: a new feature hardly anybody will use can sell, because it grows the list of things the software can do. Taking too much time to boot up however annoys everyone a tiny little bit — just not enough to actually lose the sale.

                                      Yet, when you think about it, if you’re loosing cumulated decades of time across all your users, it’s kind of your moral duty to spend at least a couple weeks to fix the issue.

                                1. 4

                                  The parallel between societies and software is a great find! The big thing that I disagree with though is:

                                  and a fresh-faced team is brought in to, blessedly, design a new system from scratch. (…) you have to admit that this system works.

                                  My experience is the opposite. No customer is willing to work with a reduced feature set, and the old software has accumulated a large undocumented set of specific features. The new-from-scratch version will have to somehow reproduce all of that, all the while having to keep up with patching done to the old system that is still running as the new system is under development. In other words, the new system will never be completed.

                                  In short, we have no way to escape complexity at all. Once it’s there, it stays. The only thing we can do to keep ourselves from collapse as described in the article is avoid creating complexity in the first place. But as I think is stated correctly, that is not something most organisations are particularly good at.

                                  1. 11

                                    No customer is willing to work with a reduced feature set…

                                    Sure they are, because the price for the legacy system keeps going up. They eventually bite the bullet. That’s been my experience, anyway. The evidence is that products DO actually go away, in fact, we complain about Google doing it too much!

                                    Yes, some things stay around basically forever, but those are things that are so valuable (to someone) that someone is willing to pay dearly to keep them running. Meanwhile, the rest of the world moves on to the new systems.

                                    1. 3

                                      Absent vandals ransacking offices, perhaps this is what ‘collapse’ means in the context of software; the point where its added value can no longer fund its maintenance.

                                      1. 1

                                        Cost is one way to look at it, but it’s much harder to make this argument in situations like SaaS. The cost imposed on the customer is much more indirect than when it’s software the customer directly operates. You need to have a deprecation process that can move customers onto the supported things in a reasonable fashion. When this is done well, there is continual evaluation to reduce the bleeding from new adoption of a feature that’s going away while migration paths are considered.

                                        I think the best model for looking at this overall is the Jobs To Be Done (JTBD) framework. Like many management tools, it can actually be explained in to a software engineer on a single page rather than requiring a book, but people like to opine.

                                        You split out the jobs that customers need done which are sometimes much removed from the original intent of a feature. These can then be mapped onto a solution, or the solution can be re-envisioned. Many people don’t get to the bottom of the actual job the customer is currently doing and then they deprecate with alternatives that only partially suit task.

                                      2. 4

                                        My experience is the opposite. No customer is willing to work with a reduced feature set

                                        Not from the same vendor. But if they’re lucky enough not to be completely locked in, once the first vendor’s system is sufficiently bloated and slow and buggy, they might be willing to consider going to the competition.

                                        It’s still kind of a rewrite, but the difference this time is that one company might go under while another rises. (If the first company is big enough, they might also buy the competition…)

                                      1. 4

                                        So, the problem we’re trying to solve here is that JSON parsing is a bottleneck. Their solution is to do some cheaper pre-processing so they have less parsing to do. But then I have to ask:

                                        Why are we using a slow text format to begin with?

                                        I understand that not all past decisions can be reversed, and sometimes there’s no choice because of reasons¹. Still, before we even contemplate solving a problem, I believe we should think whether we could avoid it entirely.

                                        In this case: how about using a custom binary format instead of JSON? You may not even have to roll your own if something like MessagePack is fast enough despite being pretty generic. And if you do roll your own, chances your format could be simple enough that it wouldn’t take more effort than implementing raw filtering.


                                        [1]: Like, text is more readable and easier to debug when all your tools are text based. We started with JSON in our prototype and now our “production grade prototype” that calls itself an application is built around JSON and we’re stuck.

                                        1. 1

                                          JSON is the most common data interchange format. It’s hard to change that. Maybe if you control all of the structure, then you could, but not everyone can control how the data is presented to them, and how their data will be presented to others. Gigabyte sized JSON files usually don’t come from internal services, they will come from external ones, and similarly, such JSON files are usually produced for external systems, not internal ones. Could this be changed? Maybe, I’d say that SQLite is becoming common enough to replace many of the large sized batch datasets in use, but there is still nothing that is fairly universal for streaming data that wouldn’t create additional friction for others besides JSON.

                                          1. 3

                                            Granted, sometimes it is what it is.

                                            My experience has been different however. I’ve encountered JSON a number of times in my career, and every single time, it was used to exchange data between two subsystems we both controlled. I know that because we specified the exact shape of the JSON data we were exchanging. So JSON was not chosen because it was standard, it was chosen because it was perceived to be easy.

                                            Another thing to consider here is that JSON does not reduce friction that much: after parsing you still get fairly unstructured data: sure there are the nested objects and lists, but it still leaves much data being represented as raw strings, for which you’ll need a bespoke parser anyway. Heck, my most recent encounter involved encoding binary data in base64 just so I could send it through JSON! And on top of that there’s the schema your data conforms to. Whether you formalise it or not, the data you have will be structured in a certain way, and that too begets custom code.

                                            You’ll have custom code anyway, so why not go all the way and design a custom data format? It won’t be more than a couple hundred lines of code more than what you needed to do on top of JSON. It might even save you code in some cases (because your format is tailored to your data, you don’t have to fight it). Really, I suspect the friction you speak of is more of a mental block. Custom data format are not nearly as hard as they’re perceived to be.

                                            1. 1

                                              JSON is good because it’s mostly self-describing, and there are tools for whatever platform to use it. It’s of course isn’t as good if you control both sides, and I usually in such cases I weigh whether it is worth going to the next thing over. Creating your own custom data format isn’t as easy as you say, because often you’ll need to implement that in several different languages, and even then, making your own format more efficient than JSON isn’t that easy. You’d probably use an already existing format in most cases, but now you need to decide, which format? Now you have to weigh your options, and that is extra work that you need to do. JSON is usually good enough to offset the time investment to choose something else.

                                              1. 2

                                                JSON is good because it’s mostly self-describing

                                                Not quite. It’s textual. What makes text special is the sheer ubiquity of associated tools: editors, terminals… to the point where we came to believe that text is “human readable”, even though it’s not (we need viewers to read it, and editors to modify it).

                                                Creating your own custom data format isn’t as easy as you say, because often you’ll need to implement that in several different languages

                                                Sure, depending on what language we use. Though note that a simple binary format is easy to implement in C. Much easier than an equivalent textual format in fact. From there all you need is language bindings (another thing that’s ubiquitous is being able to talk to C).

                                                even then, making your own format more efficient than JSON isn’t that easy.

                                                Boy, you have no idea how inefficient text formats are. Here’s the crux: text formats are terminator based, while binary formats are length based (mostly). When you parse a textual format, you need to scan each character until you find the terminator. We have fancy techniques like finite state automata and LR parsing to make that faster, but the fundamental problem remains. Binary formats however tend to specify the length of their fields right there at the beginning. This lets you parallelise parsing if you ever need to, or skip fields that you are not interested in.

                                                You want to be more efficient than JSON? Start with TLV encoding: Type, Length, Value. It’s very simple, and it goes a long way.

                                                You’d probably use an already existing format in most cases

                                                Yes. Please everyone consider MessagePack.

                                                now you need to decide, which format?

                                                Like you didn’t need to decide when you chose JSON? If the choice of JSON was easy, the binary equivalent is just as easy: it’s MessagePack: like JSON, only it’s binary and leaner and faster.

                                                JSON is usually good enough to offset the time investment to choose something else.

                                                If you don’t want to think, the choice is already made: it’s MessagePack. And don’t worry about language support, your favourite language already has like 3 implementations.


                                                Okay, I’m being a liittle annoying over MessagePack, but I mean it: it’s basically JSON, only better. You just need a reader that’s not a text editor to visualise it.

                                                1. 1

                                                  Msgpack is fine, but I think CBOR has a lot going for it. I appreciate the tag system for example. It should be easier or as easy to parse than msgpack.

                                                  Sure, depending on what language we use. Though note that a simple binary format is easy to implement in C. Much easier than an equivalent textual format in fact.

                                                  Please don’t. That’s how buffer overflows happen. Use a library and an existing format, like you say; hopefully it’ll even be fuzzed and battle-tested.

                                                  Otherwise I agree, it’s interesting to see how people think that json is fast and “good enough” when really it sucks at storing floats, integers, binary blobs, etc. It’s only acceptable for lists, dictionaries, and unicode text, and even then you pay the cost of escaping/unescaping your text.

                                                  1. 2

                                                    Please don’t. That’s how buffer overflows happen. Use a library and an existing format, like you say; hopefully it’ll even be fuzzed and battle-tested.

                                                    • Yes, C is unsafe, and we should consider using something else whenever possible. However, for pure computations like parsing or cryptography, it also has the advantage of being everywhere, effectively making it extremely portable. This portability is why I still use it even though its almost always technically inferior.

                                                    • Yes, better use an existing format when available, provided it does what I need. I’ll take a look at CBOR one of those days.

                                                    • What makes binary formats easier to implement also makes them safer. There’s less room for error in general, but one crucial thing is that most of the time, you can just read a small size field, then know right away how much memory you need to allocate. Then you just loop and you’re done. Textual formats however force you to allocate before you know the size, and that is so much more dangerous.

                                                    • That said, in my experience fuzzing is the bare minimum where C code is concerned. Even my custom code is going to have property based tests, automatically generated correct and incorrect inputs and sanitizers and Valgrind and all that jazz. We’re talking about processing adversarial inputs from an unsafe language after all.

                                        1. 3

                                          I’ve noticed that if you convert an image file to a plain BMP and run that through gzip, the result will almost always be noticeably smaller than the corresponding PNG. Pretty impressive, given that PNGs also are using zlib compression internally.

                                          1. 6

                                            I suspect those (most?) PNG images simply are badly compressed. I bet running them through an optimised PNG compressor would also produce noticeably smaller results. Likely even smaller than plain gzip.

                                            1. 11

                                              I wrote a png library. One time a user emailed me saying he was amazed at how small the files were and asked what the secret was… I was perplexed because I did the bare minimum and let stock zlib do the compression itself with no special settings.

                                              Turns out the other program was adding a bunch of metainfo mine didn’t, and that metainfo made the file appear bloated.

                                              1. 3

                                                Then I wonder exactly how common “good PNG compression” actually is?

                                                Since my original observation was largely anecdotal, I downloaded the dataset provided in the original article and did a comparison. I found that a gzipped BMP file was smaller than the original PNG in over half of the cases (54/94). After applying the same compression as the author (namely oxipng --opt max --strip safe), gzipped BMPs were still smaller for nearly a quarter of the files (23/94). Admittedly, usually not by much. But the fact that it is at all competitive with an optimizer (and one that takes an order of magntiude longer to run than gzip) is pretty noteworthy.

                                                1. 2

                                                  I use oxipng for better compression.

                                                2. 2

                                                  Given that PNG is a gzip-compressed bitmap, if you’re seeing consistent savings this way there must be something terribly wrong with the PNG encoder you have. Instead of a DIY PNG-equivalent format maybe use a PNG optimizer?

                                                  1. 1

                                                    Well, I don’t use such a format, since no existing software reads gzipped BMPs natively. I just have noticed that it does pretty well compared to PNG. And my experience is that it is pretty consistent. Using a PNG optimizer will improve this in a lot of cases, but not all.

                                                  2. 1

                                                    Funny, isn’t it? I was inspired by that to create the lossless image format farbfeld which relies on external compression and keeps up quite well with PNG.

                                                  1. 2

                                                    Copied from r/programing:

                                                    A few months ago I identified two categories of code format tools: at one end of the spectrum we have rule enforcers, and at the other end we have canon enforcers.

                                                    With rule enforcers, we have a set of rules the code must adhere to, and anything that breaks those rules is incorrect, but within the confines of those rules, you can do whatever you please. For instance, assuming lines are limited to 25 characters, the following would be incorrect:

                                                    void foo(int a, int b, int c, int d)
                                                    {
                                                        int bar = a + b - c * d;
                                                    }
                                                    

                                                    On the other hand, there may be several correct ways to fix it:

                                                    void foo(int a, int b,
                                                             int c, int d)
                                                    {
                                                        int bar =
                                                            a + b - c * d;
                                                    }
                                                    
                                                    void foo(int a,
                                                             int b,
                                                             int c,
                                                             int d)
                                                    {
                                                        int bar = a + b
                                                                - c * d;
                                                    }
                                                    

                                                    With canon enforcers, there’s one way to format code. Anything different is basically incorrect, and ends up being formatted back to the One True Style.

                                                    Now some of you may say that limiting lines to 25 columns is a tad restrictive. How about raising that limit to 80? With that, the first version I showed above becomes correct. It’s a win!

                                                    Well, it depends. I can feel like 80 columns is a bit large, and really, most of the time a limit of 25 is okay. We have to have a hard limit, but it’s nice to have a soft (unenforced) lower limit as well. Besides, what if variables are related in some logical way? In such a case, the most readable code might look like this:

                                                    void foo(int a1, int a2,
                                                             int b1, int b2)
                                                    {
                                                        int bar = (a1 + a2)
                                                                - (b1 * b2);
                                                    }
                                                    

                                                    Enter the canon enforcer. With those, if I chose to set the limit at 80 column, they will prevent unneeded line breaks. But this completely breaks the spirit of a soft lower limit!! And I can kiss semantic groupings goodbye.


                                                    In a real project I’m working on, the architects wrote in the code style that we ought to observe an 80 column soft limit. But the formatting tool they gave us is configured company wide to 120 columns. And because clang-format is a canon enforcer, it means I cannot break lines that would take between 80 and 120 characters.

                                                    A similar problem occurs for function calls. Either the whole call fits in less than 120 columns, and it has to be a single line, or it does not, and the only accepted style is one argument per line. So instead of:

                                                    do_stuff_with_buffers(buffer1, buffer1_size,
                                                                          buffer2, buffer2_size,
                                                                          buffer3, buffer3_size);
                                                    

                                                    I was forced into this less readable:

                                                    do_stuff_with_buffers(buffer1,
                                                                          buffer1_size,
                                                                          buffer2,
                                                                          buffer2_size,
                                                                          buffer3, buffer3_size);
                                                    

                                                    Pretty infuriating.

                                                    Now if you don’t care about style, canon enforcers are great: they prevent your carelessness from polluting the code base too much. I however do care. and when the tool forces me into something that is clearly less readable than what I’m trying to achieve, within the confines of the official code guidelines, I die a little inside.

                                                    Having some rules is good. But don’t overdo it.

                                                    1. 5

                                                      whether you need all this heavy optimization work is an open question: most of your code runs seldom (or never), particularly in the bloatware that is produced nowadays, and the search for runtime performance has taken on a somewhat religious meaning. We try to eke out every cycle of speed, yet never worry about the huge amounts of code involved, which increase memory traffic and cache misses.

                                                      Religious or not, I feel that our search for runtime performance is failing miserably. Many of our programs are slow. They take forever to boot, eat up a crazy amount of memory, and often the only meaningful difference from what we had 20 years ago is in the eye candy. I like my eye candy, but it rarely justifies the slowdowns that come with it.

                                                      My feeling is that in general, we aren’t serious about performance. We pay lip service to it, but we rarely ascertain where we actually need it. In the worst cases, we get a false sense of having done what we could, all while stuck in a freakishly high local minimum.

                                                      This reminds me of Casey Muratori’s philosophies of optimisations:

                                                      • Optimisation: where you measure stuff and then make it leaner or faster. (Only use it when you really need it.)
                                                      • Non-pessimisation: where you simplify what the CPU has to do, and avoid unneeded work. (Should be used pretty much all the time.)
                                                      • Fake optimisation: where you rely on heuristics outside of their original domain. (Avoid. Obviously.)

                                                      I get the feeling that Forth leans heavily towards non-pessimisation.

                                                      1. 10

                                                        Them: “Leslie Lamport may not be a household name,[…]”

                                                        Me: “The hell he isn’t.” [rage close]

                                                        (I opened it back up and read it anyway. It was actually really interesting. But my rage close was real.)

                                                        1. 17

                                                          There are whole households out there that don’t have a single graduate degree in them. Amazing, I know!

                                                          That said, I didn’t actually know LL was the one behind TLA+, so it was a useful read for me too. (Also, it turns out he actually does look somewhat like the fluffy lion on the cover of the LaTeX book!)

                                                          1. 3

                                                            This is a hypertext book that I intended to be the way to learn TLA+ and PlusCal, until I realized that people don’t read anymore and I made the video course.

                                                            1. 3

                                                              Yeah, I knew he did Lamport clocks but didn’t know he was also the guy who did LaTeX and TLA.

                                                              1. 4

                                                                Inverse for me, I never made the connection between LaTeX Lamport and clock Lamport.

                                                                1. 4

                                                                  One did happen before the other.

                                                                  1. 10

                                                                    Can we really be sure about that.

                                                              2. 3

                                                                I only learned about his writing LaTeX after reading his TLA+ book. Dude doesn’t have much ego, and he only made his name better known when he started actively advocating for formal methods.

                                                              3. 4

                                                                I saw that and ran a Twitter poll

                                                                without googling, can you name one thing Leslie Lamport is known for?

                                                                • yes: 70.1%
                                                                • no: 29.9%

                                                                Not as good as I’d like, but not as bad as the article makes it sound. Granted, this is a heavily skewed demographic. I ran the same poll at work and got 0% yes.

                                                                1. 3

                                                                  And even then, people tend to know him more for LaTeX than for his much more important work on distributed computing. Which I’ve heard bothered him.

                                                                  1. 5

                                                                    I imagine that there are more people writing professional documents than working on distributed computers.

                                                                    1. 1

                                                                      On the other hand, I imagine that there are more people using distributed computers than people writing documents. Probably.

                                                                    2. 5

                                                                      I can kind of see why that bothered him. Many people viewed TeX as the real, manly typesetting system, and LaTeX was for weaklings. Lamport received the impression of a hippie simplifying Knuth - more an enthusiastic college lecturer than a “real” computer scientist.

                                                                      OTOH LaTeX made TeX usable, and facilitated the distribution of a lot of science. That has to count for something.

                                                                    3. 2

                                                                      I know him for Lamport clocks and not much else :)

                                                                      1. 1

                                                                        Heheh. I saw the Twitter poll and thought of this story. I follow you on Twitter now.

                                                                    1. 25

                                                                      Yeah yeah, mention Rust. Rust is too complicated to implement by one person.

                                                                      I’m not sure that’s a practical metric by which to judge a tool. The C compilers that provide a practical foundation for modern software development were not implemented by one person either.

                                                                      In general Turing completeness is necessary but not sufficient: it’s just one facet of what makes a language practically useful. There are many other properties that end up resulting in costs someone has to pay to use a language; e.g., is it memory safe, or will engineers and users alike be on the hook for an unknown number of egregious memory safety bugs?

                                                                      1. 12

                                                                        Also mrustc has been implemented mostly by one person.

                                                                        1. 2

                                                                          I knew this would be brought up; you know the effort they’ve had to do to achieve this? An incredible amount.

                                                                          1. 8

                                                                            It’s 100K lines of code, and majority of it was developed over a 2-3 year period (with ongoing development to catch up with evolution of Rust). The number of commits and lines of code happens to be close to TCC:

                                                                            It does take a couple of shortcuts: it’s a Rust-to-C compiler (no machine code generation) and it doesn’t perform borrow checking (the Rust language is carefully designed to make it optional. Lifetimes are purely a compile-time lint, and don’t affect generated code or its behavior).

                                                                            I think overall in terms of implementation difficulty Rust is somewhere between C and C++. Parsing of Rust is much simpler than C++, and Rust has fewer crufty language features than C++ (there’s one way to initialize a variable), but some features are big-ish (borrow checker, type inference).

                                                                            How hard it is to implement mainly depends on how good quality of implementation you want to have. For example, LLVM is 85× larger than mrustc and tcc, with over 130× more commits. It’s a 20-year collaborative effort, likely not possible to do by a single person. The main rustc project is also maximalist like that, because it isn’t merely an effort to get Rust working, but to make it fast, efficient, user-friendly, well-documented, reliable, portable, etc., so much much more work went into it beyond just the language implementation.

                                                                            1. 2

                                                                              I cannot speak for mrustc, but 100k loc for tcc is bullshit. Just counting sources and headers in the top level, I get 55k loc (the remainder is taken up by tests and win32 headers). Close to 40k is taken up by target-specific code. The core compiler is about 10k loc.

                                                                              1. 1

                                                                                openhub stats I’ve quoted are for the whole repo, and I see 57K .c and 38K .h in there. This includes tests, so it’s indeed more than just the compiler.

                                                                                1. 2

                                                                                  If I run a word count on everything in the ‘src’ directory of mrustc, I get about 130k loc. I therefore conclude that mrustc’s rust compiler is approximately 10x larger and more complex than tcc’s c compiler. Recall that tcc also includes assemblers and linkers, and supports many targets.

                                                                              2. 0

                                                                                I mean if 3 years is not a lot of effort then cheers to you! You must be an absolute coding beast.

                                                                                1. 15

                                                                                  I feel like this is a fairly disingenuous and dismissive argument - your original post stated that “Rust is too complicated to implement by one person.” The comment you were responding to was making the point that not only is there an implementation of Rust by primarily one person, but a single-contributor C implementation is a comparable size and would theoretically take a similar amount of effort to implement. People here aren’t trying say it’s not a lot of effort, but that it does exist and you may be trivializing the amount of effort needed for a C implementation.

                                                                                  1. 3

                                                                                    Sorry, I didn’t mean to dismiss anything! Isn’t the statement still true if it’s been mentioned they still got help?… Regardless the general sentiment is right. I should have said instead that it’s not reasonable!

                                                                                    I may very well be trivializing the effort for a C implementation. In my mind C’s type system, lack of borrow checker, and other features make its implementation maybe a magnitude easier. I could be completely wrong though and please elaborate if that’s the case!

                                                                                    1. 4

                                                                                      A non-optimizing C89 or C90 compiler is relatively simple to implement, with only minor inconveniences from the messy preprocessor, bitfields, parsing ambiguities of dangling else and typedef (did you know it can be scoped and nested and this affects syntax around it!?). The aren’t any things that are hard per-se, mostly just tedious and laborious, because there’s a lot of small quirks underneath the surface (e.g. arrays don’t always decay to pointers, sizeof evaluates things differently, there are rules around “sequence points”).

                                                                                      There are corners of C that most users don’t use, but compiler in theory needs to support, e.g. case doesn’t have to be at the top level of switch, but can be nested inside other arbitrary code. C can generate “irreducible” control flow, which is hard to reason about and hard to optimize. In fact, a lot of optimization is pretty hard due to aliasing, broken const, and the labyrinth of what is and isn’t UB described in the spec.

                                                                                      1. 3

                                                                                        There are corners of C that most users don’t use, but compiler in theory needs to support, e.g. case doesn’t have to be at the top level of switch, but can be nested inside other arbitrary code

                                                                                        It’s worth noting that, since you said ‘non-optimising’ these things are generally very easy in a non-optimising compiler. You can compile C more or less one statement at a time, including case statements, as long as you are able to insert labels after you insert a jump to them (which you can with most assembly languages). Similarly, sequence points matter only if you’re doing more than just evaluating expressions as you parse them.

                                                                                        The original C compiler ran on a computer that didn’t have enough memory for a full parsed AST and so the language had to support incremental code generation from a single-pass compiler.

                                                                          2. 9

                                                                            LLVM was originally just Chris Latner. I think the question isn’t “Can one person build it?” It’s “Can one person build it to the point where it has enough value for other people to work on it too?”

                                                                            1. 5

                                                                              LLVM was originally just Chris Latner

                                                                              Several of the folks in / formerly in Vikram Adve’s group at UIUC would be quite surprised to learn that.

                                                                              1. 1

                                                                                I actually looked at Wikipedia first before my comment, but that made it seems like it was Latner’s project under Adve’s mentorship. I’ll take your word for it that it was a group effort from the start.

                                                                            2. 3

                                                                              This was my first thought as well. There are a lot of very useful things that are too complicated to be implemented by one person - the current state of Linux probably falls into that category, and I know that at least I wouldn’t want to go back to even a version from 5 years ago, much less back to a version that could have been implemented by a single person.

                                                                              1. 2

                                                                                …And there are a lot of useful things that are simple enough for one person to implement! :D

                                                                                1. 3

                                                                                  Ha, I agree with that, was mostly just highlighting that I don’t feel like “too complicated to implement by one person” is a good reason to dismiss Rust’s potential usefulness.

                                                                                  For myself, I originally got frustrated with Rust not allowing me to do things; eventually, I realized that it was statically removing bad habits that I’d built in the past. Now I love when it yells at me :)

                                                                              2. 1

                                                                                [Tool] is too complicated to implement by one person.

                                                                                I’m not sure that’s a practical metric by which to judge a tool

                                                                                I am. Short term, that means the tool will cost much less: less time to make, fewer bugs, more opportunities for improvement. Long term it means other people will be able to rebuild it from scratch if they need to. At a lower cost.

                                                                                1. 3

                                                                                  The flip side of this is that the tool will do much less. A wooden hammer is a tool that a single person can make. A hammer with a steel head that can drive in nails requires a lot more infrastructure (smelting the ore and casting the head are probably large enough tasks that you’ll need multiple people before you even get to adding a wooden handle). An electric screwdriver requires many different parts made in different factories. If I want to fix two pieces of wood together than a screw driven by an electric screwdriver is both easier to use and produces a much better result than a nail driven by a wooden hammer.

                                                                                  1. 1

                                                                                    Obviously I was limiting my analysis to software tools, where the ability of a single person to make it is directly tied to its complexity.

                                                                                    One fair point you do have is how much infrastructure the tool sits upon. Something written in Forth needs almost nothing besides the hardware itself. Something written in Haskell is a very different story. Then you need to chose what pieces of infrastructure you want to depend on. For instance, when I wrote my crypto library I chose C because of it’s ubiquity. It’s also a guarantee of fairly extreme stability. There’s a good chance that the code I write now will still work several decades from now. If I wanted to maximise safety instead, I would probably have picked Rust.

                                                                                    1. 6

                                                                                      Obviously I was limiting my analysis to software tools, where the ability of a single person to make it is directly tied to its complexity.

                                                                                      My point still applies. A complex software tool allows me to do more. In the case of a programming language, a more complex compiler allows me to write fewer bugs or more features. The number of bugs in the compiler may be lower for a compiler written by a single person but I would be willing to bet that the number of bugs in the ecosystem is significantly higher.

                                                                                      The compiler and standard library are among the best places for complexity in an ecosystem because the cost is amortised across a great many users and the benefits are shared similarly. If physical tools were, like software, zero marginal cost goods, then nail guns, pillar drills, band saws, and so on would all be ubiquitous. If you tried to make the argument that you prefer a manual screwdriver to an electric one because you could build one yourself if you needed then you’d be laughed at.

                                                                                      For instance, when I wrote my crypto library I chose C because of it’s ubiquity. It’s also a guarantee of fairly extreme stability

                                                                                      It also gives you absolutely no help in writing constant-time code, whereas a language such as Low* allows you to prove constant-time properties at the source level. The low* compiler probably depends on at least a hundred person-years of engineering but I’d consider it very likely that the EverCrypt implementations of the same algorithms would be safer to use than your C versions.

                                                                                      1. 2

                                                                                        I reckon amortized cost is a strong argument. In a world where something is build once and used a gazillion times the cost analysis is very different from something that only has a couple users. Which is why by the way I have a very different outlook for Oberon and Go: the former were used in a single system, and the cost of a more powerful compiler could easily outweigh the benefits across the rest of the system; while Go set out to be used by a gazillion semi-competent programmers, and the benefit of some conspicuously absent features would be multiplied accordingly.

                                                                                        Honestly, I’m not sure where I stand. For the things I make, I like to keep it very very simple. On the other hand, If I’m being honest with myself I have little qualms sitting on a mountain of complexity, provided such foundation is solid enough.

                                                                                        Do you have a link to Low*? My search engine is failing me.

                                                                                        1. 2

                                                                                          Do you have a link to Low*? My search engine is failing me.

                                                                                          This paper is probably the best place to start

                                                                                2. 1

                                                                                  The C compilers that provide a practical foundation for modern software development were not implemented by one person either.

                                                                                  Right but there are many C compilers which were written by one person and still work. To me, that’s the important part. Thank you for your thoughts!

                                                                                  1. 2

                                                                                    Why is that important?

                                                                                    1. 1

                                                                                      It’s important because fast forward 300 years and no one uses your language anymore. It must be reasonable the future humans can write a compiler on their own if they want to run your program.

                                                                                      I’m really trying to encourage people thinking beyond their lives in the software realm lately, just as we need to do the same for the ecosystem.

                                                                                      1. 3

                                                                                        trying to build software to last 300 years seems like it would limit hardware development
                                                                                        and everyone implements C compatibility in their new hardware so that people will use it
                                                                                        if people can figure out quantum computers and computers not based on binary, they’ll probably need to figure out what the next C will be for that new architecture
                                                                                        if you want your software to last 300 years, write it in the most readable and easy-to-understand manner, and preserve it’s source so people can port it in the future

                                                                                        1. 3

                                                                                          And this is why C is not good for longevity, but languages which are more abstracted. Thank you for that! Completely agree with what you’re thinking here.

                                                                                          1. 3

                                                                                            i don’t think the biggest blockers to software longevity is language choices or even hardware, it’s the economy/politics of it… long lasting anything doesn’t fit in well with our throw-away society, and since it can’t be monetized, the capitalist society snubs it’s nose at it

                                                                                            1. 2

                                                                                              Hehe, an interesting thread of thought we could travel down here. I’ll just say I agree to a degree.

                                                                                        2. 3

                                                                                          It’s important because fast forward 300 years and no one uses your language anymore. It must be reasonable the future humans can write a compiler on their own if they want to run your program.

                                                                                          If you’re considering a person 300 years in the future then you should also consider that they will have tools 300 years more advanced than ours. 30 years ago, writing a simple game like space invaders was weeks worth of programming, now it’s something that you can do in an afternoon, with significantly better graphics. In the same time, parser generators have improved hugely, reusable back ends are common, and so on. In 300 years, it seems entirely feasible that you’d be able to generate a formal specification for a language from a spoken description and synthesise an optimising compiler directly from the operational semantics.

                                                                                          1. 1

                                                                                            You’re right, I haven’t considered this! I don’t know what to say immediately other than I think this is very important to think about. I’d like to see what others have to comment on this aspect too…!

                                                                                            1. 1

                                                                                              you should also consider that they will have tools 300 years more advanced than ours.

                                                                                              Unless there has been a collapse in between. With climate change and peak oil, we have some serious trouble ahead of us.

                                                                                              1. 5

                                                                                                In which case, implementing the compiler is one of the easiest parts of the problem. I could build a simple mechanical computer that could execute one instruction every few seconds out of the kind of materials that a society with a Victorian level of technology could produce, but that society existed only because coal was readily accessible. I’ve seen one assessment that said that if the Victorians had needed to use wood instead of coal to power their technology they’d have completely deforested Britain in a year. You can smelt metals with charcoal, but the total cost is significantly higher than with coal (ignoring all of the climate-related externalities).

                                                                                                Going from there to a transistor is pretty hard. A thermionic valve is easier, but it requires a lot of glass blowing (which, in turn, requires an energy-dense fuel source such as coal to reach the right temperatures) and the rest of a ‘50s-era computer required fairly pure copper, which has similar requirements. Maybe a post-collapse civilisation would be lucky here because there’s likely to be fairly pure copper lying around in various places.

                                                                                                Doping silicon to produce integrated circuits requires a lot of chemical infrastructure. Once you can do that, the step up to something on the complexity of a 4004 is pretty easy but getting lithography to the point where you can produce an IC powerful enough to run even a fairly simple C program is nontrivial. Remember that C has a separate preprocessor, compiler (which traditionally had a separate assembler), and linker because it was designed for computers that couldn’t fit more than one of those in RAM at a time. Even those computers were the result of many billions of dollars of investment from a society that already had mass production, mass mining, and large-scale chemistry infrastructure.

                                                                                                C code today tends to assume megabytes of RAM, at a minimum. Magnetic core storage could do something like 1 KiB in something the size of a wardrobe. Scaling up production to the point where 1 MiB is readily available requires ICs, so any non-trivial C program is going to have a dependency on at least ’80s-era computing hardware.

                                                                                                TL;DR: If a society has collapsed and recovered to the point where it’s rediscovering computers, writing a compiler for a fairly complex language is going to be very low cost in comparison to building the hardware that the compiler can target.

                                                                                                1. 1

                                                                                                  Well, I wasn’t anticipating such a hard collapse. I was imagining a situation where salvage is still a thing, or where technology doesn’t regress that far. Still, you’re making a good point.

                                                                                                  1. 4

                                                                                                    That’s an interesting middle ground. It’s hard for me to imagine a scenario in which computers are salvageable but storage is all lost to the point where a working compiler is impossible to find. At the moment, flash loses its ability to hold charge if not powered for a few years but spinning rust is still fine, as is magnetic tape, for a much longer period, so you’d need something else to be responsible for destroying them. Cheap optical storage degrades quite quickly but there are archive-quality disks that are rated for decades. If anything, processors and storage are more fragile.

                                                                                                    In the event of a collapse of society, I think it’s a lot more likely that copies of V8 would survive longer than any computer capable of running them. The implicit assumption in the idea that the compiler would be a bottleneck recovering from a collapse of society is that information is more easily destroyed than physical artefacts. This ignore the fact that information is infinitely copyable, whereas the physical artefacts in question are incredibly complex and have very tight manufacturing tolerances.

                                                                                                    Of course, this is assuming known threats. It’s possible that someone might create a worm that understands a sufficiently broad range of vulnerabilities that it propagates into all computers and erases all online data. If it also propagates into the control systems for data warehouses then it may successfully destroy a large proportion of backups. Maybe this could be combined with a mutated bacterium that ate something in optical disks and prevented recovering from backup DVDs or whatever. Possibly offline storage will completely go out of fashion and we’ll end up with all storage being some form of RAM that is susceptible to EMP and have all data erased by a solar flare.

                                                                                                    1. 1

                                                                                                      It really depends on what we can salvage, and what chips can withstand salvage operations. In a world where we stop manufacturing computers (or at least high-end chips), I’d expect chips to fail over the years, and the most complex ones will likely go first. And those that don’t will be harder to salvage for various reasons: how thin their connection pins are, ball arrays, multi-layer boards requirements, and the stupidly fast rise times that are sure to cause cross-talk and EMI problems with the hand made boards of a somewhat collapsed future.

                                                                                                      In the end, many of us may be stuck with fairly low-end micro controllers and very limited static memory chips (forget about controlling DRAM, it’s impossible to do even now without a whole company behind you). In that environment, physical salvage is not that horrible, but we’d have lost enough computing power that we’ll need custom software for it. Systems that optimise for simplicity, like Oberon, might be much more survivable in this environment.

                                                                                                      C code today tends to assume megabytes of RAM, at a minimum.

                                                                                                      In this hypothetical future, that is relevant indeed. Also, I believe you. But then the first serious project I wrote in C, Monocypher, requires only a couple KB of stack memory (no heap allocation) for everything save password hashing. The compiled code itself fits requires less than 40KB of memory. Thing is, I optimised it for simplicity and speed, not for memory usage (well, I did curb memory use a bit when I’ve heard I had embedded users).

                                                                                                      I suspect that when we optimise for simplicity, we also tend to use less resources as a side effect.


                                                                                                      Now sure, those simple systems will take no time to rebuild from scratch… if we have the skills. In our world of bigger and faster computers with a tower of abstraction taller than the Everest, I feel most of us simply don’t have those skills.

                                                                                                      1. 4

                                                                                                        Now sure, those simple systems will take no time to rebuild from scratch… if we have the skills. In our world of bigger and faster computers with a tower of abstraction taller than the Everest, I feel most of us simply don’t have those skills.

                                                                                                        While it’s an interesting thought exercise, but I think this really is the key point. The effort in salvaging a working compiler to be able to run some tuned C code in a post-apocalyptic future may be significantly higher than just rewriting it in assembly for whatever system you were able to salvage (and, if you can’t salvage an assembler, you can even assemble it by hand after writing it out on some paper. Assuming cheap paper survives - it was very expensive until a couple of hundred years ago).

                                                                                                        Most of us probably don’t have the skills to reproduce the massive towers of abstraction that we use today from scratch but my experience teaching children and young adults to program suggests that learning to write simple assembly routines is a skill that a large proportion of the population could pick up fairly easily if necessary. If anything, it’s easier to teach people to write assembly for microcontrollers than JavaScript for the web because they can easily build a mostly correct mental model of how everything works in the microcontroller.

                                                                                                        Perhaps more importantly, it’s unlikely that any software that you write now will solve an actual need for a subsistence level post-apocalyptic community. They’re likely to want computers for automating things like irrigation systems or monitoring intrusion sensors. Monocypher is a crypto library that implements cryptosystems that assume an adversary who had access to thousands of dedicated ASICs trying to crack your communications. A realistic adversary in this scenario would struggle to crack a five-wheel Enigma code and that would be something that you could implement in assembly in a few hours and then send the resulting messages in Morse code with an AM radio.

                                                                                                        1. 1

                                                                                                          Most of us probably don’t have the skills to reproduce the massive towers of abstraction that we use today from scratch but my experience teaching children and young adults to program suggests that learning to write simple assembly routines is a skill that a large proportion of the population could pick up fairly easily if necessary.

                                                                                                          I feel a little silly for not having thought of that. Feels obvious in retrospect. If people who have never programmed can play Human Resource Machine, they can probably learn enough assembly to be useful.

                                                                                                          Perhaps more importantly, it’s unlikely that any software that you write now will solve an actual need for a subsistence level post-apocalyptic community.

                                                                                                          Yeah, I have to agree there.

                                                                                            2. 2

                                                                                              Today’s humans were able to create Rust, so I don’t see why future humans wouldn’t. Future humans will probably just ask GPT-3000 to generate the compiler for them.

                                                                                              If you’re thinking about some post-apocalyptic scenario with a lone survivor rebuilding the civilisation, then our computing is already way beyond that. In the 1960’s you were able to hand-stitch RAM, but to even hold source code of modern software, let alone compile and run it, you need more technology than a single person can figure out.

                                                                                              C may be your point of reference, because it’s simple by contemporary standards, but it wasn’t a simple language back when the hardware was possible to comprehend by a single person. K&R C and single-pass C compilers for PDP-11 are unusable for any contemporary C programs, and C is too complex and bloated for 8-bit era computers.

                                                                                              1. 1

                                                                                                If GPT can do that for us then hey, I will gladly gladly welcome it. I’m not thinking about a post-apocalyptic scenario but I can see the relationship to it.

                                                                                              2. 2

                                                                                                But why one person? I think we’ll still write software in teams in 2322, if we write software at all by that point instead of flying spaceships and/or farming turnips in radioactive wastelands. The software was written by teams today, and I think, if it needs to be rewritten, it will be rewritten by teams in the future.

                                                                                                1. 1

                                                                                                  I would also be careful about timespans here. computers haven’t been around for a century yet, so who knows what things will be like 100 years from now? I don’t even know if it’s possible to emulate an ENIAC and run old punch card code on modern hardware, that’s the sort of change we’ve seen in just 75y. maybe multicore x86 machines running windows/*nix/BSD will seem similarly arcane 300y from now.

                                                                                              3. 1

                                                                                                Wouldn’t a published standard be more important to future programmers? Go might be a wonderful language, but is there a standards document I can read from which an implementation can be written from?

                                                                                            1. 8

                                                                                              I suppose it depends on the company, time, and luck, and “YMMV” as always. However, my experience working in staff roles was quite miserable, and many of my friends had the same experience.

                                                                                              Your manager may report to the COO (or the CEO in smaller companies), but it may not mean anything for either of you. If executives see you as a cost center that steals money from the real business, you will have to fight tooth and nail to keep your department funded. You may not even win: at quite a few places I’ve seen, such internal departments were staffed mainly by inexperienced people who would leave for a better job as soon as they could find one. But when disaster happens, you will be blamed for everything.

                                                                                              I’m pretty sure there are companies that don’t mistreat their staff IT personnel, but no assumption is universal.

                                                                                              1. 9

                                                                                                IME: the harder it is for execs to see that “person/group X does their job which directly leads to profit” the more of an uphill battle it is. Even a single hop can have a big effect: note the salary differences between skilled sales people and skilled engineers.

                                                                                                1. 5

                                                                                                  Can confirm. This is particularly challenging for “developer experience” or “productivity” teams, where all of the work is definitionally only an indirect contribution to the bottom line—even if an incredibly important and valuable one.

                                                                                                  1. 2

                                                                                                    Gotta be able to sell everything you do. It’s hard when metrics are immaterial but in those specific areas, you have to be showing “oh, I save business line X this many person-hours daily/weekly/etc.” constantly in order to advance

                                                                                                    1. 5

                                                                                                      As an idea that sounds good, but in practice no one knows how to even estimate that in a lot of categories of important tech investment for teams like mine. I have spent a non-trivial amount of time with both academic and industrial literature on the subject, and… yeah, nobody blows how to measure or even guesstimate this stuff in a way that I could remotely sign my name to in good conscience.

                                                                                                  2. 1

                                                                                                    note the salary differences between skilled sales people and skilled engineers.

                                                                                                    The latter usually have a higher salary or total compensation so I’m not sure if I understood your point. Maybe sales make more in down-market areas of the industry that don’t pay more than $100k for programmers if they can help it?

                                                                                                    1. 5

                                                                                                      $100k for programmers exists in the companies that have effectively scaled up their sales pipeline. Most programmers work on some kind of B2B software (like the example in the article, internal billing for an electricity company), where customers don’t number in the millions, engineer salaries have five digits, and trust me, their package can’t touch the compensation of the skilled sales person who manages the expectations of a few very important customers.

                                                                                                      1. 3

                                                                                                        I can confirm that I have never worked for companies where the sales people were paid less than the engineers. At least not to my knowledge.

                                                                                                        In fact, in most companies I worked for, the person I reported to had a sales role.

                                                                                                        1. 2

                                                                                                          I think a good discriminant for this might be software-as-plumbing vs. software-is-the-product. I suspect SaaS has driven down the costs a lot of glue type stuff like this.

                                                                                                    2. 5

                                                                                                      I’ve had exactly the opposite experience. Being in staff roles has been the most enjoyable because we could work on things that had longer term payoffs. When I’ve been a line engineer we weren’t allowed to discuss anything unless it would increase revenue that quarter. The staff roles paid slightly less but not too much less.

                                                                                                      1. 2

                                                                                                        I had a similar experience. I worked on a devops team at a small startup, and we did such a good job that when covid hit and cuts needed to be made, our department was first on the chopping block. I landed on my feet just fine, finding a job that paid 75% more (and have since received a promotion and a couple of substantial raises), but I was surprised to learn that management may keep a floundering product/dev org over an excellent supporting department (even though our department could’ve transitioned to dev and done a much better job).

                                                                                                      1. 5

                                                                                                        Yes it matters.

                                                                                                        At least with C++ developers can slowly learn the more arcane part of the language language while they use it. A bit more difficult with Rust.

                                                                                                        Furthermore, it might be possible to implement some form of borrow checking for existing languages.

                                                                                                        Any language should be easy to learn. That’s true for C and python. Language popularity is highly correlated with ease of learning. And this is true for all the new languages out there that try to do fancy things: most developers do not care.

                                                                                                        Personally, all I would ever want, is something mostly like C/C++, with pythonic features, easier to read and use, faster to compile, without a GC, statically compiled, without sophisticated things.

                                                                                                        1. 16

                                                                                                          I wouldn’t call C easy to learn. It probably has less essential complexity than Rust has, but there’s still a lot of fiddly details to learn that wouldn’t come up in languages created decades later with garbage collection and better tooling and syntactic defaults.

                                                                                                          1. 8

                                                                                                            A couple issues I found when wanting to learn C is all of the variation because of its history. What tooling should I use? Which conventions should I follow? Which version is the current version?

                                                                                                            The various C standards are not conveniently discoverable and even when you map them out, they’re written in reference to past standards. So to get the set of rules you have to mentally diff K&R C with a handful of other standards published over 40 years, etc. Starting completely from no knowledge and trying to figure out “What are the complete set of rules for the most modern version of C?” is nearly impossible. At least that has been my experience and biggest struggle when trying to get started with C multiple times over the years.

                                                                                                            Then I constantly see veteran C programmers arguing with each other about correct form, there seems to be far less consensus than with modern languages.

                                                                                                            1. 4

                                                                                                              I’d say C is easy to learn but hard to master. But that could be said about a lot of languages.

                                                                                                              1. 2

                                                                                                                I think there is a big difference to what is the absolute minimum you can learn.

                                                                                                                You can “learn” C with programs that compile and run the happy path mostly correctly. The probably have tons of bugs and security issues but you are using the language.

                                                                                                                Rust forces you to handle these issues up front. This does make the minimal learning longer but the total learning to be a “production ready coder” is probably actually shorter.

                                                                                                              2. 14

                                                                                                                Man, I was terrified when I was learning C++. I would stick to the parts I was “comfortable” with, but when I would call someone else’s code (or a library) I couldn’t reliably know how the features they used would intersect with mine. And the consequences very often were debugging core dumps for hours. I’m no Rust fanboy, but if you’re going to have a language as complicated as Rust or C++, I’d rather learn with one that slaps my hand when doing something I probably oughtn’t do.

                                                                                                                1. 11

                                                                                                                  So, Nim once ARC lands?

                                                                                                                  1. 3

                                                                                                                    Is that not the case already ? I unfortunately do not use Nim often these days so I might be out of touch, but if I recall correctly arc/orc are available but not the default.

                                                                                                                    EDIT: Yeah, It seems to use ref counting by default currently but the doc advise to use orc for newly written code Cf: https://nim-lang.github.io/Nim/mm.html

                                                                                                                  2. 8

                                                                                                                    Any language should be easy to learn. That’s true for C and python. Language popularity is highly correlated with ease of learning.

                                                                                                                    All other things being equal, yes, ease of learning is good. But at some point one may have to sacrifice ease of learning to make the expert’s work easier or more reliable or faster or leaner. Sometimes that’s what has to be done to reach the level of quality we require.

                                                                                                                    If it means some programmers can’t use it, so be it. It’s okay to keep the incompetents out.

                                                                                                                    1. 8

                                                                                                                      I was mostly with you, but “incompetents” is harsh.

                                                                                                                      1. 4

                                                                                                                        Can we at least agree that there is such a thing as incompetent programmers? I’m all for inclusivity, but at some point the job has to get done. Also, people can learn. It’s not always easy, but it’s rarely impossible.

                                                                                                                        1. 4

                                                                                                                          There are, but generally they’re not going to be successful whether they use Rust or another language. There are inexperienced developers who aren’t incompetent but just haven’t learned yet who will have an easier time learning some languages than Rust, and there are also experienced programmers who simply don’t know Rust who will also have an easier time learning other languages than Rust. Since the incompetent programmers will fail with or without Rust, it seemed like you were referring to the other groups as incompetent.

                                                                                                                          1. 5

                                                                                                                            Ah, the permanent connotation of “incompetent” eluded me. I was including people who are not competent yet. You only want to keep them out until they become competent.

                                                                                                                            My original point was the hypothesis that sometimes, being expert friendly means being beginner hostile to some extent. While it is possible (and desirable) to lower the learning curve as much as we reasonably can, it’s rarely possible to flatten it down to zero, and in some cases, it just has to be steep.

                                                                                                                            Take oscilloscopes for instance. The ones I was exposed to in high school were very simple. But the modern stuff I see now is just pouring buttons all over the place like a freaking airliner! That makes them much scarier to me, who have very little skill in electronics. But I also suspect all these buttons are actually valuable to experts, who may have lots of ways to test a wide variety of circuits. And those button give them a more direct access to all that goodness.

                                                                                                                            In the end, the question is, are steep learning curve worth it? I believe that in some cases, they are.

                                                                                                                            1. 3

                                                                                                                              That makes sense. Thanks for clarifying. I don’t know if I have a strong opinion, but I do believe that there are cases that require extreme performance and that often requires expertise. Moreover, having been a C++ programmer for a time, I’m grateful that where C++ would accept a broken program, Rust slaps my hand.

                                                                                                                      2. 2

                                                                                                                        True. Ada Programming language easy to learn but not widely accepted or used

                                                                                                                      3. 4

                                                                                                                        At least with C++ developers can slowly learn the more arcane part of the language language while they use it. A bit more difficult with Rust.

                                                                                                                        I’m not sure what parts of Rust you consider “arcane”. The tough parts to learn, borrow checking and lifetimes, aren’t “arcane” parts of Rust; they are basically it’s raison d’être.

                                                                                                                        Any language should be easy to learn.

                                                                                                                        Ideally languages would be as simple/easy as they can be to meet their goals. But a language might be the easiest-to-learn expression of a particular set of goals and still be tough to learn – it depends on the goals. Some goals might have a lot of inherent complexity.

                                                                                                                        1. 3

                                                                                                                          If carefully aware of escape analyses as a programmer, you might realize that with Go, and, while programming Go for several years, I’m by no means a Go fanboy.

                                                                                                                          In particular, I conjecture that you could write a program in Go that does not use GC, unless the standard library functions you use themselves use GC.

                                                                                                                          I need to learn Rust, i realize, having written that prior sentence, and having been originally a C fan.

                                                                                                                          1. 6

                                                                                                                            Personally, I would strongly recommend the O’Reilly “Programming Rust, 2nd ed.” For me it was a breakthrough that finally allowed me to write Rust and not get stuck. It may not be “perfectly shiny” what I write, but before that, I often stumbled into some situations I just couldn’t get out of. Now I understand enough to be able to at least find some workaround - ugly or not, but it lets me go on writing.

                                                                                                                            Also, coming from Go (with a history of C++ long ago beforehand), one thing I had to get over and understand “philosophically” was the apparent lack of simplicity in Rust. For this, my “a ha” moment was realizing, that the two languages make different choices in a priorities triangle of: simplicity vs. performance vs. security. Go does value all 3, but chooses simplicity as the highest among them (thus GC, nulls, etc; but super approachable lang spec and stdlib APIs and docs). Rust does value all 3 too, but chooses performance AND security as the highest. Thus simplicity necessarily is just forced to the back-seat, with a “sorry, man; yes, we do care about you, but now just please stay there for a sec and let us carry out the quarrel we’re having here; we’ll come back to you soon and really try to look into what you’d like us to hear.” And notably the “AND” here is IMO a rather amazing feat, where before I’d assume it just has to often be an “or”. Also this theory rather nicely explains to me the sparking and heated arguments around the use of unsafe in the community - it would appear to happen around the lines where the “AND” is, or looks like it is, kind of stretching/cracking.

                                                                                                                          2. 2

                                                                                                                            Personally, all I would ever want, is something mostly like C/C++, with pythonic features, easier to read and use, faster to compile, without a GC, statically compiled, without sophisticated things.

                                                                                                                            I think you’re looking for Myddin (still WIP) or possibly Hare. Whether they’re “pythonic” is debatable though.

                                                                                                                            1. 1

                                                                                                                              Any language should be easy to learn.

                                                                                                                              Not only easy to run. Easy. Because why would one make it difficult if it clearly can be made easy? The whole point of programming languages is providing simoler alternatives to the targets of their compilers.

                                                                                                                            1. 8

                                                                                                                              C does not provide maps, and when I really need one I can implement it in less than 200 lines (won’t be generic, though). Were I to design a language like Hare or Zig, providing an actual (hash) map implementation would be way down my list of priorities. Even if it belongs in the standard library, my first order of business would be to make sure we can implement that kind of things.

                                                                                                                              In fact, Go made a mistake when it provided maps directly without providing general purpose generics. That alone hinted at a severe lack of orthogonality. If maps have to be part of the core language, that means users can’t write one themselves. Which means they probably can’t write many other useful data structures. As Go authors originally did, you could fail to see that if the most common ones (arrays, hash tables…) are already part of the core language.

                                                                                                                              The most important question is not whether your language has maps. It’s whether we can add maps that really matters. Because if we can’t, there’s almost certainly a much more serious root cause, such as the lack of generics.

                                                                                                                              1. 11

                                                                                                                                I think this is too a one-sided debate. Generics have benefits and drawbacks (to argument from authority, see https://nitter.net/graydon_pub/status/1036279571341967360).

                                                                                                                                Go’s original approach of providing just three fundamental generic data structures (vec, map, and chan) definitely was a worthwhile experiment in language design, and I have the feeling that it almost worked.

                                                                                                                                1. 10

                                                                                                                                  At this point I’d argue that the benefits of even the simplest version of generics (not bounded, template-style or ML-functor-style, whatever) are so huge compared to the downsides, that it’s just poor design to create a new statically typed language without them. It’s almost like creating a language without function calls.

                                                                                                                                  Go finally fixed that — which doesn’t fix all the other design issues like zero values or lack of sum types — but their initial set of baked-in generic structures was necessary to make the language not unbearable to use. If they hadn’t baked these in, who would use Go at all?

                                                                                                                                  1. 3

                                                                                                                                    other design issues like zero values

                                                                                                                                    Could you share more here? I agree about Go generics, but its zero values are one thing I miss when using other imperative languages. They’re less helpful in functional languages, but I even miss zero values when using OCaml in an imperative style.

                                                                                                                                    1. 5

                                                                                                                                      Zero values are:

                                                                                                                                      • not always something that makes sense (what’s the 0 value for a file descriptor? an invalid file descriptor, is what. For a mutex? same thing.) The criticism in a recent fasterthanlime article points this out well: Go makes up some weird rules about nil channels because it has to, instead of just… preventing channels from being nil ever.
                                                                                                                                      • error prone: you add a field to a struct type, and suddenly you need to remember to update all the places you create this struct
                                                                                                                                      • encouraging bad programming by not forcing definition to go with declaration. This is particularly true in OCaml, say: there are no 0 values, so you always have to initialize your variables. Good imperative languages might allow var x = undefined; (or something like that) but should still warn you if a path tries to read before writing to the field.
                                                                                                                                      1. 3

                                                                                                                                        nitpick: Go’s sync.Mutex has a perfectly valid and actually useful zero value: an unlocked mutex.

                                                                                                                                        That said, I broadly agree with you; some types simply do not have a good default, and the best solution is not to fudge it and require explicit initialization.

                                                                                                                                        @mndrix, note that there is a middle ground that gives you the best of both worlds: Haskell and Rust both have a Default type class/trait, that can be defined for types for which it does makes sense. Then you can just write

                                                                                                                                        (in haskell):

                                                                                                                                        let foo = def
                                                                                                                                          in ...
                                                                                                                                        

                                                                                                                                        (or rust):

                                                                                                                                        let foo = Default::default();
                                                                                                                                        

                                                                                                                                        Note you can even write this in Go, it just applies to more types than it should:

                                                                                                                                        func Zero[T any]() T {
                                                                                                                                           var ret T
                                                                                                                                           return ret
                                                                                                                                        }
                                                                                                                                        
                                                                                                                                        // Use:
                                                                                                                                        foo := Zero[T]()
                                                                                                                                        

                                                                                                                                        You could well define some mechanism for restricting this to certain types, rather than just any. Unfortunately, it’s hard for me to see how you could retrofit this.

                                                                                                                                        1. 1

                                                                                                                                          Thank you for the correction!

                                                                                                                                        2. 3

                                                                                                                                          not always something that makes sense (what’s the 0 value for a file descriptor? an invalid file descriptor, is what. For a mutex? same thing.)

                                                                                                                                          Partially agreed on a mutex (though on at least some platforms, a 0 value for a pthread mutex is an uninitialised, unlocked, mutex and will be lazily initialised on the first lock operation). If you bias your fd numbers by one then a 0 value corresponds to -1, which is always invalid and is a useful placeholder, but your example highlights something very important: the not-present value may be defined externally.

                                                                                                                                          I saw a vulnerability last year that was a direct result of zero initialisation, of a UID field. A zero value on *NIX means root. If you hit the code path that accidentally skipped initialising the field properly, then the untrusted thing would run as root. Similarly, on most *NIX systems (all that I know of, though POSIX doesn’t actually mandate this), fd 0 is stdin, which is (as you point out) a terrible default.

                                                                                                                                          Any time you’re dealing with an externally defined interface, there’s a chance that either there is no placeholder value or there is a placeholder value and it isn’t 0.

                                                                                                                                          1. 2

                                                                                                                                            not always something that makes sense

                                                                                                                                            Agreed. However, my experience is that zero values are sensible for roughly 90% of types, and Go’s designers made the right Huffman coding decision here.

                                                                                                                                            The criticism in a recent fasterthanlime article points this out well: Go makes up some weird rules about nil channels because it has to, instead of just… preventing channels from being nil ever.

                                                                                                                                            For anyone who comes along later, I think this is the relevant fasterthanlime article. Anyway, the behavior of nil and closed channels is well-grounded in the semantics of select with message passing, and quite powerful in practice. For me, this argument ends up favoring zero values, for channels at least.

                                                                                                                                            you add a field to a struct type, and suddenly you need to remember to update all the places you create this struct

                                                                                                                                            My experience has been that they’d all be foo: 0 anyway. Although in practice I rarely use struct literals outside of a constructor function in Go (same with records in OCaml) because I inevitably want to enforce invariants and centralize how my values are created. In both languages, I only have to change one place after adding a field.

                                                                                                                                            by not forcing definition to go with declaration

                                                                                                                                            The definition there but it’s implicit. I guess I don’t see much gained by having repetitive = 0 on each declaration, like I often encounter in C.

                                                                                                                                            1. 1

                                                                                                                                              what’s the 0 value for a file descriptor?

                                                                                                                                              standard input

                                                                                                                                              1. 2

                                                                                                                                                I hope you’re not serious. I mean, sure, but it makes absolutely no sense whatsoever that leaving a variable uninitialized just means “use stdin” (if it’s still open).

                                                                                                                                                1. 1

                                                                                                                                                  File descriptor 0 is standard input on unix systems. (Unless you close it and it gets reused, of course, leading to fun bugs when code expects it to be standard input.)

                                                                                                                                                  1. 1

                                                                                                                                                    As ludicrous as it would be, it would be a natural default value to have for careless language implementers, and before you know it users come to expect it. Even in C, static variables are all zero initialised and using one on read(2) would indeed read from standard input.

                                                                                                                                                    I’m sure we can point out various language quirks or weird idioms that started out that way.

                                                                                                                                                2. 1

                                                                                                                                                  A zero-value file descriptor is invalid, sure, but a zero-value mutex is just an unlocked mutex. Why would that be invalid?

                                                                                                                                                3. 1

                                                                                                                                                  There is no “zero” postal code, telephone number or user input.

                                                                                                                                              2. 7

                                                                                                                                                Thing is, Go not only is statically typed, it is garbage collected.

                                                                                                                                                As such, it is quite natural for it to use heap allocation for (almost) everything, and compensate for this with a generational GC. Now things get a little more complicated if they want to support natively sized integers (OCaml uses 31/62-bit integers to have one bit to distinguish them from pointers, so the GC isn’t confused), but the crux of the issue is that when you do it this way, generics become dead simple: everything is a pointer, and that’s it. The size of objects is often just irrelevant. It may sometime be a problem when you want to copy mutable values (so one might want to have an implicit size field), but for mere access, since everything are pointers size does not affect the layout of your containing objects.

                                                                                                                                                This is quite different from C++ and Rust, whose manual memory management and performance goals kinda force them to favour the stack and avoid pointers. So any kind of generic mechanism there will have to take into account the fact that every type might have a different size, forcing them to go to a specialization based template, which may be more complex to implement (especially if they want to be clever and specialise by size instead of by type).

                                                                                                                                                What’s clear to me is that Go’s designers didn’t read Pierce’s Type and Programming Languages, and the resulting ignorance caused them to fool themselves into thinking generics were complicated. No they aren’t. It would have taken a couple additional weeks to implement them at most, and that would have saved time elsewhere (for instance they wouldn’t have needed to make maps a built in type, and pushed that out to the standard library).

                                                                                                                                                I have personally implemented a small scripting language to pilot a test environment. Plus it had to handle all C numeric types, because the things it tests were low level. I went for static typing for better error reporting, local type inference to make things easier on the user, and added an x.f() syntax and a simple type based static dispatch over the first argument to get an OO feel. I quickly realised that some of the functions I needed required generics, so I added generics. It wasn’t perfect, but it took me like 1 week. I know that generics are simple.

                                                                                                                                                The reason the debate there is so one sided is because Go should have had generics for the start. The benefits are enormous, the drawbacks very few. It’s not more complex for users who don’t use generics, generic data structures can still be used as if they were built in, it hardly complicates the implementation, and it improves orthogonality across the board.

                                                                                                                                                “LoL no generics” was the correct way to react, really.

                                                                                                                                                1. 8

                                                                                                                                                  It still seems to me that you are overconfident in this position. That’s fanboyism from my side, but Graydon certainly read TAPL, and if Graydon says “there’s a tradeoff between expressiveness and cognitive load” in the context of Go’s generics, it does seem likely that there’s some kind of tradeoff there. Which still might mean “LoL no generics” is the truth, but not via a one-sided debate.

                                                                                                                                                  Having covered meta issues, let me respond to specific points, which are all reasonable, but also are debatable :)

                                                                                                                                                  First, I don’t think the GC/no-GC line of argument holds for Go, at least in a simple form. Go deliberately distinguishes value types and pointer types (up to having a dedicated syntax for pointers), so “generics are easy ‘cause everything is a pointer” argument doesn’t work. You might have said that in Go everything should have been a pointer, but that’s a more complex argument (especially in the light of Java trying to move away from that).

                                                                                                                                                  Second, “It’s not more complex for users who don’t use generics,” – this I think is just in general an invalid line of argumentation. It holds in specific context: when you own the transitive closure of the code you are working with (handmade-style projects, or working on specific things at the base of the stack, like crypto libraries, alone or in a very small and tightly-knit teams). For “industrial” projects (and that’s the niche for Go) user’s simply don’t have the luxury of ignoring parts of the language. If you work on an average codebase with >10 programmers and >100k lines of code, the codebase will use everything which is accepted by the compiler without warnings.

                                                                                                                                                  Third, I personally am not aware of languages which solve generics problem in a low cognitive-load way. Survey:

                                                                                                                                                  C++ is obviously pretty bad in terms of complexity – instantiation-time errors, up-until-recently a separate “weird machine” for compile-time computations, etc.

                                                                                                                                                  Rust – it does solve inscrutable instantiation-time errors, but at the cost of far-more complex system, which is incomplete (waiting for GATod), doesn’t compose with other language features (async traits, const traits), and still includes “weird machine” for compile-time evaluation.

                                                                                                                                                  Zig. Zig is exciting – it fully and satisfactory solves the “weird machine” problem by using the same language for compile-time (including parametric polymorphism) and run-time computation. It’s also curious in that, as far as I understand, it also essentially “ignores TAPL” – there are no generics in the type system. It, however, hits “instantiation time errors” on the full speed. It seems to be that building tooling for such a language would be pretty hard (it would be anti-go in this sense). But yeah, Zig so far for me is one of the languages which might have solved generics.

                                                                                                                                                  Haskell – it’s a bit hard to discuss what even is the generics impl in Haskell, as it’s unclear which subset of pragmas are we talking about, but I think any subset generally leads to galactic-brain types.

                                                                                                                                                  OCaml – with questionable equality semantics, and modular implicit which are coming soon, I think it’s clear that generics are not solved yet. It also has a separate language for functors, which seems pretty cognitively complex.

                                                                                                                                                  Scala – complexity-wise, I think it’s a super-set of both Haskell and Java? I don’t have a working memory of Scala to suggest specific criticisms, but I tend to believe “Scala is complex” meme. Although maybe it’s much better in Scala 3?

                                                                                                                                                  Java – Java’s type system is broken in a trivial way (covariant arrays) and a (couple of?) interesting way (they extended type-inference when they added lambdas, and that inference allowed materialization of some un-denotable types which break the type system). I am also not sure that its “LoL covariant arrays”, as some more modern typed languages also make this decision, citing reduction of cognitive load. And variance indeed seems to be quite complex topic – Effective Java (I think?) spends quite some pages explaining “producer extends, consumer super”.

                                                                                                                                                  C# – I know very little about C#, but it probably can serve as a counter example for “generics in GC languages are simple”. Like Go, C# has value types, and, IIRC, it implements generics by just-in-time monomorphisation, which seems to be quite a machinery.


                                                                                                                                                  Now, what I think makes these systems complicated is not just parametric polymorphism, but bounded quantitication. The desire to express not only <T>, but <T: Ord>. Indeed, to quote TAPL,

                                                                                                                                                  This chapter introduces bounded quantification, which arises when polymorphism and subtyping are combined, substantially increasing both the expressive power of the system and its metatheoretic complexity.

                                                                                                                                                  I do think that there’s an under-explored design space of non-bounded generics, and very much agree with @c-cube . I am not quite convince that it would work and that Go should have been SML without functors and with channels, but that also doesn’t seem obviously worse than just three generic types! The main doubt for me is that having both interfaces and unbounded generics feels weird. But yeah, once I have spare time for implementing a reasonable complete language, unbounded generics is what I’d go for!

                                                                                                                                                  EDIT: forgot Swift, which absolutely tops my personal chart of reasons why adding generics to a language is not a simple matter: https://forums.swift.org/t/swift-type-checking-is-undecidable/39024.

                                                                                                                                                  1. 1

                                                                                                                                                    Graydon certainly read TAPL

                                                                                                                                                    Ah. that’s one hypothesis down then. Thanks for the correction.

                                                                                                                                                    It holds in specific context: when you own the transitive closure of the code you are working with (handmade-style projects, or working on specific things at the base of the stack, like crypto libraries, alone or in a very small and tightly-knit teams). For “industrial” projects (and that’s the niche for Go) user’s simply don’t have the luxury of ignoring parts of the language. If you work on an average codebase with >10 programmers and >100k lines of code, the codebase will use everything which is accepted by the compiler without warnings.

                                                                                                                                                    OK, while I do have some experience with big projects, my best work by far was in smaller ones (including my crypto library, which by the way did not even need any generics to implement). What experience I do have with bigger projects however have shown me that most of the time (that is, as long as I don’t have to debug something), the only pieces of language I have to care about are those used for the API of whatever I’m using. And those tend to be much more reasonable than whatever was needed to implement them. Take the C++ STL for an extreme example: when was the last time you actually specified the allocator of a container? Personally I’ve never done it in over 10 years being paid to work with C++.

                                                                                                                                                    I personally am not aware of languages which solve generics problem in a low cognitive-load way

                                                                                                                                                    I have written one. Not public, but that language I’ve written for test environments? It had generics (unbounded, no subtyping), and I didn’t even tell my users. I personally needed them to write some of the functions of the standard library, but once that was done, I thought users would not really need them. (Yeah, it was easier to add full blown generics than having a couple ad-hoc generic primitives.)

                                                                                                                                                    Now, what I think makes these systems complicated is not just parametric polymorphism, but bounded quantitication. The desire to express not only <T>, but <T: Ord>.

                                                                                                                                                    Yeah, about subtyping…

                                                                                                                                                    In the code I write, which I reckon has been heavily influenced by an early exposure to OCaml (without the object part), I almost never use subtyping. Like, maybe 3 or 4 times in my entire career, two of which in a language that didn’t have closures (C++98 and C). If I design a language for myself, subtyping will be way down my list of priorities. Generics and closures will come first, and with closures I’ll have my poor man’s classes in the rare cases I actually need them. Even in C I was able to add virtual tables by hand that one time I had to have subtype polymorphism (It’s in my crypto library, it’s the only way I found to support several EdDSA hashes without resorting to a compilation flag).

                                                                                                                                                    I’ve heard of subtyping being successfully used elsewhere. Niklaus Witrth took tremendous advantage of it with its Oberon programming language and operating system. But I believe he didn’t have parametric polymorphism or closures either, and in this case I reckon subtyping & class based polymorphism are an adequate substitute.

                                                                                                                                                    Type classes (or traits) however are really enticing. I need more practice to have a definite opinion on them, though.

                                                                                                                                                    1. 2

                                                                                                                                                      Ah. that’s one hypothesis down then. Thanks for the correction.

                                                                                                                                                      To clarify, graydon, as far as I know, didn’t participate in Go’s design at all (he designed Rust), so this has no bearing on your assumption about designers of Go.

                                                                                                                                                      1. 2

                                                                                                                                                        Crap! Well, at least his argument has value.

                                                                                                                                                  2. 2

                                                                                                                                                    If generics are simple, can you read this issue thread and tell everyone else how to fix comparable to be consistent? TIA.

                                                                                                                                                    1. 2

                                                                                                                                                      Generics are simple under a critical condition: Design them from the ground up

                                                                                                                                                      Done after the fact in a system not designed for them, of course they’re going to be difficult. Then again, why waste a couple weeks of up front design when you can afford years of waiting and months of pain?

                                                                                                                                                      1. 3

                                                                                                                                                        I don’t agree that generics are simple, but I agree that designing them in from the start is vastly easier than trying to retrofit them to an existing language. There are a lot of choices in how generics interact with your type system (especially in the case of a structural type system, which Go has, and even more so in an algebraic type system). If you build generics in as part of your type system from the start then you can explore the space of things allowed by generics and the other features that you want. If you don’t, then you may find that the point that you picked in the space of other things that you want is not in the intersection of that and generics.

                                                                                                                                                        1. 0

                                                                                                                                                          I don’t agree that generics are simple

                                                                                                                                                          I kinda had to change my mind on that one. Generics can be very simple in some contexts (like my own little language), but I see now that Go wasn’t one of them.

                                                                                                                                                          If you build generics in as part of your type system from the start then you can explore the space of things allowed by generics and the other features that you want. If you don’t, then you may find that the point that you picked in the space of other things that you want is not in the intersection of that and generics.

                                                                                                                                                          One thing I take for granted since I went out of college in 2007, is that we want generics. If your language is even slightly general purpose, it will need generics. Even my little specialised language, I didn’t plan to add generics, but some functions in my standard library required it. So this idea of even attempting to design a language without generics feels like an obvious waste of time to me. Likewise for closures and sum types by the way.

                                                                                                                                                          There is one thing that can make me change my mind: systematic experimentation for my niche of choice. I design my language, and I write a real program with it, trying to use as few features as possible. For instance, my experience in writing a cryptographic library in C convinced me they don’t need generics at all. Surprisingly though, I did see a case for subtype polymorphism, in some occasions, but that happens rarely enough that an escape hatch like writing your vtable by hand is good enough. I believe Jonathan Blow is doing something similar for his gaming language, and Niklaus Wirth definitely did the same for Pascal (when he devised Modula and Oberon).

                                                                                                                                                          1. 2

                                                                                                                                                            One thing I take for granted since I went out of college in 2007, is that we want generics. If your language is even slightly general purpose, it will need generics

                                                                                                                                                            I completely agree here. I wrote a book about Go that was finished just after Go reached 1.0 and, even then, I thought it was completely obvious that once you’ve realised that you need generic maps you should realise that you will need other generic types. Having maps (and arrays and slices) as special-case generics felt like a very bad decision.

                                                                                                                                                            Mind you, I also thought that having a language that encouraged concurrency and made data races undefined behaviour, but didn’t provide anything in the type system to allow you to define immutable types or to limit aliasing was also a terrible idea. It turns out that most of the folks I’ve spoken to who use Go use it as a statically compiled Python replacement and don’t use the concurrency at all. There was a fantastic paper at ASPLOS a couple of years back that looked at concurrency bugs in Go and found that they are very common.

                                                                                                                                                            1. 1

                                                                                                                                                              I think Oberon and Go have a lot in common. Both were designed by an experienced language constructor late in his career, emphasising “simplicity” over everything else, leaving out even basic things such as enums, even though he had created more expressive languages earlier on.

                                                                                                                                                              1. 2

                                                                                                                                                                I tend to be more sympathetic to Wirth’s decisions, because he was working in a closed ecosystem he completely controlled and understood. I mean, they were like less than 5 people working on the entire OS + compiler + main applications, and in the case of the Lilith computer, and FPGA Oberon, even the hardware!

                                                                                                                                                                He could have criteria such as “does this optimisation makes the entire compiler bootstrap faster? Here it’s for speed, but I can see the idea that every single piece of complexity has to pay for itself. It was not enough for a feature to be beneficial, the benefits had to outweigh the costs. And he could readily see when they did, because he could fit the entire system in his head.

                                                                                                                                                                Go is in a different situation, where from the start it was intended for a rather large audience. Thus, the slightest benefit to the language, even if it comes at a significant up-front cost, is liable to pay huge dividends as it becomes popular. So while I can believe even generics may not be worth the trouble for a 10K LOC system (the size of the Oberon system), that’s a different story entirely when people collectively write hundreds of millions of lines of code.

                                                                                                                                                                1. 2

                                                                                                                                                                  The best characterisation of Go and Oberon I can come up with is »stubborn«.

                                                                                                                                                      2. 2

                                                                                                                                                        While Go is garbage collected, it is still very much “value oriented” in that it gives control to the programmer for the layout of your memory, much like C++ and Rust. Just making everything a pointer and plugging your ears to the issues that brings isn’t solving the problem.

                                                                                                                                                        I’m glad that you added generics to your small scripting language for a test environment in 1 week. I don’t think that speaks much to the difficulty in adding it to a different language with a different type system and different goals. When they started the language, things like “very fast compile times” where very high priority, with the template system of C++ generics heavily inspiring that goal. It would be inane to start a project to avoid a problem in a language and then cause the exact same problem in it.

                                                                                                                                                        So, they didn’t want to implicitly box everything based on their experience with Java, and they didn’t want to template everything based on their experience with C++. Finding and implementing a middle ground is, in fact, difficult. Can you name any languages that avoid the problems described in https://research.swtch.com/generic?

                                                                                                                                                        The problem with “lol no generics” is that the word “generics” sweeps the whole elephant under the rug. There are a fractal of decisions to make when designing them with consequences in the type system, runtime, compiler implementation, programmer usability, and more. I can’t think of any languages that have the exact same generics system. Someone who prefers Rust generics can look at any other language and say “lol no traits”, and someone who prefers avoiding generics entirely (which, I promise, is a reasonable position to hold) may look around and say “lol 2 hour compile times”. None of those statements advance any conversation or are a useful way to react.

                                                                                                                                                        1. 1

                                                                                                                                                          I can’t think of any languages that have the exact same generics system. Someone who prefers Rust generics can look at any other language and say “lol no traits”,

                                                                                                                                                          Aren’t Rust traits analogous to Swift protocols?

                                                                                                                                                          1. -1

                                                                                                                                                            While Go is garbage collected, it is still very much “value oriented” in that it gives control to the programmer for the layout of your memory, much like C++ and Rust.

                                                                                                                                                            That kind of changes everything… Oops. (Really, I mean it.)

                                                                                                                                                            When they started the language, things like “very fast compile times” where very high priority, with the template system of C++ generics heavily inspiring that goal.

                                                                                                                                                            Several things conspire to make C++ compile times slow. The undecidable grammar, the complex templates, and the header files. Sure we have pre-compiled headers, but in general, those are just copy pasta that are being parsed and analysed over and over and over again. I’ve seen bloated header-only libraries add a full second of compilation time per .cpp file. And it was a logging library, so it was included everywhere. Properly isolating it solved the problem, and the overhead was reduce to one second for the whole project.

                                                                                                                                                            A simpler grammar is parsed basically instantly. Analysis may be slower depending on how advanced static checks are, but at least in a reasonable language it only has to happen once. Finally there’s code generation for the various instantiations, and that may take some time if you have many types to instantiate. But at least you don’t have to repeat the analysis, and the most efficient optimisations don’t take that much compilation time anyway.

                                                                                                                                                            1. 3

                                                                                                                                                              The undecidable grammar, the complex templates, and the header files. Sure we have pre-compiled headers, but in general, those are just copy pasta that are being parsed and analysed over and over and over again.

                                                                                                                                                              The parsing and analysis isn’t the whole problem. The C++ compilation model is an incremental evolution of Mary Allen Wilkes’ design from the ’70s, which was specifically designed to allow compiling complex problems on machines with around 2 KiB of memory. Each file is compiled separately and then pasted together in a completely separate link phase. In C++, inline and templated functions (including methods on templated classes) are emitted in every compilation unit that uses them. If 100 files use std::vector<int>::push_back then 100 instances of the compiler will create that template instantiation (including semantic analysis), generate IR for it, optimise it, and (if they still have calls to it left after inlining) spit out a copy of it in a COMDAT in the final binary. It will then be discarded at the end.

                                                                                                                                                              Sony has done some great work on a thing that they call a ‘compilation database’ to address this. In their model, when clang sees a request for std::vector<int>::push_back, it queries a central service to see if it’s already been generated. It can skip the AST generation and pull the IR straight from the service. Optimisers can then ignore this function except for inlining (and can provide partially optimised versions to the database). A single instance is emitted in the back end. This gives big compile time speedups, without redesigning the language.

                                                                                                                                                              It’s a shame that the Rust developers didn’t build on this model. Rust has a compilation model that’s more amenable to this kind of (potentially distributed) caching than C++.

                                                                                                                                                          2. 1

                                                                                                                                                            What’s clear to me is that Go’s designers didn’t read Pierce’s Type and Programming Languages,

                                                                                                                                                            I’ll just leave this here https://www.research.ed.ac.uk/en/publications/featherweight-go

                                                                                                                                                            1. 0

                                                                                                                                                              I meant back when Go first came out. That was over 12 years ago, in 2009. This paper is from 2020.

                                                                                                                                                              Nevertheless, the present thread significantly lowered my confidence in that claim. I am no longer certain Go designers failed to read Pierce’s work or equivalent, I now merely find it quite plausible.

                                                                                                                                                            2. 1

                                                                                                                                                              What’s clear to me is that Go’s designers didn’t read Pierce’s Type and Programming Languages . . .

                                                                                                                                                              Do you really think that Ken Thompson, Rob Pike, and Robert Greisemer were ignorant to this degree? That they made the design decisions they did based on a lack of theoretical knowledge?

                                                                                                                                                              1. 2

                                                                                                                                                                None of them is known for statically typed functional languages, and I know for a fact there is little cross talk between that community and the rest of the world. See Java, designed in 1995, twenty years after ML showed the world not only how to do generics, but how neat sum types are. Yet Java’s designers chose to have null instead, and generics came only years later. Now I kinda forgive them for not including generics at a time they likely believed class based polymorphism would replace parametric polymorphism (a.k.a. generics), but come on, adding null when we have a 20 year old better alternative out there?

                                                                                                                                                                So yeah, ignorance is not such an outlandish hypotheses, even with such people. (Edit: apparently one of them did read TAPL, so that should falsify the ignorance hypothesis after all.)

                                                                                                                                                                But that’s not my only hypothesis. Another possibility is contempt for their users. In their quest for simplicity, they may have thought the brains of Go programmers would be too weak to behold the frightening glory of generics. That instead they’d stay in the shelter of more familiar languages like Python or C. I’m not sure how right they may have been on that one to be honest. There are so many people that don’t see the obvious truth that programming is a form of applied maths (some of them explicitly fled maths), that I can understand they may panic at the first sight of an unspecified type. But come on, we don’t have to use generics just because they’re there. There’s no observable difference between a built in map and one that uses generics. Users can use the language now, and learn generics later. See how many people use C++’s STL without knowing the first thing about templates.

                                                                                                                                                                Yet another hypothesis is that they were in a real hurry, JavaScript style, and instead of admitting they were rushed, they rationalised the lack of generics like it was a conscious decision. Perhaps they were even told by management to not admit to any mistake or unfinished job.

                                                                                                                                                                1. 6

                                                                                                                                                                  But that’s not my only hypothesis. Another possibility is contempt for their users. … Yet another hypothesis is that they were in a real hurry, JavaScript style, and instead of admitting they were rushed, they rationalised the lack of generics like it was a conscious decision. Perhaps they were even told by management to not admit to any mistake or unfinished job.

                                                                                                                                                                  This is exhausting and frustrating. It’s my own fault for reading this far, but you should really aim for more charity when you interpret others.

                                                                                                                                                                  1. 3

                                                                                                                                                                    In the words of Rob Pike himself:

                                                                                                                                                                    The key point here is our programmers are Googlers, they’re not researchers. They’re typically, fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.

                                                                                                                                                                    I’d say that makes it quite clear.

                                                                                                                                                                    1. 0

                                                                                                                                                                      Ah, I didn’t remember that quote, thank you. That makes the contempt hypothesis much more plausible.

                                                                                                                                                                      That being said, there’s a simple question of fact that is very difficult to ascertain: what the “average programmer” is capable of understanding and using, and at what cost? I personally have a strong intuition that generics don’t introduce unavoidable complexity significant enough to make people lives harder, but I’m hardly aware of any scientific evidence to that effect.

                                                                                                                                                                      We need psychologists and sociologists to study is.

                                                                                                                                                                    2. 1

                                                                                                                                                                      I’ve ran out of charitable interpretations to be honest. Go designers made a mistake, plain and simple. And now that generics have been added, that mistake has mostly been fixed.

                                                                                                                                                                      1. 1

                                                                                                                                                                        I’m surprised you’re saying this after noticing that you didn’t know basic things about Go’s type system and implementation and learning those details “changes everything” (which, in my opinion, is commendable). Indeed, you’ve also apparently learned many facts about the authors and their history in this thread. Perhaps this is a good moment to be reflective about the arguments you’re presenting, with how much certainty you’re presenting them, and why.

                                                                                                                                                                        1. 2

                                                                                                                                                                          A couple things:

                                                                                                                                                                          • I still think that omitting generics from a somewhat general purpose language past 2005 or so is a mistake. The benefits are just too large.
                                                                                                                                                                          • I’ve seen comments about how the standard library itself had to jump through some hoops that wouldn’t be there if Go had generics from the start. So Go authors did have some warning.
                                                                                                                                                                          • Go now has generics, even though adding them after the fact is much harder. There can be lots of reasons for this change, but one of them remains an admission of guilt: “oops we should have added generics, here you are now”.

                                                                                                                                                                          So yeah, I still believe beyond reasonable doubt that omitting generics back then was a mistake.

                                                                                                                                                                    3. 3

                                                                                                                                                                      All of your “analyses” are rooted in a presumption of ignorance, or malice, or haughty superiority, or some other bad-faith foundation. Do you really thing that’s the truth of the matter?

                                                                                                                                                                      There are so many people that don’t see the obvious truth that programming is a form of applied maths

                                                                                                                                                                      Some programming is a form of applied math. Most programming, as measured by the quantity of code which exists and is maintained by human beings, is not. Most programming is the application of computational resources to business problems. It’s imperative, it’s least-common-denominator, and it’s boring.

                                                                                                                                                                      1. 1

                                                                                                                                                                        All of your “analyses” are rooted in a presumption of ignorance, or malice, or haughty superiority, or some other bad-faith foundation. Do you really thing that’s the truth of the matter?

                                                                                                                                                                        The only way it’s false is if omitting generics was the right thing to do. I don’t believe that for a second. It was a mistake, plain and simple. And what could possibly cause mistakes, if not some form of incompetence or malice?

                                                                                                                                                                        Most programming is the application of computational resources to business problems. It’s imperative, it’s least-common-denominator, and it’s boring.

                                                                                                                                                                        It’s also maths. It’s also the absolutely precise usage of a formal notation that ends up being transformed into precise instructions for an (admittedly huge) finite state machine. Programs are still dependency graphs, whose density is very important for maintainability — even the boring ones.

                                                                                                                                                                        It’s not the specific kind of maths you’ve learned in high school, but it remains just as precise. More precise in fact, given how unforgiving computers are.

                                                                                                                                                                        1. 2

                                                                                                                                                                          The only way it’s false is if omitting generics was the right thing to do. I don’t believe that for a second.

                                                                                                                                                                          “The right thing to do” is a boolean outcome of some function. That function doesn’t have a a single objective definition, it’s variadic over context. Can you not conceive of a context in which omitting generics was the right thing to do?

                                                                                                                                                                          1. 2

                                                                                                                                                                            I see some:

                                                                                                                                                                            1. Designing a language before Y2K. Past 2005, it is too easy to know about them to ignore them.
                                                                                                                                                                            2. Addressing a specific niche for which generics don’t buy us much.
                                                                                                                                                                            3. Generics are too difficult to implement.
                                                                                                                                                                            4. Users would be too confused by generics.
                                                                                                                                                                            5. Other features incompatible with generics are more important.

                                                                                                                                                                            Go was designed too late for (1) to fly with me, and it is too general purpose for (2). I even recall seeing evidence that its standard library would have significantly benefited from generics. I believe Rust and C++ have disproved (3) despite Go using value types extensively. And there’s no way I believe (4), given my experience in OCaml and C++. And dammit, Go did add generics after the fact, which disavows (4), mostly disproves (3), and utterly destroys (5). (And even back then I would have a hard time believing (5), generics are too important in my opinion.)

                                                                                                                                                                            So yeah, I can come up with various contexts where omitting generics is the right think to do. What I cannot do is find one that is plausible. If you can, I’m interested.

                                                                                                                                                                            1. 1

                                                                                                                                                                              [Go] is too general purpose for (2). I even recall seeing evidence that its standard library would have significantly benefited from generics.

                                                                                                                                                                              You don’t need to speculate about this stuff, the rationale is well-defined and recorded in the historical record. Generics were omitted from the initial release because they didn’t provide value which outweighed the cost of implementation, factoring in overall language design goals, availability of implementors, etc. You can weight those inputs differently than the authors did, and that’s fine. But what you can’t do is claim they were ignorant of the relevant facts.

                                                                                                                                                                              I believe Rust and C++ have disproved (3)

                                                                                                                                                                              A language is designed as a whole system, and its features define a vector-space that’s unique to those features. Details about language L1 don’t prove or disprove anything about language L1. The complexity of a given feature F1 in language L1 is completely unrelated to any property of that feature in L2. So any subjective judgment of Rust has no impact on Go.

                                                                                                                                                                              Go did add generics after the fact, which disavows (4), mostly disproves (3), and utterly destroys (5).

                                                                                                                                                                              Do you just not consider the cost of implementation and impact on the unit whole as part of your analysis? Or do you weight these things so minimally as to render them practically irrelevant?

                                                                                                                                                                              Generics did not materially impact the success of the goals which Go set out to solve initially. Those goals did not include any element of programming language theory, language features, etc., they were explicitly expressed at the level of business objectives.

                                                                                                                                                                              1. 1

                                                                                                                                                                                You don’t need to speculate about this stuff, the rationale is well-defined and recorded in the historical record.

                                                                                                                                                                                What record I have read did not convince me. If you know of a convincing article or discussion thread, I’d like to read it. A video would work too.

                                                                                                                                                                                I believe Rust and C++ have disproved (3)

                                                                                                                                                                                A language is designed as a whole system […]

                                                                                                                                                                                I picked Rust for a specific reason: manual memory management, which means value types everywhere, and the difficulties they imply for generics. That said, I reckon that Go had the additional difficulty of having suptyping. But here’s the thing: in a battle between generics and subtyping, if implementing both is too costly, I personally tend to sacrifice subtyping. In a world of closures, subtyping and suptype polymorphism simply are not needed.

                                                                                                                                                                                Do you just not consider the cost of implementation and impact on the unit whole as part of your analysis?

                                                                                                                                                                                I’m not sure what you mean there… I think pretty much everyone agrees that designing and implementing generics up front is much easier than doing so after the fact, in a system not designed for them. If the Go team/community were able to shoulder the much higher cost of after-the-fact generics, then they almost certainly could have shouldered the cost of up-front generics back then —even though the team was much smaller.

                                                                                                                                                                                Generics did not materially impact the success of the goals which Go set out to solve initially.

                                                                                                                                                                                Well if they just wanted to have a big user base, I agree. The Google brand and the reputation of its designers did most of that work. As for real goals, they’re the same as any language: help the target audience write better programs for cheaper in the target niche. And for this, I have serious doubts about the design of Go.

                                                                                                                                                                                Now as @xigoi pointed out, Go authors targetted noobs. That meant making the language approachable by people who don’t know the relevant theory. That didn’t mean making the language itself dumb. Because users can’t understand your brilliant language doesn’t mean they won’t be able to use it. See every C++ tutorial ever, where you’re introduced to its features bit by bit. For instance when we learn I/O in C++ we don’t get taught about operator overloading (<< and >> magically work on streams, and we don’t need to know why just yet). Likewise we don’t learn template meta programming when we first encounter std::vector.

                                                                                                                                                                                People can work with generics before understanding them. They won’t write generic code just yet, but they absolutely can take advantage of already written code. People can work with algebraic data types. They won’t write those types right away, but they can absolutely take advantage of the option type for the return value of functions that may fail.

                                                                                                                                                                                A language can be brilliant and approachable. Yet Go explicitly chose to be dumb, as if it was the only way to be easy to work with. Here’s the thing though: stuff like the lack of generics and sum types tends to make Go harder to work with. Every time someone needed a generic data structure, they had to sacrifice type safety and resort to various conversion to and from the empty interface. Every time someone needs to report failures to the caller, they ended up returning multiple values, making things not only cumbersome, but also fairly easy to miss —with sum types at least the compiler warns you when you forget a case.

                                                                                                                                                                                It’s all well and good to design a language for other people to use, but did they study the impact of their various decision on their target audience? If I’m writing a language for myself I can at least test it on myself, see what feels good, what errors I make, how fast I program… and most of my arguments will be qualitative. But if I’m writing for someone else, I can’t help but start out with preconceived notions about my users. Maybe even a caricature. At some point we need to test our assumptions.

                                                                                                                                                                                Now I say that, such studies are bloody expensive, so I’m not sure what’s the solution there. When I made my little language, I relied on preconceived notions too. We had the requirements of course, but all I knew wast that my users weren’t programmers by trade. So I made something that tries its best to get out of their way (almost no explicit typing by default), and reports errors early (static typing rules). I guess I got lucky, because I was told later that they were happy with my language (and home grown languages have this reputation for being epically unusable).

                                                                                                                                                                                1. 1

                                                                                                                                                                                  But here’s the thing: in a battle between generics and subtyping, if implementing both is too costly, I personally tend to sacrifice subtyping. In a world of closures, subtyping and suptype polymorphism simply are not needed.

                                                                                                                                                                                  Do you consider generics and subtyping and polymorphism and other programming language properties means to an end, or ends in themselves?

                                                                                                                                                                                  1. 0

                                                                                                                                                                                    Of course they’re a means to an end, why do you even ask?

                                                                                                                                                                                    In the programs I write, I need inheritance or subtyping maybe once a year. Rarely enough that using closures as a poor’s man classes is adequate. Heck, even writing the odd virtual table in C is enough in practice.

                                                                                                                                                                                    Generics however were much more useful to me, for two purposes: first, whenever I write a new data structure or container, it’s nice to have it work on arbitrary data types. For standard libraries it is critical: you’ll need what, arrays & slices, hash tables, maybe a few kind of trees (red/black, AVL…).

                                                                                                                                                                                    I don’t write those on a day to day basis, though. I’ve also grown susceptible to Mike Acton’s arguments that being generic often causes more problems than it solves, at least when speed matters. One gotta shape one’s program to one’s data, and that makes generic data structures much less useful.

                                                                                                                                                                                    My second purpose is less visible, but even more useful: the right kind of generics help me enforce separation of concerns to prevent bugs. That significantly speeds up my development. See, when a type is generic you can’t assume anything about values of that type. At best you can copy values around. Which is exactly what I’m looking for when I want to isolate myself from that type. That new data structure I’m devising just for a particular type? I’m still going to use generics if I can, because they make sure my data structure code cannot mess with the objects it contains. This drastically reduces the space of possible programs, which is nice when the correct programs are precisely in that space. You can think of it as defining bugs out of existence, like Ousterhout recommends in A Philosophy of Software Design.

                                                                                                                                                                                2. 1

                                                                                                                                                                                  A language is designed as a whole system, and its features define a vector-space that’s unique to those features.

                                                                                                                                                                                  How do you add two languages, or multiply a language by an element of a field?

                                                                                                                                                              1. 2

                                                                                                                                                                A key aspect of Rust is allowing company lock-in. Once you have large teams, you find that lack of a large standard library allows you to create a proprietary custom one. You can create a library, or even an ecosystem, that does prevents skills from transferring to a new position and thus lowers your staffing costs.

                                                                                                                                                                It is easily defended as ‘cutting edge’.

                                                                                                                                                                This leads to the odd question: is it a good idea for a developer to work in Rust?

                                                                                                                                                                1. 27

                                                                                                                                                                  A key aspect of Rust is allowing company lock-in. Once you have large teams, you find that lack of a large standard library allows you to create a proprietary custom one. You can create a library, or even an ecosystem, that does prevents skills from transferring to a new position and thus lowers your staffing costs.

                                                                                                                                                                  This might be more convincing if, say, crates.io didn’t exist and if Cargo wasn’t built entirely around making a shared ecosystem easy, but as it stands this idea seems, to put it mildly, preposterous – and suggests you’re reaching for uncharitable assertions.

                                                                                                                                                                  The job market for developers with experience in a language with a famously batteries-included standard lib, Python, currently pays a lot less than it does for Rust (there are more Python jobs, of course, but your average salary is about $30,000 less per: https://www.zdnet.com/article/heres-how-much-money-you-can-make-as-a-developer-in-2021/). There’s zero reason to imagine that a larger standard library size is correlated with higher pay, or vice-versa.

                                                                                                                                                                  1. 16

                                                                                                                                                                    I’ve used Rust as several jobs and never seen a company create any kind of alternate standard library for it, proprietary or otherwise. There are a number of well-known crates used for specific things - e.g. reqwests, serde, nom - that are open source, easily available via crates.io, and applicable to projects across multiple firms. Which is similar to the situation in other modern languages with easily-accessible package ecosystems like Python and JavaScript, and is the opposite of discouraging skill transfer.

                                                                                                                                                                    1. 14

                                                                                                                                                                      The exact same thing can be said of C and its tiny stdlib. Did it lead to this situation you describe?

                                                                                                                                                                      1. 12

                                                                                                                                                                        You can create a library, or even an ecosystem, that does prevents skills from transferring to a new position and thus lowers your staffing costs.

                                                                                                                                                                        Someone would have to be unusually incompetent to be unable to transfer their skills from one standard lib to another. Fundamentals, people! Once you know them, it’s mostly a matter of knowing how stuff is named. You won’t hit the ground running, but then again nobody does. Even when you already know the standard stuff there will be tons of proprietary and bespoke code to wade through.

                                                                                                                                                                      1. 11

                                                                                                                                                                        As someone who is rather new to languages like C (I only recently got into it by making a game with it), I have a few newbie questions:

                                                                                                                                                                        • Why do people want to replace C? Security reasons, or just old and outdated?

                                                                                                                                                                        • What does Hare offer over C? They say that Hare is simpler than C, but I don’t understand exactly how. Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?

                                                                                                                                                                        That being said, I find it cool to see these languages popping up.

                                                                                                                                                                        1. 33

                                                                                                                                                                          Why do people want to replace C? Security reasons, or just old and outdated?

                                                                                                                                                                          • #include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from
                                                                                                                                                                          • C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.
                                                                                                                                                                          • There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)
                                                                                                                                                                          • C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).
                                                                                                                                                                          • C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.
                                                                                                                                                                          • C’s standard library isn’t really standard, a lot of stuff isn’t consistent across OS’s. (I have agreeable memories of that time I tried to get a simple 3kloc project from Linux running on Windows. The amount of hoops you have to jump through, tearing out functions that are Linux-only and replacing them with an ifdef mess to call Windows-only functions if you’re on compiling on Windows and the Linux versions otherwise…)
                                                                                                                                                                          • C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.
                                                                                                                                                                          • C has no anonymous functions. (Whether this matters really depends on your coding style.)
                                                                                                                                                                          • Manual memory management without defer is a PITA and error-prone.
                                                                                                                                                                          • Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)

                                                                                                                                                                          EDIT: As Forty-Bot noted, one of the biggest issues are null-terminated strings.

                                                                                                                                                                          I could go on and on forever.

                                                                                                                                                                          What does Hare offer over C?

                                                                                                                                                                          It fixes a lot of the issues I mentioned earlier, as well as reducing footguns and implementation-defined behavior in general. See my blog post for a list.

                                                                                                                                                                          They say that Hare is simpler than C, but I don’t understand exactly how.

                                                                                                                                                                          It’s simpler than C because it comes without all the cruft and compromises that C has built up over the past 50 years. Additionally, it’s easier to code in Hare because, well, the language isn’t trying to screw you up every 10 lines. :^)

                                                                                                                                                                          Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?

                                                                                                                                                                          Zig and Hare both occupy the same niche as C (i.e., low-level manual memory managed systems language); they both compile to machine code. And yes, they make it a lot easier to write code.

                                                                                                                                                                          1. 15

                                                                                                                                                                            Thanks for the great reply, learned a lot! Gotta say I am way more interested in Hare and Zig now than I was before.

                                                                                                                                                                            Hopefully they gain traction. :)

                                                                                                                                                                            1. 15

                                                                                                                                                                              #include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from

                                                                                                                                                                              This and your later point about not being able to associate methods with struct definitions are variations on the same point but it’s worth repeating: C has no mechanism for isolating namespaces. A C function is either static (confined to a single compilation unit) or completely global. Most shared library systems also give you a package-local form but anything that you’re exporting goes in a single flat namespace. This is also true of type and macro definitions. This is terrible for software engineering. Two libraries can easily define different macros with the same name and break compilation units that want to use both.

                                                                                                                                                                              C++, at least, gives you namespaces for everything except macros.

                                                                                                                                                                              C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).

                                                                                                                                                                              The lack of type checking is really important here. A systems programming language is used to implement the most critical bits of the system. Type checks are incredibly important here, casting everything via void* has been the source of vast numbers of security vulnerabilities in C codebases. C++ templates avoid this.

                                                                                                                                                                              C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.

                                                                                                                                                                              This is less of an issue for systems programming, where a large standard library is also a problem because it implies dependencies on large features in the environment. In an embedded system or a kernel, I don’t want a standard library with file I/O. Actually, for most cloud programming I’d like a standard library that doesn’t assume the existence of a local filesystem as well. A bigger problem is that the library is not modular and layered. Rust’s nostd is a good step in the right direction here.

                                                                                                                                                                              C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.

                                                                                                                                                                              From libc, most errors are not returned, they’re signalled via the return and then stored in a global (now a thread-local) variable called errno. Yay. Option types for returns are really important for maintainable systems programming. C++ now has std::optional and std::variant in the standard library, other languages have union types as first-class citizens.

                                                                                                                                                                              Manual memory management without defer is a PITA and error-prone.

                                                                                                                                                                              defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.

                                                                                                                                                                              C has no anonymous functions. (Whether this matters really depends on your coding style.)

                                                                                                                                                                              Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.

                                                                                                                                                                              1. 6

                                                                                                                                                                                defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.

                                                                                                                                                                                True. I’m more saying that defer is the baseline here; without it you need cleanup: labels, gotos, and synchronized function returns. It can get ugly fast.

                                                                                                                                                                                Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.

                                                                                                                                                                                I disagree, depends on what you’re doing. I’m doing a roguelike in Zig right now, and I use anonymous functions quite extensively for item/weapon/armor/etc triggers, i.e., where each game object has some unique anonymous functions tied to the object’s fields and can be called on certain events. Having closures would be nice, but honestly in this use-case I didn’t really feel much of a need for it.

                                                                                                                                                                              2. 3

                                                                                                                                                                                Note that C does have “standard” answers to a lot of these.

                                                                                                                                                                                C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.

                                                                                                                                                                                The macro system is the #1 thing keeping C alive :)

                                                                                                                                                                                There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)

                                                                                                                                                                                Aside from macro stuff, the typical way to address this is to use a struct of function pointers. So you’d create a wrapper like

                                                                                                                                                                                do_stuff(struct *foo)
                                                                                                                                                                                {
                                                                                                                                                                                    foo->do_stuff(foo);
                                                                                                                                                                                }
                                                                                                                                                                                

                                                                                                                                                                                C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).

                                                                                                                                                                                Note that typically there is a “base class” which either all “subclasses” include as a member (and use offsetof to recover the subclass) or have a void * private data pointer. This doesn’t really escape the problem, however in practice I’ve never run into a bug where the wrong struct/method gets combined. This is because the above pattern ensures that the correct method gets called.

                                                                                                                                                                                C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.

                                                                                                                                                                                Well, there’s always errno… And if you control the address space you can always use the upper few addresses for error codes. That said, better syntax for multiple return values would probably go a long way.

                                                                                                                                                                                C has no anonymous functions. (Whether this matters really depends on your coding style.)

                                                                                                                                                                                IIRC gcc has them, but they require executable stacks :)

                                                                                                                                                                                Manual memory management without defer is a PITA and error-prone.

                                                                                                                                                                                Agree. I think you can do this with GCC extensions, but some sugar here would be nice.

                                                                                                                                                                                Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)

                                                                                                                                                                                Arguably there should be fixed width types, size_t, intptr_t, and regsize_t. Unfortunately, C lacks the last one, which is typically assumed to be long. Rust, for example, gets this even more wrong and lacks the last two (c.f. the recent post on 129-bit pointers).


                                                                                                                                                                                IMO you missed the most important part, which is that C strings are (by-and-large) nul-terminated. Having better syntax for carrying a length around with a pointer would go a long way to making string support better.

                                                                                                                                                                              3. 9

                                                                                                                                                                                Even in C’s domain, where C lacks nothing and is fine for what it is, I would criticize C for maybe 5 things, which I would consider the real criticism:

                                                                                                                                                                                1. It has undefined behaviour, of the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code just by switching compiler or inlining some code that wasn’t inlined before. You can’t necessarily point at a piece of code and say it was always broken, because UB is a runtime phenomenon. Not reassuring for a supposedly lowlevel language.
                                                                                                                                                                                2. Its operator precedence is wrong.
                                                                                                                                                                                3. Integer promotion. Just why.
                                                                                                                                                                                4. Signedness propagates the wrong way: Instead of the default type being signed (int) and comparison between signed and unsigned yielding unsigned, it should be opposite: There should be a nat type (for natural number, effectively size_t), and comparison between signed and unsigned should yield signed.
                                                                                                                                                                                5. char is signed. Nobody likes negative code points.
                                                                                                                                                                                1. 6

                                                                                                                                                                                  the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code

                                                                                                                                                                                  I’m wary of this same tired argument cropping up again, so I’ll just state it this way: I disagree. Code that invokes undefined behavior is already broken; changing compiler can’t (except perhaps in very particular circumstances, which I don’t think you were referring to) introduce undefined behaviour; it can change the observable behaviour when UB is invoked.

                                                                                                                                                                                  A compiler can’t “disobey the source code” whilst conforming to the language standard. If the source code does something that doesn’t have defined semantics, that’s on the source code, not the compiler.

                                                                                                                                                                                  “It’s easy to accidentally invoke undefined behaviour in C” is a valid criticism, but “C compilers breaks code” is not.

                                                                                                                                                                                  You can’t necessarily point at a piece of code and say it was always broken

                                                                                                                                                                                  You certainly can in some instances. But sure, for example, if some piece of code dereferences a pointer and the value is set somewhere else, it could be undefined or not depending on whether the pointer is valid at the point it is dereferenced. So code might be “not broken” given certain constraints (eg that the pointer is valid), but not work properly if those constraints are violated, just like code in any language (although in C there’s a good chance the end result is UB, which is potentially more catastrophic).

                                                                                                                                                                                  I’m not saying C is a good language, just that I think this particular criticism is unfair. (Also I think your point 5 is wrong, char can be unsigned, it’s up to the implementation).

                                                                                                                                                                                  1. 7

                                                                                                                                                                                    Thing is, it certainly feels like the compiler is disobeying the source code. Signed integer overflow? No problem pal, this is x86, that platform will wrap around just fine! Right? Riiight? Oops, nope, and since the compiler pretends UB does not exist, it just deleted a security check that it deemed “dead code”, and now my hard drive has been encrypted by a ransomware that just exploited my vulnerability.

                                                                                                                                                                                    Though I agree with all the facts you laid out, and with the interpretation that UB means the program is already broken even if the generated binary didn’t propagate the error. But Chandler Carruth pretending that UB does not invoke the nasal demons is not far. Let’s not forget that UB means the compiler is allowed to cause your entire hard drive to be formatted, as ridiculous as it may sound. And sometimes it actually happens (as it did so many times with buffer overflow exploits).

                                                                                                                                                                                    Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.

                                                                                                                                                                                    1. 3

                                                                                                                                                                                      Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.

                                                                                                                                                                                      I feel like “disobeying the code” and “not doing what I intended it to do due to the code being wrong” are still two sufficiently different things that it’s worth distinguishing.

                                                                                                                                                                                      1. 4

                                                                                                                                                                                        Okay, it is worth distinguishing.

                                                                                                                                                                                        But it is also worth noting that C is quite special. This UB business repeatedly violates the principle of least astonishement. Especially the modern interpretation, where compilers systematically assume UB does not exist and any code path that hits UB is considered “dead code”.

                                                                                                                                                                                        The original intent of UB was much closer to implementation defined behaviour. Signed integer overflow was originally UB because some platforms crashed or otherwise went bananas when it occurred. But the expectation was that on platforms that behave reasonably (like x86, that wraps around), we’d get the reasonable behaviour. But then compiler writers (or should I say their lawyers) noticed that strictly speaking, the standard didn’t made that expectation explicit, and in the name of optimisation started to invoke nasal demons even on platforms that could have done the right thing.

                                                                                                                                                                                        Sure the code is wrong. In many cases though, the standard is also wrong.

                                                                                                                                                                                        1. 4

                                                                                                                                                                                          I agree with some things but not others that you say, but these arguments have been hashed out many times before.

                                                                                                                                                                                          Sure the code is wrong

                                                                                                                                                                                          That’s the point I was making. Since we agree on that, and we agree that there are valid criticisms of C as a language (though we may differ on the specifics of those), let’s leave the rest. Peace.

                                                                                                                                                                                    2. 4

                                                                                                                                                                                      But why not have the compiler reject the code instead of silently compiling it wrong?

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        It doesn’t compile it wrong. Code with no semantics can’t be compiled incorrectly. You’re making the exact same misrepresentation as in the post above that I responded to originally.

                                                                                                                                                                                        1. 3

                                                                                                                                                                                          Code with no semantics shouldn’t be able to be compiled at all.

                                                                                                                                                                                          1. 1

                                                                                                                                                                                            I’d almost agree, though I can think of some cases where such code could exist for a reason (and I’ll bet that such code exists in real code bases). In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB, at least) in order to make compile-time type-safety checks. IIRC there are a few such things used in the Linux kernel. There are probably plenty of other cases; there’s a lot of C code out there.

                                                                                                                                                                                            In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated (eg if a pointer is invalid, or if an integer is too large and will result in overflow at some operation), and the compiler can’t always tell that the constraints necessarily will be violated, so it generates code with the assumption that if the code is executed, then the constraints do hold. So if the larger body of code is wrong - the constraints are violated, that is - the behaviour is undefined.

                                                                                                                                                                                            1. 1

                                                                                                                                                                                              In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB

                                                                                                                                                                                              That’s why it’s good to have a proper macro system that isn’t literally just find and replace.

                                                                                                                                                                                              In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated

                                                                                                                                                                                              True, and I’m mostly talking about UB that can be detected at compile time, such as f(++x, ++x).

                                                                                                                                                                                  2. 6

                                                                                                                                                                                    Contrary to what people are saying, C is just fine for what it is.

                                                                                                                                                                                    People complain about the std library being tiny, but you basically have the operating system at your fingers, where C is a first class citizen.

                                                                                                                                                                                    Then people complain C is not safe, yes that’s true, but with a set of best practices you can keep thing under control.

                                                                                                                                                                                    People complain you don’t have generics, you dont need them most of the time.

                                                                                                                                                                                    Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language. Also most of the popular python libraries nowadays are written in C.

                                                                                                                                                                                    1. 25

                                                                                                                                                                                      Hi! I’d like to introduce you to Fish in a Barrel, a bot which publishes information about security vulnerabilities to Twitter, including statistics on how many of those vulnerabilities are due to memory unsafety. In general, memory unsafety is easy to avoid in languages which do not permit memory-unsafe operations, and nearly impossible to avoid in other languages. Because C is in the latter set, C is a regular and reliable source of security vulnerabilities.

                                                                                                                                                                                      I understand your position; you believe that people are morally obligated to choose “a set of best practices” which limits usage of languages like C to supposedly-safe subsets. However, there are not many interesting subsets of C; at best, avoiding pointer arithmetic and casts is good, but little can be done about the inherent dangers of malloc() and free() (and free() and free() and …) Moreover, why not consider the act of choosing a language to be a practice? Then the choice of C can itself be critiqued as contrary to best practices.

                                                                                                                                                                                      nginx is well-written, but Redis is not. SQLite is not written just in C, but also in several other languages combined, including SQL and TH1 (“test harness one”); this latter language is specifically for testing that SQLite behaves property. All three have had memory-unsafety bugs. This suggests that even well-written C, or C in combination with other languages, is unsafe.

                                                                                                                                                                                      Additionally, Nix is written in C++ and package definitions are written in shell. I prefer PyPy to CPython; both are written in a combination of C and Python, with CPython using more C and PyPy using more Python. I’m not sure where you were headed here; this sounds like a popularity-contest argument, but those are not meaningful in discussions about technical issues. Nonetheless, if it’s the only thing that motivates you, then consider this quote from the Google Chrome security team:

                                                                                                                                                                                      Since “memory safety” bugs account for 70% of the exploitable security bugs, we aim to write new parts of Chrome in memory-safe languages.

                                                                                                                                                                                      1. 3

                                                                                                                                                                                        I am curious about your claim that Redis is not well-written? I’ve seen other folks online hold it up as an example of a well-written C codebase, at least in terms of readability.

                                                                                                                                                                                        I understand that readable is not the same as secure, but would like to understand where you are coming from on this.l

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          It’s 100% personal opinion.

                                                                                                                                                                                      2. 9

                                                                                                                                                                                        Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language.

                                                                                                                                                                                        Ah yes, you can see the safety of high-quality C in practice:

                                                                                                                                                                                        https://nginx.org/en/security_advisories.html https://www.cvedetails.com/vulnerability-list/vendor_id-18560/product_id-47087/Redislabs-Redis.html

                                                                                                                                                                                        Including some fun RCEs, like CVE-2014-0133 or CVE-2016-8339.

                                                                                                                                                                                        1. 2

                                                                                                                                                                                          I also believe C will still have a place for long time. I know I’m a newbie with it, but making a game with C (using Raylib) has been pretty fun. It’s simple and to the point… And I don’t mind making mistakes really, that’s how I learn the best.

                                                                                                                                                                                          But again it’s cool to see people creating new languages as alternatives.

                                                                                                                                                                                        2. 4

                                                                                                                                                                                          What does Hare offer over C?

                                                                                                                                                                                          Here’s a list of ways that Drew says Hare improves over C:

                                                                                                                                                                                          Hare makes a number of conservative improvements on C’s ideas, the biggest bet of which is the use of tagged unions. Here are a few other improvements:

                                                                                                                                                                                          • A context-free grammar
                                                                                                                                                                                          • Less weird type syntax
                                                                                                                                                                                          • Language tooling in the stdlib
                                                                                                                                                                                          • Built-in and semantically meaningful static and runtime assertions
                                                                                                                                                                                          • A lightweight system for dependency resolution
                                                                                                                                                                                          • defer for cleanup and error handling
                                                                                                                                                                                          • An optional build system which you can replace with make and standard tools

                                                                                                                                                                                          Even with these improvements, Hare manages to be a smaller, more conservative language than C, with our specification clocking in at less than 1/10th the size of C11, without sacrificing anything that you need to get things done in the systems programming world.

                                                                                                                                                                                          It’s worth reading the whole piece. I only pasted his summary.

                                                                                                                                                                                        1. 40

                                                                                                                                                                                          I tried out this language while it was in early development, writing some of the standard library (hash::crc* and unix::tty::*) to test the language. I wrote about this experience, in a somewhat haphazard way. (Note, that blog post is outdated and not all my opinions are the same. I’ll be trying to take a second look at Hare in the coming days.)

                                                                                                                                                                                          In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. I really hate to say this about a project where they authors have put in such a huge amount of effort over the past year or so, but I just don’t see its niche – the lack of generics mean I’d always use Zig or Rust instead of Hare or C. It really looks like Drew looked at Zig, said “too bloated”, and set out to create his own version.

                                                                                                                                                                                          Another thing I find strange: why are you choosing to not support Windows and macOS? Especially since, you know, one of C’s good points is that there’s a compiler for every platform and architecture combination on earth?

                                                                                                                                                                                          That said, this language is still in its infancy, so maybe as time goes and the language finds more users we’ll see more use-cases for Hare.

                                                                                                                                                                                          In any case: good luck, Drew! Cheers!

                                                                                                                                                                                          1. 10

                                                                                                                                                                                            why are you choosing to not support Windows and macOS?

                                                                                                                                                                                            DdV’s answer on HN:

                                                                                                                                                                                            We don’t want to help non-FOSS OSes.

                                                                                                                                                                                            (Paraphrasing a lot, obvs.)

                                                                                                                                                                                            My personal 2c:

                                                                                                                                                                                            Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.

                                                                                                                                                                                            1. 13

                                                                                                                                                                                              Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.

                                                                                                                                                                                              I think that’s the consequence of not planning for Windows support in the first place. Rust’s standard library was built without the assumption of an underlying Unix-like system, and it provides good abstractions as a result.

                                                                                                                                                                                              1. 5

                                                                                                                                                                                                Amos talks about that here: Go’s file APIs assume a Unix filesystem. Windows support was kludged in later.

                                                                                                                                                                                              2. 5

                                                                                                                                                                                                Windows and Mac/iOS don’t need help from new languages; it’s rather the other way around. Getting people to try a new language is pretty hard, let alone getting them to build real software in it. If the language deliberately won’t let them target three of the most widely used operating systems, I’d say it’s shooting itself in the foot, if not in the head.

                                                                                                                                                                                                (There are other seemingly perverse decisions too. 80-character lines and 8-character indentation? Manual allocation with no cleanup beyond a general-purpose “defer” statement? I must not be the target audience for this language, is the nicest response I have.)

                                                                                                                                                                                                1. 2

                                                                                                                                                                                                  Just for clarity, it’s not my argument. I was just trying to précis DdV’s.

                                                                                                                                                                                                  I am not sure I agree, but then again…

                                                                                                                                                                                                  I am not sure that I see the need for yet another C-replacement. Weren’t Limbo, D, Go, & Rust all attempts at this?

                                                                                                                                                                                                  But that aside: there are a lot of OSes out there that are profoundly un-Unix-like. Windows is actually quite close, compared to, say, Oberon or classic MacOS or Z/OS or OpenVMS or Netware or OS/2 or iTron or OpenGenera or [cont’d p94].

                                                                                                                                                                                                  There is a lot of diversity out there that gets ignored if it doesn’t have millions of users.

                                                                                                                                                                                                  Confining oneself to just OSes in the same immediate family seems reasonable and prudent to me.

                                                                                                                                                                                              3. 10

                                                                                                                                                                                                My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.

                                                                                                                                                                                                1. 20

                                                                                                                                                                                                  You could say that generics and macros are complex, relative to the functionality they offer.

                                                                                                                                                                                                  But I would put comptime in a different category – it’s reducing complexity by providing a single, more powerful mechanism. Without something like comptime, IMO static languages lose significant productivity / power compared to a dynamic language.

                                                                                                                                                                                                  You might be thinking about things from the tooling perspective, in which case both features are complex (and probably comptime even more because it’s creating impossible/undecidable problems). But in terms of the language I’d say that there is a big difference between the two.

                                                                                                                                                                                                  I think a language like Hare will end up pushing that complexity out to the tooling. I guess it’s like Go where they have go generate and relatively verbose code.

                                                                                                                                                                                                  1. 3

                                                                                                                                                                                                    Yup, agree that zig-style seamless comptime might be a great user-facing complexity reducer.

                                                                                                                                                                                                    1. 16

                                                                                                                                                                                                      I’m not being Zig-specific when I say that, by definition, comptime cannot introduce user-facing complexity. Unlike other attributes, comptime only exists during a specific phase of compiler execution; it’s not present during runtime. Like a static type declaration, comptime creates a second program executed by the compiler, and this second program does inform the first program’s runtime, but it is handled entirely by the compiler. Unlike a static type declaration, the user uses exactly the same expression language for comptime and runtime.

                                                                                                                                                                                                      If we think of metaprogramming as inherent complexity, rather than incidental complexity, then an optimizing compiler already performs compile-time execution of input programs. What comptime offers is not additional complexity, but additional control over complexity which is already present.

                                                                                                                                                                                                      To put all of this in a non-Zig context, languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming. Some folks argue that this introduces complexity. But the complexity of the Python runtime is there regardless of whether modules get an extra code-execution phase; the extra phase provides expressive power for users, not new complexity.

                                                                                                                                                                                                      1. 8

                                                                                                                                                                                                        Yeah, but I feel like this isn’t what people usually mean when they say some feature “increases complexity.”

                                                                                                                                                                                                        I think they mean something like: Now I must know more to navigate this world. There will be, on average, a wider array of common usage patterns that I will have to understand. You can say that the complexity was already there anyway, but if, in practice, is was usually hidden, and now it’s not, doesn’t that matter?

                                                                                                                                                                                                        then an optimizing compiler already performs compile-time execution of input programs.

                                                                                                                                                                                                        As a concrete example, I don’t have to know about a new keyword or what it means when the optimizing compiler does its thing.

                                                                                                                                                                                                        1. 2

                                                                                                                                                                                                          A case can be made that this definition of complexity is a “good thing” to improve code quality / “matters”:

                                                                                                                                                                                                          Similar arguments can be used for undefined behavior (UB) as it changes how you navigate a language’s world. But for many programmers, it can be usually hidden by code seemingly working in practice (i.e. not hitting race conditions, not hitting unreachable paths for common input, updating compilers, etc.). I’d argue that this still matters (enough to introduce tooling like UBSan, ASan, and TSan at least).

                                                                                                                                                                                                          The UB is already there, both for correct and incorrect programs. Providing tools to interact with it (i.e. __builtin_unreachable -> comptime) as well as explicit ways to do what you want correctly (i.e. __builtin_add_overflow -> comptime specific lang constructs interacted with using normal code e.g. for vs inline for) would still be described as “increases complexity” under this model which is unfortunate.

                                                                                                                                                                                                          1. 1

                                                                                                                                                                                                            The UB is already there, both for correct and incorrect programs.

                                                                                                                                                                                                            Unless one is purposefully using a specific compiler (or set thereof), that actually defines the behaviour the standard didn’t, then the program is incorrect. That it just happens to generate correct object code with this particular version of that particular compiler on those particular platforms is just dumb luck.

                                                                                                                                                                                                            Thus, I’d argue that tools like MSan, ASan, and UBSan don’t introduce any complexity at all. The just revealed the complexity of UB that was already there, and they do so reliably enough that they actually relieve me of some of the mental burden I previously had to shoulder.

                                                                                                                                                                                                        2. 5

                                                                                                                                                                                                          languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming.

                                                                                                                                                                                                          Python doesn’t allow compile-time metaprogramming for any reasonable definition of the word. Everything happens and is introspectable at runtime, which allows you to do similar things, but it’s not compile-time metaprogramming.

                                                                                                                                                                                                          One way to see this is that sys.argv is always available when executing Python code. (Python “compiles” byte code, but that’s an implementation detail unrelated to the semantics of the language.)

                                                                                                                                                                                                          On the other hand, Zig and RPython are staged. There is one stage that does not have access to argv (compile time), and another one that does (runtime).

                                                                                                                                                                                                          Related to the comment about RPython I linked here:

                                                                                                                                                                                                          http://www.oilshell.org/blog/2021/04/build-ci-comments.html

                                                                                                                                                                                                          https://old.reddit.com/r/ProgrammingLanguages/comments/mlflqb/is_this_already_a_thing_interpreter_and_compiler/gtmbno8/

                                                                                                                                                                                                          1. 4

                                                                                                                                                                                                            Yours is a rather unconventional definition of complexity.

                                                                                                                                                                                                            1. 5

                                                                                                                                                                                                              I am following the classic paper, “Out of the Tar Pit”, which in turn follows Brooks. In “Abstractive Power”, Shutt distinguishes complexity from expressiveness and abstractedness while relating all three.

                                                                                                                                                                                                              We could always simply go back to computational complexity, but that doesn’t capture the usage in this thread. Edit for clarity: Computational complexity is a property of problems and algorithms, not a property of languages nor programming systems.

                                                                                                                                                                                                              1. 3

                                                                                                                                                                                                                Good faith question: I just skimmed the first ~10 pages of “Out of the Tar Pit” again, but was unable to find the definition that you allude to, which would exclude things like the comptime keyword from the meaning of “complexity”. Can you point me to it or otherwise clarify?

                                                                                                                                                                                                                1. 4

                                                                                                                                                                                                                  Sure. I’m being explicit for posterity, but I’m not trying to be rude in my reply. First, the relevant parts of the paper; then, the relevance to comptime.

                                                                                                                                                                                                                  On p1, complexity is defined as the tendency of “large systems [to be] hard to understand”. Unpacking their em-dash and subjecting “large” to the heap paradox, we might imagine that complexity is the amount of information (bits) required to describe a system in full detail, with larger systems requiring more information. (I don’t really know what “understanding” is, so I’m not quite happy with “hard to understand” as a concrete definition.) Maybe we should call this “Brooks complexity”.

                                                                                                                                                                                                                  On p6, state is a cause of complexity. But comptime does not introduce state compared to an equivalent non-staged approach. On p8, control-flow is a cause of complexity. But comptime does not introduce new control-flow constructs. One could argue that comptime requires extra attention to order of evaluation, but again, an equivalent program would have the same order of evaluation at runtime.

                                                                                                                                                                                                                  On p10, “sheer code volume” is a cause of complexity, and on this point, I fully admit that I was in error; comptime is a long symbol, adding size to source code. In this particular sense, comptime adds Brooks complexity.

                                                                                                                                                                                                                  Finally, on a tangent to the top-level article, p12 explains that “power corrupts”:

                                                                                                                                                                                                                  [I]n the absence of language-enforced guarantees (…) mistakes (and abuses) will happen. This is the reason that garbage collection is good — the power of manual memory management is removed. … The bottom line is that the more powerful a language (i.e. the more that is possible within the language), the harder it is to understand systems constructed in it.

                                                                                                                                                                                                                  comptime and similar metaprogramming tools don’t make anything newly possible. It’s an annotation to the compiler to emit specialized code for the same computational result. As such, they arguably don’t add Brooks complexity. I think that this argument also works for inline, but not for @compileError.

                                                                                                                                                                                                      2. 18

                                                                                                                                                                                                        My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.

                                                                                                                                                                                                        Yeah, I can see that. But under what conditions would I care how small, big, or icecream-covered the compiler is? Building/bootstrapping for a new platform is a one-time thing, but writing code in the language isn’t. I want the language to make it as easy as possible on me when I’m using it, and omitting features that were around since the 1990’s isn’t helping.

                                                                                                                                                                                                        1. 8

                                                                                                                                                                                                          Depends on your values! I personally see how, eg, generics entice users to write overly complicated code which I then have to deal with as a consumer of libraries. I am not sure that not having generics solves this problem, but I am fairly certain that the problem exists, and that some kind of solution would be helpful!

                                                                                                                                                                                                          1. 3

                                                                                                                                                                                                            In some situations, emitted code size matters a lot (and with generics, that can quickly grow out of hand without you realizing it).

                                                                                                                                                                                                            1. 13

                                                                                                                                                                                                              In some situations

                                                                                                                                                                                                              I see what you mean, but I think in those situations it’s not too hard to, you know, refrain from use generics. I see no reason to force all language users to not use that feature. Unless Hare is specifically aiming for that niche, which I don’t think it is.

                                                                                                                                                                                                              1. 4

                                                                                                                                                                                                                There are very few languages that let you switch between monomorphisation and dynamic dispatch as a compile-time flag, right? So if you have dependencies, you’ve already had the choice forced on you.

                                                                                                                                                                                                                1. 6

                                                                                                                                                                                                                  If you don’t like how a library is implemented, then don’t use it.

                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                    Ah, the illusion of choice.

                                                                                                                                                                                                          2. 10

                                                                                                                                                                                                            Where is the dividing line? What makes functions “not complex” but generics, which are literally functions evaluated at compile time, “complex”?

                                                                                                                                                                                                            1. 14

                                                                                                                                                                                                              I don’t know where the line is, but I am pretty sure that this is past that :D

                                                                                                                                                                                                              https://github.com/diesel-rs/diesel/blob/master/diesel_cli/src/infer_schema_internals/information_schema.rs#L146-L210

                                                                                                                                                                                                              1. 17

                                                                                                                                                                                                                Sure, that’s complicated. However:

                                                                                                                                                                                                                1. that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere, and I am not convinced that complexity that is abstracted away and provides value is a bad thing, as much of the “let’s go back to simpler times” discourse seems to imply. I rather someone takes the time to solve something once, than me having to solve it every time, even if with simpler code.

                                                                                                                                                                                                                2. Is this just complex, or is it actually doing more than the equivalent in other languages? Rust allows for expressing constraints that are not easily (or at all) expressable in other languages, and static types allow for expressing more constraints than dynamic types in general.

                                                                                                                                                                                                                In sum, I’d reject a pull request with this type of code in an application, but don’t mind it at all in a library.

                                                                                                                                                                                                                1. 4

                                                                                                                                                                                                                  that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere,

                                                                                                                                                                                                                  I find that’s rarely the case. It’s often possible to tweak the approach to a problem a little bit, in a way that allows you to simply omit huge swaths of complexity.

                                                                                                                                                                                                                  1. 3

                                                                                                                                                                                                                    Possible, yes. Often? Not convinced. Practical? I am willing to bet some money that no.

                                                                                                                                                                                                                    1. 7

                                                                                                                                                                                                                      I’ve done it repeatedly, as well as seeing others do it. Occasionally, though admittedly rarely, reducing the size of the codebase by an order of magnitude while increasing the number of features.

                                                                                                                                                                                                                      There’s a huge amount of code in most systems that’s dedicated to solving optional problems. Usually the unnecessary problems are imposed at the system design level, and changing the way the parts interface internally allows simple reuse of appropriate large-scale building blocks and subsystems, reduces the number of building blocks needed, and drops entire sections of translation and negotiation glue between layers.

                                                                                                                                                                                                                      Complexity rarely needs to be somewhere – and where it does need to be, it’s in often in the ad-hoc, problem-specific data structures that simplify the domain. A good data structure can act as a laplace transform for the entire problem space of a program, even if it takes a few thousand lines to implement. It lets you take the problem, transform it to a space where the problem is easy to solve, and put it back directly.

                                                                                                                                                                                                                2. 7

                                                                                                                                                                                                                  You can write complex code in any language, with any language feature. The fact that someone has written complex code in Rust with its macros has no bearing on the feature itself.

                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                    It’s the Rust culture that encourages things like this, not the fact that Rust has parametric polymorphism.

                                                                                                                                                                                                                    1. 14

                                                                                                                                                                                                                      I am not entirely convinced – to me, it seems there’s a high correlation between languages with parametric polymorphism and languages with culture for high-to-understand abstractions (Rust, C++, Scala, Haskell). Even in Java, parts that touch generics tend to require some mind-bending (producer extends consumer super).

                                                                                                                                                                                                                      I am curious how Go’s generic would turn out to be in practice!

                                                                                                                                                                                                                      1. 8

                                                                                                                                                                                                                        Obligatory reference for this: F# Designer Don Syme on the downsides of type-level programming

                                                                                                                                                                                                                        I don’t want F# to be the kind of language where the most empowered person in the discord chat is the category theorist.

                                                                                                                                                                                                                        It’s a good example of the culture and the language design being related.

                                                                                                                                                                                                                        https://lobste.rs/s/pkmzlu/fsharp_designer_on_downsides_type_level

                                                                                                                                                                                                                        https://old.reddit.com/r/ProgrammingLanguages/comments/placo6/don_syme_explains_the_downsides_of_type_classes/

                                                                                                                                                                                                                        which I linked here: http://www.oilshell.org/blog/2022/03/backlog-arch.html

                                                                                                                                                                                                              2. 3

                                                                                                                                                                                                                In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. … I’d always use Zig or Rust instead of Hare or C.

                                                                                                                                                                                                                What if you were on a platform unsupported by LLVM?

                                                                                                                                                                                                                When I was trying out Plan 9, lack of LLVM support really hurt; a lot of good CLI tools these days are being written in Rust.

                                                                                                                                                                                                                1. 15

                                                                                                                                                                                                                  Zig has rudimentary plan9 support, including a linker and native codegen (without LLVM). We’ll need more plan9 maintainers to step up if this is to become a robust target, but the groundwork has been laid.

                                                                                                                                                                                                                  Additionally, Zig has a C backend for those targets that only ship a proprietary C compiler fork and do not publish ISA details.

                                                                                                                                                                                                                  Finally, Zig has the ambitions to become the project that is forked and used as the proprietary compiler for esoteric systems. Although of course we would prefer for businesses to make their ISAs open source and publicly documented instead. Nevertheless, Zig’s MIT license does allow this use case.

                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                    I’ll be damned! That’s super impressive. I’ll look into Zig some more next time I’m on Plan 9.

                                                                                                                                                                                                                  2. 5

                                                                                                                                                                                                                    I think that implies that your platform is essentially dead ( I would like to program my Amiga in Rust or Swift or Zig, too) or so off-mainstream (MVS comes to mind) that those tools wouldn’t serve any purpose anyway because they’re too alien).

                                                                                                                                                                                                                    1. 5

                                                                                                                                                                                                                      Amiga in Rust or Swift or Zig, too)

                                                                                                                                                                                                                      Good news: LLVM does support 68k, in part to many communities like the Amiga community. LLVM doesn’t like to include stuff unless there’s a sufficient maintainer base, so…

                                                                                                                                                                                                                      MVS comes to mind

                                                                                                                                                                                                                      Bad news: LLVM does support S/390. No idea if it’s just Linux or includes MVS.

                                                                                                                                                                                                                      1. 1

                                                                                                                                                                                                                        Good news: LLVM does support 68k Unfortunately, that doesn’t by itself mean that compilers (apart from clang) get ported, or that the platform gets added as part of a target triple. For instance, Plan 9 runs on platforms with LLVM support, yet isn’t supported by LLVM.

                                                                                                                                                                                                                        Bad news: LLVM does support S/390. I should have written VMS instead.

                                                                                                                                                                                                                        1. 1
                                                                                                                                                                                                                      2. 2

                                                                                                                                                                                                                        I won’t disagree with describing Plan 9 as off-mainstream ;) But I’d still like a console-based Signal client for that OS, and the best (only?) one I’ve found is written in Rust.

                                                                                                                                                                                                                  1. 16

                                                                                                                                                                                                                    I ignored almost everything and went straight to the bit I have some expertise on: cryptography.

                                                                                                                                                                                                                    • I like the relative lack of bloat. We could argue that their cryptographic library is not complete, but that can be fixed.
                                                                                                                                                                                                                    • I like that (apparently) slices are used for the API. Having written a cryptographic library in C, I saw how we are reading from and writing to buffers all the time, and having to specify their length explicitly means my functions have many more arguments than I would have liked.
                                                                                                                                                                                                                    • I like the choice of primitives. Except perhaps AES (slow or vulnerable to timing attacks on pure software implementations), but I understand its appeal in the face of widespread hardware support.

                                                                                                                                                                                                                    There is one thing I’d like to insist on, that I realised fairly late in my cryptographic career: naive key exchange is not high-level enough.

                                                                                                                                                                                                                    By “key exchange”, I mean the kind of key exchange that happens in NaCl’s crypto_box(): a key exchange proper followed by a hash, so the two parties have a shared key. Nowadays it’s not quite enough to just exchange Alice’s and Bob’s long term keys, we also want stronger properties like forward secrecy and key compromise impersonation resistance. To do that, you need a full key exchange protocol involving 2 or 3 messages in most cases. I don’t know what Hare is actually using here (the key exchange link is not live), but if they don’t have it already, something like Noise would be a good addition some time in the future.

                                                                                                                                                                                                                    1. 2

                                                                                                                                                                                                                      Speaking of verification, I have a seemingly simple problem the systems I tried it on (TLA+, Coq) seem to be unable to address (or, more likely, I don’t have the tools).

                                                                                                                                                                                                                      So I have these two integers a and b, that are in [0, K] (where K is a positive integer). I would like to prove the following:

                                                                                                                                                                                                                      • a + b ≤ 2 K
                                                                                                                                                                                                                      • a × b

                                                                                                                                                                                                                      Should be easy, right? Just one little snag: K is often fairly big, typically around 2^30 (my goal here is to prove that a given big number arithmetic never causes limb overflow). I suspect naive SAT solving around Peano arithmetic is not going to cut it.

                                                                                                                                                                                                                      1. 4

                                                                                                                                                                                                                        This should be pretty easy for any modern SMT solver to prove. I’m not exactly an expert, but this seems to work for Z3:

                                                                                                                                                                                                                        ; Declare variables/constants
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (declare-const K Int)
                                                                                                                                                                                                                        (declare-const a Int)
                                                                                                                                                                                                                        (declare-const b Int)
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        ; Specify the properties of the variables
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (assert (> K 0))
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (assert (>= a 0))
                                                                                                                                                                                                                        (assert (>= b 0))
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (assert (<= a K))
                                                                                                                                                                                                                        (assert (<= b K))
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        ; Now let's prove facts
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (push) ; Save context
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        ; Note how I've actually inserted the opposite statement of what you are trying to prove, see below as to why
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (assert (> (+ a b) (* 2 K)))
                                                                                                                                                                                                                        (check-sat)
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        ; If you get an `unsat` answer, it means your statement is proved
                                                                                                                                                                                                                        ; If instead you get a `sat` answer, you can use the (get-model) command here
                                                                                                                                                                                                                        ; to get a set of variable assignments which satisfy all the assertions, including
                                                                                                                                                                                                                        ; the assertion stating the opposite of what you are trying to prove
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (pop) ; Restore context
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        (assert (> (* a b) (* K K)))
                                                                                                                                                                                                                        (check-sat)
                                                                                                                                                                                                                        
                                                                                                                                                                                                                        ; See above for the comment about the (get-model) command
                                                                                                                                                                                                                        

                                                                                                                                                                                                                        Save that to a file and then run z3 <file.smt>.

                                                                                                                                                                                                                        Z3 should give you 2 unsat answers in a fraction of a second, which means that your 2 statements were proven to be true. Notably, it proves this for any K > 0 (including 2^30, 2^250, 2^1231823, etc…)

                                                                                                                                                                                                                        As far as I understand, the biggest gotcha is that you have to negate the statement that you are trying to prove and then let the SMT solver prove that there is no combination of values for the a, b and K integers that satisfy all the assertions. It’s a bit unintuitive at first, but it’s not hard to get used to it.

                                                                                                                                                                                                                        1. 3

                                                                                                                                                                                                                          my goal here is to prove that a given big number arithmetic never causes limb overflow

                                                                                                                                                                                                                          I’m not exactly sure what you mean here, is it that you’re using modulo arithmetic? If not, I’ve got a little proof here in Coq:

                                                                                                                                                                                                                          Theorem for_lobsters : forall a b k : nat,
                                                                                                                                                                                                                            a<=k /\ b<=k -> a+b <= 2*k /\ a*b <= k*k.
                                                                                                                                                                                                                          Proof.
                                                                                                                                                                                                                            split.
                                                                                                                                                                                                                            - lia.
                                                                                                                                                                                                                            - now apply PeanoNat.Nat.mul_le_mono.
                                                                                                                                                                                                                          Qed.
                                                                                                                                                                                                                          

                                                                                                                                                                                                                          I think even if you’re doing modulo arithmetic, it shouldn’t be too hard to prove the given lemmas. But you might need to put some tighter restrictions on the bounds of a and b. For example requiring that a and b are both less than sqrt(k) (though this is too strict).

                                                                                                                                                                                                                          1. 1

                                                                                                                                                                                                                            My, I’m starting to understand why I couldn’t prove that trivial theorem:

                                                                                                                                                                                                                            • I’m not sure what “split” means, though I guess it splits conjunction in the conclusion of the theorem into 2 theorems…
                                                                                                                                                                                                                            • I have no idea what lia means.
                                                                                                                                                                                                                            • I have no idea how PeanoNat.Nat.mul_le_mono applies here. I guess I’ll have to look at the documentation.

                                                                                                                                                                                                                            Thanks a lot though, I’ll try this out.

                                                                                                                                                                                                                            1. 2

                                                                                                                                                                                                                              I’m not sure what “split” means, though I guess it splits conjunction in the conclusion of the theorem into 2 theorems…

                                                                                                                                                                                                                              Yep!

                                                                                                                                                                                                                              I have no idea what lia means.

                                                                                                                                                                                                                              The lia tactic solves arithmetic expressions. It will magically solve a lot of proofs that are composed of integer arithmetic. The docs on lia can be found here. Note that I omitted an import statement in the snippet above. You need to prepend From Coq Require Import Lia to use it.

                                                                                                                                                                                                                              I have no idea how PeanoNat.Nat.mul_le_mono applies here. I guess I’ll have to look at the documentation.

                                                                                                                                                                                                                              The mul_le_mono function has the following definition, which almost exactly matches the goal. I found it using Search "<=".

                                                                                                                                                                                                                              PeanoNat.Nat.mul_le_mono
                                                                                                                                                                                                                                   : forall n m p q : nat, n <= m -> p <= q -> n * p <= m * q
                                                                                                                                                                                                                              

                                                                                                                                                                                                                              I used now apply ... which is shorthand for apply ...; easy. The easy tactic will try to automatically solve the proof using a bunch of different tactics. You could do without the automation and solve the goal with apply PeanoNat.Nat.mul_le_mono; destruct H; assumption, if you’re so inclined.

                                                                                                                                                                                                                              I hope this is helpful!

                                                                                                                                                                                                                        1. 1

                                                                                                                                                                                                                          Crap.

                                                                                                                                                                                                                          Crap crap.

                                                                                                                                                                                                                          Crap crap crap crap crap.

                                                                                                                                                                                                                          1. 2

                                                                                                                                                                                                                            game over man, game over.