Threads for pzel

    1. 8

      Nice and heartening story. I wonder what it’s like for true systems consultants nowadays. Whether it’s still possible to pull one’s weight and override the client’s wishes/opinions about implementation in the interest of fulfilling their wishes about outcomes. Somehow I get the impression that nowadays, if you’re not moving customers “to the cloud” or at least to K8S, you’re not finding gigs easily. If anyone has experiences to the contrary, please let me know.

    2. 4

      Is this actually specific to Go? I feel like it could be about nearly any language from the 1980s.

      1. 10

        You could write a regex to make this apply to anything with garbage collection and not-too-many features. The author also carelessly (and offensively) misapplies the concept of, and the word, “Tao”. Water’s flowing may be it’s wu wei, it’s lack of doing, it’s being: the water is flowing downhill, rather downhill-ness being some innate property of the water. It doesn’t have “a Tao”. Tao isn’t just “work[ing] with the grain”, it’s understanding the natural order that resulted in the grain to take the most effortless next action. I guess it’s just advertising copy so it’s not that important, but if I was a Taoist I’d be pretty miffed

        1. 3

          Don’t be miffed. Mabye you’re already a Taoist dreaming that you’re a WilhelmVonWeiner.

      2. 2

        But then you get only 1% of the clicks…

      3. 2

        Given Go is essentially a reskin of Algol 68…

        1. 2

          What isn’t, these days?

    3. 3

      Regarding the bit about RSI-fodder ergonomics: I will again shamelessly plug my layout which solves these ;)

    4. 1

      Regarding this bit:

      It seeks to simplify modules by combining structures and signatures into one construct, and using a linking mechanism to avoid a lot of the boilerplate around functors.

      I think sml# has gone partway there with its notion of separate compilation and interface files.

    5. 19

      Very much an article about ‘how this regretted outcome came to be’. Some key paragraphs:

      A couple of teams, our most enthusiastic, early adopters of Elm, completed their migration away from React. Having worked hard to embrace Elm’s nirvana of type-safe, pure functional programming, the last thing those teams wanted to do was break out their increasingly rusty React skills whenever they contributed a change to a design system component.

      […]

      It seemed we were faced with a choice: Elm or React. Continuing to support both was fast becoming unsustainable for us.

      The thing that ultimately tipped the balance in React’s favour for us was that we acquired another company whose entire codebase was written in React, and whose team knew nothing about Elm. Overnight, we went from a company that was writing about equal amounts of Elm and React (and which might well have decided to double down on Elm) to one that was writing about 75% React.

      […]

      Elm is really designed to be your whole front end stack. We have gotten around this fact by investing in tools (i.e. super cool hacks) which allow Elm to integrate with our blended stack. But there are some consequences to using Elm in this way: […] we don’t get to use Elm as our single dependency — it is actually just one more (big) piece of complexity for the rest of our tools and code to consider. This means that we don’t see the benefits of Elm either as a low-maintenance front end stack, nor as a way to guarantee consistent low-cost context switching.

      […]

      we believe that choosing a single language and framework (React) for new projects is the best path for Culture Amp, as it will buy us economies of scale within the front end practice.

      1. 6

        So basically, “the benefits of using Elm didn’t weigh enough against the costs of integrating Elm with other systems”?

        Seems to me like a pretty decent example of why “Pinky & The Brain” systems seldom go anywhere. ;_;

        1. 18

          Seems to me like a pretty decent example of why “Pinky & The Brain” systems seldom go anywhere. ;_;

          My takeaway was more like “React is why we can’t have nice things” which … I kind of already knew was true.

          Extremely depressing but at the end of the day having better technology can’t overcome the enormous momentum of an industry that has enthusiastically embraced a mediocre alternative.

          1. 8

            … which in fairness isn’t exactly React’s fault; convergence on mediocrity (a.k.a. “best practices”) is an ongoing challenge in many industries.

            You want to have a defined minimum bar, but that’s almost certainly going to be a low bar.

            So then many pressures - the ability to hire quickly (including to easily assess candidates) and at scale, ability to treat engineering staff as fungible, ability to pay other people to solve your problems (or even look them up on StackExchange) - all come to bear.

            Some for better, some for worse, but you wind up with your industry’s equivalent of React. C.f. Java, or dietary guidance for children that’s a decade behind the science.

            1. 5

              C.f. Java

              Java’s old, but it’s anything but mediocre. Tastes vary; I get that. But it has been a high quality project for decades, and if they weren’t in such a rush to begin with (when Microsoft was trying to destroy Java), it would probably have way less tech debt (e.g. the bloated and super-complicated Taligent stuff that IBM donated, and lots of other “core” libraries like crypto that show every sign of having been rushed).

              I came from the C++ world, which had dramatically worse tool-chains, dramatically worse (or usually, just missing) libraries, and where a programming mistake was fatal (at least to the program). At the time, there were widely used “high level” languages like Perl, PHP, Visual Basic, Access, PowerBuilder, Gupta, DBase, FoxPro, Excel…. So in comparison, Java was shockingly better than the alternatives.

              And with Guy Steele involved, it started with a solid language spec and runtime spec. Almost unheard of back then.

              There’s nothing wrong with not liking a language, but Java is anything but mediocre.

              1. 3

                This feels a lot like damning with faint praise.

                if they weren’t in such a rush to begin with (when Microsoft was trying to destroy Java), it would probably have way less tech debt

                We’re talking about the Java that actually exists, not a hypothetical possible Java which might have been.

                Yeah, sure it’s better than C++ and PHP and VB, but we’re talking about a language that took decades to add something as fundamental as first-class functions. Even in the mid-90s when Java came out, closures weren’t exactly a cutting edge new technology.

                Just because it’s better than the three worst languages I can think of doesn’t mean it’s not mediocre.

                1. 3

                  we’re talking about a language that took decades to add something as fundamental as first-class functions

                  OK.

                  That doesn’t make a language mediocre.

            2. 2

              I’m interested in what was the “better” to Java’s “worse”, at the time Java was being pushed into popularity by Sun.

          2. 5

            Yeah, I find that when “worse is better” is invoked to retroactively justify the dominance of POSIX and C, the prevalent emotional response (including my own, guilty as charged) is positive. “Yeah! Worse is better! All that other stuff was so overengineered! Thank goodness for worse-is-better or we’d be stuck with MULTICS and ADA”

            But looking at all the “worse is better” situations playing out now, I can’t help, like you, feeling a bit depressed about all the beauty and elegance that is being trampled underfoot by giants (such as React, Javascript, Kubernetes, etc). Perhaps the better-is-better technologists of the past felt the same? Lispers have the unique privilege of having been on the side of better-is-better for several decades now, constantly being sidelined by pareto-sufficient copies of lispy technologies.

            I feel this says something about survivorship bias, 20:20 hindsight, etc.

            (*updated for typo)

            1. 2

              Kubernetes itself is amazingly engineered.

              It’s the problems that it’s trying to solve (a giant legacy layer cake of entropy) that we should all be embarrassed by 🤣

              (Although you probably were talking about all of the new layers of the cake that sprouted up on top of Kubernetes. Yeah, agreed on that.)

        2. 16

          Ah, there’s a few other nuances:

          Around this same time the momentum around Elm’s own development and that of its tooling was losing steam. … Culture Amp is a medium-sized tech company that can afford to contribute back to the open source ecosystem it depends upon, but in Elm’s case it was beginning to feel like we would have to contribute more than we would get back to make it work well for us.

          The breaking changes of Elm 0.18 → 0.19 were not unreasonable, and yet it took a small group of volunteers across multiple teams about a year to do it… When no one is finding the time and motivation to keep a technology healthy in your stack, you can infer how people feel about it.

          And the people on the front line were feeling the same about the cost/benefit analysis:

          I let these engineers know that I was thinking about moving Elm from “adopt” to “contain”, I asked them what they thought, and I listened.

          Every single one of them said they understood and agreed with the decision.

          1. 14

            I really appreciated that the author did one-on-ones with all the ICs about it.

        3. 9

          I’m not at all an Elm fan, but I think it actually did. It wasn’t sustainable long-term but the short-term and medium-term benefits were enough for them to get products launched and the business made apparently sustainable. I agree with Kevin (the author)’s assertion that:

          Just because a relationship ends doesn’t make it a failure.

        4. 3

          Seems to me like a pretty decent example of why “Pinky & The Brain” systems seldom go anywhere. ;_;

          I disagree with that characterisation. It’s obvious that Elm isn’t a good fit for all environments, but there are still plenty out there where the self-contained ecosystem is an advantage. Whether or not that’s enough to keep the project going is another question though.

          1. 2

            As a counterpoint, it seems like most successful languages have a smooth on-ramp for integration into existing code bases and ecosystems, of which the latter include type systems these days.

    6. 2

      Can anyone recommend a good retrospective on what happened to OpenID? Seems like that was an attempt to address this type of problem, but it never took off. Google, FB, Github, etc. branded their own versions instead. What was the problem? Technical shortcomings? Economic misalignment? Branding?

      1. 4

        Economic misalignment would be the euphemism. The data collecting companies realized it would be pretty valuable to have a record of who was logging in to what sites when, so became oidc providers and disallowed logging in to themselves with openID (or each others’ oidc tokens).

      2. 4

        Like several other popular technologies of the era is suffered a Second System collapse. OpenID2 was widely deployed, then most of the same players started deployed OAuth2 for other purposes, people realised these protocols had a large overlap in tech (if not in purpose) and worked to create OIDC. While waiting for OIDC, OpenID2 ecosystem collapsed since no point in supporting the “old way” and we were left with just OAuth2 + proprietary login extensions, which is still what we use today.

      3. 1

        Also, there is the availability issue, exactly as I’ve described in this article.

        What happens if the ID provider decides to delete your account? With email-based systems, while you remember your password (and provided the other service provider doesn’t want you to verify your email address on each login), you still have a chance to log-in and switch to a different email address.

        With OpenID, OAuth, SAML, etc. you aren’t that lucky. In one go, your ID provider can disable your entire on-line presence.

        1. 3

          But if “identity provider” emerged as a more widespread concept for a service, it could be made illegal for identity providers to delete your account in the way you’ve described. A law could say that a identity provider could decline to continue to provide service, but must provide a method for discontinued users to transfer their identity to a different provider.

          1. 4

            I think this is the solution to OP’s problem statement. It should be illegal to cut off someone’s access to their email account without recourse. For example, I can imagine that if you’re in bad standing, the provider can cut off outgoing email, but still let you log in and receive emails. This type of law would be much better at de-incentivizing digital de-personing than giving your government central power over your email. To say nothing of the privacy implications of that!

          2. 2

            Which law? A US law? For US citizens? Just like the US constitutional rights apply only to the US citizens, and the rest of the world has exactly zero rights with any US-based company?

            But setting politics aside, because I bet the EU has the same approach, let alone China or Russia, such a law would be practically meaningless for the global community. A law can cover only that country’s companies and that country’s people; it can’t cover other countries citizens. Moreover, I think a law can only be applied on that country’s “land”. The internet doesn’t have a geographical position on the map.

            The only body that could perhaps come with such a law is the UN, but even reading the human rights charter it’s full of exceptions like (paraphrasing) “unless it’s against a local law”.

            1. 3

              Well, the GDPR is technically only applicable to legal persons that gather the data of European citizens, but due to difficulty of implementation the rest of us not infrequently also receive GDPR protections. So even one major jurisdiction implementing such a law would have effects on the entire internet.

              Your original post also advocates government action, I don’t understand why you are here so averse to it.

              1. 2

                Your original post also advocates government action, I don’t understand why you are here so averse to it.

                In my article I advocate that each government provides its own citizens, as public service, the email forwarding service. Which is completely different to governments mandating by law requirements from private companies.

                The differences are mainly the following:

                • because the government provides something to its own citizens (and others that are assimilated to citizens), it can’t (like the US is doing with foreigners) just trample over other people rights;
                • because most governments are democratic (at least in theory), and because citizens support them through paying taxes, we could (at least in theory) hold them accountable (especially through voting) if they go amok; (which can’t be said about businesses;)
                • because is most civilized countries there are laws mandating the governments (and other institutions) to answer to queries (at least in theory), in the US these are FOIA requests, we could (at least in theory) audit what the government is actually doing; (which can’t be said about businesses;)
                • because in the end, if one needs to resort to any legal recourse, it’s more practical to do it in your country, than in some far away land;
                • because a country can’t just pack its loot, and abandon a particular market;

                My main point is this: our current judicial system seems to be bound to the borders where both the business and the client is; if one or both are from different jurisdictions things get murky; also the government can’t hide behind “freedom of association” or “economic opportunity”.

    7. 9

      This reminds me of my favorite talk I saw at strangeloop 2021: https://thestrangeloop.com/2021/prevent-phishing-and-impersonation-with-trust-loops.html

      The idea behind the talk is that people having “a” (singular) identity is a purely modern invention, and we might benefit from having our identities be a function of our relationships. So my rock climbing gym buddy doesn’t need to know my government identity and whatnot, I am just “Andrew Climbing Gym” in his contacts. The presenter made a chat app based on that idea, where you don’t even need to know someone’s phone number (necessarily a property of a singular identity) to message them; alternative P2P introduction methods are used.

      1. 2

        I agree, that sometimes multiple quasi-anonymous identities are better.

        But in the article above I advocate about the opposite – very stable, irrevocable email addresses (perhaps with any number of aliases), that are to be used where the user needs long-term stable and irrevocable email addresses, like for example for cloud providers (DNS, VPS, etc.) where they already know the identity of the person (or at least the one that pays the monthly bill).

        At the moment your email provider, Google for example, can at any moment and for any reason just terminate your email account, thus leaving you unable to authenticate with the rest of your digital life.

        1. 3

          Whilst I think the effort is misguided in its implementation, the W3C’s DID spec is intended to address this very problem.

        2. 2

          I get what you’re saying but your reply is still written in the language of people having one true identity of which others are simply derivations or obfuscations. I do think labeling “things your government and major financial institutions know about you” as “the” identity of a person is missing something important. Me qua “Andrew Climbing Gym” is as much my true identity as “Andrew Helwer, USCIS #123456789”. All these different identities should be separated, not derived from a single point of failure that makes phishing so effective & valuable to criminals.

          1. 1

            I think we, as technologists, always chase the “perfect solution” (be it anonymity, privacy, security, scalability, etc.) and think it would suit everybody, or that everybody searches for the same features. However, for many people having a way to recover their on-line presence is perhaps much more important than hiding from BigBrother.

            Let’s look at the current IT consumer landscape: it’s clear that Whatsup, Facebook, TikTok, Chrome, iOS and Android have clearly won the market, even-though they are personal data and tracking block-holes. The average consumer doesn’t care…

            As another example, last week I’ve asked the community for a simple text editor that supports out-of-the-box GnuPG encryption / decryption (without too much fuss); the answer of the community: summer crickets… Either Emacs plugins or complex GnuPG frontends. The end result: nobody uses PGP!

        3. 1

          As long as you don’t let your government try to do that (depending on your government) or you’ll get a disaster like https://en.wikipedia.org/wiki/De-Mail

          1. 2

            Reading the criticism page on Wikipedia, there are indeed many concerns, from privacy to legal implications.

            However, what I’ve proposed is sort of similar, sort of opposite, to what De-Mail is. Namely I want:

            • the government to provide one with one or multiple irrevocable email addresses, fully interoperable with the current SMTP deployments;
            • the law covering the above shouldn’t span more than this simple requirement: the government provides the user with an irrevocable email address; nothing more; the user is free to use it or not;

            The main advantage here is for regular folks, that get very easily locked out from their on-line accounts because they have forgotten their email address password (thus account recovery or reset doesn’t usually work), or the email address they used to sign-up 10 years ago, no longer works due to various buys and mergers.

            (I have family members that have dealt with this, and for some time now, I’m the “password manager” for my immediate family members. If I happen to get hit by a bus, they’ll no undoubtedly have to start over in a few years.)

            1. 1

              While I agree with your assessment about how dangerous our reliance on private “identity providers” is, just imagine the catastrophic consequences of a world where your government runs critical IT infrastructure.

              1. Very rarely are government IT services run well (I know there are exceptions but I’d say it’s Pareto-abysmal)
              2. A single point of failure would be an irresistible target for all kinds of criminals. The possibilities for fraud, theft and extortion are endless.

              I believe we need diversity, standards, and legal accountability, and not authoritarian centralization.

              Edit: IT services

              1. 2

                just imagine the catastrophic consequences of a world where your government runs critical IT infrastructure.

                Well, governments already run critical IT infrastructure that they can’t seem to have a handle on (at least in my country, Romania):

                • there is a national health-care information system, that often has downtimes, making patients queue or revisit the pharmacy or the doctor at a later date;
                • there is the population and identity database (ID-cards, birth-certificates, driving-licenses, passports, etc.); only they know how secure all of this is;
                • there is the fiscal authority (equivalent to the IRS) that has data about all our taxes;
                • there is the public pension system that has data about all contributors;
                • and possibly countless other centralized databases, that most likely are wide open…

                So, running an email forwarding service, doesn’t pose much more risk than any of the above. And hopefully they would get it running right, as it’s orders of magnitude simpler than any of the other services…

    8. 12

      Very reasonable observations. I find more and more developers, even the “younger generation” get burned on these hyper-specialized build systems and fall back to Make more and more often. I think it’s a good thing. Make is clunky but, as the poster notes, it does the job you ask it to do and you know it will continue to do it ten years from now.

      1. 10

        Make also has a bunch of problematic things. The biggest one is that it has no way of expressing a rule that produces more than one output but it also has no way of tracking staleness other than modification times. It also can’t express things that may need more than a single run. You can’t build LaTeX reliably with Make, for example, because it does a single pass and must be rerun until it reaches a fixed point. You often end up with fake files to express things that will run once, such as patching files.

        The annoying thing is that many of the complex replacements don’t solve these problems.

        1. 8

          GNU make supports rules that produce more than one output. See “Rules with Grouped Targets” on this page.

        2. 6

          I’ve recently started using just which - as per their docs - “avoids much of make’s complexity and idiosyncrasies.”. Based on my limited use it looks like a promising alternative.

          1. 3

            It’s a handy tool but it has a major omission in my opinion: no freshness tracking. It always runs all the commands, it doesn’t track whether a task’s dependencies are up to day and running the command can be skipped.

        3. 2

          That’s why there’s latex-mk - it is a program that simply runs LaTeX the necessary number of times. It is also a large set of helpful Make definitions for TeX files so you don’t even need to teach it how to build. It knows about all the LaTeX flavours and related tools like pdflatex, tex2page, bibtex etc. The simplest possible latex-mk file is simply

          NAME = foo
          include /usr/local/share/latex-mk/latex.gmk
          

          Then running make pdf, make ps etc would build foo.pdf, foo.ps etc from foo.tex, but it can be as complex as you want it to be.

          1. 1

            I use latex-mk, but it also has problems. For example, I was never able to work our how to hook it so that it can run steps to produce files that a TeX file includes if they don’t exist.

            1. 1

              That’s a bit of an odd requirement. What kind of situation requires that? I guess you could run some nasty eval to expand to make targets based on awk or grep output from your LaTeX sources, in GNU Make at least.

              1. 1

                Basically every LaTeX document I write pulls in things from elsewhere. For example, most of the figures in my books were drawn with OmniGraffle and converted to pdf with a makefile. I want that to be driven from latex-mk so that it can run that rule if I actually include the PDF (and so I don’t have to maintain a build system that drives a build system with limited visibility). For papers, there’s usually a build step that consumes raw data and runs some statistics to produce something that can be included. Again, that ends up being driven from a wrapper build.

        4. 2

          It’s been a long time since I worked on a LaTeX-only codebase requiring multiple compilation passes. I’m spoiled by pandoc + markdown for most of the documents I must write. I’ve heard that pandoc is a competent layer for a (la)tex -> pdf compiler instead of using pdflatex or xelatex or whatever directly. Have you seen pandoc being used in that way, primarily to avoid the multiple compilation pass madness behind pandoc’s abstraction thereof? I’ve also used tectonic a bit for debugging more complex pandoc markdown->tex->pdf builds, and it abstracts away the need for multiple passes.

          1. 4

            I’ve been able to use pandoc to compile markdown books, but I struggled to use it well with TikZ or Beamer. LaTeX just has too many dark corners.

          2. 1

            I use TeX primarily because most academic venues offer only LaTeX or Word templates and it’s the lesser of two evils. If I didn’t have to match an existing style, I’d use SILE.

        5. 1

          The annoying thing is that many of the complex replacements don’t solve these problems.

          I guess build2 would qualify as one of those complex replacements. Let’s see:

          The biggest one is that it has no way of expressing a rule that produces more than one output

          Check: we have a notion of target groups. You can even have groups where the set of member is discovered dynamically.

          also has no way of tracking staleness other than modification times

          Check: a rule is free to use whatever method it sees fit. We also keep track of changes to options, set of inputs, environment variables that affect a tool, etc.

          For example, we have the venerable in rule which keeps track of changes to the variable values that it substitutes in the .in file.

          It also can’t express things that may need more than a single run.

          Hm, I don’t know, this feels like a questionable design choice in a tool, not in a build system. And IMO the sane way to deal with this is to just run the tool a sufficient number of times from a recipe, say, in a loop.

          Let me also throw some more problematic things in make off the top of my head:

          • Everything is a string, no types in the language.

          • No support for scoping/namespaces, everything is global (hurts especially badly in non-recursive setups).

          • Recipe language (shell) is not portable. In particular, it’s unusable on Windows without another “system” (MSYS2, Cygwin, etc).

          • Support for separate source/output directories is a hack (VPATH).

          • No abstraction over file extensions (so you end with with hello$(EXE) all over the place).

          • Pattern rules do not support multiple stems (in build2 we have regex-based pattern rules which are a lot more flexible: https://build2.org/release/0.14.0.xhtml#adhoc-rules).

          1. 1

            Agreed on all of the other criticisms of Make. I’m a bit surprised that build2 can’t handle the dynamic dependency case, since I thought you needed that for your approach to handling C++ modules.

            I’d be interested in whether build2 can reproduce latex-mk’s behaviour. A few interesting things:

            • latex needs rerunning if it still has unresolved cross references, but not if the number doesn’t go down.
            • bibtex needs running if latex complained about a missing bbl file or before running latex if a bib file used by a prior run has changed.

            There are a few more subtleties. Doing the naive thing of always running latex bibtex latex latex takes build times from mildly annoying to an impediment to productive work, so is not an acceptable option. Latex-mk exists, so there’s no real need for build2 to be able to do this (though being able to build my experiments, the thing that analyses the result, the graphs, and the final paper from a single build system would be nice), but there are a lot of things such as caching and generated dependencies that can introduce this kind of pattern and I thought it was something build2 was designed to support.

            1. 1

              I’m a bit surprised that build2 can’t handle the dynamic dependency case, since I thought you needed that for your approach to handling C++ modules.

              It can’t? I thought I’ve implemented that. And we do support C++20 modules somehow (at least with GCC). Unless we are talking about different “dynamic dependencies”. Here is what I am referring to: https://build2.org/release/0.15.0.xhtml#dyndep

              Doing the naive thing of always running latex bibtex latex latex […]

              I am probably missing something here, but why can’t all this be handled within a single recipe or a few cooperating recipes, something along these lines:

              latex
              if (latext_complained_about_missing_bbl)
                bibtex
                latex
              end
              while (number_of_unresolved_cross_references_is_not_zero_and_keeps_decreasing)
                latex
              end
              
      2. 3

        Am a big fan of Make. It is clunky, hard to debug, but it sits just at the right level of abstraction. I’ve seen more and more posts of people realizing it’s useful beyond the original use case of compile C. There is room, I think, for a successor that addresses its flaws (see David’s comment) and expands to cover modern use cases (distribution, reproducibility, scheduling, orchestration). The challenge is in finding a compact set of primitives to support that and keep it simple, ie. not Bazel.

        1. 5

          Ninja, maybe? That’s my hope at least. I like its approach of “do exactly what it’s told, and use a higher level tool to tell it what to do”.

        2. 5

          You may want to read Build Systems à la Carte or watch the talk about it by Simon Peyton Jones (audio warning: it’s quite bad). Shake seems to be what you’re looking for, unfortunately you have to write the Shakefile in Haskell and have GHC installed which can be a bit steep as requirement.

          Circling back to JS, I had a half idea to use the Shake model described in that paper to implement it in JS so I could replace Jake, which is a good tool but shares many of the problems that Make has.

        3. 3

          remake has made a huge difference for me, in terms of making Makefiles far more debuggable.

          1. 1

            Oh, remake sounds amazing, it was not on my radar, thanks!

    9. 3

      Ever since coming across this paper/tutorial many years ago, I’ve been on the lookout for more materials from Abdulaziz Ghuloum, but sadly it seems he has had no online footprint after the incremental compiler stuff was released.

      1. 6

        I think he moved away and started a restaurant, last I looked.

      2. 2

        Jeremy Siek has a course[1] and a book[2] to go with it which seems to build on Ghuloum’s work. Updated to generate code for x86_64 as well.

        [1] https://iucompilercourse.github.io/IU-Fall-2022/ [2] https://github.com/IUCompilerCourse/Essentials-of-Compilation

    10. 24

      “Why does Thunderbird look so old, and why does it take so long to change?”

      If that actually does bother anyone, can’t they just use a different email client? Wouldn’t it be better to keep Thunderbird working with its existing UI, for those of us that have gotten used to it over the past 20 years?

      1. 16

        I think Thunderbird has lost it’s priorities.

        No one actually cares how modern an email client looks. No one is wearing their email client on their head as a fashion statement. No one is reading their email in closet because they can’t stand the idea that someone would peak over their shoulder and see them using UI from 2004.

        Thunderbird needs some reorganization and usability updates but modernizing the design is not one of them.

        Having worked at plenty of software companies I often find rewrites are often just large feature requests cobbled together under the guise of rewriting the base of the application as the only way to achieve the cluster of new features. Is the old software really that bad or has it grown in a disorganized way and/or do you just not understand it?

        1. 23

          I dunno, I won’t use a mail app that looks weird and old. I’d consider using Thunderbird if it looked good and worked better than Mail.app.

          1. 3

            real talk: Thunderbird looks better than it did a couple years back! I booted into it for the first time in 4 or so years and was like “oh this is pretty good”

            Granted I’m in KDE and before I was using it in Mac. But I feel like it’s pretty good for “email power usage”.

            There’s a legit complaint about how the panes are laid out by default, but I think that you can get pretty far by just moving around existing components.

        2. 12

          UI and UI conventions for email have been pretty continuously evolving since we first got GUI mail clients. And that’s without considering that UI conventions in general evolve over time; an application that doesn’t match the UI conventions of the surrounding system is likely to see itself replaced by one that does.

          Which is why keeping the same UI for decades on end is not something that usually works in mass-market software.

          1. 14

            I can confidently say I’ve never stopped using a useful piece of software because they hadn’t updated their UI to keep up with whatever fashion trend is currently hip. On the other hand, I have (repeatedly) stopped using software after a UI overhaul screwed up my existing comfort with the software, opening the door for me to look around and see what other software, while equally alien, might have better functionality.

            Messing with the look and feel of your application is an exercise in wagering your existing users in pursuit of new ones. In commercial projects, that can often make sense. In FLOSS, it rarely does, as your existing users, particularly those that will be the most annoyed by changes to their workflows, are also the most likely to contribute to the project, either through thorough bug reports or pull requests.

            1. 16

              I think it is important to consider the possibility that you are not representative of the mass-market software user. Or, more bluntly: if you were representative of the mass-market software user, the market would behave very differently than what we observe.

              1. 14

                I dunno, I don’t think Thunderbird users in general are representative of the mass-market software user. Most of those are just using Gmail or Outlook.com. Desktop MUA users are few and far between these days and use them specifically because they have specialized mail management requirements.

              2. 6

                I don’t see where APhilisophicalCat claims to be “representative of the mass-market software user”. I rather would interpret their “In commercial projects, that can often make sense. In FLOSS, it rarely does” as disagreeing with you on whether Thunderbird is “mass-market software”. (I don’t use Thunderbird and I claim no stake in this.)

                1. 2

                  I think the people building Thunderbird think of it as, or at least aspire to it being, a mass-market application.

                  (standard disclaimer: I am a former employee of the Mozilla Corporation; I have not worked there since the mid-2010s; when I did work there the stuff I worked on wasn’t related to Thunderbird; I am expressing my own personal opinion based on reading the public communications of the Thunderbird team rather than any sort of lingering insider knowledge or perspective)

          2. 1

            I don’t think things evolved, I think we just have a bunch of ways to stylize a list and where buttons to other lists might go. There are trends but at the end of the day you’re reading a list of things.

            The point I was trying to make is that sometimes rewrites are just shifting complexity and that you can satisfy both crowds by working on an old tech stack. Not that there isn’t a market for whatever-UI-trend email apps.

            1. 4

              I don’t think things evolved, I think we just have a bunch of ways to stylize a list and where buttons to other lists might go.

              I remember when Gmail first came out, and introduced a broad audience to the switch from email clients that present things in terms of individual messages, to clients that present things in terms of conversations.

              From an underlying technical perspective this may feel like nothing – just two ways of styling lists, why does it matter? – but from a user interface and experience perspective it was a gigantic shift. It’s rare these days to see an email client or interface that still clings to the pre-Gmail message-based UI conventions.

              The same is true for other UI conventions like labeling/tagging, “snooze” functions, etc. etc. Sure, in some sense your inbox is still a list or at most a list of trees, but there are many different ways to present such a thing, and which way you choose to present does in fact matter to end users.

              1. 1

                Exactly and there isn’t one crowd; you should aim to appease both. Even if Gmail started a new trend.

        3. 6

          I think a lot of people don’t like using “ugly” software. Definitely matters more to nontechies than it does to techies, I think.

          1. 3

            even techies will look elsewhere if the app gives you the vibe of something that seems to be from the windows 2000 era and thus probably as other problems too (let’s say scaling)

            But current thunderbird on the desktop looks fine ?!

            1. 4

              There’s a certain group (small?) that likes these old interfaces though. Enough that things like SerenityOS are quite popular. Or maybe it’s just me.

              1. 3

                I can appreciate a very technical but functional UI where you can find everything you need, but it doesn’t look that fresh. And then there is also the “jankynes” factor, like Handbrake, which looks very janky, but exposes all ffmpeg configuration flags as a UI. In my experience there is a big divide between applications that just look old but provide a lot of functionality, even looking janky at first - and apps that simply got thrown together in a short time and weren’t updated to keep up with modern requirements. One example for the latter is looking at f-droid applications, where a very old looking app can be a good indicator that it has never been updated to modern android permission requirements. Or that your desktop application just doesn’t support scaling and is either tiny on high-DPI or blurry - god forbid you move it between different DPI displays.

              2. 1

                Yup, that’s why Sylpheed/Claws were considered examples of a great UI for desktop email: https://www.claws-mail.org/screenshots.php

          2. 2

            Techies are just as, if not more, aesthetically conscious when it comes to software than non-techies. They just have different aesthetic preferences.

        4. 4

          I agree with the above. I’d much prefer if this announcement was about fundamental things related to the internals of thunderbird, not the chrome.

          • Random slowness due to background processes
          • Weird display issues when loading email folders which haven’t been opened for a while
          • No ability to import/export to other email formats
          • Search and indexing – serious improvements here would be very welcome

          I say this as a long-time thunderbird user, who loves the app and hopes to continue using it (my thunderbird profile is 11 years old). Don’t fix what’s not broken, but do fix what is.

      2. 7

        If that actually does bother anyone, can’t they just use a different email client?

        Name one other usable, open-source, mail client that runs on Windows.

        Looking old covers a lot of things. There are a bunch of things in the Thunderbird that are annoying and probably hard to fix without some significant redesign. Off the top of my head:

        • it sometimes gets confused by resolution changes (when you plug in an external monitor or use Remote Desktop) and you end up with a window that’s too small to resize and have to restart the app.
        • It does a load of blocking things on the main thread, which can cause the UI to freeze in exciting ways.
        • It has modal dialogs for a bunch of things, which freeze other parts of the UI.
        • The settings interface is hard to find and hard to navigate: different settings are split between different items in different top-level menus.
        • The way the text widget is built makes it impossible to switch between plain and rich text compose if you change your mind part way through writing.

        It’s also not clear to me how well the HTML email renderer view is sandboxed. Apple’s Mail.app opens emails in separate sandboxed renderer processes, I’m not sure the extent to which Thunderbird can take advantage of this underlying functionality from Firefox because it’s designed to isolate tabs and all of Thunderbird’s data lives in one tab.

        1. 1

          Sylpheed? But now we’re talking about super-esoteric software. I can’t imagine the user that uses sylpheed and thinks “Thunderbird isn’t usable enough”.

      3. 3

        familiarity is a usability feature. And I’m sure most UX people are aware of this on some level, but it’s rare to see it articulated

    11. 3

      Nice exposition of the 100r philosophy. I know Devine is on here but I don’t know his handle: Devine, have you taken a look at mu?

      1. 5

        Yes, I use the Mastodon instance they admin and we have lots of discussion/debate/feedback.

      2. 4

        They’re @neauoire.

    12. 19

      I’ve not written anything before, sorry if the typing style is annoying. I just hope it’s clear enough to get what I feel the point of Forth is across.

      1. 13

        This is my favorite type of lobsters post – where a programmer describes a problem and then shows the reader how they solve it. It does not matter one bit if the style is a bit rambly. If anything, it makes it more fun to read. I loved this write-up and I’m looking forward to more materials on forth from you.

      2. 4

        I enjoyed the style and your sense of humor. And I hope you got some good sleep.

    13. 6

      For something not emulating a tty, http://man.9front.org/1/rio under section “Text windows”.

      1. 6

        tt[0] (2/3 of a tty) is a small, neat implementation of the ideas found in 9term.

        [0] https://github.com/leahneukirchen/tt

      2. 5

        Heh. Came here to say “try Plan 9 for a decades-old implementation of something better than TTY emulation” and found you had beaten me to it :)

        1. 2

          Yeah, that was my instinct, too. You can just simplify and get all that stuff for free, or complexify and attempt to recreate a lot of functionality on your own. For what it’s worth, emacs’ M-x shell with some tweaks can work very similarly to an acme win window. And you get all the cut/paste/edit/hover functionality you have in emacs.

    14. 7

      I could not follow why all this is bad. Does it make it impossible to install Linux on any computer with this chip?

      There were statements in the article that made it sound to me like the author considered it a problem that the chip made it harder to pirate copyrighted works i.e stealing content from creators, which didn’t sound right to me.

      1. 6

        According to an MS employee commenting on this (@david_chisnall), it’s a certification requirement from MS to support alternative root certs, which allows Linux distros to be installed and booted with a full secure boot chain.

        I found the article to be fairly unclear on why Pluton is bad, as well; the only “bad” thing is that it theoretically will make it easier to prevent software piracy and cheating in online games. Which doesn’t seem… bad, to me?

      2. 4

        Yup there were a number of red flags for me as well.

        I’ll believe that Linux really is impossible on these chips when we see systems in the wild in the hands of skilled hackers :)

        1. 4

          As I understand it, Linux on Pluton PCs will be about as available as, say, LineageOS on Android devices. If the vendor doesn’t allow unlocking the bootloader, you’re probably out of luck.

      3. 7

        “Pirating copyrighted works” is an unavoidable side effect of general-purpose computing. I’d rather not throw out general-purpose computing in order to appease Disney and Time-Warner.

        1. -2

          While it may sound easier on the conscience to think that this is a Robin Hood (The English myth, not the trading platform) kind of situation, at the foundation it isn’t. It’s indie authors getting their books stolen, it’s actors getting lower revenue because of lost viewership. It’s singers not getting royalties.

          If you are against big publishing/producing houses making money, support Indie artists. Making it easier to steal their works is not a solution.

          1. 12

            It’s singers not getting royalties.

            The biggest thieves of royalties are arguably the music industry.

            Making it easier to steal their works is not a solution.

            You can’t “steal” a work. You can create and distribute reproductions without permission or attribution and fail to pay royalties, but my possession of a song doesn’t exclude your access to it. The “theft” framework is a meme perpetuated almost entirely to the benefit of publishers and rights-holders (not artists!) in order to legitimize incredibly shitty and abusive behavior.

            support Indie artists

            I do, and I even pay for my tracks! But a lot of those artists I’m only aware of due to running across their music under circumstances where perhaps the licensing wasn’t as well audited as it could be.

            1. 2

              I have bought more indie music on Bandcamp in the last two years than I bought any music in the previous twenty. But a lot of that is because they offer DRM-free tracks in your choice of formats, generally at a very reasonable price. That’s not something you can say of, for example, the movie industry.

            2. 2

              I’ve known quite a few musicians and they had huge pirate music collections.

          2. 6

            I don’t see how locking down computers to such a level helps artists, especially independent ones, and as a consumer I hate buying DRMed content and avoid it where I can.

    15. 18

      Agreed on everything but Copilot. The freedom to study how the software works is a fundamental attribute of free software. Learning is not covered by the GPL’s requirements. Copilot sometimes copypastes code (honestly - who doesn’t) but broadly it learns. This is entirely in keeping with open source.

      If we’re gonna set a standard that you can’t use things you learnt in software under an open-source license when writing commercial software, we might as well shutter either the entire software industry or the entire open-source movement, because literally everybody does that. It’s how brains work!

      And of course, it’s not like being off Github is gonna prevent MS from feeding your project into Copilot 2.

      1. 65

        Copilot does not learn.

        Like all of these neural network “AIs”, it’s just a pattern recognition system that launders the work of many humans into a new form, which the corporation can profit from but the humans cannot. It’s piracy for entities rich enough to train and operate such an AI, and unethical enough to acquire the training data, but you or I would still be punished for pirating from the corporation. Whether or not it is legal is irrelevant to me (I’m in favor of abolishing copyright), but we must recognize the increasing power imbalance between individuals and corporations such “AI” represents.

        Copilot understands nothing of what it writes. It learns nothing and knows nothing. It is not sentient or alive, no matter how tempting it is to anthropomorphize it.

        1. 16

          I think “pattern recognition system that launders the work of many humans into a new form” is just a rude way to phrase “learning.”

          Define “understands.” Define “knows.” I think transformers derive tiered abstract patterns from input that they can generalize and apply to new situations. That’s what learning is to me.

          1. 21

            The standard philosophical definition of knowledge is a justified true belief. Copilot and other AIs make the belief part problematic, so bracket that. But they don’t justify things well at all. Justification is a social process of showing why something is true. The AIs sound like total bullshit artists when asked to justify anything. I don’t think Copilot “knows” things anymore than a dictionary does yet.

            1. 2

              Putting aside Gettier cases, that’s not what I understand “justified” to mean. You just need to have a reason for holding the knowledge. With AI, reinforcement learning is the justification.

              The point of “justified belief” is just that it’s not knowledge if you just guess that it’s raining outside, even if it is in fact raining.

              1. 9

                The definition that @carlmjohnson is quoting is Plato’s and ever since Plato put it forth, knowledge theorists have been bickering about what “justified” means. The history of ideas after the age of Boethius or so isn’t quite my strong point so I’ll leave that part to someone else but FWIW most classical definitions of justification either don’t readily apply to reinforced learning, or if they do, it fails them quite badly.

                That being said, if you want to go forth with that definition, it’s very hard to frame a statistical model’s output as belief in the first place, whether justified or not. Even for the simplest kinds of statistical models (classification problems with binary output – yes/no) it’s not at all clear to formulate what belief the model possesses. For example, it’s trivial to train a model to recognize if a given text is an Ancient Greek play or not. But when you feed it a piece of text, the question that the model is “pondering” isn’t “Is this an Ancient Greek play”, but “Should I say yes?”, just like any other classification model. If subjected to the right laws and statements, a model that predicts whether a statement would cause someone to be held in contempt of the court might also end up telling you if a given text is an Ancient Greek play with reasonable accuracy, too. “Is this an Ancient Greek play?” and “Is this statement in contempt of the court?” are not equivalent statements, but the model will happily appear to make both with considerable accuracy.

                The model is making an inference about the content (“This content is of the kind I say yes to/the kind I say no to”), but because the two kinds cannot be associated to a distinct piece of information about the subject being fed to the model, I don’t think it can be said to constitute a belief. It’s not a statement that something is the case because it’s not clear what it asserts to be the case or not: there are infinitely many different classification problems that a model might turn out to solve satisfactorily.

                1. 4

                  In Greek, “justified” was some variation on “logos”: an account. Obviously everyone and their Buridan’s ass has a pet theory of justification, but I think it’s fair to interpret Plato’s mooted definition (it’s rejected in the dialogue IIRC!) as being “the ability to give an account of why the belief is true”. This is the ability which Socrates finds that everyone lacks, and why he says he knows that he knows nothing.

                  1. 7

                    Ugh, it’s really tricky. This comes up in two dialogs: Theaetetus, where knowledge gets defined as “true judgement with an account” (which IIRC is the logos part) and it’s plainly rejected in the end. The other one is Meno, where it’s discussed in the terms of the difference between true belief and knowledge, but the matter is not definitively resolved.

                    I was definitely wrong to say it was Plato’s – I think I edited my comment which initially said “is effectively Plato’s” because I thought it was too wordy but I was 100% wrong to do it, as Plato doesn’t actually use this formulation anywhere (although his position, or rather a position that can be inferred from the dialogues, is frequently summarized in these terms). (Edit: FWIW this is a super frequent problem with modern people talking about ancient sources and one of the ways you can probably tell I’m an amateur :P)

                    I think it’s fair to interpret Plato’s mooted definition (it’s rejected in the dialogue IIRC!) as being “the ability to give an account of why the belief is true”.

                    You may know of this already but just in case your familiarity with modern philosophy is as spotty as mine, only it’s got holes in different places, and if you’re super curious patient, you’re going to find Gettier’s “Is Justified True Belief Knowledge?” truly fascinating. It’s a landmark paper that formalizes a whole lot of objections to this, some of them formulated as early as the 15th century or so.

                    The counter-examples Gettier comes up with are better from a formal standpoint but Russel famously formulated one that’s really straightforward.

                    Suppose I’m looking at a clock which shows it’s two o’clock, so I believe it’s two o’clock. It really is two o’clock – it appears that I possess a belief that is both justified (I just looked at the clock!) and true (it really is two o’clock). I can make a bunch of deductions that are going to be true, to: for example, if I were to think that thirty minutes from now it’s going to be half past two, I’d be right. But – thought I haven’t realized it – that clock has in fact stopped working since yesterday at two. (Bear with me, we’re talking about clocks from Russell’s age). My belief is justified, and it’s true, but only by accident: what I have is not knowledge, but sheer luck – I could’ve looked at the clock as half past two and held the same justified belief, but it would’ve been false, suggesting that an external factor may also be involved in whether a belief is true or not, justified or not, and, thus, knowledge or not, besides the inherent truth and justification of a statement.

                    1. 7

                      The counter-examples Gettier comes up with are better from a formal standpoint but Russel famously formulated one that’s really straightforward.

                      I love collecting “realistic” Gettier problems:

                      • You’re asked a question and presented with a multiple choice answer. You can rule out 3 of the answers by metagaming (one is two orders of magnitude different from the others, etc)
                      • I give you a 100 reasons why I believe X. You examine the first 30 of them and they’re all completely nonsensical. In fact, every argument but #41 is garbage. Argument #41 is irrefutable.
                      • I believe “Poets commit suicide more often than the general population”, because several places say they commit suicide at 30x the rate. This claim turns out to be bunk, and a later investigation finds it’s more like 1.1x.
                      • I encounter a bug and know, from all my past experience dealing with it, that it’s probably reason X. I have not actually looked at the code, or even know what language it’s programmed in, and it’s one notable for not having X-type bugs. The developers were doing something extremely weird that subverted that guarantee, though, and it is in fact X.
                      • I find an empirical study convincingly showing X. The data turns out to have been completely faked. This is discovered by an unrelated team of experts who then publish an empirical study convincingly showing X’, which is an even stronger claim than X.
                      1. 3

                        My favourite ones come from debugging, that’s actually what got me into this in the first place (along with my Microwave Systems prof stubbornly insisting that you should know these things, even if engineers frown upon it, but that’s a whole other story):

                        • While debugging an Ethernet adapter’s driver, I am pinging another machine and watching the RX packet count of an interface go up, so I believe packets are being received on that interface, and the number of packets received on my machine matches the number of packets that the other machine is sending to it. Packets are indeed being received on the interface. I made a stupid copy-paste error in the code: I’m reading from the TX count register and reporting that as the RX count. It only shows the correct value because sending a ping packet generates a single response packet, so the two counts happen to match.
                        • An RTOS’ task overflows its stack (this was proprietary, it’s complicated) and bumps into another task’s stack, corrupting it. I infer the system crashes because of the stack corruption. Indeed, I can see task A bumping into task B’s stack, then A yields to B, and B eventually jumps at whatever garbage is on the stack, thus indeed crashing the system. There’s actually a bug in the process manager which causes the task table to become corrupted: A does overflow its task, but B’s stack is not located where A is overflowing. When A yields to B, the context is incorrectly restored, and B looks for its stack someplace else than it actually is, loading the stack pointer with an incorrect value. It just so happens that, because B is usually started before A, the bug is usually triggered by B yielding to A, but A just sits in a loop and toggles a couple of flags, so it’s never doing anything with the stack and never crashes, even though its stack does eventually get corrupted, too.

                        I got a few other ones but it’s really late here and I’m not sure I’m quite coherent by now :-D.

                    2. 2

                      I’m familiar with Gettier cases. I never dove very deep into the literature. It always struck me that a justification is not just a verbal formulation but needs some causal connection to the fact of the matter: a working clock causes my reasoning to be correct but a stopped clock has no causal power etc. I’m sure someone has already worked out something like this and brought out the objections etc etc but it seems like a prima facie fix to me.

                2. 1

                  Yes, IMO the belief is “how may this text continue?” However, efficiently answering this question requires implicit background knowledge. In a similar sense, our brains may be said to only have information about “what perpetuates our existence” or “what makes us feel good.” At most we can be said to have knowledge of the electric potentials applied to our nerves, as Plato also made hay of. However, as with language models, a model of the unseen world arises as a side effect of the compression of sensory data.

                  Actually, language models are fascinating to me because they’re a second-order learner. Their model is entirely based on hearsay; GPT-3 is a pure gossip. My hope for the singularity is that language models will be feasible to make safe because they’ll unavoidably pick up the human ontology by imitation.

                  1. 5

                    Yes, IMO the belief is “how may this text continue?”

                    That’s a question, not a belief – I assume you meant “This text may continue ”. This has the same problem: that’s a belief that you are projecting onto the model, not necessarily one that the model formulates. Reasoning by analogy is an attractive shortcut but it’s an uneasy one – we got gravity wrong because of it for almost two thousand years. Lots of things “may be said” about our brains, but not all of them are true, and not all of them apply to language models.

                    1. 1

                      Sure, but by that metric everything that anyone has ever said is a belief that person is projecting. I think that language models match the pattern of having a belief, as I understand it.

                      a belief that you are projecting onto the model, not necessarily one that the model formulates

                      You’re mixing up meta-levels here: I believe that the model believes things. I’m not saying that we should believe that the model believes things because the model believes that; rather, (from my perspective) we should believe it because it’s true.

                      In other words, if I model the learning process of a language model, the model in my head of the process fits the categories of “belief” and “learning”.

                      1. 6

                        I think that language models match the pattern of having a belief, as I understand it.

                        Observing that a model follows the pattern of a behaviour is not the same as observing that behaviour though. For example, Jupiter’s motion matches the pattern of orbiting a fixed Earth on an epicycle, but both are in fact orbiting the Sun.

                        FWIW, this is an even weaker assumption than I am making above – it’s not that no statements are made and that we only observe something akin to statements being made. I’m specifically arguing that the statements that the model appears to make (whether “it” makes them or not) are not particular enough to discriminate any information that the model holds about the world outside of itself and, thus, do not qualify as beliefs.

                        1. 1

                          If the world had a different state, the model would have different beliefs - because the dataset would contain different content.

                          Also, Jupiter is in fact orbiting a fixed Earth on an epicycle. There is nothing that inherently makes that view less true than the orbiting-the-sun view. But I don’t see how that relates at all.

              2. 3

                The problem is that reinforcement learning pushes the model toward reproducing the data distribution it was trained on. It’s completely orthogonal to truth about reality, in exactly the same way as guessing the state of the weather without evidence.

                1. 3

                  The data is sampled from reality… I’m not sure what you think evidence is, that training data does not satisfy.

                  It’s exactly the same as guessing the weather from a photo of the outside, after having been trained on photo/weather pairs.

                  1. 9

                    The data for language models in general is sampled from strings collected from websites, which includes true statements but also fiction, conspiracy theories, poetry, and just language in general. “Do you really think people would get on the Internet and tell lies” is one of the oldest jokes around for a reason.

                    You can ask GPT-3 what the weather is outside, and it’ll give you an answer that is structured like a real answer would be, but has no relation to the actual weather outside your location or whatever data centers collectively host the darned thing. It _looks_like a valid answer, but there’s no reason to believe it is one, and it’s dangerous to infer that anything like training on photo/weather pairs is happening when nobody built that into the actual model at hand.

                    Copilot in particular is no better - it’s more focused on code specifically, but the fact that someone wrote code does not mean that code is a correct or good solution. All Copilot can say is that it’s structured in a way that resembles other structures it’s seen before. That’s not knowledge of the underlying semantics. It’s useful and it’s an impressive technical achievement - but it’s not knowledge. Any knowledge involved is something the reader brings to the table, not the machine.

                    1. 2

                      Oh I’ll readily agree that Copilot probably doesn’t generate “correct code” rather than “typical code.” Though if it’s like GPT-3, you might be able to prompt it to write correct code. That might be another interesting avenue for study.

                      “However, this code has a bug! If you look at line”…

                      1. 2

                        I’ve experimented with this a bit and found it quite pronounced - if you feed copilot code written in an awkward style (comments like “set x to 1”, badly named variables) you will get code that reflects that style.

          2. 20

            IMHO it’s perilous and not quite fair to think what a machine should be allowed to do and not to do by semantic convention. “Machine learning” was one uninspired grant writer away from going down into history as, say, “statistically-driven autonomous process inference and replication”, and we likely wouldn’t have had this discussion because anything that replicates code is radioactive for legal teams.

            Copilot is basically Uber for copy-pasting from Stack Overflow. It’s in a legally gray area because the legal status of deriving works via statistical models is unclear, not because Microsoft managed to finally settle the question of what constitutes learning after all. And it’s probably on the more favourable side of gray shades because it’s a hot tech topic so it generates a lot of lobbying money for companies that can afford lawyers who can make sure it stays legally defensible until the next hot tech topic comes up.

            Also, frankly, I think the question of whether what Copilot does constitutes learning or not is largely irrelevant, and that the question of whether Copilot-ing one’s code should be allowed is primarily rooted in entitlement. Github is Microsoft’s platform so, yes, obviously, they’re going to do whatever they can get away with on it, including things that may turn out to be illegal, or things that are illegal but will be deemed legal by a corrupt judge, or whatever. If someone wants $evil_megacorp to not do things with your code, why on Earth was their code anywhere near $evil_megacorp’s machines in the first place?

            This cannot be a surprise to anyone who’s been in this field for more than a couple of years. Until a court rules otherwise, “fair” is whatever the people running a proprietary platform decide is fair. If anyone actually thought Github was about building a community and helping people do great things together or whatever their mission statement is these days, you guys, I have a bridge in Manhattan, I’m selling it super cheap, the view is amazing, it’s just what you need to take your mind off this Copilot kerfuffle, drop me a line if you wanna buy it.

            (Much later edit: I know Microsoft is a hot topic in FOSS circles so just to be clear, lemme just say that I use Github and have zero problem with Copilot introducing the bugs that I wrote in other people’s programs :-D).

            1. 1

              If machine learning was called “data replication”, it would be misnamed. And if it was called “pattern inference”, it would just be a synonym for learning… I wouldn’t care about Codex if I thought it was just a copypaste engine. I don’t think it is, though. Does it occasionally copypaste? Sure, but sometimes it doesn’t, and those are the interesting cases for me.

              I don’t think this at all comes down to Github being Microsoft’s platform so much as Github being the biggest repo in one place.

              I’m not at all defending Microsoft for the sake of Microsoft here, mind. I hate Microsoft and hope they die. I just think this attack does not hold water.

              1. 9

                If machine learning was called “data replication”, it would be misnamed.

                I beg to differ! Machine learning is a misnomer for statistically-driven autonomous process inference and replication, not the other way ’round!

                I’m obviously kidding but what I want to illustrate is that you shouldn’t apply classical meaning to an extrapolated term. A firewall is neither a wall nor is it made of fire, and fire protection norms doesn’t apply to it. Similarly, just because it’s called machine learning, doesn’t mean you should treat it as human learning and apply the same norms.

                1. 2

                  I don’t think machine learning learns because it’s called machine learning, I think it learns because pattern extraction is what I think learning is.

                  1. 7

                    I realize that. I want to underline that, while machine learning may be superficially analogous to human learning, just like a firewall is superficially analogous to a wall made of fire, it does not mean that it should be treated the same as human learning in all regards.

                    1. 2

                      I don’t think it should be treated the same as human learning in all regards either. I think it’s similar to human learning in some ways and dissimilar in others, and the similarities are enough to call it “learning”.

      2. 14

        Do you think Microsoft would be okay with someone training an AI on the leaked Windows source code and using it to develop an operating system or a Windows emulator?

        1. 5

          You don’t even have the right to read that. That said, I think it should be legal.

          1. 14

            I’m not asking whether it should be legal, but whether Microsoft would be happy about it. If not, it’s hypocritical of them to make Copilot.

            1. 7

              Oh by no means will I argue that Microsoft are not hypocritical. I think it’s morally valid though, and whether Microsoft reciprocates shouldn’t enter into it.

          2. 4

            Bit of a niggle, but it depends on the jurisdiction, really. Believe it or not, there exist jurisdictions where the Berne Convention is not recognized and as such it is perfectly legal to read it.

      3. 14

        I’d personally relicense all my code to a license that specifically prohibits it from being used as input for a machine-learning system.

        This is specifically regarding text and images, but the principle applies.

        https://gerikson.com/m/2022/06/index.html#2022-06-25_saturday_01

        “It would violate Freedom Zero!” I don’t care. Machines aren’t humans.

        1. 15

          Machines aren’t humans.

          Exactly this. I think anthropomorphising abstract math executed in silicon is a trap for our emotional and ethical “senses”. We cannot fall for it. Machines and algorithms aren’t humans, aren’t even alive in any sense of the word, and this must inform our attitudes.

          1. 1

            Machines aren’t humans. That’s fine, but irrelevant.

            Machines aren’t alive. Correct, but irrelevant.

            If the rule doesn’t continue to make sense when we finally have general AI or meet sapient aliens, it’s not a good rule.

            That said, we certainly don’t have any human-equivalent or gorilla-equivalent machine intelligences now. We only have fuzzy ideas about how meat brains think, and we only have fuzzy ideas about how transformers match input to output, but there’s no particular reason to consider them equivalent. Maybe in 5 or 10 or 50 years.

            1. 2

              If the rule doesn’t continue to make sense when we finally have general AI or meet sapient aliens, it’s not a good rule.

              Won’t happen. If it does happen, we all die very soon afterwards.

              I think the rule is good. We could come up with a different rule: oxygen in the atmosphere is a good thing. If we reach general AI or meet sapient aliens, they might disagree. Does that mean the rule was bad all along? I feel similar about anthropomorphising machines. It’s not in our ecological interest to do so.

        2. 5

          Source distribution is like the only thing that’s not covered by Freedom Zero so you’re good there 🤷🏻‍♀️

          Arguably the GPL and the AGPL implicitly prohibits feeding it to copilot.

          (I personally don’t mind my stuff being used in copilot so don’t shoot the messenger on that.

          (I don’t mind opposition to copilot either, it sucks. Just, uh, don’t tag me.))

          1. 1

            Do we have a lawyer’s take here, because I’d be very interested.

            1. 4

              It’s the position of the Software Freedom Conservancy according to their web page. 🤷🏻‍♀️ It hasn’t been tried in court.

        3. 1

          I’m on board; however, I would, at least personally, make an exception if the machine-learned tool and it’s data/neural net were free, libre, and open source too. Of course the derivative work also needs to not violate the licenses too.

      4. 10

        Learning is not covered by the GPL’s requirements.

        For most intents and purposes, licences legally cover it as “creation of derived works”, otherwise why would “clean room design” ever exist. Just take a peek at the decompiled sources, you’re only learning after all.

        1. 5

          I think this depends on the level of abstraction. There’s a difference in abstraction between learning and copying - otherwise, clean room design would itself be a derivative work.

          1. 12

            I don’t understand what you mean. Clean-room implementation requires not having looked at the source of the thing you’re re-implementing. If you read the source code of a piece of software to learn, then come up with an independent implementation yourself, you haven’t done a clean-room implementation.

            1. 3

              Cleanroom requires having read a documentation of the thing you are reimplementing. So some part of the sequence read -> document -> reimplement has to break the chain of derivation. At any rate, my contention is that training a neural network to learn a concept is not fundamentally different from getting a human to document a leaked source code. You’re going from literal code to abstract knowledge back down to literal code.

              Would it really change your mind if OpenAI trained a second AI on the first AI in-between?

              1. 4

                At any rate, my contention is that training a neural network to learn a concept is not fundamentally different from getting a human to document a leaked source code.

                I think it’s quite different in the sense that someone reading the code’s purpose may come up with an entirely different algorithm to do the same thing. This AI won’t be capable of that - it is only capable of producing derivations. Sure, it may mix and match from different sources, but that’s not exactly the same as coming up with a novel approach. For example, unless there’s something like it in the source you feed it, I doubt the “AI” would be able to come up with Carmack’s fast inverse square root.

                1. 2

                  You can in theory get Codex to generate a comment from code, and then code from the comment. So this sort of process is entirely possible with it.

                  It might be an interesting study to see how often it picks the same algorithm given the same comment.

              2. 2

                In copyright law, we have usually distinguished between an interface and an implementation. The difference there is always gonna be fuzzy, because law usually is. But with an AI approaches, there’s no step which distinguishes the interface and the implementation.

        2. 3

          One problem here is the same sort of thing that came up in the Oracle/Google case — what do you do with things that have one “obvious” way to do them? If I’m the first person to write an implementation of one of those “obvious” functions in a given language, does my choice of license on that code then restrict everyone else who ever writes in that language?

          And a lot (though of course not all) of the verbatim-copying examples that people have pointed out from Copilot have been on standard/obvious/boilerplate type code. It’s not clear to me that licensing ought to be able to restrict those sorts of things, though the law is murky enough that I could see well-paid attorneys arguing it either way.

    16. 10

      Excellent post, I strongly agree with this. One of the reasons I enjoy Elm is its radical simplicity (and, relatedly, its stability). I’ve been using Elixir more recently, and there’s a clear overhead to learning it and using it in comparison.

      Some of its complexity seems particularly unnecessary - eg there shouldn’t really be special syntax for calling anonymous functions, it’s not great to have different ways to access fields in structs vs maps, and so on. Possibly this highlights the “vertical” aspect of complexity - everything sits on top of a deep stack of other stuff these days. Perhaps the design of Elixir was constrained by the peculiarities of Erlang and OTP.

      Elixir still takes a commendably conservative approach to features in comparison to other languages like Rust, TypeScript and C++.

      In general, I feel like there’s an extremely cavalier approach to complexity at all levels of our industry, whether it’s designing PLs or writing software using them. That’s probably why, no matter how much tools and technologies seem to improve (on paper), it still takes a long time to create even an ostensibly simple piece of software.

      1. 14

        I’m finding that the most complex part of learning yet another language isn’t the language, it’s the tooling and ecosystem.

        TypeScript itself was fun and easy … but figuring out that config file the translator uses was annoying, and ditto for TSLint and Jester, and don’t get me started on the seventeen flavors of JS modules. Stuff like Grunt can die in a fire. (I now understand why some folks go nuts for radical simplicity and end up living in a cave running Gopher on their 6502-based retro computer: Node and the web ecosystem drove them to it. So much worse than C++, even with CMake.)

        1. 3

          Yes, that is true but I think there’s a compounding effect on complexity which means complexity at all levels is important to minimise. As an example, a more complex language requires more complex tools and more attempts at getting the tools right, which leads to the proliferation you mention. Some features can instead lead to proliferation of libraries (Smart pointers or regular pointers? Which of fifteen flavours of strings? Macros or regular functions? Etc.)

          Putting languages aside, isn’t it the same kind of disregard for complexity in all other areas (protocols, APIs, libraries etc.) that ultimately results in unwieldy ecosystems?

          We tend to underestimate how small bits of complexity compound and snowball, I think.

        2. 3

          I strongly agree with this. Coming from Clojure on the JVM, I found it much more difficult to work on a project written in ClojureScript (supposedly the “same” language on a different runtime) than it was to learn Mirah (completely different language on the JVM).

          Of course there are other factors like completely different paradigms that come out of left field (logic programming, category theory, etc) to make things difficult, but other than that, once you’ve got thee or four languages under your belt, learning a new one is much easier than learning a new runtime.

      2. 4

        Some of its complexity seems particularly unnecessary - eg there shouldn’t really be special syntax for calling anonymous functions, it’s not great to have different ways to access fields in structs vs maps, and so on. Possibly this highlights the “vertical” aspect of complexity - everything sits on top of a deep stack of other stuff these days. Perhaps the design of Elixir was constrained by the peculiarities of Erlang and OTP.

        Joe Armstrong discussed this in an article.

        It’s because there is a bug in Erlang. Modules in Erlang are sequences of FORMS. The Erlang shell evaluates a sequence of EXPRESSIONS. In Erlang FORMS are not EXPRESSIONS.

        The two are not the same. This bit of silliness has been Erlang forever but we didn’t notice it and we learned to live with it.

        https://joearms.github.io/published/2013-05-31-a-week-with-elixir.html

        1. 2

          My Erlang is super rusty so I’m a bit fuzzy on this, but it looks to me like “forms” in this context basically means what other languages call statements; is that accurate?

          It’s refreshing to see someone admitting their mistakes so openly.

      3. 2

        In general, I feel like there’s an extremely cavalier approach to complexity at all levels of our industry, whether it’s designing PLs or writing software using them.

        This has been frustrating me to no end, especially in the last ~4 years. My pet theory is that the industry has been overtaken by a “tech consumerist” mindset, where the impact of incorporating new, shiny “tech products” into your codebase is greatly overestimated, while the cost of the added complexity is underestimated.. Curious to hear if you have a take on how this cavalier attitude became so prevalent.

        1. 2

          Very few major projects are considered minimalist. The knock on effect of this is that steers the imagination away from minimalism. From a maximalist viewpoint, PLs such as Lisp, Smalltalk, or ML look like toys because there seemingly isn’t enough stuff to learn.

        2. 2

          It’s often a question of choosing where to put complexity. In Verona, we’d really like to have Smalltalk-like user-defined flow control (though with static typing so that it can all be inlined in an ahead-of-time compiler with no overhead), where a match expression and a function call / lambda, and (checked) exception throwing / catching are the only things in the language and everything else is part of the standard library. This is a great thing for having a minimal language and also means that complex control flow constructs such as a range-based for loop for iterating over every other element of a collection in reverse order are as first-class as while loops.

          We pushed quite a long way on this, but in a language that has linear types they add a huge amount of complexity in the type system. For example, consider this trivial case:

          var x : iso; // Linear pointer to the entry point of a region.
          if (a)
          {
            doSomethingWith(x);
          }
          else
          {
            doSomethingDifferentWith(x);
          }
          

          The two lambdas (braces are used to define zero-argument lambdas in Verona) both need to capture x, but x is a linear type and so they can’t both capture it. There are some details here around the fact that else is actually an infix operator applied to the result of if here that I’m going to ignore for now but even without that, we need to be able to express in the type system that:

          • The if...else construct takes two lambda that take zero arguments.
          • Each lambda must be callable once.
          • We don’t need to be able to call both lambdas together and so it’s fine for calling one to implicitly invalidate the other.
          • The construct will definitely call one or the other, and so any properties that are true at the end of both (e.g. that x is still valid and was not consumed) can be assumed for type checking the surrounding block.

          This is incredibly complex to make work at all, making it work and be something that mere mortals can use is basically impossible, so we’ve punted on it for our MVP and will build some control-flow constructs into the language.

    17. 35

      As I was reading this wonderful writeup, I had a nagging feeling that most certainly ‘someone on the internet’ will bring up some half-assed moralizing down to bear on the author. And sure enough, it’ the first comment on lobsters.

      I think it’s a beautiful and inspiring project, making one think about the passage of time and how natural processes are constantly acting on human works.

      @mariusor, I recommend you go troll some bonsai artists, they too are nothing but assholes who carve hearts in trees.

      1. 8

        We do have an ethical obligation to consider how our presence distorts nature. Many folks bend trees for many purposes. I reuse fallen wood. But we should at least consider the effects we have on nature, if for no other reason than that we treat nature like we treat ourselves.

        I could analogize bonsai to foot-binding, for example. And I say that as somebody who considered practicing bonsai.

        1. 12

          Foot binding is a social act in which women are deliberately crippled in exchange for access to certain social arrangements in which they don’t need to be able to walk well. The whole practice collapsed once the social arrangement went away. It’s very different than just getting a cool gauge piercing or whatever.

        2. 7

          Thank you Corbin for addressing the substance of my admittedly hot-headed comment. It did give me food for thought.

          I am definitely in agreement with you on the need to consider the impact of our actions on the environment. I have a bunch of 80-year old apple trees in my yard which were definitely derailed, by human hands, from their natural growth trajectory. This was done in the interest of horticulture, and I still benefit from the actions of the now-deceased original gardener. All in all I think the outcome is positive, and perhaps will even benefit others in the future if my particular heritage variety of apple gets preserved and replicated in other gardens. In terms of environmental impact, I’d say it’s better for each backyard to have a “disfigured” but fruitful apple tree than to not have one, and rely on industrial agriculture for nutrition.

          Regarding the analogy with foot-binding, which I think does hold to a large extent (i.e it involves frustrating the built-in development pattern of another, without the other’s consent) – the key difference is of course the species of the object of the operation.

          1. 7

            Scale matters too, I think.

            I’m a gardener who grows vegetables, and I grow almost everything from seed - it’s fun and cheap. That means many successive rounds of culling: I germinate seeds, discard the weakest and move the strongest to nursery pots, step out the strongest starts to acclimatize to the weather, plant the healthiest, and eventually thin the garden to only the strongest possible plants. I may start the planting season with two or three dozen seeds and end up with two plants in the ground. Then throughout the year, I harvest and save seeds for next, often repeating the same selecting/culling process.

            Am I distorting nature? Absolutely, hundreds of times a year - thousands, perhaps, if I consider how many plants I put in the ground. But is my distortion significant? I don’t think so; I don’t think that, even under Kant’s categorical imperative, every back-yard gardener in the universe selecting for their own best plants is a problem. It fed the world, after all!

            1. 3

              My friend who is a botanist told me about research he did into how naïve selection produces worse results. Assume you have a plot with many variants of wheat, and at the end of the season, you select the best in the bunch for next year. If you’re not careful, the ones you select are the biggest hoarders of nutrients. If you had a plot with all that genotype, it would do poorly, because they’re all just expertly hoarding nutrients away from each other. The ones you want are the ones that are best at growing themselves while still sharing nutrients with their fellow plants. It’s an interesting theory and he’s done some experiment work to show that it applies in the real world too.

              1. 2

                The ones you want are the ones that are best at growing themselves while still sharing nutrients with their fellow plants.

                So maybe you’d also want to select some of the ones next to the biggest plant to grow in their own trials as well.

      2. 3

        I think it’s a beautiful and inspiring project, making one think about the passage of time and how natural processes are constantly acting on human works.

        I mean… on the one hand, yes, but then on the other hand… what, we ran out of ways to make one think about the passage of time and how natural processes are constantly acting on human works without carving into things, so it was kind of inevitable? What’s wrong with just planting a tree in a parking lot and snapping photos of that? It captures the same thing, minus the tree damage and leaving an extra human mark on a previously undisturbed place in the middle of the forest.

        1. 14

          As I alluded in my comment above, we carve up and twist apple trees so that the actually give us apples. If you just let them go wild you won’t get any apples. Where do you get your apples from? Are you going to lecture a gardener who does things like grafting, culling, etc., to every tree she owns?

          The same applies here: the artist applied his knowledge of tree biology and his knowledge of typography to get a font made by a tree. I think that’s pretty damn cool. I am very impressed! You can download a TTF! how cool is that?

          Also, it’s not ‘in the middle of a forest’, but on his parents’ property, and the beech trees were planted by his parents. It’s his family’s garden and he’s using it to create art. I don’t get the condemnation, I think people are really misapplying their moral instincts here.

          1. 5

            Are you going to lecture a gardener who does things like grafting, culling, etc., to every tree she owns?

            No, only the gardeners who do things like grafting, culling etc. just to write a meditative blog post about the meaning of time, without otherwise producing a single apple :-). I stand corrected on the forest matter, but I still think carving up trees just for the cool factor isn’t nice. I also like, and eat, beef, and I am morally conflicted about it. But I’m not at all morally conflicted about carving up a living cow just for the cool factor, as in, I also think it’s not nice. Whether I eat fruit (or beef) has no bearing on whether stabbing trees (or cows) for fun is okay.

            As for where I get my apples & co.: yes, I’m aware that we carve up and twist apple trees to give us apples. That being said, if we want to be pedantic about it, back when I was a kid, I had apples, a bunch of different types of prunes, sour cherries, pears and quince from my grandparents’ garden, so yeah, I know where they come from. They pretty much let the trees go wild. “You won’t get any apples” is very much a stretch. They will happily make apples – probably not enough to run a fruit selling business off of them, but certainly enough for a family of six to have apples – and, as I very painfully recall, you don’t even need to pick them if you’re lazy, they fall down on their own. The pear tree is still up, in fact, and at least in the last 35 years it’s never been touched in any way short of picking the pears on the lowest two or three branches. It still makes enough pears for me to make pear brandy out of them every summer.

            1. 6

              I concede your point about the various approaches as to what is necessary and unnecessary tree “care” :)

              No, only the gardeners who do things like grafting, culling etc. just to write a meditative blog post about the meaning of time, without otherwise producing a single apple :-).

              But my argument is that there was an apple produced, by all means. You can enjoy it here: https://bjoernkarmann.dk/occlusion_grotesque/OcclusionGrotesque.zip

      3. 3

        Eh. I hear what you’re saying, but you can’t ignore the fact that “carving letters into trees” has an extremely strong cultural connection to “idiot disrespectful teenagers”.

        I can overlook that and appreciate the art. I do think it’s a neat result. But then I read this:

        The project challenges how we humans are terraforming and controlling nature to their own desires, which has become problematic to an almost un-reversible state. Here the roles have been flipped, as nature is given agency to lead the process, and the designer is invited to let go of control and have nature take over.

        Nature is given agency, here? Pull the other one.

      4. 3

        You see beautiful and wonderful writeup, I see an asshole with an inflated sense of self. I think it’s fair that we each hold to our own opinions and be at peace with that. Disrespecting me because I voiced it is not something I like though.

        1. 15

          I apologize at venting my frustration at you in particular.

          This is a public forum though, and just as you voiced your opinion in public, so did I. Our opinions differ, but repeatedly labeling other as “assholes” (you did in in your original post and in the one above) sets up a heated tone for the entire conversation. I took the flame bait, you might say.

          Regarding ‘inflated sense of self’ – my experience with artists in general (I’ve lived with artists) is that it’s somewhat of a common psychological theme with them, and we’re better off judging the art, not the artist.

    18. 4

      Thank you so much for the writeup, nomemory. I seem to be in the minority but I simply adore blog posts/essays that are like: “here is some code, let’s talk about how it works”. What an excellent ‘advent of code’ idea, too! Something to try to emulate come December ’22.

    19. 4

      Out of all the languages I’ve working with during my career, Erlang and POSIX shell have felt the most ‘solid’. I can look at a shell script or Erlang program written in the last 30 years and be confident that I

      1. can understand it
      2. can run it on modern systems

      Erlang definitely doesn’t shine in the pure-functional department. Its type system is very lispy, not offering the wonders of what ML-based systems give you. … yet despite all that, it’s a very fun and elegant programming environment, and I could see it hanging around long-term.

    20. 4

      Very cool writeup Ntietz. I think as more and more applications diverge from the old-school request -> DB-work -> render-output webapp model, we’ll find ourselves “breaking the rules” more often.

      This type of architecture makes me happy – Erlang/Elixir programs can very often really capitalize on this pattern (see, for example, caching user-local data in a Phoenix Channel for the duration of a socket’s existence).

      1. 1

        Elixir and the BEAM definitely make this easy to do and can be used to great effect. I’m really excited to see what comes about with Phoenix LiveView (and the similar projects in other languages) leveraging connection state and lots of backend processing.