Threads for ubernostrum

  1.  

    I wasn’t aware you could calculate the sin of an author. Are there additional formulas for the cos and tan?

    But more seriously: a lot of what the author complains about is the result of applying pretty standard guidelines of composition (intro/topic statement, “tell them what you’re going to tell them”, etc.). Are they being misapplied, or at least imperfectly applied, in many of these cases? Yes, but that’s because they’re being applied by amateurs – it’s likely that none of the people writing these are actually professional writers. And I’d bet money that their training for this consists entirely of maybe a seminar or workshop in grad school where a bunch of oversimplified “rules” for how to write a paper were presented and now they’re just rote-following those rules.

    This is something I also see plenty of in tech conference presentations – you can always tell who has no idea about public speaking and is just reading their slides, who just Googled a basic how-to and is re-enacting exactly what it said, and who has enough experience to actually be comfortable and know which “rules” to follow and which to break (and when and why).

    1. 5

      Makes you wonder how something like this got through with no prior announcement or communication of any kind it seems. Surely they knew they’d have to revert this?

      1.  

        From the link, it looks as if this is a change in git, and GitHub’s API is just a thin wrapper around the git tool. The problem isn’t that git changed it, it’s the combination of three things:

        • The behaviour of the git tool changed.
        • Various services such as GitHub exposed the functionality of git directly.
        • Various consumers of that service depended on the GitHub service to provide a stable output.

        I’m honestly quite surprised that this is how it works at GitHub, but then I wasn’t aware of the git archive command. The svn equivalent is svn export, which creates a file tree that you can then tar up with whatever tool you want.

        1.  

          It feels like something that usually wouldn’t even be that noteworthy in the new git release, and so nobody thought to check for things relying on the exact prior compression algorithm’s output or to make a big warning post in advance about it.

        1. 5

          Wait, who is going around updating their copyright years??

          Ignoring anything else: you can’t simply put a new copyright year on your work to extend the period of coverage: the copyright applies to when you published it first at the latest. If January comes around and you slap a new year on your work you’re not doing anything: updating the copyright date only applies to new material, so you can update a copyright header when you make some other change.

          That said given Disney has ensured length of copyright approaches the heat death of the universes I’m not sure there’s real value in that either.

          1. 6

            The argument goes:

            • The initial copyright date for a software project is the date when it was first written
            • But if the project undergoes any type of ongoing development, such that new code is being added to it, then that new code has a copyright date of whenever the new code was written
            • Therefore the copyright statement should reflect the range of dates involved in the various individual bits of code that make up the project

            So a project begun in 2015 would initially have “Copyright 2015”. Then if more code was added in 2016, it would become “Copyright 2015-2016”. And so on.

            Or at least that’s what I understand the argument to be for why the years need updating.

            The analogy would be a blog that’s kept over multiple years – each individual entry is copyrighted as of its date of authorship, so the blog’s sidebar or footer would display a pair of years in its copyright statement, reflecting the range of dates of copyright of the constituent entries.

            1. 6

              I understand updating the year when you make a copyrightable change, but some projects (e.g. FreeBSD) go and update copyright on unmodified files at the start of a year and this completely confused me.

              1. 1

                Copyright terms are based on death of author and not publication date anyway

          1. 8

            I was confused reading this until I realized the author is using an unusual and non-standard meaning of “rationalist”.

            A rationalist, in ordinary use, is someone who uses evidence and reasoning to make decisions that don’t have clearly defined right or wrong answers. Ask a rationalist to design a perfectly secure system and they’ll resign. Ask them to improve security and their decisions (while often unintuitive) will generally result in designs that are more resistant to malicious activity.

            The author is using “rationalist” to describe someone who applies an inflexible mechanical process, such as trying to design a secure system by enumerating every possible “source of insecurity” and then applying compensating controls. You’ll see a lot of this in auditor-driven security reports such as SOC 2.

            Not sure if the author is reading these comments (I don’t want to butt in on their thread directly), but if they are, it might be worth trying to rephrase their posts for clarity.

            1. 15

              I believe the author is using “rationalist” in the sense it has come to have in tech-y corners of the internet. Which is not someone who falls into the intellectual tradition of, e.g., Descartes, but rather a person enamored of “Bayesianism” and likely associated things like LessWrong, Effective Altruism/“longtermism”, Eliezer Yudkowsky, etc.

              It’s similar to how, say, “Java” in most contexts likely refers either to the island or to coffee, but in a tech-oriented forum almost certainly refers to the programming language and/or its associated ecosystem.

              1. 6

                I think it’s unreasonable to attempt a redefinition of a millenias-old philosophical tradition based on some obscure blog and the actions of someone known primarily for writing the world’s longest Harry Potter fanfiction.

                1. 9

                  Whether you think it’s “unreasonable” isn’t really relevant – if you read a lot of tech and tech-adjacent forums, you’re going to find that in those forums, “rationalist” refers to the modern-day Bayesian people almost every time.

                  1. 6

                    if you read a lot of tech and tech-adjacent forums,

                    I do.

                    you’re going to find that in those forums, “rationalist” refers to the modern-day Bayesian people almost every time.

                    This is not my experience.

                    1. 5

                      This is not my experience.

                      This is simply the day you learn about it. No shame about it. Some day in your life you didn’t know about Coke.

                      But yes “rationalists” on hackernews or here often* refers to something different than the classical meaning you might learn in a logic or philosophy course.

                      *not always

                      1. 9

                        I’ve been an active poster on HN since 2009, and older communities (Slashdot, etc) before then.

                        I’m not trying to be rude here, but I think you’re significantly over-estimating how many people (even on “tech-y corners of the internet”) have even heard of LessWrong. That population is almost certainly orders of magnitude smaller than the people who took several quarters of logic and/or philosophy as part of their undergrad degree.

                        If you want to claim there’s a set of very young (or very parochial) folks in a Twitter/Tumblr bubble who think “rationalists” refer to fans of a particular blog then I’ll accept that, but I would also say that’s all the more reason to not put any weight on their opinions.

                      2. 3

                        Fully agreed, I’ve never stumbled over this term as used in such a way, so I’m siding with unreasonable :P

                      1. 5

                        Oh my god.

                        I love that you have citations for this but … oh my god.

                        I knew HPOR is longer than War and Peace but I had never expected someone would write three and a half million words of fanfic about anything, much less Harry Potter.

                        A Second Chance by Breanie

                        Part 2 of The Kismet Trilogy

                        Three and a half million words and it’s part of a trilogy!

                        I can’t handle this right now, or maybe ever.

                        1. 2

                          I mean… it’s been 6,400 business days since HP1 was released, and the author would have had to average 560 words a day to reach that number. Hardly impossible, especially for a creative person.

                        2. 4

                          Jeepers, you have to get to page 6 in the AO3 listing to hit 660k words. Fanfic is such an amazing thing (I don’t partake, I just find this kind of remixing culture interesting).

                          1. 1

                            I partake, and yeah it’s pretty great.

                            One of my favorite serials (With This Ring) has been updating with ~1000 words per day every day for the past 13 years!

                      2. 6

                        Of course, the rationalists (LessWrong, EA, Yudkowsky) are against enumerating modelable threats and assuming you thus have the problem handled, certainly in the case of AI. A big point is that we have to account for the fact that we can be hideously limited in our understanding of a problem.

                        (Like, Eliezer is all about “navigating a sea of unknown unknowns”.)

                        That said, if the author is calling out LW specifically, unless this is about AI alignment, I have no idea what they are talking about. The LW “canon” and community has no particular opinion on infosec.

                        (I also don’t know what Chapman is talking about. He doesn’t seem like he’s directly responding to anything Eliezer or anyone on LW said. Though I do actually, as a LWite, think he’s wrong on his own merits.)

                        1. 3

                          It’s similar to how, say, “Java” in most contexts likely refers either to the island or to coffee, but in a tech-oriented forum almost certainly refers to the programming language and/or its associated ecosystem.

                          No, it’s not similar to that. “Java” has multiple meanings, but these meanings are not competing, e.g. “learning Java” or “drinking Java” clearly refer to different kinds of Java, and it is clear from context which one you mean. This is different from saying things like “I’m a rationalist”, “this is a rationalist view”, “rationalists are wrong” and the listener assuming a different kind of rationalist than what you meant.

                          1. 4

                            Perhaps a better analogy is when C.S. Peirce found the term “pragmatism” being co-opted and used by others – particularly William James – in ways he didn’t like. People “stealing” established terms like this is not exactly a new phenomenon and, like I pointed out, in tech and tech-adjacent forums it’s almost always safe to assume “rationalist” means the Bayesian/Yudkowsky people.

                            (Peirce worked around it by renaming his philosophy “pragmaticism”, claiming the word was too ugly for anyone else to want to take it from him)

                          2. 3

                            Not sure what you’re about here. First, LessWrong doesn’t call it’s thing “rationalism”, it calls it “rationality”. I think the slight change in terminology is important, since it helps highlight that it’s an explicit departure from previous, similar looking traditions.

                            Second, there’s a reason people become enamoured with Bayesianism: probability theory is correct. See the first 2-3 chapters of E. T. Jaynes’ Probability Theory: the Logic of Science for more details, but the crux is: probability theory requires surprisingly few axioms, that would be unreasonable to deny.

                            Now the problem with probability theory is (and any one sufficiently familiar with it knows it), it’s also intractable. Correctly reasoning about anything complex enough brings about unmanageable combinatorial explosion, and we end up dealing with worse than NP complete problems all the time. In most cases we can only apply an approximation of probability theory, and that’s clearly incorrect. Our best hope is to get close enough, and to my knowledge we don’t yet have a theory of how to “correctly” approach correct reasoning.

                            Kind of frustrating, really: we have this beautiful Theory of Truth, that crushes our dreams of ever reaching absolute certainty. But at least we can get close enough, right? But then computational theory crushes our dreams again by pointing out we cannot apply our beautiful Theory of Truth in the first place.

                            I believe this is why LessWrong is called “LessWrong” instead of “OvercomingBias” (that was on purpose) or “MoreRight”. We cannot completely overcome biases, and forget about being right. The best we can hope for is making fewer mistakes than the other folks.

                        1. 11

                          It had 105 million downloads in 2022.

                          “Sorta popular”, right. Only 290,000 downloads per day. Lesson one: humility.

                          A lot of programmers use Stack Overflow today. (I wish there were a popular nonprofit alternative I could recommended.)

                          On the one hand, oof, but on the other hand, I very much like lobste.rs not being “popular”.

                          1. 9

                            Only 290,000 downloads per day.

                            Keep in mind that package-index download counts are tricky numbers, because there are a lot of mirrors of the big popular indexes, and a lot of badly-configured CI tools which download the whole dependency tree from scratch on every run. And in the JS world the way installation works means that a single npm install might actually download multiple copies of any given package in certain circumstances (example: packages foo and bar both depend on package baz, but on different versions of it; npm install will download two copies of baz in that case).

                            It’s still a popular package. But always take download counts with a similarly-sized grain of salt.

                          1. -4

                            So, reinventing Nix?

                            1. 13

                              I think that’s a bit of an unfair take. They are talking about making it as easy as possible for newbies to bootstrap a python environment in Windows/Linux/macOS. If your answer to that is Nix, the bootstrapping would become a nightmare.

                              So how do I setup for this project?

                              Either you install this whole operating system where everything works very different to what you’re used to, or you install this Nix cli tool. Ah you’re using Windows? Sorry, unless you start by installing this WSL thing.

                              Now, imagine you want to add a new dependency, well, you may need to either put it in pyproject and then use this glue called poetry2nix, that sometimes works and sometimes doesn’t, if it doesn’t work maybe you can add it directly using python3XPackages.package and if it’s not there, then good luck, you either open a patch to nixpkgs poetry2nix adding your package or learn how to package a python library and also contribute to nixpkgs. The other option is to create a virtualenv the old-fashioned way and then use nix as kind of a pyenv.

                              I don’t think any of that sounds better than what is being proposed in the link.

                              1. 3

                                I don’t mean to dismiss the author’s work, but to point out the continued insular choices of the Python core teams. Instead of installing Nix, the author asks us to install Cargo and then go through a standard Rust-project workflow; they are comparable in complexity and extent.

                                You may choose to continue using pyproject.toml, Poetry, virtualenv, pyenv, etc. but the direct path is to use nix-shell to configure an entire development environment in an atomic step. The list of packages can be contained to a single line in a single file; here is an example from a homelab application which I updated recently.

                                Contributing to nixpkgs is not trivial, but it is not difficult either. Here is a recent PR I authored for adding a Python package. It’s shorter than a setup.py, in my experience! Also, you don’t have to contribute new packages to nixpkgs; instead, you can add them to your local Nix expressions on a per-project basis.

                                Please also keep in mind that all of this discussion is within the context of Python packaging difficulties. Languages without extension modules don’t require all of this effort; all we need instead is to install a runtime directly from an upstream package-builder, whether that’s a distro, vendor, or third-party packager. We should imagine that a language is either designed to have lots of extensions and be an integrator of features, or designed for monolithic applications which reimplement every feature and are suitable for whole-program transformation. Python picked both, and so gets neither.

                                1. 10

                                  I don’t understand how Nix would even be an alternative option when the goal is to support MacOS, Linux, and Windows.

                                  1. 2

                                    Hypothetically, just as a thought experiment and nothing else, I think maybe a case could be made that running Nix inside WSL and cross-compiling from that to Windows might be sorta acceptable. I don’t think that’s a realistic thing to propose: it’s a ton of work for starters, and the payoff would be pretty dubious since you’d have this really long painful edit/test cycle.

                                    1. 1

                                      Doesn’t WSL solve that problem out of the gate?

                                      1. 8

                                        WSL is like Electron: it makes it easy for a developer to provide something to the user, but the thing provided is much worse than a corresponding native-solution. I’d struggle to integrate a WSL app with my native Windows powershell scripts.

                                        1. 7

                                          The short answer is no.

                                          The longer answer is: python has supported Windows natively for over a decade (how well might be up for debate, but it was supported), it’s not reasonable for them to suddenly say “use Linux inside Windows or get fucked”, and it’s not reasonable to expect them to do so, either.

                                          1. 3

                                            I don’t have any statistics, but I would bet that the vast majority of Windows users (corporate IT managed machines) can’t enable WSL. Python is actually very easy to install on Windows with the Microsoft Store. Requiring users to enable WSL and understand how to use linux would be a large obstacle.

                                            1. 2

                                              Or just with the installer from python.org, which will install to AppData by default (I think? At least if you choose the install just for me option), so no admin permissions needed.

                                              1. 3

                                                All the cool kids use winget

                                        2. 11

                                          I don’t mean to dismiss the author’s work, but to point out the continued insular choices of the Python core teams.

                                          No, you are doing what you always do: pushing your preferred tools as the only acceptable tools, such that all development on all other tools must cease and all people everywhere must adopt only and exclusively your preferred tools. And along the way you throw in the usual (un)healthy dose of bashing anyone who dares to develop other tools, since obviously it’s bad and wrong for them to do so when the One True Thing has already been invented and thus they must be doing that for bad reasons.

                                          Sometimes you do this with PyPy versus CPython. Sometimes with functional progamming/category theory versus other paradigms. Sometimes with Nix versus literally everything. But it’s always the same basic dismissive/attacking approach.

                                          Maybe don’t do that?

                                          1. 2

                                            I really appreciate this comment. It helps me understand you.

                                          2. 5

                                            I didn’t read your comment, and boy, there’s so much more I disagree with.

                                            but the direct path is to use nix-shell

                                            To whom? There’s a world of people for whom nix is a non starter. Everyone using Visual Studio. Or working on computers they don’t fully control (enterprise developers). Or people that like Bluetooth to work, so they can’t use Linux (this is half in jest, half serious). “There’s no silver bullet” applies to your favorite thing too.

                                            all we need instead is to install a runtime directly from an upstream package-builder

                                            What language is like that? Ruby has C extensions, JavaScript has them, Java has jni. Even go, which is famous for reinventing the wheel a lot, has cgo. In every single language that isn’t C you will, at some point, have problems trying to install a package that needs to compile something in another language.

                                            The reason it happens so much more in python is actually kind of a feature, not a bug: python was designed to be easily extendable, specifically in C, although that feature was perhaps not as well designed as we would like, in hindsight.

                                            We should imagine that a language is either designed to have lots of extensions and be an integrator of features, or designed for monolithic applications which reimplement every feature and are suitable for whole-program transformation.

                                            Maybe in a perfect world but … I don’t think any languages really fit this binary, well, binarilly (?). At most some are more at one end than the other, but I’m honestly struggling to find utility in the whole classification really.

                                            1. 2

                                              Everyone using Visual Studio. Or working on computers they don’t fully control (enterprise developers).

                                              I don’t understand, why would Nix be a blocker in those contexts? If you don’t fully control the computer, wouldn’t you have trouble installing all the Rust thingies anyhow?

                                                1. 1

                                                  I don’t think that addresses my question. I genuinely don’t get why Nix would be a blocker to people using Visual Studio (VS Code(?)), are plugins sandboxed, or unable to interact with binaries/run commands in some other way?

                                                  1. 5

                                                    You’re applying things I said about one thing to to other things that I didn’t for it to be applied to.

                                                    One of my disagreements is with the idea that nix is some sort of ideal goal that ever developer is converting to. This idea breaks down as soon as you realize that people writing C# on Visual Studio (not VSCode) will never adopt something like nix, unless it’s fully integrated with Windows, like every single other tool they use.

                                                    The other disagreement is with the idea that the way the project linked can currently be used is the final interface: it clearly isn’t, they clearly say it will be a single binary in the future.

                                                    Only the second one has anything to do with Python tools. The first one is just a criticism of the idea of nix as the best thing ever that everyone should use and can do no wrong.

                                            2. 4

                                              the author asks us to install Cargo and then go through a standard Rust-project workflow

                                              This is clearly a very early stage prototype, I didn’t see any claims that this is the final interface they people should use today.

                                              In fact, straight from the readme:

                                              The Vision The goal is for posy to act as a kind of high-level frontend to python: you install posy, then run posy [args] some_python.py and it takes care of everything up until entering the python interpreter. That includes:

                                              installing Python (posy is a pure-rust single-file binary; it doesn’t assume you have anything else installed) (…)

                                              1. 3

                                                Has nix managed to one-up rust on its evangelism task force?

                                              2. 2

                                                Not that I know a lot about nix but the only similarity I see between nix and this proposal is that they are both made with code?

                                                Ok, to be fair, they are made with code and related to managing packages. And they mention immutability somewhere in their description.

                                                So, 3 similarities. Maybe I am wrong. Still sounds like a far-fetched comparison.

                                              3. 1

                                                I for one would like to strongly encourage anyone who would like to make an attempt at “reinventing Nix”, since a thing that is like Nix but avoids some of its pain points could potentially be delightful.

                                              1. 49

                                                Airline computer systems are astonishing. A few years ago, I turned up at a check-in desk for a two-hop flight and discovered that I didn’t have an included bag for one hop. The person at the desk helpfully told me that it was because I had booked the two hops separately and they could replace it with a single booking for the two hops to get a free checked bag. To do this without losing my seat, they needed to do an atomic delete and insert into the database. This was accomplished by telephoning the DBA, asking them to lock out all updates from every terminal except theirs for a minute, and then doing the two transactions.

                                                A lot of airlines used to be vulnerable to the same issue with their predictive pricing models, where they’d dynamically adjust pricing based on demand. 15-20 years ago, this was easy to game by going to a load of different travel web sites and starting the process of reserving a ticket. Each of the sites would place a hold on a ticket for the flight, which would trigger a price increase. A few hours later, if you didn’t complete the purchases, the holds would time out and the back-end system would detect a sudden drop in demand and adjust the price down a lot. At that point you could book the flights for a lot less than the original price.

                                                1. 7

                                                  About ten years ago I boarded a flight and had the same seat on my boarding pass as another passenger.

                                                  The crew were apologetic and explained “you must have checked in at two different counters at exactly the same time”.

                                                  At the time I thought to myself “surely it can’t be that simple”. But maybe it was!

                                                  1. 5

                                                    At the same time, it’s kind of amazing how the ancient mainframe-era systems still hold up today, and how you occasionally see bits and pieces of their interfaces “leaking” outward. For example, many airlines used to have odd password restrictions for their online accounts – you couldn’t use certain “Q” or “Z” in your password. I’ve been told by people whose knowledge I trust that this was due to the ultimate backend system being originally designed decades ago for travel agents to use via telephone, where “Q” and “Z” could not be entered. These days I think most of the US airlines have upgraded/modernized at least enough of the intermediate layers to not have that be a problem anymore, but it amused me at the time.

                                                    As to your story of the DBA: the hardcore frequent-flyer people (I used to fly a lot, and so spent a fair amount of time in their forums) are pretty good at figuring out patterns and cycles of when airlines do various types of data load/data change operations, since often it affects which fare buckets are available and thus what kinds of deals you can get. That community does plenty of impressive reverse engineering of internal processes, and the stuff you learn from just reading their forums can be fascinating.

                                                  1. 5

                                                    One thing I’ve really begun to like since figuring it out: in nox, Python version is a first-class selector.

                                                    In tox, you can associate a given testenv with a particular Python version by declaring the basepython key for that testenv. And you can build a matrix of Python versions with the generative features. But there’s no easy way to run tox and say “only run the testenvs that go with this Python version”. You can select on lots of things, but not Python version. Which means that for CI systems which tend to run each Python version as its own job/container, you end up needing a plugin that will selectively enable the correct testenvs based on Python version. I used to use tox-travis for this, and then tox-gh-actions when I switched to GitHub Actions CI. But doing this always required duplicating information: you need to specify your testenv/python-version matrix all over again for the plugin so that it knows which envs go with which Python versions.

                                                    Meanwhile in nox, you declare the associated Python version(s) for a given session in the nox.session decorator, and then can run nox --python <version> to select only sessions associated with <version>. This means you don’t need any kind of duplicate mapping or even a plugin; as long as the active Python version is available as a variable in the CI config you can pass it as a command-line argument to nox and run only the correct sessions for that version.

                                                    1. 1

                                                      With the one exception of pre-releases (that matters only to the 0.1%) which I’ve hilariously opened a bug about just yesterday: https://github.com/wntrblm/nox/issues/685

                                                      I could probably write some extractor with regexp-fu but debugging CI is a nightmare.

                                                      edit: I’ve figured it out with a bit of bash-fu: https://github.com/hynek/environ-config/pull/48/files

                                                      edit 2: I’ve updated the post re: this topic

                                                    1. 2

                                                      The whole article was great, but I’m going to just start quoting

                                                      Furthermore, I consider that the PyPA must be destroyed.

                                                      in random unrelated other threads now because it is absolutely correct and it tickles my classicist funnybone.

                                                      The thing that should come first is making it easy for developers to accomplish common tasks. Everything else should be second to that. You aren’t going to fix all workflows with v1. That’s fine. Focus on solving a vertical slice and expand.

                                                      Just look at ESBuild. Before that people would defend Webpack by saying that all the complexity existed for a reason. It did but also most user needs could be solved with a zero configuration tool, and the remaining people can do something else.

                                                      1. 1

                                                        in random unrelated other threads now because it is absolutely correct and it tickles my classicist funnybone.

                                                        Considering that various peoples’ anti-PyPA crusades already basically pushed people who were actively trying to help improve packaging to give up because they got sick of all the hate directed at them, maybe you should find something else to “tickle your funnybone”.

                                                        1. 1

                                                          Please wait until I actually do the absurd thing before chiding me for it.

                                                      1. 20

                                                        Inadequate search is my top misfeature of Fediverse. With virtually non-existent discoverability within Fediverse I just don’t have a good reason to use it.

                                                        People getting upset that the stuff they post might get discovered is strange to me. The toots are out there. If one’s afraid someone might go through the whole timeline they shouldn’t have posted it in the first place. If someone’s out there to get you, your indignation won’t stop them. And it it’s possible to do it manually, it can be automated in the browser. At the moment Fediverse can not both propagate your content and protect it from being read. I mean, people are not deleting their old toots because someone might read them but still get upset when someone reads them.

                                                        If that level of privacy is what they’re looking for they have to accept that Fediverse is not it. Then they might start moving towards a more private solution. Until then no amount of negative emotions will get them any closer. Denying reality will not give them what they want.

                                                        1. 7

                                                          With virtually non-existent discoverability within Fediverse

                                                          What does discovery look like to you on other platforms? How do you use twitter (or other platforms) to find new people/content?

                                                          The toots are out there. If one’s afraid someone might go through the whole timeline they shouldn’t have posted it in the first place.

                                                          There’s a difference between “here’s this thing and if you happen across it organically that’s cool” and “here’s this this i want you to slurp it up into your indexing service”, and that’s where they’re coming from. I am of two minds, I think overall people do want better ways to find new people to follow, but there’s another desire for only wanting to be found organically. And it turns out most of the long time users want organic interactions.

                                                          If that level of privacy is what they’re looking for they have to accept that Fediverse is not it. […] Denying reality will not give them what they want.

                                                          If anyone is denying reality I think it’s everybody creating the indexers. They have found that the level privacy they need, claiming the fediverse doesn’t provide this when it clearly and explicitly does is just wrong. Just because the protocol itself doesn’t provide explicit protections you’re talking about doesn’t mean the fediverse and the people inhabiting it haven’t solved these problems through non technical means.

                                                          To use everyone’s favorite quote, “your scientists were so preoccupied with whether they could, they didn’t stop to think if they should”. The communities that have found a home in the fediverse are simply asking all these developers writing crawlers and indexers to ask if they should.

                                                          1. 9

                                                            How do you use twitter (or other platforms) to find new people/content?

                                                            On Twitter I can search for anything and quickly find people who talk about the thing I searched. Note that Twitter provides full text search and near real time too. So I can not only search for my favourite programming language but also current events. The later is especially useful because I’ve found more than a single digit of people who have expertise in areas I don’t and didn’t even know how those are called to be able to find them “organically”.

                                                            Tangentially, isn’t coming by a tweet/toot in search more valuable? People are actively looking for a specific thing so their interest is probably higher than randomly stumbling upon a random toot/tweet. Anyway…

                                                            On Mastodon, though… I can only search local instance and tags only. Not all current events or other things one might want to search for have a tag. Not all interesting people are on my instance.

                                                            Just because the protocol itself doesn’t provide explicit protections you’re talking about doesn’t mean the fediverse and the people inhabiting it haven’t solved these problems through non technical means.

                                                            Well… Did they? As OP pointed out people just can’t know for sure what any particular instance does. Like, if I set up my private instance and follow a bunch of people I’ll have all their toots in my db. It’s all one SQL query away and no one would even know. I’d argue that “asking nicely” not to be indexed is not a viable solution.

                                                            The communities that have found a home in the fediverse are simply asking all these developers writing crawlers and indexers to ask if they should.

                                                            And what would happen when any given developer would answer that they, in fact, should? Current reality of Feriverse doesn’t quite have an answer to that, does it?

                                                            1. 2

                                                              Tangentially, isn’t coming by a tweet/toot in search more valuable?

                                                              I mean I think that is what people want. You find a profile by seeing something that someone else boosted, or by it being in your local or federated timelines, meaning someone in your orbit is following them, or follows someone who boosted them. The ideal (I think) being that the content filters through the network in an organic way, and not through a centralized index.

                                                              I can only search local instance and tags only. Not all current events or other things one might want to search for have a tag. Not all interesting people are on my instance.

                                                              This is definitely true, but I think people would argue they don’t want to be your news source for current events. If you want news, find a news account you like, or go to news websites.

                                                              To be clear I say “people would” and similar vague noncommittal things because I am still trying to figure out my own opinions on the topic, though I do currently lean a lot towards the side of “you need to make your fancy new product explicitly opt-in”.

                                                              Well… Did they? As OP pointed out people just can’t know for sure what any particular instance does. Like, if I set up my private instance and follow a bunch of people I’ll have all their toots in my db.

                                                              Sure, and the entire network doesn’t operate without that. But the concern is not you individually doing those things, it’s you doing that at a mass scale with the sole intent of indexing the network. In the case of Searchtodon I think the point where the line was crossed is when it went from being “i downloaded these files to my mac and can use spotlight to search them on my own machine” to being “i created a server which will download those files for you to search, and the indexes are colocated with other users’ data”.

                                                              And yes, you can of course do a ton of things without people knowing, and possibly do more nefarious things like sentiment analysis and account correlation, and there wouldn’t be any easy way to know. But given the community feedback projects like this are clearly not welcome. A lot of people are fiercely opposed to this. But if you were to do something like this, and kept it secret, if the greater community found out I’m not positive anyone but a FAANG company could weather it.

                                                              And what would happen when any given developer would answer that they, in fact, should? Current reality of Feriverse doesn’t quite have an answer to that, does it?

                                                              I mean? Its the community reaction not actually a sign that you’re probably not going to be able to say “yes we should do this”?T hat same reaction is proof that they actually do have an answer: “we will fight you tooth and nail on this, until your project is untenable or until we perish”.

                                                              1. 5

                                                                Sure, and the entire network doesn’t operate without that. But the concern is not you individually doing those things, it’s you doing that at a mass scale with the sole intent of indexing the network. In the case of Searchtodon I think the point where the line was crossed is when it went from being “i downloaded these files to my mac and can use spotlight to search them on my own machine” to being “i created a server which will download those files for you to search, and the indexes are colocated with other users’ data”.

                                                                … colocated with other users’ already publicly available data. I still don’t see what the hoopla is since there isn’t any difference.

                                                                It’s the community reaction not actually a sign […]

                                                                You mention the “community” a lot in your posts—why do you consider only those who oppose fediverse search to be a part of it? What about the rest of us fedizens who actually want it? Why is their reaction the only one to be considered?

                                                                1. 3

                                                                  I mean their reaction is the one I would give weight to because they’re the ones saying “I don’t want you doing this with my data”, while you’re asking “what about my desire to have their data used this way?”

                                                                  Yes I am saying the community a lot because from my perspective the overwhelming majority of people do not want this. The numbers on the searchtodon post are impressive for the fediverse, but I saw a ton of posts from instance admins speaking out against it, so yeah, I think it’s fair to say the community in this case, because that is my view of the network.

                                                                  And sure fine do what you want, just be prepared to be instance blocked, and make sure your instance admin knows you’re going to use a service like this, because if people who don’t want you using their data like this find out that you’re using their data, they are going to instance block you. I’m not saying it’s right or wrong, and it is something the network is prepared to accept and designed for, I’m just saying this is a known consequence, and a lot of instance admins aren’t willing to be cut off from that chunk of the network for that reason.

                                                                  At the end of the day, you’re free to do what you want with the fediverse and things published in the activitypub format. A ton of fascists have setup their own mastodon (or other ap) servers, they also have been disconnected from the greater network because nobody wants to deal with them. The software is open source and the API is public and well defined, do what you want, be ready to be locked out of parts of the network for it.

                                                                  This is why I keep saying people are trying to get around the social problem with a technical solution. Yes, you can do all these things. But here are the known consequences of doing it. And no technical solution can get around that.

                                                                  1. 3

                                                                    Yes I am saying the community a lot because from my perspective the overwhelming majority of people do not want this. The numbers on the searchtodon post are impressive for the fediverse, but I saw a ton of posts from instance admins speaking out against it, so yeah, I think it’s fair to say the community in this case, because that is my view of the network.

                                                                    This is not really how it works, for two reasons:

                                                                    1. Feedback has a tendency to be largely negative regardless of the actual average feelings of people, largely because it’s only those who have strong negative feelings about something who will be moved to make sufficient noise to get noticed. People who think a search function would be mildly useful might never post about it; people who think it would be really useful might make a single post. People who think it’s evil and must be fought to the bitter end will dedicate hours or days or weeks to nonstop angry posting about it. I had to unfollow someone I knew and liked IRL because they had basically turned my timeline into a sewer with the constant angry boosts and angry posts and angry threads over the search thing.
                                                                    2. The Fediverse is often more of an echo chamber than other social networks, precisely because it only shows you things from people you’ve chosen to follow and/or actively seek out. Which means that if many people in your social circle are against a thing, all it tells you is that many people in your social circle are against the thing. It tells you nothing whatsoever about what the broader “Fediverse community” thinks or feels.

                                                                    And sure fine do what you want, just be prepared to be instance blocked, and make sure your instance admin knows you’re going to use a service like this, because if people who don’t want you using their data like this find out that you’re using their data, they are going to instance block you.

                                                                    Having read quite a bit of the search drama, and also being an occasional reader of #fediblock and having seen some earlier kerfuffles like the CISA thing, I think the likelier outcome is that a relatively small number of instances are going to increasingly isolate themselves through their own aggressive defederation policies, helped along by other instance admins just getting tired of dealing with them.

                                                                    At any rate, I do not think you have sufficiently established that you do or can speak on behalf of some sort of “fediverse community”, or a majority or plurality thereof, so please stop doing so.

                                                            2. 9

                                                              There’s a difference between “here’s this thing and if you happen across it organically that’s cool” and “here’s this this i want you to slurp it up into your indexing service”, and that’s where they’re coming from

                                                              I’d say that’s the difference between the public and unlisted post scopes.

                                                              Just because the protocol itself doesn’t provide explicit protections you’re talking about

                                                              But it gives guidelines, with the post scopes as above.

                                                              For what it’s worth, this was the most reasonable search engine that came out from all of the twitter immigrants. And yet some extreme people still bullied it off. It will only take one developer who doesn’t have the best intentions to stand their ground with a way more aggressive indexing strategy than the one Searchtodon used to ruin it for everyone.

                                                              1. 2

                                                                Oh I fully agree this is the most reasonable system that’s been introduced so far, but it’s equally clear that even the way this was handled is unacceptable to the community.

                                                                I do however strongly disagree with your framing of “bullied it off”. The greater fediverse community sees tools like this as an existential threat, they’ve been very clear about it. And yet similar scraping/indexing projects continue to pop up without really talking through their idea with the community, or figuring out how to work with them. Instead they’ve all been announced as “hey, here’s this tool that’s going to slurp up your data, you’re welcome’.

                                                                And to be clear, I really do think searchtodon did a good deal of homework on figuring out how the community would feel, and did look into the criticisms of past attempts and tried to address their concerns. But they also didn’t spread their idea out and solicit feedback until it was already done.

                                                                As for “one developer who [will …] stand their ground”, I don’t think any of them are truly going to be prepared for the mountain of legal paperwork they’re going to encounter if they don’t back down. Namely, the same people blasting these services in posts right now, and who are covered by GDPR, CCPA, and any privacy laws going into effect in several other states too, are going to be filing data access requests and data deletion requests, and will be applying legal pressure as well. It’s going to take a fairly large company to be able to weather that, and at that point, only a handful would be willing to keep going rather than fold and tear down the service.

                                                                1. 7

                                                                  As for “one developer who [will …] stand their ground”, I don’t think any of them are truly going to be prepared for the mountain of legal paperwork they’re going to encounter if they don’t back down. Namely, the same people blasting these services in posts right now, and who are covered by GDPR, CCPA, and any privacy laws going into effect in several other states too, are going to be filing data access requests and data deletion requests, and will be applying legal pressure as well.

                                                                  I think it is far more likely that this would turn into asymmetric warfare – Fediverse search and discovery has enough money behind it that somebody’s going to handle the initial “storm” of GDPR/CCPA/etc. attempts and come out the other side with a product. Meanwhile, if people do try to weaponize such things against a search/discovery service, it opens the door for it to be weaponized in the other direction against the, frankly, mostly under-resourced “indie” instance admins who have been most strongly against search/discovery features. They are the ones who are overwhelmingly more likely to fold, along with their instances, when legal paperwork starts coming in from angry strangers.

                                                                  So as righteous and tempting as it sounds, I think the net effect would be the opposite of what’s desired.

                                                                  Also:

                                                                  The greater fediverse community sees tools like this as an existential threat, they’ve been very clear about it.

                                                                  I have seen a number of instance admins who treat it this way, and who claim to speak on behalf of some large majority of all Fediverse admins/users. I have not yet seen evidence that they actually do speak on behalf of such a majority, or that “the greater fediverse community” is an accurate label for them.

                                                                  1. 2

                                                                    Fediverse search and discovery has enough money behind it that somebody’s going to handle the initial “storm” of GDPR/CCPA/etc. attempts and come out the other side with a product

                                                                    does it? from who? and what strategy do they have for monetizing this product with an increasingly hostile user base?

                                                                    This whole “search has backing” line is never able to answer the question of who’s doing this and why are they doing it. Because so far it’s been entirely people who recently left twitter without any corporate backing. Clearly the problem is not a lack of resources if some larger entity wanted to do this, because multiple developers have shown this is something they can hack together in a couple weeks in their spare time.

                                                                    Hell, if search has backing why haven’t they gone to any of the larger instances and tried to work with them specifically with the goal of using their data as a starting point?

                                                                    There seems to be this view, especially from people in the tech industry, that just because twitter did something one way, it is inevitable that the fediverse will do it too. But the problem is that twitter did a lot of things simply because they were trying to figure out how to make money at VC required return rates. This is not a problem most server administrators have. There is no forcing function of “we need to show more people more content because it means we can serve more ads”.

                                                                    1. 8

                                                                      does it? from who? and what strategy do they have for monetizing this product with an increasingly hostile user base?

                                                                      Several people who’ve attempted to build “thoughtful” search/discovery have been clear that they are aware of efforts to build much less “thoughtful” versions, at least some of which are alleged to already be indexing, if not yet publicly offering search functionality.

                                                                      Also I would ask who is the “increasingly hostile user base”? What I saw of the last round of this was basically a lot of “normies” who were like “oh cool, search would be a useful thing to have”, and a handful of instance admins who were prepared to harass and abuse anyone right off the internet for even daring to suggest building such a thing. I don’t doubt that those admins sincerely hold their beliefs, but I do very much strongly doubt that they speak for a majority, or even a significant minority, of Fediverse users. And mostly when I’ve tried to explain the conflict to people who weren’t familiar with it I’ve been bombarded with questions about “Wait, so they want to post publicly, but also not have it show up publicly?” Which seems to be the way the aforesaid “normies” view the whole thing.

                                                                      There seems to be this view, especially from people in the tech industry, that just because twitter did something one way, it is inevitable that the fediverse will do it too.

                                                                      One of the things people will use any social media platform for is talking about current events and other topics that interest them. One of the things people will want, as part of that, is the ability to find, connect, and interact with others who are talking about the same events and topics. The expected user-experience solution to that is a search box into which you can put your terms and get back relevant results. Twitter did not invent that, nor is Twitter the only social media platform nor service in general ever to have such search functionality, and there is nothing whatsoever Twitter-specific or “Twitter did it that way” about wanting to have a search box. And Mastodon at least already has limited search functionality; what’s missing is the ability to usefully search beyond one’s local instance for arbitrary terms (rather than for specific handles or hashtags, which currently can be searched for).

                                                                      So, effectively, trying to prevent search and discovery from happening is saying “Attention all social humans! Stop being human and social! By Order Of The Admins!” This is not going to work; somebody’s going to build it, and right now everybody who tries to build it thoughtfully is being harassed into oblivion, so unfortunately that means it’s going to be built by non-thoughtful people who don’t care about being harassed over it.

                                                                  2. 6

                                                                    I am not sure where you find this greater fediverse community. The bubble I’m in seemed absolutely fine with searchtodon. Even the ones that were really angry at previous attempts essentially summed up their thoughts as “ah, this seems reasonable, why not?”. I honestly feel like this is a loud minority of people bullying others off. It happened maybe 10 times already, even before twitter’s shenanigans. I don’t think this is healthy for the network and only drives curious people who want to improve it away.

                                                                    I honestly don’t know what searchtodon could’ve changed from it’s initial plan that wouldn’t have made it completely useless, There’s just some people who don’t want search, period, and they react violently to anyone trying to work on it. I’ve seen threats at developers life from them on some previous occasions. I don’t know how else to call it but bullying.

                                                                    As for developer needing to fight legal threats: they can just release the source code. There’s plenty of actors with semi-malicious intentions that would run such software - enough that trying to take them all down would be foolish to attempt. After one goes down, another one would pop up. So honestly, I’d rather have a reasonable search that some don’t like, than a malicious one that everyone hates.

                                                                2. 8

                                                                  You may find the berrypicking paper to be a good starting point for understanding search and indexing as tools for humans doing research.

                                                                  If anyone is denying reality I think it’s everybody creating the indexers.

                                                                  I don’t know if you recall the AOL era of the early 90s. At first, the idea was that all content would be curated, organized, and available via keyword search. (A keyword is kind of like a Mastodon hashtag.) However, the advent of full-text indexing led to today’s modern reputation-ranked full-text search interface. With the hindsight of history, I think that you are exactly backwards: if anybody is denying reality, it’s the folks publishing their data publically and then politely asking not to be indexed.

                                                                  1. 3

                                                                    I mean look, heed the warning of every index-related project that’s come before, or don’t. But be prepared for a wave of people to do everything in their power to stop you and limit you. You’re approaching this as a technical problem to be solved, when it’s not, it’s a social problem, and the society in question has said “this is unacceptable, we will not permit this”.

                                                                    If this problem truly was only technical, we wouldn’t have seen multiple fediverse index related projects collapse due to community pressure this week, let alone over the past several years.

                                                                    1. 6

                                                                      If record labels, scientific journals, video-game publishers, and even governments cannot stop proliferation of (meta)data, then I am genuinely unable to understand what structural differences protect the Fediverse from similar proliferation. Information tends to be a free good simply because it is not scarce, and attempts at artificial scarcity of data are incongruous with information theory.

                                                              1. 13

                                                                This means the difference between 2022.08.1 and 2022.09.2 is impossible to tell. Was 2022.08.1 the second release in August, or a hotfix for a break in 2022.08.0? Which release is more stable? Should I expect them to be compatible? There just isn’t any information here to make a decision.

                                                                As I understand it, the pro-CalVer argument is that SemVer and other “compatibility semantics as part of the version number” approaches are basically wishful thinking – that there’s no way to actually do it and satisfy all or even most consumers, since consumers will have varying ideas and expectations about those semantics. So, the argument goes, better to use something that explicitly cannot contain compatibility semantics; this way everyone knows they have to go to the changelog every time, rather than make potentially unfounded assumptions based solely on the version number.

                                                                So the CalVer supporter’s reply to the above quote would be “yes, that’s deliberate and is in fact the whole point of using a version scheme like CalVer”.

                                                                1. 9

                                                                  consumers will have varying ideas and expectations about those semantics

                                                                  I don’t buy this. It is true as an academic argument, but on a practical level SemVer works very well. It doesn’t have to be perfect, it has to be good enough.

                                                                  1. 7

                                                                    It is true as an academic argument, but on a practical level SemVer works very well.

                                                                    The “scandal” of the Python cryptography package rewriting its compiled extension from C to Rust is my go-to example of SemVer expectation problems. The package wasn’t even using or advertising SemVer, and the rewrite of its compiled extension did not break or change any of the package’s documented public API (so would not have been a SemVer violation even if the package were trying to follow SemVer), but it was still widely condemned as an unacceptable compatibility break and a violation of consumer expectations.

                                                                    So this is very much a practical and not an “academic” thing.

                                                                    1. 10

                                                                      Was it widely condemned?

                                                                      I got the impression that it was condemned by a very small but very loud minority - people who cared about alpha mainframe architectures possibly?

                                                                      I didn’t follow the situation super closely though, so maybe there was more of a universal condemnation than I picked up.

                                                                      1. 2

                                                                        I remember far fewer voices of reason in that whole situation. Some of it was, I think, driven by the choice of Rust specifically (for some people that’s practically a culture-war target nowadays), but I would not have wanted tobe a cryptography maintainer as that was going on.

                                                                        1. 3

                                                                          I would not have wanted to be a cryptography maintainer as that was going on

                                                                          Me neither, but perhaps this is more reflective of the community than of the versioning scheme used?

                                                                          Edit: I should be clear that I’m not part of that community.

                                                                      2. 3

                                                                        Works very well does not mean perfect. I don’t see any incompatibility here.

                                                                        1. 2

                                                                          I’m not asking for “perfect”, though. And I don’t think “works very well” is how I’d describe the situation I mentioned.

                                                                          1. 3

                                                                            I’m not applying it to that situation. That seems like one place it may have broken down.

                                                                            When I say works very well, I’m talking about my experience of how it has made it possible to keep the many dependencies of many applications up to date with very little breakage. I think of it as a hint, or a strong signal, not an absolute guarantee. The alternative is having no signal and adding a larger burden on every user of the dependencies.

                                                                      3. 3

                                                                        My problem with SemVer is that, due to the “spacebar heating” effect, pretty much anything can be a breaking change, so what the author considers an actually breaking change is highly subjective.

                                                                        1. 2

                                                                          This may be true for applications, where the set of interfaces has to be clearly defined to be able to define breakage, but for libraries, breaking changes are really easy to identify. You bump the patch version if the change did not affect the ABI, the minor version if you only added to the interface (not affecting all applications linked against a smaller minor version) and the major version if you changed the interface (i.e. changed existing methods or data structure).

                                                                          Regarding applications, I like the approach to build even those as libraries and the GUI as a thin layer on top. This makes it easy to write different interfaces to one single program (e.g. a GUI, a TUI, a web interface and maybe even a local REST-API or other form of IPC at the same time). Then you can apply the library rules and resolve this conflict. :)

                                                                          1. 2

                                                                            The library rules seem too simple. What about backwards-incompatible semantic changes that don’t touch the interface?

                                                                            1. 0

                                                                              This naturally also constitutes a major version bump, which can be trivially deduced from the rules I gave.

                                                                              1. 2

                                                                                Would you consider the removal of spacebar heating a breaking change?

                                                                                1. 1

                                                                                  No, because it wasn’t specified in the documentation.

                                                                          2. 2

                                                                            There’s a weird corollary of this where sometimes libraries that are actually good at maintaining backwards compatibility in practice end up with really large major version numbers. Their authors care a lot about backwards compatibility, so they bump major any time they change anything that could possibly be depended on. This includes stuff that almost certainly no real API consumer cares about, like the joke about spacebar heating. As a result they end up on version “27.0.0” after a few years. Meanwhile consumers originally written against their “0.1.0” prerelease have yet to get broken.

                                                                            1. 1

                                                                              SemVer also signals intent as to the scope of the change. What else, besides always reading the change notes (if there are any) or just trying it out, do you suggest?

                                                                              1. 1

                                                                                Assume all changes are breaking and invest in good test automation.

                                                                        1. 5

                                                                          This thread appears to be a response from someone who has significant experience in the space.

                                                                          A key quote:

                                                                          This only scratches the surface of the technical complexity here. The reason there are so many tools for managing Python dependencies is because Python is not a monoculture and different folks need different things.

                                                                          For the record, I agree with the linked thread and believe that people who post this sort of “fix Python packaging by just doing this” article are unhelpfully naïve at best.

                                                                          1. 8

                                                                            Another interesting quote is

                                                                            You gotta deal with the Python Core team and the steering council. They have consistently abdicated the details of packaging to the community. They aren’t, at this time, very interesting in taking over packaging and telling the community how to manage their dependencies.

                                                                            That’s useful to contrast with Rust, where it’s a usual talking point how early on Mozilla very deliberately hired bundler’s creator to implement a package manager for Rust.

                                                                            1. 1

                                                                              I agree with most of that thread, but will I pile one more hot take on top of this part of it:

                                                                              So you want to fix Python packaging: you fucking can’t. get lost.

                                                                              You also probably don’t fucking need to.

                                                                              In the past 23ish years of using python, it has only occasionally been a headache for me. And most of those occasions pre-dated virtual environments. Compared to perl, bash, java, php, ruby, or C++, using other people’s software in my python programs has been, broadly, a cakewalk. Distributing my C++ extensions has occasionally been a headache, but no more so than distributing C++ libraries for people to use from C++ programs.

                                                                              Once I learned my lessons in the early 2000s about not touching the system python, the situation has only occasionally been a hassle.

                                                                              And these days, a few minutes’ fiddling with pip-tools or poetry is almost always the worst of it. At least on Linux or Linux-adjacent things.

                                                                              As thorny as some corners of the ecosystem can be, I’ll take it over node. And CPAN. And ruby gems. And pkg-config. And Conan. And maven. And likely a pile of other things I’ve had to touch for varying amounts of time since the late 90s.

                                                                            1. 22

                                                                              The part about how XHTML failed to take root and how web devs were stupid/lazy for not wanting to produce syntactically valid XML feels a bit too hindsighty.

                                                                              It wasn’t until HTML5 that browsers actually agreed on how to deal with malformed HTML. The early HTML standard also played fast and loose, and constructs like

                                                                                   <b>foo <i>bar</b> baz</i>
                                                                              

                                                                              were commonplace, as was the use of <p> as a paragraph break rather than a paragraph wrapper.

                                                                              It’s easy now to say everything should’ve been an AST from the start, and throw shade at web people, but it was a much harder sell when there was way more legacy code and content around that simply wasn’t well formed.

                                                                              OP makes the point that custom escapers need to match what browsers do, but that assumes browsers even agree. In the heydays of IE6-8, it was totally common to use subtle differences in CSS parsing and interpretation to make e.g. a selector that targeted one browser version that would be inert on others.

                                                                              Yeah it was icky, but there was a reason string concatenation was the common practice. You had little other choice. And it wasn’t the fault of yolo’ing PHP devs, but Microsoft, one of the biggest software houses around.

                                                                              On top of this, XHTML made the fundamental mistake of trying to dictate a standard without offering any compelling reason to switch to it. All it added was busywork. HTML5 instead made the saner choice of not using a full blown XML wrapper, and was better for it, while offering actual improvements people wanted.

                                                                              1. 15

                                                                                While I agree that for most people the benefits of XHTML were not exactly exciting, I wouldn’t say they were nonexistent. Being able to process HTML with a standard XML parser, rather than having to find things that can parse HTML specifically, is a benefit. Being able to apply standard XML transformation tools is a benefit. Though I would argue the most significant benefit is namespacing, which is a feature of XHTML I use even today.

                                                                                The lack of namespacing in HTML has resulted in a lot of crude hacks, like the data- prefix, which I feel is pretty unfortunate. HTML5 ultimately adds SVG support to HTML5 simply by the fiat of saying that <svg> is SVG, but there’s not really any clean way to integrate new schemas. I think this actually significantly undermines many of the uses of HTML/XML; namespacing allows you to freely annotate documents however you like while keeping them still readable by existing browsers.

                                                                                Also, while ultimately there was this negative reaction to XHTML, honestly that came relatively late. I remember the heydey of alistapart.com and CSS Zen Garden where, frankly, there was a very positive ethos to web design. People cared about the idea of the semantic web, of using the semantics of the various HTML tags correctly, of not using tables for layout, of using CSS properly, and using XHTML. People would proudly put those W3C ‘valid XHTML’ badges on their site (remember those?). It wasn’t that uncommon for people to talk about how they were annoyed they couldn’t serve their pages as application/xhtml+xml yet because IE didn’t support it. So it seems to me there was actually a fair bit of enthusiasm and good faith engagement at first. To my recollection, the ‘rejection’ of XHTML really came with the HTML5 movement, which above all else seemed quite focused on the web as an application platform and less as the semantic hypertext platform envisaged by the W3C.

                                                                                1. 19

                                                                                  XHTML was a mess, and I think you have rose-tinted glasses/nostalgia looking back on it. I was there, active in web standards communities in the early to mid 2000s, and I recall it very differently.

                                                                                  Evan Goer’s “XHTML 100” experiment is still online and is a stark reminder of what things really were like back then. For those unfamiliar: in 2003, he picked a good-sized sample of websites run by “alpha geek” web designers/developers, people who advocated for and presumably understood the state of the art at the time. And he ran three simple tests on their sites: (1) does the home page validate as XHTML; (2) do three secondary pages validate as XHTML; (3) does the site serve the XHTML content-type to user agents which accept it? 74% of sites – and remember, these are the personal sites of “alpha geek” folks who are supposed to know what they’re doing! – failed the first test. Only one, out of 119 in the starting sample, passed all three tests.

                                                                                  And that’s kind of where XHTML was for much of its early days; something that people were weirdly enthusiastic about and pushing heavily, despite largely not doing it correctly (and getting away with it because browsers were still rendering their sites with forgiving HTML parsers).

                                                                                  Then people like Mark Pilgrim started pointing out deeper problems with XHTML – there are whole labyrinthine tangles of multi-spec interactions you can get into when you serve XHTML-as-XML-over-HTTP, like a sequence of bytes that is a well-formed XHTML document if served as application/xhtml+xml but not if served as text/xml, or issues of validating versus non-validating parsers, or issues of which things are CDATA in HTML 4 but PCDATA in XHTML, or the different DOM APIs depending on whether a document parsed as HTML or XHTML… and all for what? For what benefit? The go-to demo was always MathML, but people founds ways to do math without bringing on the pain of going full XML.

                                                                                  The thing that killed XHTML, finally, was W3C’s architecture astronautics around updates; XHTML 1.1 already went in that direction, and the ultimate failure of XHTML 2 just kind of sealed it. They had become, effectively, completely disconnected from both browser vendors and web authors, and were off in their own little bubble doing stuff that they thought was elegant, and everybody else got tired of saying “no, what we need is something that’s practical”, and went off and formed WHATWG and the rest is history.

                                                                                  1. 5
                                                                                    • Having to deal with long, convoluted namespaces and other strict XML aspects was terrible. And there were multiple doctypes as i recall, strict vs transitional, which was a bad idea. And the idea that you can just embed foreign content into a page is by itself a bit absurd. It’s pointless unless it’s supported by every client in use.

                                                                                    • Semantic HTML was, sorry to say it, mostly a cult. And the list-apart people were part of that. They came up with absurd css hacks just to avoid having to add a wrapper div or span out of a misguided sense of purity. In HTML5 it’s really no better: you should look up what the guidance for e.g. the article-tag is. It basically guarantees nobody is every going to do anything useful by looking for that tag.

                                                                                    • XML tooling is awful. Nobody sane uses XSLT, and everyone switched their APIs to JSON the minute it became commonly supported.

                                                                                  2. 7

                                                                                    It wasn’t until HTML5 that browsers actually agreed on how to deal with malformed HTML.

                                                                                    If you construct DOM (or create valid XML another way) on the server, browser mistakes while processing invalid documents does not matter.

                                                                                    The problem was in the organic growth of the web technologies – XHTML arrived too late, when everyone was already used to produce garbage („it will be displayed somehow anyways“) and to consume garbage („there is lot of invalid documents on the web and I want to see them so I need a browser that displays random tag soups“). On the other hand, the organic growth was also one of big opportunities of the web and the web took big advantage of it.

                                                                                    P.S. What if we allow invalid ELF or other executables and invent some heuristics to run them approximately to how they were intended to run? We can find functions/calls with similar names or skip missing ones and it will run somehow. It would lead to existence of lot of garbage executable files, people will get used to it and will not bother to create valid binaries. And then it would be really difficult to fix this state.

                                                                                    1. 5

                                                                                      P.S. What if we allow invalid ELF or other executables and invent some heuristics to run them approximately to how they were intended to run? We can find functions/calls with similar names or skip missing ones and it will run somehow. It would lead to existence of lot of garbage executable files, people will get used to it and will not bother to create valid binaries. And then it would be really difficult to fix this state.

                                                                                      That’s not super far away from a reasonable description of some of the compatibility hacks lurking in Windows. Those are API-level things as opposed to executable format things, but that kind of chicanery certainly occurred. The slope wasn’t as slippery as you suggest, but it certainly left a mess for quite some time.

                                                                                      1. 1

                                                                                        It would lead to existence of lot of garbage executable files, people will get used to it and will not bother to create valid binaries. And then it would be really difficult to fix this state.

                                                                                        Correct. And XHTML was a misguided way to try and fix that.

                                                                                      2. 5

                                                                                        On top of this, XHTML made the fundamental mistake of trying to dictate a standard without offering any compelling reason to switch to it. All it added was busywork.

                                                                                        XHTML was about XML data. It was about treating web documents as data, and potentially even evaluating them under other document/data semantics using schemas. All of our existing tools to work with XML and XML Schemas were thus enabled through this unification, and presumably we’d have had a world where we’re returning XML from our web APIs instead of JSON, and that XML could be data for someone to process with a consumer app, or could be a document to be marked up and rendered via a user agent. Instead we live in a world where we return HTML or JSON, rather than one unified form of data.

                                                                                        1. 3

                                                                                          It wasn’t until HTML5 that browsers actually agreed on how to deal with malformed HTML. The early HTML standard also played fast and loose, and constructs like

                                                                                          <b>foo <i>bar</b> baz</i>

                                                                                          That’s not what I recall. Browsers would (usually) render it correctly, but HTML 1.0 was a dialect of SGML, which was quite explicit that this was not permitted.

                                                                                          1. 3

                                                                                            HTML was modeled on (or inspired by) SGML, but it has never been a dialect of it. In much the same way that some English grammarians like to claim that English is derived from Latin and so should imitate Latin habits, the W3C wrote into the HTML 2.0 standard (and perhaps earlier formal ones) that HTML was derived from SGML without it actually being so. Since early W3C standards were essentially documenting existing browser practice instead of defining new practices, this redefinition was as ineffective as you would expect. Browsers paid no attention to the claim that HTML was SGML based and so should support various SGML features or behave in (incompatible with existing practice) SGML-ish ways. HTML5, which was set up to document existing practices, finally put a stake in this by saying explicitly that HTML was a custom format merely inspired by SGML.

                                                                                            (You can read a polite version of this in the Parsing HTML documents section of the HTML(5) standard. I have some feelings about this whole area because I was there at the height of the XHTML wars, and not on the XHTML side.)

                                                                                            1. 1

                                                                                              I first learned HTML back in the 1.0 days when the img tag was not supported by all browsers and the example that you gave was in every HTML tutorial as an example of something that might work but was I’ll formed. There were a lot more HTML parsers back when you could write one in a few hundred lines of code and they did not all do the consistent thing with that example.

                                                                                        1. 9

                                                                                          I don’t dislike make, and even use it quite a lot, both at work and outside, but let’s be realistic here:

                                                                                          It’s already available everywhere.

                                                                                          No, it’s not, not on Windows, where even if you have something like Git Bash, it won’t have make. And even after installing it, using it on windows is extremely annoying, specially if you want to use PowerShell.

                                                                                          Anyone telling me to “well, don’t use Windows” will be cursed with 10 years of intermittent production bugs that don’t leave stack traces.

                                                                                          It’s fast.

                                                                                          That much is true.

                                                                                          It’s language-agnostic

                                                                                          Ish. It’s not shell agnostic, at least not enough to support something like PowerShell, which means your target rules still need to be in shell script. Which sucks as soon as anything more complex is needed.

                                                                                          It’s simple (…)

                                                                                          LoL. Maybe there’s a BSD/POSIX make that is simple, but, again, realistic, everyone uses GNU make, and that is not simple. It certainly doesn’t make it simple to do most things you might want to to with it if you’re not compiling C, which is my case, and the article’s case.

                                                                                          Make is one of those things that when you stop to really look at it, is not actually good for most of the things we hammer it into. It’s just less bad and less inconvenient than all other alternatives.

                                                                                          1. 7

                                                                                            To be fair, he says make is a bad choice if you need to use Windows.

                                                                                            1. 3

                                                                                              That’s what I get for skimming instead of reading with attention =P

                                                                                            2. 4

                                                                                              No, it’s not, not on Windows […] It’s not shell agnostic, at least not enough to support something like PowerShell, which means your target rules still need to be in shell script. Which sucks as soon as anything more complex is needed.

                                                                                              Very true. This was one of the main reasons for us starting build2 – we needed Windows to be the first class citizen. To achieve this we had to invent our own shell-like language (called Buildscript; while at it we’ve also fixed a couple of major POSIX shell annoyances like having to quote variable expansions or not failing on errors by default). Here is an example of a rule written in Buildscript. It works the same everywhere, including on Windows.

                                                                                              Having said that:

                                                                                              Anyone telling me to “well, don’t use Windows” will be cursed with 10 years of intermittent production bugs that don’t leave stack traces.

                                                                                              You mean the same as if you do use Windows? Supporting Windows in build2 was and is a constant source of pain and frustration. So if you can avoid using Windows, by all means do. You will save yourself a lot of grief.

                                                                                              1. 2

                                                                                                I’ll check that out, sounds interesting!

                                                                                                You mean the same as if you do use Windows

                                                                                                Hehe, I feel you. But yeah, I don’t love windows, but I’m currently working in a bank, I don’t get much leeway I’m choosing technologies, hence the comment XD

                                                                                              2. 2

                                                                                                So, at several companies now I’ve worked with Makefile-driven setups, and even been one of the people responsible for building and maintaining them.

                                                                                                As a developer experience, it is of course not perfect, but it’s also better than a lot of the alternatives. If what you want is a tool that lets you specify a set of tasks to execute and have all the developers on the team be able to use it, well, basically everything that can be leveled as criticism at make can also be leveled at other automation/task orchestration tools. Lots of them are not fully cross-platform, or require extra installation/upkeep, or have limitations due to being designed with a certain specific task domain in mind, etc. etc. etc,

                                                                                                As a lowest-common-denominator of “available” and “relatively easy to write the config for”, make does a pretty decent job.

                                                                                                Anyone telling me to “well, don’t use Windows” will be cursed with 10 years of intermittent production bugs that don’t leave stack traces.

                                                                                                Remember that we’re talking about web development here, so there aren’t Windows-specific builds – in my own case it’s all backend services that will deploy to containers. And the entire team is on various flavors of Unix-ish operating systems, with no developers on Windows as their daily driver OS. So this is a purely irrelevant objection.

                                                                                                1. 1

                                                                                                  So this is a purely irrelevant objection.

                                                                                                  Bit harsh. Bit of a fallacy, too: you or your team not using windows doesn’t mean no one does it, anywhere.

                                                                                                  1. 1

                                                                                                    Neither harsh nor fallacious. Your whole objection seems to be asking what to do about developers who are on Windows and builds that are broken on Windows. But the whole point of my comment is: there are teams out there who do not have “Windows builds” and who have no developers on Windows.

                                                                                                    If I were building software that require OS-specific builds, or working with people who were on Windows, I’d choose a different task automator. But I’m not, so make poses no Windows-related problems to me.

                                                                                                    And the entire context of the main article is web applications, which are platform-independent, so the objection of a build breaking/failing on Windows is… irrelevant, because a web application does not have platform-specific builds like that.

                                                                                                    1. 1

                                                                                                      I’m not sure why you started talking about broken builds? I didn’t mention it, and the makefile in the article has more PHONY targets than file dependent ones, so it’s using make as more of a task runner than a builder. Which is fine, I do that too.

                                                                                                      Your whole objection seems to be (…) builds that are broken on Windows.

                                                                                                      It isn’t, though? My objections were mostly that make isn’t really available everywhere, nor is it really simple.

                                                                                                      web applications, which are platform-independent

                                                                                                      The application can be as platform independent as you want, but unless you’re also doing all your development in a browser, your development environment isn’t. Your tools have to work with every platform that anyone in your team uses. And there are a lot of teams using Windows out there.

                                                                                                      By the way, I did see, after writing the first comment, granted, that windows is mentioned at the end of the article, so it seems like the article itself puts at least an asterisk on the whole “make is available everywhere” argument.

                                                                                                      1. 0

                                                                                                        I’m not sure why you started talking about broken builds? I didn’t mention it,

                                                                                                        You started off with:

                                                                                                        Anyone telling me to “well, don’t use Windows” will be cursed with 10 years of intermittent production bugs that don’t leave stack traces.

                                                                                                        That seems to be a pretty clear claim about builds that are broken on Windows. Which, for the third(?) time now, isn’t a thing with web applications because they don’t have platform-specific builds.

                                                                                                        Your tools have to work with every platform that anyone in your team uses. And there are a lot of teams using Windows out there.

                                                                                                        And there are equally a lot of teams not using Windows out there. Like the teams I work with. Like (presumably) the teams the author of the article works with. Dev tooling being non-portable to Windows is a non-issue if nobody on the dev team is using Windows. If a team doesn’t use Windows, it is OK for them to use some tools that are not portable to Windows. Really. I’m not sure why this seems to be such a difficult point to get across to you.

                                                                                                        1. 1

                                                                                                          You started off with (…)

                                                                                                          That was a joke, completely independent from the main point. It’s not “clearly” about broken builds on windows at all, it’s about snarky responses that can come up when someone mentions using windows.

                                                                                                          And there are equally a lot of teams not using Windows out there.

                                                                                                          We’re talking in circles here. I’m not saying there’s someone using windows in every single development team in the known universe, I’m saying “some people use Windows”.

                                                                                                          I don’t know why you think answering “some people use Windows” with “some people don’t!” is some kinda of argument that proves that I’m completely wrong? It’s not even a disagreement, it’s a logical conclusion: if only “some” people use Windows, necessarily there are people that don’t.

                                                                                                          (…) is a non-issue if nobody on the dev team is using Windows.

                                                                                                          You keep repeating this and I am out of ways to try to communicate that I am not talking specifically about your team, or any other you worked with in the past, I am talking in the general sense.

                                                                                                          Can we agree on disagreeing? Because at the end of the day I’m pretty sure we’re both just gonna keep using make anyways, so, it’s a pointless discussion.

                                                                                                          1. 1

                                                                                                            You argued pretty strongly against Makefiles on grounds of non-portability to Windows. All I’m pointing out is that in the particular field of programming the original article’s author was discussing, this is often a non-concern – in web development, you don’t need a Windows-specific build of your application, and the number of developers who use Windows as their daily operating system for doing dev work is extremely small.

                                                                                                            So basically you weren’t arguing against the article; you were arguing against a hypothetical alternate universe version where the article’s author worked in a field of programming where Windows support is much more important. And I said that’s not really relevant, because it isn’t relevant.

                                                                                              1. 22

                                                                                                To save people misunderstanding from just the title: this proposal would not remove or turn off the GIL by default. It would not let you selectively enable/remove the GIL. It would be a compile-time flag you could set when building a Python interpreter from source, and if used would cause some deeply invasive changes to the way the interpreter is built and run, which the PEP goes over in detail.

                                                                                                It also would mean that if you use any package with compiled extensions, you would need to obtain or build a version compiled specifically against the (different) ABI of a Python interpreter that was compiled without the GIL. And, as expected, the prototype is already a significant (~10%) performance regression on single-threaded code.

                                                                                                1. 6

                                                                                                  It’s pretty clear this is for scientific computing applications and no doubt the expectation is that major scientific libraries will distribute no gil versions. I suspect that users who fall within the motivating use case will take the slow down in Python execution in order to benefit from the ability to exchange data in process and use multiple cores.

                                                                                                  1. 3

                                                                                                    The ABI break is a pretty big burden. It being a build option is certainly good but any org that decides to onboard will have a lot of work on their hands

                                                                                                  1. 7

                                                                                                    An entire leap cycle of 400 years has 146097 days, which is 0 mod 7. During the 400 years, there is a slight bias toward some particular day of the week for each day.

                                                                                                    1. 3

                                                                                                      365 mod 7 == 1 which means that if we have consecutive years of 365 days each, each date in them would “advance” by one weekday. Suppose that year n and year n + 1 are both 365-day years; if January 8 falls on a Sunday in year n, it will fall on a Monday in year n + 1.

                                                                                                      But a leap year causes each date to “advance” by two weekdays (366 mod 7 == 2). Or, more relevantly to this issue, a leap year causes each date to skip one weekday. There are 97 leap years in a 400-year leap cycle, so each date must do 97 such skips in each leap cycle. Since 97 mod 7 == 6, it’s not possible for these skips to be distributed evenly among all seven weekdays. Thus, it must be the case that the same date does not fall on each weekday an equal number of times per leap cycle.

                                                                                                    1. 2

                                                                                                      I suppose the title should be “has happened.” I’m very stale on stats, but maybe if you ran it out to, say, 3000 it might be different.

                                                                                                      1. 3

                                                                                                        It is, but Sunday is still the top spot, and Monday is still the bottom spot.

                                                                                                        Sunday	206
                                                                                                        Tuesday	206
                                                                                                        Friday	205
                                                                                                        Thursday	202
                                                                                                        Wednesday	202
                                                                                                        Saturday	199
                                                                                                        Monday	198
                                                                                                        
                                                                                                        1. 1

                                                                                                          If you want to just tinker manually with dates and see the frequency numbers for which weekdays they fall on, here’s a Python script. To calculate for January 8, run python weekday_frequency.py 1 8. Sub in other month/day arguments as you like.

                                                                                                        1. 7

                                                                                                          I know it’s not a favorite among rustaceans, but If you care about compile times and short development cycles, this is where Go shines.

                                                                                                          I would go as far as claiming that short complle times for larger projects is the whole point of using Go.

                                                                                                          1. 3

                                                                                                            FYI: for my own domain (backend web services), if I were going to switch to a statically-typed + compiled language, Go would still be very far down the list – the top choice would almost certainly be C#. The tooling and ecosystem are mature and fantastically rich, the performance is great (both compile-time and run-time), and the language has learned a lot both from predecessors and from itself.

                                                                                                            1. 1

                                                                                                              Using dependencies is much easier and faster with Go, though.

                                                                                                              I’ve tried using SDL2 in 20 different programming languages. I really liked the C# version, but getting the SDL2 dependency to work was much harder than from Go. I ended up calling SDL2 directly from C# instead of using an intermediate library.

                                                                                                              https://github.com/xyproto/sdl2-examples

                                                                                                              Also, as far as I know, C# does not have the equivalent to defer in Go.

                                                                                                          1. 18

                                                                                                            There’s a reason we stopped using CGI the way we did though. You wouldn’t want to use escript for them, unless you’re happy to wait for the BEAM startup on every single request. Then the memory is used per-request, so a few slow requests can kill your server. And you lose in-process cache and any connection pooling too. I find this a bit annoying every time someone says that Lambda is just CGI these days…

                                                                                                            CGI has some basic uses, but these days, I can’t think of any reason I would use it instead of FastCGI or something similar.

                                                                                                            1. 3

                                                                                                              CGI has some basic uses, but these days, I can’t think of any reason I would use it instead of FastCGI or something similar.

                                                                                                              One of the nicest things is if you write a program in such a way it works with basic cgi, it will also work in almost any other deploy system. You can transparently switch it to fastcgi or scgi or an embedded http server since the handler, by definition, will not rely on any of that cross-request in-process stuff (though you can still provide it if you want as a bonus, of course). There’s a lot of benefit in this - you can also do horizontal scaling with the model quite easily.

                                                                                                              a few slow requests can kill your server.

                                                                                                              this tends to be the case on a lot of setups, cgi is fairly resilient to it since you can always spawn more processes or kill individual requests from the outside.

                                                                                                              1. 2

                                                                                                                Exactly. The reason they listed CGI for Perl is because presumably Perl didn’t have anything better. CGI is very much “the simplest thing that could possibly work”, but it has terrible performance since it launches a new process to handle every request. It’s ok for a toy project, but not much more.

                                                                                                                1. 6

                                                                                                                  but it has terrible performance since it launches a new process to handle every request

                                                                                                                  We should be careful to distinguish between Unix processes and interpreters / VMs

                                                                                                                  A process in C / C++ / Rust should start in 1 ms or so, which is plenty fast for many sites.

                                                                                                                  The real problem is that Perl’s “successors” like Python and Ruby start an order of magnitude slower – 30 ms is probably a minimum, and it’s easy to get up to 300 ms when you import libraries

                                                                                                                  node.js and Java have additional JIT startup time, and basically the JIT will not get warmed up at all.

                                                                                                                  A program that proves the point is cgit, which is CGIs written in C, and you can see it’s plenty fast.

                                                                                                                  https://git.zx2c4.com/cgit/

                                                                                                                  (Not making any comment on the wisdom of CGIs in C here :) But they made it work )

                                                                                                                  1. 4

                                                                                                                    My blog is a CGI program in C and it can return a page in 0.01 seconds. I’ve never really had an issue with performance in the 23 years I’ve been using the program (and that’s even when I first started out using a 66MHz 32-bit x86 Linux server in 1999). I think establishing the TLS connection is more system intensive than my CGI program.

                                                                                                                    1. 2

                                                                                                                      It pains me to think of launching a process as “fast”, but yeah, you’re right about that. A ms isn’t unreasonable latency.

                                                                                                                      Of course even a C/Rust CGI will slow down if if it has to do something like open a connection to a database server or parse a big config file.

                                                                                                                      FastCGI isn’t much harder to implement and saves you the startup overhead. (Although TBH I can’t recall whether it was FastCGI I used or a different similar protocol.)

                                                                                                                      1. 2

                                                                                                                        I think basically what happened is that everything else got so slow – Python/Ruby startup time, TLS, web pages do 100 network round trips now, etc. – that Unix processes are fast now :)

                                                                                                                        It’s basically like bash saying it’s “too big and too slow” in the man page written in the 90’s, but now it’s fast and small, compared to modern programs

                                                                                                                        Database connections are an issue, especially on shared hosting, but there seems to be this newfound love for sqlite which would almost eliminate that :)


                                                                                                                        I currently use FastCGI on Dreamhost, but my impression was that it was kind of a weird protocol without many implementations. It seems to be almost a PHP implementation technique – for some reason, it’s not widely used for Python / node.js / etc.

                                                                                                                        I actually had to download an ancient version of a FastCGI Python library to get it work. It’s clear to me that very few people use it on shared hosting

                                                                                                                        Which is shame because as mentioned in this thread, AWS Lambda is basically FastCGI – it has cold starts, and then it’s fast.

                                                                                                                        1. 4

                                                                                                                          FastCGI was a big deal ~2000, less so now. There used to be a lot of stuff using it. It still has currency in PHP-land because php-fpm is the only reasonable way to run PHP besides Apache mod_php, and people have various reasons for not wanting to run Apache, or not wanting to have mod_php in their Apache.

                                                                                                                        2. 1

                                                                                                                          For the uninitiated could you break down the difference between CGI and FastCGI. I understand the flow of CGI being request comes in, program runs and its stdout is sent back to the client.

                                                                                                                          But in FastCGI is their persistence processes involved?

                                                                                                                          1. 2

                                                                                                                            Yeah, with fastcgi the launcher can spawn and kill worker processes as needed, and each worker process can handle an arbitrary number of requests in sequence. They might live a long time or a short time depending on load and configuration.

                                                                                                                            You can abstract it so a single program works as cgi or fastcgi pretty easily.

                                                                                                                            There’s also a scgi alternative, which aims to be a simplified version of fastcgi but is generally the same thing, just the communication protocol is a bit different.

                                                                                                                            1. 1

                                                                                                                              Si simply more control instead of directly spawning a process on visit, awesome - gotya!

                                                                                                                      2. 5

                                                                                                                        To be fair that’s what “lambda”/Function-as-a-service things do (aws). Worse, they bring up the entire virtual machine up when a request is made. They seem to be used in production, albeit in very narrow workflows where the latency requirements aren’t tight.

                                                                                                                        1. 8

                                                                                                                          That’s exactly what I mentioned above… No, lambda is not even close to CGI. Lambda spawns an environment on your first request and retains it for a predefined time, so that other requests reuse it. Under enough load it will spawn more of those environments to keep the responses quick. This allows for in memory caching and preserving database connections. There’s also no “entire virtual machine”, but a thin implementation called firecracker, which is not that much slower than starting a standard process. It comes with lots of resource usage controls and scaling options which CGI itself has no concept of. If you’re fine with the first request in a batch taking over 100ms (which honestly most services are), you shouldn’t suffer from lambda latency. (Officially listed as 150ms https://firecracker-microvm.github.io/ )

                                                                                                                          It’s closer to fastcgi container autoscaled in a cluster.

                                                                                                                          1. 4

                                                                                                                            I guess there’s a line to be drawn on whether you consider “CGI” and “CGI with bolted-on process/env persistence” to be completely separate things.

                                                                                                                            I’m not sure I do.

                                                                                                                            1. 5

                                                                                                                              If you don’t, then everything is CGI with bolted on persistence. Literally anything that handles http requests is CGI with bolted on persistence if lambda is. Why even stop there - qmail is CGI with bolted on email protocol. That would make “CGI” itself meaningless.

                                                                                                                              Or from the other side - since CGI is defined as starting a new process per connection, which implies memory separation between requests, why would anything that doesn’t implement that core idea be called CGI?

                                                                                                                              1. 4

                                                                                                                                There are a lot of “CGI-ish” protocols which have kept many of the conventions of CGI even if they haven’t kept exactly the same mechanics as CGI. So, as I pointed out in another comment, Python’s WSGI literally reuses a lot of the special names used by CGI; it just puts them as keys in a hash table instead of as names of env vars.

                                                                                                                                To me, it doesn’t matter if the particular mechanics allow for persistence; if the overall conventions of how I’m supposed to understand a “request” are the same as CGI, I don’t meaningfully distinguish it, as a programming model, from CGI.

                                                                                                                                1. 2

                                                                                                                                  And FastCGI which is the de facto way of running PHP using nginx and others.

                                                                                                                                  1. 1

                                                                                                                                    This has truly been one of the most entertaining comment threads I have read. The lambda thing that AWS has is interesting. I don’t use any AWS services, only know that all Amazon employees talk about are their products but lambda does sound, like CGIA for 2022 (but if done in 2022 with modern tech).

                                                                                                                        2. 3

                                                                                                                          The process that it launches can be very lightweight though. A lot of cgi scripts that I’ve seen were punting most of the logic to another process. For example, connecting to a dev and putting all of the logic there, or simply serving a static file and prodding an IPC channel that coalesced messages and regenerated the file if it had grown stale.

                                                                                                                          1. 2

                                                                                                                            No, it’s because Perl was huge (much as Rails was once huge) back when no one had anything better. Perl has long since gotten a Rack/WSGI-like API and a nice little ecosystem of servers and middlewares and whatnot (and, of course, the ability to run anything that’s compliant with that API under CGI or FastCGI if that’s what floats your boat). But there was a time when “dynamic web content” meant “CGI”, and “CGI” meant “Perl”.

                                                                                                                        1. 4

                                                                                                                          You’re one of today’s lucky 10,000.

                                                                                                                          WSGI and ASGI both still feel really similar to CGI to me. There are some nice affordances there (which would generally also be in FastCGI) that I don’t want to give up for production-y things, but there’s a certain beauty to just reading a request off standard input and spitting your response to standard output.

                                                                                                                          FWIW, I also thought of them as being primarily a perl thing for a long time. I think that was mostly because our university server where we could publish our web sites only allowed perl in the shebangs for cgi scripts. It was some time before I realized that was just an administrative restriction on that specific server and not something inherent to CGI.

                                                                                                                          1. 2

                                                                                                                            WSGI basically is CGI, but instead of passing things in env vars with standard names, they’re passed in a hash table with standard keys.

                                                                                                                            ASGI is more of a break from the CGI model. If you use it only to do HTTP it can still be made to feel kinda CGI-like, but if you use it to do another protocol like WebSockets the differences really show up.

                                                                                                                            1. 1

                                                                                                                              I think the last time I used anything through CGI or, really, fastcgi, was the software I Rose to be in charge of in about 2015 that I last used in 2017. The app was written in C in the mid-2000s and ran perfectly inside of Apache the entire time. It was a collection of executables. I never really got down into the C myself but the CGI part was absolutely never a problem for us.

                                                                                                                              I remember some talk when serverless entered the vocabulary a few years ago and someone in the crowd asking if they could just ship a binary that talked fastcgi. I think the speaker was talking about lambda and openwhisk and at the time the answer was no. I was surprised.

                                                                                                                              1. 2

                                                                                                                                AWS went with their own system for some reason and didn’t officially allow custom bootstraps for years, but now you can find almost every connector. For example lambda-wsgi: https://pypi.org/project/aws-lambda-wsgi/ or lambda-fastcgi: https://bref.sh/

                                                                                                                                (you could do it in a hacky way before though - deploy a python script which only did exec(your_custom_binary))