1. 31

    I prefer to see this type of project that builds upon what it considers the good parts of systemd, instead of systemic refusal and dismissal that I’ve seen mostly.

    1. 15

      Same. Too often I see “critiques” of systemd that essentially boil down to personal antipathy against its creator.

      1. 22

        I think it makes sense to take in to account how a project is maintained. It’s not too dissimilar to how one might judge a company by the quality of their support department: will they really try to help you out if you have a problem, or will they just apathetically shrug it off and do nothing?

        In the case of systemd, real problems have been caused by the way it’s maintained. It’s not very good IMO. Of course, some people go (way) to far in this with an almost visceral hate, but you can say that about anything: there are always some nutjobs that go way too far.

        1. 3

          Disclaimer: I have not paid close attention to how systemd has been run and what kind of communication has happened around it.

          But based on observing software projects both open and closed, I’m willing to give the authors of any project (including systemd) the benefit of the doubt. It’s very probable that any offensive behaviour they might have is merely a reaction to suffering way too many hours of abuse from the users. Some people have an uncanny ability to crawl under the skin of other people just by writing things.

          1. 6

            There’s absolutely a feedback loop going on which doesn’t serve anyone’s interests. I don’t know “who started it” – I don’t think it’s a very interesting question at this point – but that doesn’t really change the outcome at the end of the day, nor does it really explain things like the casual dismissal of reasonable bug reports after incompatible changes and the like.

            1. 4

              I think that statements like “casual dismissal” and “reasonable bug reports” require some kind of example.

            2. 3

              tbf, Lennart Poettering, the person people are talking about here is a very controversial personality. He can come across as an absolutely terrible know-it-all. I don’t know if he is like this in private, but I have seen him hijacking a conference talk by someone else. He was in the audience and basically got himself a mic and challenged anything that was said. The person giving the talk did not back down, but it was really quite something to see. This was either at Fosdem or at a CCC event, I can’t remember. I think it was the latter. It was really intense and over the top to see. There are many articles and controversies around him, so I think it is fair that people take that into account, when they look at systemd.

              People are also salty because he basically broke their sound on linux so many years ago, when he made pulseaudio. ;-) Yes, that guy.

              Personally I think systemd is fine, what I don’t like about it is the eternal growth of it. I use unit files all the time, but I really don’t need a new dhcp client or ntp client or resolv.conf handler or whatever else they came up with.

              1. 4

                tbf, Lennart Poettering, the person people are talking about here is a very controversial personality.

                In my experience, most people who hate systemd also lionize and excuse “difficult” personalities like RMS, Linus pre-intervention, and Theo de Raadt.

                I think it’s fine to call out abrasive personalities. I also appreciate consistency in criticism.

        2. 4

          Why?

          1. 7

            At least because it’s statistically improbable that there are no good ideas in systemd.

            1. 1

              Seems illogical to say projects that use parts of systemd are categorically better than those that don’t, considering that there are plenty of bad ideas in systemd, and they wouldn’t be there unless some people thought they were good.

              1. 2

                Seems illogical to say projects that use parts of systemd are categorically better than those that don’t

                Where did I say that though?

                1. 2

                  I prefer to see this type of project that builds upon what it considers the good parts of systemd

                  Obviously any project that builds on a part of system will consider that part to be good. So I read this as a categorical preference for projects that use parts of systemd.

          2. 2

            There have been other attempts at this. uselessd (which is now abandoned) and s6 (which still seems to be maintained)

            1. 4

              I believe s6 is more styled after daemontools rather than systemd. I never looked at it too deeply, but that’s the impression I have from a quick overview, and also what the homepage says: “s6 is a process supervision suite, like its ancestor daemontools and its close cousin runit.”

              A number of key concepts are shared, but it’s not like systemd invented those.

              1. 1

                s6 I saw bunch of folks using s6 in docker, but afaik that’s one of most not user friendly software i’ve been used.

          1. 1

            I’m surprised at how often GPS threatens to break time. Would be nice if the protocols could be updated to use less human time scales like weeks, but updating satellites is a quite brittle thing.

            1. 6

              The week number was already extended from 10 bits to 13 bits. The first satellites supporting it were launched in 2005 and the new format signal has been broadcast since 2014. For capable receivers, there is no rollover until 2137 (at which point, if GPS is still flying and hasn’t been further upgraded, the 157-year ambiguity will presumably be much easier to resolve than a 19-year one).

              This is more of a “gpsd (not GPS) tried to be clever and failed” issue. They could have simply had their logic for 10-bit week numbers resolve the ambiguity in such a way that it always returns the first matching time after the build date of the copy of gpsd itself. That would have been 100% reliable for anyone who upgrades their software at least once every two decades. If that’s not good enough then maybe they could think about a ratchet. What they ended up with instead… wasn’t really smart, it just looked that way.

              1. 1

                Do note that the exact same bug would have happened in 2003 as it is not the week counter rollover that caused it but the fact that there hasn’t been a leap second for 256 weeks which is the modulus to which the week is given for the next leap second date. Who knows how many similar footguns there are, but I’d think it would be best to utilize the fact that we now have better equipment and rework the signals to be harder to misinterpret.

                1. 2

                  Do note that the exact same bug would have happened in 2003

                  Yes, because it’s bad code. The solution is to have less bad code. Involving leap seconds in this was a mistake.

                  but I’d think it would be best to utilize the fact that we now have better equipment and rework the signals to be harder to misinterpret.

                  Again, already done. Not in the exact way you’re asking for, but in a way that works just fine in the real world. This is gpsd attempting to do its best with receivers relying on the old format (and fumbling it).

                  1. 1

                    A better solution is to make it harder to write bad code. GPS handling code will be written yet again another thousand times. Do we need to keep the possibility of them having the same exact bug yet again?

                    But I do have to agree on the fact that it has been fixed. The week on which the rollover will happen is now specified with the same modulo as the week so such bugs should hopefully no longer occur with new receivers.

                    1. 1

                      The week on which the rollover will happen is now specified with the same modulo as the week

                      This statement is nonsense.

              2. 1

                It’s not just satellites, it’s every receiver deployed in the last 40 years, many in neglected but safety- or life-critical use.

                1. 1

                  It’s possible to add extra signals, it has already been done. Having one with 40 bit seconds counter should definitely be possible without interfering with existing signals, and it would decrease the fragility of GPS time tracking massively.

                  1. 1

                    How exactly does that improve the situation of the receivers existing in the wild?

                    1. 1

                      It improves the situation of the receivers that will exist in the wild. This exact error happened 17 years ago to some Motorola receivers. The frequency of such occurrences is rare enough for information to be common knowledge even amongst those who implement GPS, but with enough impact that letting such errors repeat is asking for trouble.

              1. 50

                To be honest, I’m really sad and concerned about people opting to use WSL. It really looks like another EEE scheme. If you want to have freedom in the long run, please, please participate in the FOSS desktop ecosystem: help debug driver issues, help maintain HCLs to make it easy for new users to pick knowingly well-supported hardware, at the very least, report the bugs you see.

                WSL opens up a way for MS to push the restricted boot for desktops without people noticing who would otherwise be the first to notice. Monopolistic proprietary software vendors are not friends of FOSS. Never were, and never will be. Don’t trade essential liberty for temporary convenience.

                1. 10

                  Unfortunately, “the linux desktop” is also wholly captured by monopolistic proprietary software vendors, who are also not friends of FOSS. The systemd debacle, the dbus debacle, all the renderer debacles, the app store debacles, the various gnome debacles – it’s monopolistic proprietary tasteless foisted broken politically-contrived nonsense all the way down. It’s hardly surprising that people use WSL which at least has the benefit of you being able to run productivity software that consistently works.

                  1. 4

                    With big corporations one can never know, but it seems to me that in recent years Microsoft have been a good citizen of the open-source software world. I’m willing to give them the benefit of the doubt.

                    That being said, I’m also concerned about native Linux desktop losing traction, but I have think this has been happening steadily for a while now. I used to know a ton of people running Linux as their primary OS, and now almost everyone’s on macOS and WSL. Perhaps most people don’t care much about the underlying principles, as long as they get their work done.

                    1. 22

                      Well, and from your very post it’s clear where it eventually leads. macOS isn’t the “polished UNIX experience” people thought it always would be. I believe we should make the system we want or else we risk a situation when there will be nowhere to migrate to.

                      That said, I do get my work done on a Linux desktop and nothing crashes for me on 8th gen Intel NUC hardware.

                      1. 20

                        I’m one of them. I’m actually the last person to switch out of… about 20 regulars? in what was once a local LUG. I still run Linux and OpenBSD pretty much everywhere I can except for my main working machine, which runs macOS. I’m not happy about it, but it’s also not a matter of convenience.

                        It’s not that I’ve sacrificed the principles of open source but, realistically, I do not trust the vision that’s currently prevalent in the world of FOSS desktop. It’s a vision that I don’t understand – seeking to provide software for users who understand and care about the technical aspects of fundamental liberties, who can file bug reports and test fixes from a devel branch, or even submit patches, who are willing to follow HCLs before buying new hardware, but are also confused by too many customisation options, and intimidated by small icons and buttons. It produces software that I find less capable with each release, and which makes it harder and harder for me to make things that others find useful, to work on new things, to learn things about computers and things other than computers, and so on.

                        Showing up with patches – or heaven forbid, bug reports – that do not demonstrate a sufficient understanding of these principles is met with hostility by many communities, and I honestly have no interest in even trying anymore. I’ve no intention of dying on hills held by designers who asked four interns to perform these actions and rate how hard it was, thus unlocking the secret of how to build the UIs that will finally bring upon us the Year of Linux on the Desktop.

                        And I honestly have no interest in going the underground route, either – running dwm, i3, or resurrecting my FVWM config from 15 years ago and going back to using mc, pine, bc and emacs under xterm. I like file managers with nice-looking icons and proportional fonts, I like graphical email clients and, generally, all these programs that try (and sometimes fail) to figure out what would be good tomorrow, rather than to get to the essence of what was good thirty years ago.

                        But I also think that it’s not a good idea to develop professional software – terminal emulators, text editors and IDEs, CAD/CAE tools, system administration tools, whatever – by using UX and UI design principles for novice users and freemium apps. All you get is software designed for people who don’t want to run it in the first place, and deliberately made worse for those who do want to run it.

                        I wish there was a better way, but if the best way to write things that run under Linux and help others do cool things with it is to get a Macbook or a Surface and SSH into a headless machine, or run WSL, I can do that. (And honestly, I totally get the appeal, too. After a few months with iTerm 2 – which is also open source, and I mean GPL, the commie kind of open source – I don’t want to touch Gnome Terminal ever again in my entire life).

                        1. 5

                          That we agree on principle 1. “professionals first” comes as no surprise ;-)

                          In that vein though, can you think of concrete/specific examples that illustrate the conflict of interests/disconnect? The easiest I have from the OSX brand of face-meets-palm would be the dialog:“Terminal would like to access your Contacts”. Unpacking it speaks lengths about what is going on.

                          1. 2

                            My favourite one lately is from Windows, which insists on automatically rebooting in order to install updates in the middle of the night so that you’re greeted by a full-screen ad for Edge, which is now also your default browser a better and more secure Windows experience in the morning. That works out great on paper, but in practice, lots of us who use computers for work just wake up to the Bitlocker screen, and five minutes of please wait, installing updates.

                            From Linux land… I’ve honestly ragequit a long time ago, I kept poking it for three or four years (by which I mean I mostly ran a bunch of old console apps and a few Qt apps) and eventually gave in and bought a Mac. I mostly have a bunch of bad memories from 2012-2017 or so, after which my interactions with it were largely limited to taking point releases of XFCE, Gnome and KDE for test drives and noping the fsck out back to FVWM. So most of my complaints are probably either out of date, or oddly generic (everything’s huge and moves a lot and that’s not nice to my ageing eyes).

                            Plus… there’s this whole approach to “improving” things, you know?

                            When I got this stupid Mac, it was uncanny to see how many of the things I used back in 2006 or 2007, when I last used a mac, are still there and work the same way. Even Finder has the same bugs :). Meanwhile, there are things I liked in KDE 3.5 that literally got rewritten and broken three times since then, like custom folder icons support. Which is now practically useless anyway, since icons from modern themes look pretty much the same below 64x64px – I get looks great in screenshots but guys I have folders with a hundreds of datasheets, there’s no way I can ever find anything in there if I can only see like 12 files at a time.

                            (Edit: FWIW, I think we pretty much agree on all twelve :P)

                            1. 2

                              My favourite one lately is from Windows, which insists on automatically rebooting in order to install updates in the middle of the night so that you’re greeted by a full-screen ad for Edge, which is now also your default browser a better and more secure Windows experience in the morning. That works out great on paper, but in practice, lots of us who use computers for work just wake up to the Bitlocker screen, and five minutes of please wait, installing updates.

                              Somewhat ironic how denial of service is rephrased as a double-plus good security measure, no? Coming from a SCADA angle, the very idea of an update of any sort borders on the exotic (and erotic), but the value of that contract seems to have been twisted into a means of evaluating change by forcing it on users and study the aftermath. The browsers are perhaps the most obviously, but regardless of source it is quite darn disgusting.

                              From Linux land… I’ve honestly ragequit a long time ago, I kept poking it for three or four years (by which I mean I mostly ran a bunch of old console apps and a few Qt apps) and eventually gave in and bought a Mac.

                              So amusingly enough I was a die hard FOSS desktop user from the mid 90ies until the arrival of OSX (though raised in Solaris lands). Most of my bills went to pay the cluster of PPC mini macs I used to do my dirty deeds (the biggest of endians). Come 10.6 it was clear that Apple’s trajectory was “fsck you devs, we’re through” and I took it personally. Left all of it to rot, returned to FOSS and was dismayed by what the powers that be had done to the place. The tools I was working on towards, “solving oscilloscope envy by reshaping the debugger” had to be reused to build a desktop that “didn’t change beneath my feet as I was walking around”.

                              I get looks great in screenshots but guys I have folders with a hundreds of datasheets, there’s no way I can ever find anything in there if I can only see like 12 files at a time.

                              I would like to run an experiment on you, but the infrastructure is lacking for the time being – here is one of those things where VR performs interesting tricks. Posit that a few hundreds of datasheets are projected onto a flat surface that is textured onto a sphere. You are inside of that sphere with a head mounted display. How long would it take your cognition to find “that one sheet” among the others, versus scrolling through a listview…

                              1. 1

                                the value of that contract seems to have been twisted into a means of evaluating change by forcing it on users and study the aftermath. The browsers are perhaps the most obviously, but regardless of source it is quite darn disgusting.

                                IMHO this trench war of updates, where users are finding new ways to postpone them and companies (in this case, Microsoft) are finding new ways to make sure updates happen, is entirely self-inflicted at the companies’ end.

                                Way back when big updates were in the form of service packs, there was generally no question about whether you should update or not. If you were running bleeding-edge hardware you’d maybe postpone it for a week or two, to let the early adopters hit the bad bugs, but that was it. As for the smaller, automatic updates, people loathed them mainly because of the long shutdown times, but it was generally accepted that they at least caused no harm.

                                Nowadays, who knows. During lockdown I had to go halfway across the city to my parents’ house twice to get my overly-anxious mother (elementary school teacher who’s one or two years away from retirement, so pretty scared when it comes to tech) past the full-screen ads with no obvious close buttons, restore Firefox as a default browser and so on, while she commandeered my other parental unit’s 15 year-old, crawling laptop to hold the damn classes. No wonder everyone dodges updates for as long as they can.

                                I would like to run an experiment on you, but the infrastructure is lacking for the time being – here is one of those things where VR performs interesting tricks. Posit that a few hundreds of datasheets are projected onto a flat surface that is textured onto a sphere. You are inside of that sphere with a head mounted display. How long would it take your cognition to find “that one sheet” among the others, versus scrolling through a listview…

                                I’m sure you’ve thought about this for longer than I have, but I suspect there are two things that determine success in this case:

                                1. An organisation system that matches the presentation (e.g. alphabetical order for a one-column list view)

                                2. Being able to focus on a sufficiently large sample that you can browse the list without moving your eyes back and forth too much

                                3. is pretty obvious, I guess. I like to go through listviews because these things are sorted alphabetically, and while many of them have very stupid names like slau056.pdf (which is actually MSP430x4xx Family User’s Guide), I usually sort of know which one I’m looking for, because manufacturers tend to follow different, but pretty stable conventions. As long as they’re laid out in a way that makes it easy to “navigate the list” (in stricter terms, in a way that preserves ordering to some degree, and groups initial navigation options – i.e. sub-directories – separately so they’re easy to reach, wtf GTK…), it’s probably fine.

                                4. probably bears some explanation because it’s the reason why large icons suck so much. Imagine you have a directory with 800 items and you’re looking for one that’s in the middle. If you can only see 10-12 at a time, then a) it takes a lot of time to hit the exact window with the file you’re looking for, and the tiniest amount of scroll moves up and down by a whole page of items. So you get to go back and forth between dozens of 10-item pages, and often overshoot, then undershoot the one you’re looking for dozens of times, all while wiggling your eyes all over the window.

                                I dunno what to think about the inside of a sphere. My knee-jerk reaction is to say I’d get dizzy and that a curved, but field-of-view-sized surface might be a better fit. But gentleman skeptics once complained that trains would be draughty, too, and it turned out they were, but also that it was more than worth it. Just like flying machines became a thing once we finally figured out imitating birds is just not the right way to go about it, we’re probably going to make real progress in organising and browsing information only at the point where we stop imitating libraries, so I think this meets the essential prerequisites for success ;-).

                              2. 1

                                (everything’s huge and moves a lot and that’s not nice to my ageing eyes).

                                For what it’s worth, the Reduce Motion accessibility setting helps with some of that.

                            2. 6

                              I think the worst thing that happened to FOSS is UX designers (and that includes things like Flatpak).

                              It doesn’t matter whether the reason for “software won’t do X anymore” is some evil mega-corp or some arrogant UX designer, the result is the same.

                              (And before anyone slides in with a “let me mansplain UX to you”: I’m good, thanks.)

                              1. 7

                                The field of UX has massively regressed in the last 15-20 years, everywhere, not just in the FOSS world. It’s a cargo cult at this point. Even many (most?) of the organisations that allegedly practice “metrics-driven” design routinely get it so wrong it’s not even hilarious – they make fancy graphics from heaps of data, but they have no control groups, no population sampling, and interpretations are not checked against subject feedback (which is often impossible to get anyway, since the data comes from telemetry), so it all boils down to squinting at the data until it fits the dogma.

                            3. 11

                              It’s a common thing for people to discuss what {company X} thinks about {idea Y}. But if you work for a corporation at least a while, you learn there’s no such sentiment. It’s closer to {high level exec A} thinks that {area B} is a great way to expand and {investing in C} is the way to do it. Soon person on position “A” may change, “B” may have a good replacement, and money pumped into “C” may turn out to not have a good return. Microsoft doesn’t like or dislike anything. Managers with enough power temporarily like some strategies more than others.

                              We had Balmer on one extreme, now we’ve got Satya who seems like the other extreme. In 5 years we may have either: Balmer++ deciding to sue Valve for Proton, or Satya++ deciding to opensource parts of windows kernel, or someone in the middle who approves something nice for FOSS and destroys something nice for FOSS, because they don’t care about that aspect at all, or anyone in between.

                              Corporations don’t deserve the benefit of the doubt. Some execs do. Just remember they’ll be out in a few years.

                              1. 2

                                This is very true.

                                I think this phenomenon is laid out very well in the book The Dictator’s Handbook. It claims that countries, companies, and other entities don’t have opinions or preferences; people are the top and at every level do. They are looking after themselves.

                                Disclaimer: not the author of the book; it’s just one of my favorites, and I recommend it for everyone to read.

                              2. 4

                                With big corporations one can never know, but it seems to me that in recent years Microsoft have been a good citizen of the open-source software world. I’m willing to give them the benefit of the doubt.

                                What have they done, or stopped doing, to earn this praise?

                                1. 7

                                  They’ve released a lot of core tech as open source: VS Code, .NET Core, some Windows programs such as Terminal and Calc, etc. They’ve supported a lot of other open source things less directly: NuGet, Python packages, etc. They’ve more or less stopped campaigning, advertising and litigating against open source stuff the way they did up through the mid/late 2000’s – see here for entry points to some good examples.

                                  I’ll happily give Microsoft the benefit of the doubt, but we’ll see if they continue this strategy of being nice for another 5-10 years, or whether we enter the “extinguish” phase, more or less the same way Google has in the last 5 years. If Microsoft thinks they can make more money being nice than being evil, then that’s what they’ll do; that’s the only decision path that matters to them.

                                  1. 12

                                    They won’t abuse their monopoly too much, because they don’t actually have one. It’s no secret that they open-sourced .Net to make it a viable option for web on Linux servers, and even though they dominate desktop, desktop itself has competition from 1) the browser (see: web apps) and 2) phones/tablets (there’s zero difference between a laptop and a tablet with a keyboard). They’re playing nice because they know they’re an underdog and can’t afford to act otherwise.

                                    Of the three platforms (browser/desktop/touchscreens), the Browser is most controlled by Google, and phones/tablets are most controlled by Google. This conveniently lines up with the “Google is the new Microsoft” meme.

                                2. 2

                                  I also think Microsoft has done way better, and I really think they are likely to continue supporting and embracing open source. It seems like the company culture has shifted in a very fundamental way.

                                  That being said, we should still operate under the assumption that they won’t. Trust, but verify. Same reason you should be wary of signing a CLA that assigns copyright, even if you really, really trust the company the CLA comes from.

                                  1. 1

                                    I used to know a ton of people running Linux as their primary OS, and now almost everyone’s on macOS and WSL. Perhaps most people don’t care much about the underlying principles, as long as they get their work done.

                                    My main principle was always to get certain things done. For a long time (say, from the year 2000 to 2018) Linux was absolutely number one for that, for me. Now it’s Apple, but by a small margin.

                                    I’m fairly certain that it will never be Windows except for games. And that’s becoming a small margin as well.

                                    From the article:

                                    As you can see there’s nothing fancy about it. I wanted to build a decent workstation, not a gaming rig. Still, I opted to get a decent discrete GPU, as there were a few PC games that I was hoping to eventually play (StarCraft II, Tomb Raider, Diablo III, etc).

                                    So this is absolutely and obviously a gaming rig :)

                                  2. 2

                                    I think that asking people to pay a cost (in time, complexity and frustration) to try and match their desktop workflows from Mac OS or Windows in Linux is a fool’s errand, and in addition, the “real enemy” isn’t Microsoft (Windows) or Apple, but rather Google/Amazon/Microsoft (Azure/Github). I think the real challenges to software freedom are in confronting the big SaaS providers, and I would love to hear what the various freedom advocates think we should do about that.

                                    1. 1

                                      big SaaS providers

                                      What do the big SaaS providers have to do with the state of linux on the desktop?

                                      1. 2

                                        Not much! But they have a lot to do with the state of software freedom, or rather, the lack thereof.

                                  1. 3

                                    Because “web components” are a marketing term for a solution in search of a problem.

                                    1. 1

                                      Why? I think they are rather useful when it comes to cleanly separating functionality and enabling easier styling.

                                      1. 1

                                        I wrote up my thoughts here: https://blog.carlmjohnson.net/post/2020/web-components/

                                        Since then I will say I’ve gotten slightly more appreciation for the context having a dumb CMS in which you want to drop snippets of HTML and not have them clash. For that, I guess WC are as good as anything else (they can still clash but you have to be bad at naming for that to happen). But it’s sort of assuming a context where you have no control. If you have any control, you should just build the thing you want, and the top level tag can just be <div data-my-component> and it’s just as “semantic” as <my-component>. So for example, in my CMS at work, I write up shortcodes and so I can control what the shortcodes do exactly and have them hook into the main CSS and JS on the page instead of needing to isolate them in their own bubbles. If I had to use a CMS but was not able to write my own shortcodes, it might make sense as a strategy to use WC. But it doesn’t buy you much and it brings in a lot of problems: for example, the same blog as this story has a post about what a pain it is to do focus tricks with the shadowDOM.

                                    1. 1

                                      No.

                                      That said, I think the article is surprisingly nuanced, but I fear that the laissez-fair attitude shown in posts like this is a reason many people equate web developers with junior developers.

                                      1. 17

                                        Did you even read the post? The framework takes 1.4 kB out of 13.9 kB total, and as a result you get a self-contained component you can insert in any page. He also provides a framework-less package for those who already have the Svelte package in their dependencies. That seems a like a well thought out and reasonable approach.

                                        Also I would argue that either you use an existing framework or you need to reinvent one, or parts of it. You might end up with less than 1.4 kB overhead but probably with some subtle bugs and performance issues that have already been solved in other frameworks, not to mention the wasted time.

                                        1. 5

                                          In addition to all the problems other people mentioned already, it’s also a matter of longevity.

                                          Do users really want to have a component that invisibly bundles some framework, which they only become fully aware of when the framework ends up being abandoned after one JS hype cycle (~18 months) and the security issues start piling up?

                                          1. 1

                                            And if don’t you use a framework, the component will remain secure forever? Of course it wouldn’t, it still requires maintenance work and whether it means updating the framework, or switching to a better framework, or updating bespoke code, it’s maintenance work. Things don’t remain miraculously bug free and vulnerability free just because you don’t use a framework.

                                            If you are arguing that maintaining bespoke code from a random developer (who might have moved on) is easier than updating a widely used framework, then I guess we simply disagree. I know I’ll pick any time a framework rather than reinventing my own.

                                            Also that “JS hype cycle” meme is getting a bit old, and I wonder where you pull that 18 months from. Svelte has been around for 4 years, React for 8 years, etc.

                                          2. 1

                                            wouldn’t it be more reasonable to defer to the browser/OS for selecting input characters? unless I’m misunderstanding the purpose of the emoji picker.

                                            1. 9

                                              The framework takes 1.4 kB out of 13.9 kB total, and as a result you get a self-contained component you can insert in any page.

                                              And this is precisely how your app ends up with 15 versions of Svelte built-in with say 9 of them having security bugs. Or weird bugs where version N+1 of the library destroys version N’s global state.

                                              1. 2

                                                I think you meant to reply to /u/lau

                                              2. 2

                                                For sure, I would love for browsers to standardize on some kind of emoji picker, or to just delegate to the OS’s built-in picker. I wrote down some thoughts here on that.

                                                1. 2

                                                  In theory yes.

                                                  But in practice a lot of sites these days want custom “emoji” which would be difficult to design into an OS/browser picker.

                                                  For example look at Android where just about every keyboard has an emoji picker but many apps (especially messaging apps) have their own button to add emoji.

                                                  1. 1

                                                    custom emoji would not be present in a pre-packaged web component anyway

                                            1. 1

                                              Probably some effort to finally get the language change the removes the remnants of modifiers, in favor of annotations.

                                              After that, removing the @static annotation and replacing it with a module construct instead.

                                              1. 1

                                                that’s another gripe, there is no such type as a Path in Julia - it just uses strings. Why not? I honestly don’t know, other than perhaps the Julia devs wanted to get 1.0 out and didn’t have time to implement them.

                                                Well this is just how paths are represented in the Unix C API: plain old nul-terminated strings. And if I remember correctly Windows isn’t much different.

                                                By the way, Rust’s PathBuf is just a wrapper over OsString. There’s nothing fancy under the hood: https://doc.rust-lang.org/src/std/path.rs.html#1076-1078

                                                1. 7

                                                  The problem isn’t the underlying implementation, the problem is that paths are not strings, they just happen to be represented as them. It’d be like if Rust used Vec<u8> as its string type instead of String/str, or if instead of std::time::Instant you had u32 or whatever.

                                                  1. 2

                                                    The point is that OsString has different implementations based on what the underlying operating system APIs uses.¹

                                                    This is a requirement for many cases including “OS paths allow a superset of bytes than what would be valid in the languages’ string encoding” down to avoiding “we helpfully converted the OS paths to UTF-8 for you and now we can’t find the file using that string anymore, because OS path → language string → OS path doesn’t result in the same bytes”.


                                                    ¹ https://doc.rust-lang.org/std/ffi/struct.OsString.html

                                                    1. 2

                                                      Rust’s PathBuf is also a gigantic pain in the ass to use. Paths should be lists of path components, IMO. The string is just a serialization format.

                                                      1. 2

                                                        Not to mention that the semantics of Path::join (i. e. PathBuf::push) are just crazy.

                                                        Yeah, I want to have an operation that does two completely different things without telling me which one actually happened! /s

                                                        1. 1

                                                          What would the type of the individual components be?

                                                          1. 3

                                                            I built something like this a while ago:

                                                            I had AbsolutePaths and RelativePaths (to prevent invalid path operations at compile-time) with PathSegments that were either OsStrings or placeholders like <ROOT_DIR>, <HOME_DIR>, <CACHE_DIR> etc. (that the library understood to serialize and deserialize such that you could e. g. use these paths in config files without having to manually implement this for each use-case).

                                                      1. 3

                                                        Here’s what I posted on the Julia Zulip. For context, I’ve been using Julia for several years and am a minor contributor to the core language and ecosystem.

                                                        Spicy. Thanks for posting.

                                                        I think 2, 3 and 4 are serious problems for common use cases, but disagree that they’re impossible to solve. At least for smaller programs, a fully ahead of time compiler (like the GPU ones we already have) could solve this and it’s a project that gets some intermittent interest. PackageCompiler is suitable for some use cases, too.

                                                        Agree with 5, 6, 7, 8. Tho I think some fairly simple support for declaring a function to be a required part of an interface could be enough to mostly deal with that.

                                                        Generally, I do methodswith(AbstractDict) or whatever, but it’s definitely not exactly what I want.

                                                        Agree that the type system situation is a bit odd and the Rust-style interface system makes much more sense to me.

                                                        Definitely something to be said that inheritance of data makes implementing new types very quick (tho it can also be dangerous and confusing). I think we could get most of the ease by having some commonly accepted way of defining methods for the interface of the wrapped type that just forward to the wrapped type. There are some macros for this, but there’s not a commonly accepted way of doing it and you have to discover what the interface is first (and you need to keep your subclass up to date with changes to that interface).

                                                        Undecided on 9, 10. I mostly agree with 10, but also know that others like the filter and map functions as-is.

                                                        I think everyone agrees that not having a proper Path type was a mistake and there are semi-frequent threads about introducing one and eventually deprecating our use of strings as paths.

                                                        Unfortunately the IO and paths stuff was copied from python shortly before python introduced the path types, so we inherited their mistakes (as they inherited the mistakes of other programming languages).

                                                        1. 4

                                                          I think everyone agrees that not having a proper Path type was a mistake and there are semi-frequent threads about introducing one and eventually deprecating our use of strings as paths.

                                                          Unfortunately the IO and paths stuff was copied from python shortly before python introduced the path types, so we inherited their mistakes (as they inherited the mistakes of other programming languages).

                                                          They could always fix it. It’s a bit like with climate change, the longer you wait, the more painful the change becomes.

                                                          Same with …

                                                          map, filter and split are eager, returning Array.

                                                          … and a few other things. (I think the only language still in denial about this issue is Scala.)

                                                          Yes, that may require backtracking on …

                                                          Julia released 1.0 in 2018, and has been committed to no breakage since then.

                                                          … but only shows that people shouldn’t make promises they can’t keep.

                                                          For me, such promises are a sign of language design immaturity – it’s the 21st century, design your language to provide facilities to deal with necessary changes, instead of promising not to change anything!

                                                          Every language needs a well-defined process for deprecation and removal of language and library items, simply winging it is not an option.

                                                          1. 3

                                                            Julia v2 is coming soon and some of these things are scheduled to be changed then, so there’s the deprecation plan :)

                                                            map and friends may change in v2 as well. I’m undecided on replacing them with generators, but I’m confident the Julia contrubtors will do something sensible with them.

                                                            Yeah, we agree that we should have a Path type, but there’s limited engineering resource and it’s just not enough of a priority for anyone yet, so no one has done it yet, tho third party packages have existed for a while: https://github.com/rofinn/FilePaths.jl

                                                            1. 2

                                                              My advice after having done this a few times:

                                                              Fix everything you can. Don’t put things off. When the time of Julia 2 → Julia 3 comes you absolutely want to have less broken things to fix than you had in Julia 1 → Julia 2.

                                                        1. 15

                                                          There are a few (many) issues with this post. I feel like the author didn’t completely grasp the idea behind 2FA.

                                                          2FA solutions usually combine 2 elements of the following categories (more information here):

                                                          • Something you know
                                                          • Something you have
                                                          • Something you are

                                                          Services asking for a phone number for 2FA don’t treat it as an additional password, they use it to send out tokens which are used to verify the “possession” of this phone number. Otherwise, a phone number is an easy to guess and worse than average password. After many SIM swapping incidents, most of the big services also allow you to create an TOTP token, which is completely anonymous (and not the same as a password!)

                                                          Additionally, just taking a plain hash of a phone number doesn’t actually improve the security all that much. The input space of phone numbers is relatively small and easily enumerated: an attacker might just do a brute-force search for the correct phone number. Using a slower hash and a salt will slow this down, but to properly mangle phone numbers some form of encryption is needed. (and even then, there might be some issues)

                                                          As for collisions, you should be able to test this yourself and verify that no 2 phone numbers hash to the same SHA256 hash. I think this will even be the case for md5. Generally, to hash a secret, you should use a hash that is specifically created to hash passwords (Scrypt/Argon2id). These hashes are slow by design, so brute forcing passwords becomes more difficult.

                                                          1. 4

                                                            Not sure you fully got it, I think the idea is that they hash the phone number so that it isn’t available in case a hacker gets access to the database, but they can still send an SMS verification token if you type in your phone number.

                                                            1. 14

                                                              A hacker will be able to use dictionary attack to recover the phone numbers from their hashes. Phone numbers have too little entropy to resist that. It’s going to be a only small road bump, even if you use a relatively expensive hash function.

                                                              1. 3

                                                                On top of that I’m pretty sure for every major service you’ll be able to determine the subset of numbers your target may use. Simply by looking at the TLD of their email (you already have that info, or you wouldn’t try to break the 2FA at that point) and then looking at wikipedia which numbers are used by the top 3 mobile providers. When we’re talking about amazon you might just look at the language I do my reviews in and you know which country to look for, same problem for any other service that has localized user content. And last but not least you’ll have to explain other people why entering your mobile number anywhere else is suddenly a security hazard for this specific 2FA algorithm.

                                                                (Please just use a hardware key, TOTP or something else that isn’t based on how cheap an IMSI catcher or number transfer is in your country. We’ve had state wide attacks on peoples accounts via SMS 2FA.)

                                                                1. 0
                                                                  1. 9

                                                                    The salt doesn’t make it significantly harder to guess one particular user’s phone number. The input space is still just all legal phone numbers, which, coupled by a fast hash algorithm, isn’t that big. In fact, my desktop runs sha256 on every single 7 digit number (the size of phone numbers in Norway), with a 64-byte salt, in under 4 seconds, using fairly naïve (but multi-threaded) C.

                                                                    The salt just means that you have to spend on average <2 seconds per user, it makes it so you can’t make a complete table which maps a hash to a phone number. Throw a few GPUs at the problem and those <2 seconds per user becomes milliseconds per user.

                                                                    (The source code I threw together to test, in case you wanna check my work: https://p.mort.coffee/alh.c - with the sha256 implementation from https://github.com/ckolivas/cgminer/blob/master/sha2.h and https://github.com/ckolivas/cgminer/blob/master/sha2.c)

                                                                    EDIT: I messed up the code. I was accidentally running through almost the entire range of numbers in parallel and then running through it again single threaded. Here’s the fixed code: https://p.mort.coffee/NPQ.c - It actually runs through the entire range of Norwegian phone numbers, from 0 to 9999999, in 231 milliseconds. You don’t even need the GPUs anymore. Those 11-digit UK and US phone numbers will still be a problem, but depending on context, there may still be tricks you can do to knock a few bits of entropy off the search space.

                                                                    1. 1

                                                                      Salt is a protection against rainbow tables. Phone number space is so tiny you don’t even need to bother with raindbow tables. A single Raspberry Pi can brute-force all phone numbers in the world in under an hour.

                                                                  2. 1

                                                                    I’m sure I don’t get it. If the company only stores my hashed number, how do they reverse it to send me an SMS?

                                                                    1. 2

                                                                      The idea is that you have to reenter your phone number, which they then use to both verify and text you.

                                                                      1. 1

                                                                        They don’t, they ask you for it every time you log in.

                                                                  1. 5

                                                                    Looks like Vulkan is putting quite some pressure on proprietary API vendors.

                                                                    1. 12

                                                                      Isnt this… Really big?

                                                                      1. 15

                                                                        It does seem like it. This is, to my knowledge, the first hugely popular I/O library which now lets its users use io_uring in a way which just looks like normal async file I/O.

                                                                        Rust seems like it is in a special position in that it’s a language with good enough performance for io_uring to matter, but with powerful enough facilities to make clean abstractions on top of io_uring possible.

                                                                        1. 6

                                                                          Isn’t the problem that Rust bet the farm on readiness-based APIs and now it turns out (surprise) that completion-based APIs are generally “better” and finally coming to Linux (after Windows completely embarrassed Linux on that matter for like a decade).

                                                                          1. 1

                                                                            It’s not a problem in practice. Rust’s futures model handels io-uring just fine. There was some debate over how to handle “cancellations” of futures, e.g. when Rust code wants to just forget that it asked the OS for bytes from a TCP socket. But the “ringbahn” research prototype found a clean solution to that problem.

                                                                            Actually, that entire blog is a wealth of information about Rust futures.

                                                                            1. 1

                                                                              found a clean solution

                                                                              I’d call that a stretch, considering that the “solution” pretty much foregoes futures altogether (and with that async/await) and largely rolls its own independent types and infrastructure.

                                                                              So I’m not seeing how this is evidence for:

                                                                              futures model handels io-uring just fine

                                                                              I’d say its evidence of the opposite.

                                                                              Actually, that entire blog is a wealth of information about Rust futures.

                                                                              Actually, that blog is the reason why I asked the question in the first place.

                                                                              1. 1

                                                                                I’m getting a little out of my depth here, but my understanding is that ringbahn (which inspired the tokio implementation) is meant to be used under the hood by a futures executor, just like epoll/kqueue are used under the hood now. It’s a very thin interface layer.

                                                                                Basically, from application code you start up a TCP socket using an async library with io-uring support. Then whenever you read from it and await, the executor will do ringbahn-style buffer management and interface with io-uring.

                                                                          2. 1

                                                                            There’s also hugely popular https://github.com/libuv/libuv/pull/2322

                                                                            (Since libuv isn’t modular it hasn’t officially landed yet, but the way I understand it, both projects are at about the same level of completion)

                                                                          3. 4

                                                                            Everything about rust is big now =)

                                                                            1. 4

                                                                              Rust is the epitome of Big FOSS.

                                                                          1. 38

                                                                            FWIW the motivation for this was apparently a comment on a thread about a review of the book “Software Engineering at Google” by Titus Winters, Tom Manshreck, and Hyrum Wright.

                                                                            https://lobste.rs/s/9n7aic/what_i_learned_from_software_engineering

                                                                            I meant to comment on that original thread, because I thought the question was misguided. Well now that I look it’s actually been deleted?

                                                                            Anyway the point is that is that the empirical question isn’t really actionable IMO. You could “answer it” and it still wouldn’t tell you what to do.

                                                                            I think you got this post exactly right – there’s no amount of empiricism that can help you. Software engineering has changed so much in the last 10 or 20 years that you can trivially invalidate any study.

                                                                            Yaron Minsky has a saying that “there’s no pile of sophomores high enough” that is going to prove anything about writing code. (Ironically he says that in advocacy of static typing, which I view as an extremely domain specific question.) Still I agree with his general point.


                                                                            This is not meant to be an insult, but when I see the names Titus Winters and Hyrum Wright, I’m less interested in the work. This is because I worked at Google for over a decade and got lots of refactoring and upgrade changelists/patches from them, as maintainer of various parts of the codebase. I think their work is extremely valuable, but it is fairly particular to Google, and in particular it’s done without domain knowledge. They are doing an extremely good job of doing what they can to improve the codebase without domain knowledge, which is inherent in their jobs, because they’re making company-wide changes.

                                                                            However most working engineers don’t improve code without domain knowledge, and the real improvements to code require such knowledge. You can only nibble at the edges otherwise.

                                                                            @peterbourgon said basically what I was going to say in the original thread – this is advice is generally good in the abstract, but it lacks context.

                                                                            https://lobste.rs/s/9n7aic/what_i_learned_from_software_engineering

                                                                            The way I learned things at Google was to look at what people who “got things done” did. They generally “break the rules” a bit. They know what matters and what doesn’t matter.

                                                                            Jeff Dean and Sanjay Ghewamat indeed write great code and early in my career I exchanged a few CLs with them and learned a lot. I also referenced a blog post by Paul Bucheit in The Simplest Explanation of Oil.

                                                                            For those who don’t know, he was creator of GMail, working on it for 3 years as a side project (and Gmail was amazing back then, faster than desktop MS Outlook, even though it’s rotted now.) He mentions in that post how he prototyped some ads with the aid of some Unix shell. (Again, ads are horrible now, a cancer on the web – back then they were useful and fast. Yes really. It’s hard to convey the difference to someone who wasn’t a web user then.)

                                                                            As a couple other anecdotes, I remember people a worker complaining that Guido van Rossum’s functions were too long. (Actually I somewhat agreed, but he did it in service of getting something done, and it can be fixed later.)

                                                                            I also remember Bram Moolenaar’s (author of Vim) Java readability review, where he basically broke all the rules and got angry at the system (for a brief time I was one of the people who picked the Python readability reviewers, so I’m familiar with this style of engineering. I had to manage some disputes between reviewers and applicants.).

                                                                            So you have to take all these rules with a grain of salt. These people can obviously get things done, and they all do things a little differently. They don’t always write as many tests as you’d ideally like. One of the things I tried to do as the readability reviewer was to push back against dogma and get people to relax a bit. There is value to global consistency, but there’s also value to local domain-specific knowledge. My pushing back was not really successful and Google engineering has gotten more dogmatic and sclerotic over the years. It was not fun to write code there by the time I left (over 5 years ago)


                                                                            So basically I think you have to look at what people build and see how they do it. I would rather read a bunch of stories like “Coders at Work” or “Masterminds of Programming” than read any empirical study.

                                                                            I think there should be a name for this empirical fallacy (or it probably already exists?) Another area where science has roundly failed is nutrition and preventative medicine. Maybe not for the same exact reasons, but the point is that controlled experiments are only one way of obtaining knowledge, and not the best one for many domains. They’re probably better at what Taleb calls “negative knowledge” – i.e. disproving something, which is possible and valuable. Trying to figure out how to act in the world (how to create software) is less possible. All things being equal, more testing is better, but all things aren’t always equal.

                                                                            Oil is probably the most rigorously tested project I’ve ever worked on, but this is because of the nature of the project, and it isn’t right for all projects as a rule. It’s probably not good if you’re trying to launch a video game platform like Stadia, etc.

                                                                            1. 8

                                                                              Anyway the point is that is that the empirical question isn’t really actionable IMO. You could “answer it” and it still wouldn’t tell you what to do.

                                                                              I think you got this post exactly right – there’s no amount of empiricism that can help you.

                                                                              This was my exact reaction when I read the original question motivating Hillel’s post.

                                                                              I even want to take it a step further and say: Outside a specific context, the question doesn’t make sense. You won’t be able to measure it accurately, and even if you could, there would such huge variance depending on other factors across teams where you measured it that your answer wouldn’t help you win any arguments.

                                                                              I think there should be a name for this empirical fallacy

                                                                              It seems especially to afflict the smart and educated. Having absorbed the lessons of science and the benefits of skepticism and self-doubt, you can ask of any claim “But is there a study proving it?”. It’s a powerful debate trick too. But it can often be a category error. The universe of useful knowledge is much larger than the subset that has been (or can be) tested with a random double blind study.

                                                                              1. 5

                                                                                I even want to take it a step further and say: Outside a specific context, the question doesn’t make sense. You won’t be able to measure it accurately, and even if you could, there would such huge variance depending on other factors across teams where you measured it that your answer wouldn’t help you win any arguments.

                                                                                It makes a lot of sense to me in my context, which is trying to convince skeptical managers that they should pay for my consulting services. But it’s intended to be used in conjunction with rhetoric, demos, case studies, testimonials, etc.

                                                                                It seems especially to afflict the smart and educated. Having absorbed the lessons of science and the benefits of skepticism and self-doubt, you can ask of any claim “But is there a study proving it?”. It’s a powerful debate trick too. But it can often be a category error. The universe of useful knowledge is much larger than the subset that has (or can) be tested with a random double blind study.

                                                                                I’d say in principle it’s Scientism, in practice it’s often an intentional sabotaging tactic.

                                                                                1. 1

                                                                                  It makes a lot of sense to me in my context, which is trying to convince skeptical managers that they should pay for my consulting services. But it’s intended to be used in conjunction with rhetoric, demos, case studies, testimonials, etc.

                                                                                  100%.

                                                                                  I should have said: I don’t think it would help you win any arguments with someone knowledgeable. I completely agree that in the real world, where people are making decisions off rough heuristics and politics is everything, this kind of evidence could be persuasive.

                                                                                  So a study showing that “catching bugs early saves money” functions here like a white lab coat on a doctor: it makes everyone feel safer. But what’s really happening is that they are just trusting that the doctor knows what he’s doing. Imo the other methods for establishing trust you mentioned – rhetoric, demos, case studies, testimonials, etc. – imprecise as they are, are probably more reliable signals.

                                                                                  EDIT: Also, just to be clear, I think the right answer here, the majority of the time, is “well obviously it’s better to catch bugs early than later.”

                                                                                  1. 2

                                                                                    the majority of the time

                                                                                    And in which cases is this false? Is it when the team has lots of senior engineers? Is it when the team controls both the software and the hardware? Is it when OTA updates are trivial? (Here is a knock-on effect: what if OTA updates make this assertion false, but then open up a huge can of security vulnerabilities, which overall negates any benefit that the OTA updates add?) What does a majority here mean? I mean, a majority of 55% means something very different from a majority of 99%.

                                                                                    This is the value of empirical software study. Adding precision to assertions (such as understanding that a 55% majority is a bit pathological but a 99% majority certainly isn’t.) Diving into data and being able to understand and explore trends is also another benefit. Humans are motivated to categorize their experiences around questions they wish to answer but it’s much harder to answer questions that the human hasn’t posed yet. What if it turns out that catching bugs early or late is pretty much immaterial where the real defect rate is simply a function of experience and seniority?

                                                                                    1. 1

                                                                                      This is the value of empirical software study. I think empirical software study is great, and has tons of benefits. I just don’t think you can answer all questions of interest with it. The bugs question we’re discussing is one of those.

                                                                                      And in which cases is this false? Is it when the team has lots of senior engineers? Is it when the team controls both the software and the hardware? Is it when OTA updates are trivial? (Here is a knock-on effect: what if OTA updates make this assertion false, but then open up a huge can of security vulnerabilities, which overall negates any benefit that the OTA updates add?)

                                                                                      I mean, this is my point. There are too many factors to consider. I could add 50 more points to your bullet list.

                                                                                      What does a majority here mean?

                                                                                      Something like: “I find it almost impossible to think of examples from my personal experience, but understand the limits of my experience, and can imagine situations where it’s not true.” I think if it is true, it would often indicate a dysfunctional code base where validating changes out of production (via tests or other means) was incredibly expensive.

                                                                                      What if it turns out that catching bugs early or late is pretty much immaterial where the real defect rate is simply a function of experience and seniority?

                                                                                      One of my points is that there is no “turns out”. If you prove it one place, it won’t translate to another. It’s hard even to imagine an experimental design whose results I would give much weight to. All I can offer is my opinion that this strikes me as highly unlikely for most businesses.

                                                                                      1. 4

                                                                                        Why is software engineering such an outlier when we’ve been able to measure so many other things? We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences), we can measure minerals in soil, we can analyze diets, heat transfer, we can even study government policy, elections, and even personality 1 though it’s messy. What makes software engineering so much more complex and context dependent than even a person’s personality?

                                                                                        The fallacy I see here is simply that software engineers see this massive complexity in software engineering because they are software experts and believe that other fields are simpler because software engineers are not experts in those fields. Every field has huge amounts of complexity, but what gives us confidence that software engineering is so much more complex than other fields?

                                                                                        1. 3

                                                                                          Why is software engineering such an outlier when we’ve been able to measure so many other things?

                                                                                          You can measure some things, just not all. Remember the point of discussion here is: Can you empirically investigate the claim “Finding bugs earlier saves overall time and money”? My position is basically: “This is an ill-defined question to ask at a general level.”

                                                                                          We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences)

                                                                                          Yes.

                                                                                          we can measure minerals in soil, we can analyze diets, heat transfer,

                                                                                          Yes.

                                                                                          we can even study government policy

                                                                                          In some way yes, in some ways no. This is a complex situation with tons of confounds, and also a place where policy outcomes in some places won’t translate to other places. This is probably a good analog for what makes the question at hand difficult.

                                                                                          and even personality

                                                                                          Again, in some ways yes, in some ways no. With the big 5, you’re using the power of statistical aggregation to cut through things we can’t answer. Of which there are many. The empirical literature on “code review being generally helpful” seems to have a similar force. You can take disparate measures of quality, disparate studies, and aggregate to arrive at relatively reliable conclusions. It helps that we have an obvious, common sense causal theory that makes it plausible.

                                                                                          What makes software engineering so much more complex and context dependent than even a person’s personality?

                                                                                          I don’t think it is.

                                                                                          Every field has huge amounts of complexity, but what gives us confidence that software engineering is so much more complex than other fields?

                                                                                          I don’t think it is, and this is not where my argument is coming from. There are many questions in other fields equally unsuited to empirical investigation as: “Does finding bugs earlier save time and money?”

                                                                                          1. 2

                                                                                            In some way yes, in some ways no. This is a complex situation with tons of confounds, and also a place where policy outcomes in some places won’t translate to other places. This is probably a good analog for what makes the question at hand difficult.

                                                                                            That hasn’t stopped anyone from performing the analysis and using these analyses to implement policy. That analysis of this data is imperfect is beside the point; it still provides some amount of positive value. Software is in the data dark ages in comparison to government policy; what data driven decision has been made among software engineer teams? I don’t think we even understand whether Waterfall or Agile reduces defect rates or time to ship compared to the other.

                                                                                            With the big 5, you’re using the power of statistical aggregation to cut through things we can’t answer. Of which there are many. The empirical literature on “code review being generally helpful” seems to have a similar force. You can take disparate measures of quality, disparate studies, and aggregate to arrive at relatively reliable conclusions. It helps that we have an obvious, common sense causal theory that makes it plausible.

                                                                                            What’s stopping us from doing this with software engineering? Is it the lack of a causal theory? There are techniques to try to glean causality from statistical models. Is this not in line with your definition of “empirically”?

                                                                                            1. 5

                                                                                              That hasn’t stopped anyone from performing the analysis and using these analyses to implement policy. That analysis of this data is imperfect is beside the point; it still provides some amount of positive value.

                                                                                              It’s not clear to me at all that, as a whole, “empirically driven” policy has had positive value? You can point to successful cases and disasters alike. I think in practice the “science” here is at least as often used as a veneer to push through an agenda as it is to implement objectively more effective policy. Just as software methodologies are.

                                                                                              Is it the lack of a causal theory?

                                                                                              I was saying there is a causal theory for why code review is effective.

                                                                                              What’s stopping us from doing this with software engineering?

                                                                                              Again, some parts of it can be studied empirically, and should be. I’m happy to see advances there. But I don’t see the whole thing being tamed by science. The high-order bits in most situations are politics and other human stuff. You mentioned it being young… but here’s an analogy that might help with where I’m coming from. Teaching writing, especially creative writing. It’s equally ad-hoc and unscientific, despite being old. MFA programs use different methodologies and writers subscribe to different philosophies. There is some broad consensus about general things that mostly work and that most people do (workshops), but even within that there’s a lot of variation. And great books are written by people with wildly different approaches. There are a some nice efforts to leverage empiricism like Steven Pinker’s book and even software like https://hemingwayapp.com/, but systematization can only go so far.

                                                                                          2. 2

                                                                                            We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences)

                                                                                            Good vaccine studies are pretty expensive from what I know, but they have statistical power for that reason.

                                                                                            Health studies are all over the map. The “pile of college sophomores” problem very much applies there as well. There are tons of studies done on Caucasians that simply don’t apply in the same way to Asians or Africans, yet some doctors use that knowledge to treat patients.

                                                                                            Good doctors will use local knowledge and rules of thumb, and they don’t believe every published study they see. That would honestly be impossible, as lots of them are in direct contradiction to each other. (Contradiction is a problem that science shares with apprenticeship from experts; for example IIRC we don’t even know if a high fat diet causes heart disease, which was accepted wisdom for a long time.)

                                                                                            https://www.nytimes.com/2016/09/13/well/eat/how-the-sugar-industry-shifted-blame-to-fat.html

                                                                                            I would recommend reading some books by Nassim Taleb if you want to understand the limits of acquiring knowledge through measurement and statistics (Black Swan, Antifragile, etc.). Here is one comment I made about them recently: https://news.ycombinator.com/item?id=27213384

                                                                                            Key point: acting in the world, i.e. decision making under risk, are fundamentally different than scientific knowledge. Tinkering and experimentation are what drive real changes in the world, not planning by academics. He calls the latter “the Soviet-Harvard school”.

                                                                                            The books are not well organized, but he hammers home the difference between acting in the world and knowledge over and over in many different ways. If you have to have scientific knowledge before acting, you will be extremely limited in what you can do. You will probably lose all your money in the markets too :)


                                                                                            Update: after Googling the term I found in my notes, I’d say “Soviet-Harvard delusion” captures the crux of the argument here. One short definition is the the (unscientific) overestimation of the reach of scientific knowledge.

                                                                                            https://www.grahammann.net/book-notes/antifragile-nassim-nicholas-taleb

                                                                                            https://medium.com/the-many/the-right-way-to-be-wrong-bc1199dbc667

                                                                                            https://taylorpearson.me/antifragile-book-notes/

                                                                                            1. 2

                                                                                              This sounds like empiricism. Not in the sense of “we can only know what we can measure” but in the sense of “I can only know what I can experience”. The Royal Society’s motto is “take nobody’s word for it”.

                                                                                              Tinkering and experimentation are what drive real changes in the world, not planning by academics.

                                                                                              I 100% agree but it’s not the whole picture. You need theory to compress and see further. It’s the back and forth between theory and experimentation that drives knowledge. Tinkering alone often ossifies into ritual. In programming, this has already happened.

                                                                                              1. 1

                                                                                                I agree about the back and forth, of course.

                                                                                                I wouldn’t agree programming has ossified into ritual. Certainly it has at Google, which has a rigid coding style, toolchain, and set of languages – and it’s probably worse at other large companies.

                                                                                                But I see lots of people on this site doing different things, e.g. running OpenBSD and weird hardware, weird programming languages, etc. There are also tons of smaller newer companies using different languages. Lots of enthusiasm around Rust, Zig, etc. and a notable amount of production use.

                                                                                                1. 1

                                                                                                  My bad, I didn’t mean all programming has become ritual. I meant that we’ve seen instances of it.

                                                                                              2. 1

                                                                                                Good vaccine studies are pretty expensive from what I know, but they have statistical power for that reason.

                                                                                                Oh sure, I’m not saying this will be cheap. In fact the price of collecting good data is what I feel makes this research so difficult.

                                                                                                Health studies are all over the map. The “pile of college sophomores” problem very much applies there as well. There are tons of studies done on Caucasians that simply don’t apply in the same way to Asians or Africans, yet some doctors use that knowledge to treat patients.

                                                                                                We’ve developed techniques to deal with these issues, though of course, you can’t draw a conclusion with extremely low sample sizes. One technique used frequently to compensate for low statistical power studies in meta studies is called Post-Stratification.

                                                                                                Good doctors will use local knowledge and rules of thumb, and they don’t believe every published study they see. That would honestly be impossible, as lots of them are in direct contradiction to each other. (Contradiction is a problem that science shares with apprenticeship from experts; for example IIRC we don’t even know if a high fat diet causes heart disease, which was accepted wisdom for a long time.)

                                                                                                I think medicine is a good example of empiricism done right. Sure, we can look at modern failures of medicine and nutrition and use these learnings to do better, but medicine is significantly more empirical than software. I still maintain that if we can systematize our understanding of the human body and medicine that we can do the same for software, though like a soft science, definitive answers may stay elusive. Much work over decades went into the medical sciences to define what it even means to have an illness, to feel pain, to see recovery, or to combat an illness.

                                                                                                I would recommend reading some books by Nassim Taleb if you want to understand the limits of acquiring knowledge through measurement and statistics (Black Swan, Antifragile, etc.). Here is one comment I made about them recently: https://news.ycombinator.com/item?id=27213384

                                                                                                Key point: acting in the world, i.e. decision making under risk, are fundamentally different than scientific knowledge. Tinkering and experimentation are what drive real changes in the world, not planning by academics. He calls the latter “the Soviet-Harvard school”.

                                                                                                I’m very familiar with Taleb’s Antifragile thesis and the “Soviet-Harvard delusion”. As someone well versed in statistics, these are theses that are both pedestrian (Antifragile itself being a pop-science look into a field of study called Extreme Value Theory) and old (Maximum Likelihood approaches to decision theory are susceptible to extreme/tail events which is why in recent years Bayesian and Bayesian Causal analyses have become more popular. Pearson was aware of this weakness and explored other branches of statistics such as Fiducial Inference). (Also I don’t mean this as criticism toward you, though it’s hard to make this tone come across over text. I apologize if it felt offensive, I merely wish to draw your eyes to more recent developments.)

                                                                                                To draw the discussion to a close, I’ll try to summarize my position a bit. I don’t think software empiricism will answer all the questions, nor will we get to a point where we can rigorously determine that some function f exists that can model our preferences. However I do think software empiricism together with standardization can offer us a way to confidently produce low-risk, low-defect software. I think modern statistical advances have offered us ways to understand more than statistical approaches in the ‘70s and that we can use many of the newer techniques used in the social and medical sciences (e.g. Bayesian methods) to prove results. I don’t think that, even if we start a concerted approach today to do this, that our understanding will get there in a matter of a few years. To do that would be to undo decades of software practitioners creating systemic analyses from their own experiences and to create a culture shift away from the individual as artisan to a culture of standardization of both communication of results (what is a bug? how does it affect my code? how long did it take to find? how long did it take to resolve? etc) and of team conditions (our team has n engineers, our engineers have x years of experience, etc) that we just don’t have now. I have hope that eventually we will begin to both standardize and understand our industry better but in the near-term this will be difficult.

                                                                                    2. 5

                                                                                      Here’s a published paper that purposefully illustrates the point you’re trying to make: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC300808/. It’s an entertaining read.

                                                                                      1. 1

                                                                                        Yup I remember that from debates on whether to wear masks or not! :) It’s a nice pithy illustration of the problem.

                                                                                      2. 2

                                                                                        Actually I found a (condescending but funny/memorable) name for the fallacy – the “Soviet-Harvard delusion” :)

                                                                                        An (unscientific) overestimation of the reach of scientific knowledge.

                                                                                        I found it in my personal wiki, in 2012 notes on the book Antifragile.

                                                                                        Original comment: https://lobste.rs/s/v4unx3/i_ing_hate_science#c_nrdasq

                                                                                      3. 3

                                                                                        I’m reading a book right now about 17th century science. The author has some stuff to say about Bacon and Empiricism but I’ll borrow an anecdote from the book. Boyle did an experiment where he grew a pumpkin and measured the dirt before and after. The weight of the dirt hadn’t changed much. The only other ingredient that had been added was water. It was obvious that the pumpkin must be made of only water.

                                                                                        This idea that measurement and observation drive knowledge is Bacon’s legacy. Even in Bacon’s own lifetime, it’s not how science unfolded.

                                                                                        1. 2

                                                                                          Fun fact: Bacon is often considered the modern founder of the idea that knowledge can be used to create human-directed progress. Before him, while scholars and astronomers used to often study things and invent things, most cultures still viewed life and nature as a generally haphazard process. As with most things in history the reality involves more than just Bacon, and there most-certainly were non-Westerners who had similar ideas, but Bacon still figures prominently in the picture.

                                                                                          1. 1

                                                                                            Hm interesting anecdote that I didn’t know about (I looked it up). Although I’d say that’s more an error of reasoning within science? I realized what I was getting at could be called the Soviet-Harvard delusion, which is overstating the reach of scientific knowledge (no insult intended, but it is a funny and memorable name): https://lobste.rs/s/v4unx3/i_ing_hate_science#c_nrdasq

                                                                                            1. 1

                                                                                              To be fair, the vast majority of the mass of the pumpkin is water. So the inference was correct to first order. The second-order correction of “and carbon from the air”, of course, requires being much more careful in the inference step.

                                                                                            2. 2

                                                                                              So basically I think you have to look at what people build and see how they do it. I would rather read a bunch of stories like “Coders at Work” or “Masterminds of Programming” than read any empirical study.

                                                                                              Perhaps, but this is already what happens, and I think it’s about time we in the profession raise our standards, both of pedagogy and of practice. Right now you can take a casual search on the Web and you can find respected talking-heads talk about how their philosophy is correct, despite being in direct contrast to another person’s philosophy. This behavior is reinforced by the culture wars of our times, of course, but there’s still much more aimless discourse than there is consistency in results. If we want to start taking steps to improve our practice, I think it’s important to understand what we’re doing right and more importantly what we’re doing wrong. I’m more interested here in negative results than positive results. I want to know where as a discipline software engineering is going wrong. There’s also a lot at stake here purely monetarily; corporations often embrace a technology methodology and pay for PR and marketing about their methodology to both bolster their reputations and to try to attract engineers.

                                                                                              think there should be a name for this empirical fallacy (or it probably already exists?) Another area where science has roundly failed is nutrition and preventative medicine.

                                                                                              I don’t think we’re even at the point in our empirical understanding of software engineering where we can make this fallacy. What do we even definitively understand about our field? I’d argue that psychology and sociology have stronger well-known results than what we have in software engineering even though those are very obviously soft sciences. I also think software engineers are motivated to think the problem is complex and impossible to be empirical for the same reason that anyone holds their work in high esteem; we believe our work is complicated and requires highly contextual expertise to understand. However if psychology and sociology can make empirical progress in their fields, I think software engineers most definitely can.

                                                                                              1. 2

                                                                                                Do you have an example in mind of the direct contradiction? I don’t see much of a problem if different experts have different opinions. That just means they were building different things and different strategies apply.

                                                                                                Again I say it’s good to “look at what people build” and see if it applies to your situation; not blindly follow advice from authorities (e.g. some study “proved” this, or some guy from Google who may or may not have built things said this was good; therefore it must be good).

                                                                                                I don’t find a huge amount of divergence in the opinions of people who actually build stuff, vs. talking heads. If you look at what says John Carmack says about software engineering, it’s generally pretty level-headed, and he explains it well. It’s not going to differ that much from what Jeff Dean says. If you look at their C++ code, there are even similarities, despite drastically different domains.

                                                                                                Again the fallacy is that there’s a single “correct” – it depends on the domain; a little diversity is a good thing.

                                                                                                1. 4

                                                                                                  Do you have an example in mind of the direct contradiction? I don’t see much of a problem if different experts have different opinions. That just means they were building different things and different strategies apply.

                                                                                                  Here’s two fun ones I like to contrast: The Unreasonable Effectiveness of Dynamic Typing for Practical Programs (Vimeo) and The advantages of static typing, simply stated. Two separate authors that came to different conclusions from similar evidence. While yes their lived experience is undoubtedly different, these are folks who are espousing (mostly, not completely) contradictory viewpoints.

                                                                                                  I don’t find a huge amount of divergence in the opinions of people who actually build stuff, vs. talking heads. If you look at what says John Carmack says about software engineering, it’s generally pretty level-headed, and he explains it well. It’s not going to differ that much from what Jeff Dean says. If you look at their C++ code, there are even similarities, despite drastically different domains.

                                                                                                  Who builds things though? Several people build things. While we hear about John Carmack and Jeff Dean, there are folks plugging away at the Linux kernel, on io_uring, on capability object systems, and all sorts of things that many of us will never be aware of. As an example, Sanjay Ghewamat is someone who I wasn’t familiar with until you talked about him. I’ve also interacted with folks in my career who I presume you’ve never interacted with and yet have been an invaluable source of learnings for my own code. Moreover these experience reports are biased by their reputations; I mean of course we’re more likely to listen to John Carmack than some Vijay Foo (not a real person, as far as I’m aware) because he’s known for his work at iD, even if this Vijay Foo may end up having as many or more actionable insights than John Carmack. Overcoming reputation bias and lack of information about “builders” is another side effect I see of empirical research. Aggregating learnings across individuals can help surface lessons that otherwise would have been lost due to structural issues of acclaim and money.

                                                                                                  Again the fallacy is that there’s a single “correct” – it depends on the domain; a little diversity is a good thing.

                                                                                                  This seems to be a sentiment I’ve read elsewhere, so I want to emphasize: I don’t think there’s anything wrong with diversity and I don’t think Emprical Software Engineering does anything to diversity. Creating complicated probabilistic models of spaces necessarily involve many factors. We can create a probability space which has all of the features we care about. Just condition against your “domain” (e.g. kernel work, distributed systems, etc) and slot your result into that domain. I don’t doubt that a truly descriptive probability space will be very high dimensional here but I’m confident we have the analytical and computational power to perform this work nonetheless.

                                                                                                  The real challenge I suspect will be to gather the data. FOSS developers are time and money strapped as it is, and excluding some exceptional cases such as curl’s codebase statistics, they’re rarely going to have the time to take the detailed notes it would take to drive this research forward. Corporations which develop proprietary software have almost no incentive to release this data to the general public given how much it could expose about their internal organizational structure and coding practices, so rather than open themselves up to scrutiny they keep the data internal if they measure it at all. Combating this will be a tough problem.

                                                                                                  1. 2

                                                                                                    Yeah I don’t see any conflict there (and I’ve watched the first one before). I use both static and dynamic languages and there are advantages and disadvantages to each. I think any programmer should comfortable using both styles.

                                                                                                    I think that the notion that a study is going to change anyone’s mind is silly, like “I am very productive in statically typed languages. But a study said that they are not more productive; therefore I will switch to dynamically typed”. That is very silly.

                                                                                                    It’s also not a question that’s ever actionable in reality. Nobody says “Should I use a static or dynamic language for this project?” More likely you are working on existing codebase, OR you have a choice between say Python and Go. The difference between Python and Go would be a more interesting and accurate study, not static vs. dynamic. But you can’t do an “all pairs” comparison via scientific studies.

                                                                                                    If there WERE a study definitely proving that say dynamic languages are “better” (whatever that means), and you chose Python over Go for that reason, that would be a huge mistake. It’s just not enough evidence; the languages are different for other reasons.

                                                                                                    I think there is value to scientific studies on software engineering, but I think the field just moves very fast, and if you wait for science, you’ll be missing out on a lot of stuff. I try things based on what people who get things done do (e.g. OCaml), and incorporate it into my own work, and that seems like a good way of obtaining knowledge.

                                                                                                    Likewise, I think “Is catching bugs earlier less expensive” is a pretty bad question. A better scientific question might be “is unit testing in Python more effective than integration testing Python with shell” or something like that. Even that’s sort of silly because the answer is “both”.

                                                                                                    But my point is that these vague and general questions simply leave out a lot of subtlety of any particular situation, and can’t be answered in any useful way.

                                                                                                    1. 2

                                                                                                      I think that the notion that a study is going to change anyone’s mind is silly, like “I am very productive in statically typed languages. But a study said that they are not more productive; therefore I will switch to dynamically typed”. That is very silly.

                                                                                                      While the example of static and dynamic typing is probably overbroad to be meaningless, I don’t actually think this would be true. It’s a bit like saying “Well I believe that Python is the best language and even though research shows that Go has propertries <x, y, and z> that are beneficial to my problem domain, well I’m going to ignore them and put a huge prior on my past experience.” It’s the state of the art right now; trust your gut and the guts of those you respect, not the other guts. If we can’t progress from here I would indeed be sad.

                                                                                                      It’s also not a question that’s ever actionable in reality. Nobody says “Should I use a static or dynamic language for this project?” More likely you are working on existing codebase, OR you have a choice between say Python and Go. The difference between Python and Go would be a more interesting and accurate study, not static vs. dynamic. But you can’t do an “all pairs” comparison via scientific studies.

                                                                                                      Sure, as you say, static vs dynamic languages isn’t very actionable but Python vs Go would be. And if I’m starting a new codebase, a new project, or a new company, it might be meaningful to have research that shows that, say, Python has a higher defect rate but an overall lower mean time to resolution of these defects. Prior experience with Go may trump benefits that Python has (in this synthetic example) if project time horizons are short, but if time horizons are long Go (again in the synthetic example) might look better. I think this sort of comparative analysis in defect rates, mean time to resolution, defect severity, and other attributes can be very useful.

                                                                                                      Personally, I’m not satisfied by the state of the art of looking at builders. I think the industry really needs a more rigorous look at its assumptions and even if we never truly systematize and Fordify the field (which fwiw I don’t think is possible), I certainly think there’s a lot of progress for us to make yet and many pedestrian questions that we can answer that have no answers yet.

                                                                                                      1. 2

                                                                                                        Sure, as you say, static vs dynamic languages isn’t very actionable but Python vs Go would be. And if I’m starting a new codebase, a new project, or a new company, it might be meaningful to have research that shows that, say, Python has a higher defect rate but an overall lower mean time to resolution of these defects.

                                                                                                        Python vs Go defect rates also seem to me to be far too general for an empirical study to produce actionable data.

                                                                                                        How do you quantify a “defect rate” in a way that’s relevant to my problem, for example? There are a ton of confounds: genre of software, timescale of development, size of team, composition of team, goals of the project, etc. How do I know that some empirical study comparing defect rates of Python vs. Go in, I dunno, the giant Google monorepo, is applicable to my context? Let’s say I’m trying to pick a language to write some AI research software, which will have a 2-person team, no monorepo or formalized code-review processes, a target 1-year timeframe to completion, and a primary metric of producing figures for a paper. Why would I expect the Google study to produce valid data for my decision-making?

                                                                                                      2. 1

                                                                                                        Nobody says “Should I use a static or dynamic language for this project?”

                                                                                                        Somebody does. Somebody writes the first code on a new project and chose the language. Somebody sets the corporate policy on permissible languages. Would be amazing if even a tiny input to these choices were real instead of just perceived popularity and personal familiarity.

                                                                                                2. 2

                                                                                                  I meant to comment on that original thread, because I thought the question was misguided. Well now that I look it’s actually been deleted?

                                                                                                  Too many downvotes this month. ¯\_(ツ)_/¯

                                                                                                  1. 1

                                                                                                    This situation is not ideal :(

                                                                                                1. 1

                                                                                                  Finished a fix for some gnarly problems with GeoTools, hoping to get some time to work on language-related stuff again (annotations instead of modifiers, unified condition expressions, etc.).

                                                                                                  1. 2

                                                                                                    Can someone expand on the relevance of this?

                                                                                                    1. 4

                                                                                                      “Plan 9 is not for you” — http://fqa.9front.org/fqa0.html#0.1.3

                                                                                                      1. 4

                                                                                                        It’s a release of a niche operating system. Normally there’s not much to say about these, but the switch to git9 and git/fs is really cool. I want to check out more of that.

                                                                                                      1. 2

                                                                                                        Some real underappreciated gems in there!

                                                                                                        1. 7

                                                                                                          Seems like a product release page.

                                                                                                          Still super neat, but this makes me increasingly believe that Deno is some kind of low-key startup play.

                                                                                                          1. 13

                                                                                                            They hinted to that in the original Deno company announcement post. https://deno.com/blog/the-deno-company

                                                                                                            I personally think it’s good that OSS initiatives try to experiment with different sustainability models, even if there are plenty of ways it can go wrong. To me it’s a wake up call for the tech world: we must start getting creative to make sure our projects are sustainable, because right now it seems that the predominant model of caring about OSS is to stay asleep until a company buys the project and it’s only when it’s too late that people wake up and rage-fork the project (see Audacity).

                                                                                                            1. 1

                                                                                                              Isn’t that exactly what this venture is optimized on? Minimizing the time until it gets sold, while maximizing the money they get for it?

                                                                                                              I would be honestly surprised if users won’t get beautiful-journey-ed in short order.

                                                                                                              1. 2

                                                                                                                Maybe, but looking at what’s happening with OSS lately, trying to stay virtuous without a real business plan ends up there anyway. I think being deliberate has a chance of making things work out better in the end. That said, only time will tell for Deno.

                                                                                                            2. 5

                                                                                                              To me it’s a great news that Deno finally has a product.

                                                                                                              I kept wondering why they were building a niche version of Node.js, but it turns out they have been building their Cloudflare Workers. That makes a lot more sense now.

                                                                                                            1. 19

                                                                                                              This was much better than I thought, and worth reading for sql fans.

                                                                                                              My main disagreement is that this conflates two things:

                                                                                                              1. Sql being seriously suboptimal at the thing it’s designed for; and distinctly
                                                                                                              2. Sql being bad at things general purpose programming languages are good at.

                                                                                                              There’s value in a restricted language with a clearly defined conceptual model that meets well defined design goals. Despite serious flaws sql is quite good at its core mission of declarative relational querying.

                                                                                                              In many ways the porosity story is not bad - for example Postgres lets you embed lots of languages. I think a lot of the criticisms here really mean that more than one language is needed, and integrating them smoothly is the issue.

                                                                                                              For me, better explicit declaration of what extensions are required for a query to run would make things more maintainable. I think the criticisms in the article around compositionality are in the right area at least - much more clarity would be better here.

                                                                                                              In terms of an upgrade path - if we accept that basically sql is pretty sound but too ad hoc - then this is a very similar problem to that of shell programming. I find the “smooth upgrade path” theory of oil shell plausible (and I’d add that Perl in many ways was a smooth upgrade from shell) although many more people have attempted smooth upgrade paths than have succeeded.

                                                                                                              My best guess as to how to do it would be to implement an alternative but similar and principled language on top of at least two popular engines - probably drawn from the set of SQLite, Postgres, and MySQL - that accommodates the different engines being different and allows their differences to be exposed in a convenient way. If you can get the better query language into at least two of those, you’ll be reaching a large audience who are actually trying to do real work. All easier said than done, of course.

                                                                                                              1. 15

                                                                                                                Sql being bad at things general purpose programming languages are good at.

                                                                                                                I think this (and what follows) is a misinterpretation.

                                                                                                                The core idea is not to change things such that SQL is suddenly good a GP tasks, but to adopt the things from GP languages that worked well there, and will also work well in the SQL context; for instance:

                                                                                                                • Sane scoping rules.
                                                                                                                • Namespaces.
                                                                                                                • Imports.
                                                                                                                • Some kind of generic programming.

                                                                                                                These things alone would enable people to write “cross-database SQL standard libraries” that would make it easier to write portable SQL (which the database vendors are obviously not interested in).

                                                                                                                Which would then free up resources from people who want to improve communication with databases in other ways¹ – because having to write different translation code for 20 different databases and their individual accumulation of 20 years of “quirks” is a grueling task.

                                                                                                                principled language on top of at least two popular engines - probably drawn from the set of SQLite, Postgres, and MySQL - that accommodates the different engines being different and allows their differences to be exposed in a convenient way

                                                                                                                I think most of the ecosystems weakness comes from any non-trivial SQL code being non-portable. I would neither want “differences exposed in a convenient way”, nor would I call a language that did that “principled”.


                                                                                                                ¹ E. g. why does shepherding some data from a database into a language’s runtime require half a dozen copies and conversions?

                                                                                                                1. 2

                                                                                                                  I guess maybe I just disagree on the problem. I don’t think portability is a very important goal, and I would give it up before pretty much anything else.

                                                                                                                  1. 5

                                                                                                                    Portability is not the important goal, it’s simply the requisite to get anything done, including things you may consider an important goal.

                                                                                                                    Because without it, everyone trying to improve things is isolated into their own database-specific silo, and you have seen the effect of this for the last decades: Little to no fundamental improvements in how we use or interact with databases.

                                                                                                                    1. -2

                                                                                                                      No I don’t think so.

                                                                                                                2. 7

                                                                                                                  My best guess as to how to do it would be to implement an alternative but similar and principled language on top of at least two popular engines - probably drawn from the set of SQLite, Postgres, and MySQL

                                                                                                                  That’s exactly what I did with Preql, by making it compile it to SQL (https://github.com/erezsh/Preql)

                                                                                                                  Still waiting for the audience :P

                                                                                                                  1. 3

                                                                                                                    Yeah but (a) it’s not available out of the box (b) it’s not obvious there’s a smooth upgrade path here or even that this is the language people want. Which is only somewhat of a criticism- lots of new things are going to have to be tried before one sticks.

                                                                                                                  2. 2

                                                                                                                    Sql being seriously suboptimal at the thing it’s designed for; and distinctly

                                                                                                                    Sql being bad at things general purpose programming languages are good at.

                                                                                                                    Excellent point. Bucketing those concerns would make this “rant” even better! I do think that stuff falls into both buckets (though that which falls into the first buckets are trivially solvable, especially with frameworks like linq or ecto, or gql overlays). The second category though does reflect that people do want optimization for some of those things, and it’s worth thinking about how a “replacement for sql” might want to approach them.

                                                                                                                  1. 14

                                                                                                                    This was an absolutely brilliant article! It was fantastically well researched and written someone with expert knowledge of the domain. I’m learning so much from reading it.

                                                                                                                    The argument about representing JSON objects in SQL were not persuasive to me. I do not really understand why this would be desirable. I see the SQL approach as a more static-typed one, where you would process JSON objects and ensure they fit a predefined structure before inserting them into SQL. For a more dynamic approach where you just thrown JSON objects into a database you have MongoDB. On that note I think the lack of union types in SQL is a feature more than a limitation, isn’t it?

                                                                                                                    Excellent point about JOIN syntax being verbose, and the lack of sugar or any way to metaprogram and define new syntax. The query language could be so much more expressive and easy to use.

                                                                                                                    It totals ~16kloc and was mostly written by a single person. Materialize adds support for SQL and various data sources. To date, that has taken ~128kloc (not including dependencies) and I estimate ~15-20 engineer-years

                                                                                                                    I think these line counts say a lot! The extra work trying to fulfill all the criteria of the SQL standard isn’t necessary work for the implementation of a database system. A more compact language specification would enable implementations to be shorter and enable people to learn it much more easily.

                                                                                                                    The overall vibe of the NoSQL years was “relations bad, objects good”.

                                                                                                                    The whole attitude of the NoSQL movement put me off it a lot. Lacking types and structure never sounded like an improvement to me - more like people just wanted to skip the boring work of declaring tables and such. But this work is a foundation for things to work smoothly so I think the more dynamic approach will often bite you in the end. But then the author explains more about GraphQL and honestly it sold me on GraphQL, I would be very open to using that in future rather than SQL after reading this.

                                                                                                                    Strategies for actually getting people to use the thing are much harder.

                                                                                                                    This is a frustrating part about innovation in programming but honestly I believe that the ideas he has presented represent too significant an improvement that they are just too good for people not to start using.

                                                                                                                    1. 7

                                                                                                                      If you have data encoded in a JSON format, it often falls naturally into sets of values with named fields (that’s the beauty of the relational model) so you can convert it into a SQL database more or less painlessly.

                                                                                                                      On the other hand, if you want to store actual JSON in a SQL database, perhaps to run analytical queries on things like “how often is ‘breakfast’ used as a key rather than as a value”, it’s much more difficult, because “a JSON value” is not a thing with a fixed representation. A JSON number might be stored as eight bytes, but a JSON string could be any length, never mind objects or lists. You could create a bunch of SQL tables for each possible kind of JSON value (numbers, strings, booleans, objects, lists) but if a particular object’s key’s value can be a number or a string, how do you write that foreign key constraint?

                                                                                                                      Sure, most applications don’t need to query JSON in those ways, but since the relational model is supposed to be able to represent any kind of data, the fact that SQL falls flat on its face when you try to represent one of the most common data formats of the 21st century is a little embarrassing.

                                                                                                                      That’s what the post means by “union types”. Not in the C/C++ sense of type-punning, but in the sense of “a single data type with a fixed number of variants”.

                                                                                                                      1. 4

                                                                                                                        A JSON number might be stored as eight bytes

                                                                                                                        Sorry to nitpick, but a JSON number can be of any length. I think what you were thinking of was JavaScript, in which numbers are represented as 64-bit values.

                                                                                                                        1. 1

                                                                                                                          No, the json standard provides for a maximum number of digits in numbers. Yes I know this because of a bug from where I assumed json numbers could be any length.

                                                                                                                          Edit: I stand corrected - I’m certain I saw something in the standard about a limit (I was surprised) but it seems there isn’t. That said various implementations are allowed to limit the length they process. https://datatracker.ietf.org/doc/html/rfc7159#section-6

                                                                                                                          1. 5

                                                                                                                            Which standard? ECMA-404 doesn’t appear to have a length limitation on numbers. RFC 8259 says something much more specific:

                                                                                                                            This specification allows implementations to set limits on the range and precision of numbers accepted. Since software that implements IEEE 754 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by implementations that expect no more precision or range than these provide, in the sense that implementations will approximate JSON numbers within the expected precision. A JSON number such as 1E400 or 3.141592653589793238462643383279 may indicate potential interoperability problems, since it suggests that the software that created it expects receiving software to have greater capabilities for numeric magnitude and precision than is widely available.

                                                                                                                            In fewer words, long numbers are syntactically legal but might be incorrectly interpreted depending on which implementation is decoding.

                                                                                                                            1. 1

                                                                                                                              The ECMA-303 standard doesn’t talk about any numerical limits at all, and RFC7159 talks about implementation-specific limitations which a) is kinda obvious, because RAM isn’t unlimited in the real world and b) doesn’t buy you anything if you are implementing a library that needs to deal with JSON as it exists in the wild.

                                                                                                                              So yes, JSON numbers can be of unlimited magnitude and precision and any correct parsing library better deals with this.

                                                                                                                        2. 5

                                                                                                                          Lacking types and structure never sounded like an improvement to me - more like people just wanted to skip the boring work of declaring tables and such.

                                                                                                                          To some degree it’s the same as the arguments in favor of dynamically-typed languages. Just s/tables/variable types/, etc.

                                                                                                                          Also, remember the recent post which included corbin (?)s quote about “you can’t extend your type system across the network” — that was about RPC but it applies to distributed systems as well, and the big win of NoSQL originally was horizontal scaling, i.e. distributing the database across servers.

                                                                                                                          [imaginary “has worked at Couchbase for ten years doing document-db stuff” hat]

                                                                                                                          1. 3

                                                                                                                            The whole attitude of the NoSQL movement put me off it a lot. Lacking types and structure never sounded like an improvement to me - more like people just wanted to skip the boring work of declaring tables and such.

                                                                                                                            I always thought that NoSQL came about because people didn’t feel like dealing with schema migrations. I’ve certainly dreaded any sort of schema migration that did more than just add or remove columns. But I never actually tried using NoSQL “databases” so I can’t speak about whether or not they actually help.

                                                                                                                            1. 13

                                                                                                                              In practice you still need to do migrations, in the form of deploying your code to write the new column in a backwards compatible way and then later removing that backwards compatible layer. The intermediate deployments that allow for the new and old code to live side by side, as well as a safe rollback, are required whether you use sql or not. The only difference is that you don’t have to actually run a schema migration. A downside of this is that it’s much easier to miss what actually turns out to be schema change in a code review, since there are not explicit “migration” files to look for.

                                                                                                                              1. 10

                                                                                                                                This! you’re basically sweeping dirt under the carpet. One day you’re going to have to deal with it..

                                                                                                                              2. 11

                                                                                                                                In my experience this leads to data inconsistencies and the need to code defensively or otherwise maintain additional application code.

                                                                                                                                1. 9

                                                                                                                                  Not if you’re hopping jobs every 1-2 years. If you’re out the door quickly enough, you can honestly claim you’ve never run into any long-term maintainability issues with your choice of technologies.

                                                                                                                                2. 3

                                                                                                                                  I always thought that NoSQL came about because people didn’t feel like dealing with schema migrations.

                                                                                                                                  I think that’s unlikely, most NoSQL people probably have no idea what schema migrations are.

                                                                                                                              1. 1

                                                                                                                                Desperately trying to find some articles I read in the last few months, but cannot manage to find anymore.

                                                                                                                                Either my Google-fu is failing me, or stuff just disappeared from the (searchable) web.

                                                                                                                                1. 5

                                                                                                                                  What does “teaching/coding language” mean? What are you even trying to teach? I think the subtext is that all the things the IDE make convenient, things that you would need to learn if you were using Vim, they are not coding? Am I right?

                                                                                                                                  Why not just write that then? I am not saying IDE’s are bad, but now that we are implicitly redefining words, why don’t you just let me redefine coding as “non-HTML based cross-platform GUI coding”? Now F# suddenly doesn’t seem like a good “coding language” ;)

                                                                                                                                  Here comes my typical Haskeller retort: because F# doesn’t have a sufficiently strong type system (effectful code based on higher-ranked polymorphism is not popular in F#), you are not really doing FP. Without higher-rank polymorphism, you’re stuck in the pure part of FP, which is the easy one. F# jumps the shark on the hard parts, and it isn’t convenient to write modularized effectful code.

                                                                                                                                  So is the implication here that “teaching language” means you don’t need to worry about effects? I think teaching should be about doing it right. If F# guides you down a wrong path, is it really a good teaching language?

                                                                                                                                  I’d agree more to something like “beginner FP language”.

                                                                                                                                  1. 5

                                                                                                                                    because F# doesn’t have a sufficiently strong type system (effectful code based on higher-ranked polymorphism is not popular in F#), you are not really doing FP. Without higher-rank polymorphism, you’re stuck in the pure part of FP, which is the easy one. F# jumps the shark on the hard parts, and it isn’t convenient to write modularized effectful code.

                                                                                                                                    For anyone unfamiliar with these terms: functional programming and higher-rank polymorphism are completely unrelated concepts.

                                                                                                                                    Functional programming is about avoiding mutation and side effects, and higher-rank polymorphism is a type-checking feature. Functional code doesn’t stop being functional if you remove the type-checking step from your build, so it’s not possible for higher-rank polymorphism to be a requirement for doing FP.

                                                                                                                                    1. 6

                                                                                                                                      I don’t understand this FP gatekeeping.

                                                                                                                                      Functional programming is about avoiding mutation and side effects, and higher-rank polymorphism is a type-checking feature. Functional code doesn’t stop being functional if you remove the type-checking step from your build

                                                                                                                                      A couple years back, there were lots of posts about JavaScript being functional. At that time, I remember hearing arguments that closures and functions were what functional programming is really about.

                                                                                                                                      Personally, I’m inclined to slightly agree with the grandparent comments: without higher ranked types it is tricky to represent IO in an immutable fashion. Since the alternative is to bail out and just use mutation (as F# does) then it isn’t as functional as it could be (by your own definition of “avoiding mutation and side effects”).

                                                                                                                                      1. 1

                                                                                                                                        without higher ranked types it is tricky to represent IO in an immutable fashion. Since the alternative is to bail out and just use mutation (as F# does) then it isn’t as functional as it could be (by your own definition of “avoiding mutation and side effects”).

                                                                                                                                        Elm is a typed pure functional language without higher-ranked types, and Elm is even stricter about purity of effects than Haskell, which has unsafePerformIO. (Elm has no equivalent of that.)

                                                                                                                                        I don’t think there’s anything particularly tricky about it!

                                                                                                                                        1. 5

                                                                                                                                          Elm chooses to limit pretty strongly what sort of IO the user does to the point that I’m not sure it counts as a general purpose programming language. For example: what is the type of a function that reads a file? How do you capture those effects? You can definitely avoid using monads for IO, but IMO it leads to less powerful or less usable systems. For example: Haskell initially modeled IO using infinite lists (and it wasn’t very usable).

                                                                                                                                          unsafePerformIO is a compiler hack. It isn’t part of the Haskell specification, it is just used for FFI interop with other languages. EDIT: the legitimate use cases for unsafePerformIO are extremely rare and are (to my knowledge) always around some performance optimization that is made outside of Haskell 98.

                                                                                                                                          1. 1

                                                                                                                                            You can definitely avoid using monads for IO, but IMO it leads to less powerful or less usable systems.

                                                                                                                                            Elm does use monads for I/O, it just doesn’t call them that.

                                                                                                                                            Here is Elm’s equivalent of Haskell’s IO (except that it includes error handling, which Haskell’s IO doesn’t):

                                                                                                                                            https://package.elm-lang.org/packages/elm/core/latest/Task#Task

                                                                                                                                            Nothing tricky about it imo.

                                                                                                                                            1. 2

                                                                                                                                              Can I define my own Task to call a C function using FFI? Can’t find it.

                                                                                                                                              1. 3

                                                                                                                                                Only if your name starts with Evan. ;-)

                                                                                                                                                1. 1

                                                                                                                                                  Elm compiles to JavaScript, so no.

                                                                                                                                                  Given that the thread was about whether functional programming has anything to do with higher-ranked types, I’ll assume this change of topic means that discussion is over.

                                                                                                                                                2. 2

                                                                                                                                                  Oh man, that link reminds me of Elm’s time handling approach that made me want to take Evan’s hand and tell him “Kid, it’s not as easy as you think it is. You think you had a really smart idea there, but really, you didn’t.”.

                                                                                                                                                  1. 1

                                                                                                                                                    Interesting - I had the opposite reaction. I’ve never used a time API I liked as much as Elm’s. What didn’t you like about it?

                                                                                                                                                    1. 1

                                                                                                                                                      From an ergonomic perspective, an example problem (of many!) is the choice to make time zones explicit. I suppose that this comes from the SPA tradition, where ad-hoc user input (“Choose your language”) or Web browser APIs are used to establish a preferred presentation language instead of content negotation. However, just like with the Unicode sandwich technique, we can have a UTC Sandwich design where times on Earth are handled uniformly, and only the outermost presentation layers need to worry about timezones.

                                                                                                                                                      Worse, in my opinion, is the decision to make access to timers unprivileged. This invites timing side-channels. I know of no solutions besides making them explicit parameters instead of allowing them to be ambiently imported.

                                                                                                                                                      In general, Elm’s libraries and designs are oriented towards SPAs but not towards backend services, ruining the hope of so-called “isomorphic” deployments where identical code runs in the Web browser and the backend.

                                                                                                                                                      1. 1

                                                                                                                                                        From an ergonomic perspective, an example problem (of many!) is the choice to make time zones explicit. […] However, just like with the Unicode sandwich technique, we can have a UTC Sandwich design where times on Earth are handled uniformly, and only the outermost presentation layers need to worry about timezones.

                                                                                                                                                        I personally like this technique - explicit time zones are one of my favorite parts about the design of the elm/time package - but fair enough if your preferences differ from mine!

                                                                                                                                                        Worse, in my opinion, is the decision to make access to timers unprivileged. This invites timing side-channels. I know of no solutions besides making them explicit parameters instead of allowing them to be ambiently imported.

                                                                                                                                                        Hmm, how would a language (any language!) support even basic current-time use cases (like obtaining a current timestamp) without making timing attacks possible?

                                                                                                                                                        1. 1

                                                                                                                                                          Time-zone handling isn’t just a preference. We’ll necessarily incur an extra table lookup if we want to decode a historical timestamp which is relative to some time zone, so it’s slower. That extra table has to be from an old time-zone map, so we must store an ever-growing number of old time-zone maps, so it’s bigger. And it is another thing that programmers might get incorrect, so it’s got more potential for bugs.

                                                                                                                                                          By “explicit parameters” for timers, I mean that the application’s entry points would accept timer values as parameters, and invoking those timer values would produce timestamps. Timing attacks are still possible, but they no longer have the potential to run rampant through a codebase. For a practical example, this Monte module implementing Python’s timeit algorithm is parameterized upon some timer object, but doesn’t have automatic access to any privileged clocks or timers. This module drawing some console graphics explicitly requests the system clock Timer and syntactically reveals where it is used.

                                                                                                                                                          1. 1

                                                                                                                                                            We’ll necessarily incur an extra table lookup if we want to decode a historical timestamp which is relative to some time zone, so it’s slower.

                                                                                                                                                            I see - so, to check my understanding: for reasons beyond your control, it’s stored in the database (or you get it from an external data source) relative to a particular time zone, and you want to decode it (and then work with it?) while staying relative to that time zone?

                                                                                                                                          2. 2

                                                                                                                                            functional code doesn’t stop being functional if you remove the type-checking

                                                                                                                                            Like the “Rust is NP-hard” post shows, the type system is also used for code generation. So I can’t even run my code after removing the type checker.

                                                                                                                                            We don’t need to mince words about what FP is, because this is a discussion about what F# is missing. I should have said “pure typed FP”. Would you agree with me then?

                                                                                                                                            1. 3

                                                                                                                                              I should have said “pure typed FP”. Would you agree with me then?

                                                                                                                                              Also no; a counterexample would be Elm, which is a typed pure functional language without higher-rank polymorphism!

                                                                                                                                              1. 6

                                                                                                                                                You can also have an effect system (like Koka) and still not have higher-kinded types. These are only necessary if you want to abstract over monads.