1. 27

Nobody knows how to correctly install and package Python apps.

That’s a relief. I thought I was the only one.

1. 5

Maybe poetry and pyoxidize will have a baby and we’ll all be saved.

One can hope. One can dream.

1.

I think the same goes for running Python web apps. I had a conversation with somebody here… and we both agreed it took us YEARS to really figure out how to run a Python web app. Compared to PHP where there is a good division of labor between hosting and app authoring.

The first app I wrote was CGI in Python on shared hosting, and that actually worked. So that’s why I like Unix – because it’s simple and works. But it is limited because I wasn’t using any libraries, etc. And SSL at that time was a problem.

Then I moved from shared hosting to a VPS. I think I started using mod_python, which is the equivalent of mod_php – a shared library within Apache.

Then I used a CherryPy server and WSGI. (mod_python was before WSGI existed) I think it was behind Apache.

Then I moved to gunicorn behind nginx, and I still use that now.

But at the beginning of this year, I made another small Python web app with Flask. I managed to configure it on shared hosting with FastCGI, so Python is just like PHP now!!! (Although I wouldn’t do this for big apps, just personal apps).

So I went full circle … while all the time I think PHP stayed roughly the same :) I just wanted to run a simple app and not mess with this stuff.

There were a lot of genuine improvements, like gunicorn is better than CherryPy, nginx is easier to config than Apache, and FastCGI is better than CGI and mod_python … but it was a lot of catching up with PHP IMO. Also FastCGI is still barely supported.

1.

I’d make an exception to this point: “…unless you’re already a Python shop.” I did this at $job and it’s going okay because it’s just in the monorepo where everyone has a Python toolchain set up. No installation required (thank god). 1. Why is Python’s packaging story so much worse than Ruby’s? Is it just that dependencies aren’t specified declaratively in Python, but in code (i.e. setup.py), so you need to run code to determine them? 1. Gemfile and gemspec are both just ruby DSLs and can contain arbitrary code, so that’s not much different. One thing is that pypi routinely distributes binary blobs that can be built in arbitrarily complex ways called “wheels” whereas rubygems always builds from source. 2. I just run pkg install some-python-package-here using my OS’s package manager. ;-P It’s usually pretty straightforward to add Python projects to our ports/package repos. 1. 4 I build static site generators for almost every project I have that needs a site (my personal blog build script for example), except for the ones that only have a single page (like dbcore.org. Each time it’s ~200 lines of code that lasts me years with only minor tweaks. I much prefer that than having to deal with breaking changes by 3rd party generators over the years. My basic formula is file system walker + markdown parser (when I’m not feeling lazy) + jinja or equivalent template engine for layouts. Actually my personal blog doesn’t even use jinja it just uses Python string templates which has… worked ok for years. Eventually I add crazy things like RSS feed generators and I’ve been thinking about adding a static site search where indexes are built from posts during deployment and you just use JavaScript to search indexes in the browser. Having a link checker and spell check pass might also be nice. And after reading all that you might say it’s a waste of time to build your own and on bigger projects I’d probably agree with you. But again the benefit is that you can start very quickly, small, and only worry about breaking changes in the generator when you want to. 1. 3 If you’re worried about breaking changes isn’t the solution to just not upgrade? 1. 1 But then you’re stuck on an unsupported version. So you’ve now taken responsibility for their whole ecosystem. 1. 2 I don’t see how being on an unsupported version matters if it works for you and you don’t want it to change? Just how your self-built one works for you and isn’t a pain because you basically never change it. 1. 1 As soon as you do want to make your own changes then you have to deal with the breaking changes too. When you build it yourself the only changes are ones you want to make. 1. 2 If I use, say, Jekyll or Hugo, then I can use a specific version of it. I’m running it offline, on trusted data, so I don’t care about security vulnerabilities. If there are bugs, I can either fix them or work around them. If I upgrade then there may be some breaking changes and I can decide whether the new features that I get for free are worth the pain and I can benefit from an ecosystem full of useful things like BibTeX-compatible bibliography engines, FreeBSD man page links, and other fun stuff. If I write my own SSG, then I can use a specific version of it - the one that I wrote. I’m running it offline, on trusted data, so I don’t care about security vulnerabilities. If there are bugs, I can either fix them or work around them. I can’t upgrade except by writing more code and I don’t benefit from any other contributions or from a wider ecosystem. Every feature that I want, I must implement myself. It’s really hard for me to see how the second option is better than the first. 1. 2 Before I write more I’ll clarify I’m really not trying to convince anyone per se. :) Just sharing my experience. Maybe a differentiating factor for you and I: do you generally find it easier to contribute to projects you don’t own (and are driven by a community of many people) or projects that you were the solo dev on? For me it’s always easier to contribute to my own projects than to get into existing ones. I’m not saying one mine is better in any way just that mine are smaller and I already know them. Maybe the point that’s closest to me trying to make an objective argument is code size. How many lines of code is Hugo or Jekyll, including dependencies? Those are lines of code you’re on the hook for. 1. 1 Maybe a differentiating factor for you and I: do you generally find it easier to contribute to projects you don’t own (and are driven by a community of many people) or projects that you were the solo dev on? It can go either way. For project with a good test suite and CI infrastructure already, it’s easier for me to contribute than to a project that I set up myself and haven’t set up the relevant infrastructure yet, as long as upstream is receptive. There’s also a difference between three categories of change: • Changes that I upstream, which must be useful to others and production quality. • Changes that I carry locally (or in a downstream form), which can be small tweaks that break it for others but improve it for me. • Changes that are layered on top of something else. With Jekyll, there’s a lot of scripting available in Liquid and there’s a plug-in interface, so I can maintain downstream plugins if I need to but everything I’ve actually needed was either already implemented by someone else in a plugin or possible for me to implement with Liquid and carry in my site’s repo. Maybe the point that’s closest to me trying to make an objective argument is code size. How many lines of code is Hugo or Jekyll, including dependencies? Those are lines of code you’re on the hook for. That cuts both ways. A lot of the code in Jekyll does stuff that I want. The lack of it in a from-scratch implementation is code that I have to write. I don’t have to maintain them in the upstream repo, I only need to worry about when they change things in ways that affect my own site. This happens, but it has generally meant that I need to spend a few minutes tweaking a couple of lines of Liquid for a major version bump. That’s orders of magnitude less effort for me than to implement something like Jekyll-Scholar from scratch. And if I did, I’d probably end up using some of the same libraries, so I’d be just as vulnerable to API changes (if not more so, since Jekyll-Scholar provides interfaces designed for end users, whereas libraries provide interfaces intended for developers). 1. 6 I know enough of ddevault to understand why he went with IRC instead of Matrix. But I think it is the wrong choice. There’s a reason why sr.ht uses git instead of CVS (or RCS). Similarly IRC should be replaced with Matrix. 1. 12 I’m not sure Matrix is obviously better than IRC, especially not on a protocol level; it might have more features than IRC right now, but I think half of the point of the project was to try and fix that disparity (?) We already have lots of Matrix clients that work pretty well; why should people not be allowed to work on IRC clients, too, especially since we don’t have as much development going on there? 1. 9 I find that a very weird statement. If you’re talking about an org of a certain size you can say it makes sense they choose X over Y. But this is a very small org that provides a service to its paying customers, but most probably because they use IRC themselves. Nothing should be replaced if people are happy to use it. 1. 4 Matrix is still AFAIK, Not Great™ to operate because the server guzzles resources and lacks a lot of moderation features (which can be hard to implement due to the DAG). 1. 2 FWIW I’m running Synapse with a ton of open channels across multiple homeservers and I’m not running into any resource issues on a VPS with 2 GB of RAM (previously 1, but it was running a bit tight) and some swap. I expect Dendrite, the new Go homeserver, to cut resource usage down significantly once it stabilizes. 1. 1 So on a resource front I think it definitely depends on the server implementation you use. I’ve moved to using the conduit home server implementation which is using 500MB RES in some high volume channels. I don’t have numbers on hand but dendrite and synapse both used gigs of RES mem iirc. Now admittedly compare that to an ircd which is no doubt much lower. As far as mod features yeah matrix can use more things in the spec, which probably will be hard to implement. 1. Synapse has improved a lot. 2. 4 I don’t think this is a relevant critique. I do, personally, think that Matrix is the better protocol and I use it myself, but sr.ht uses IRC themselves and is just offering a service to its paying customers to use IRC. If you find that valuable, you can pay for it or use it along with your existing sr.ht account, and if you don’t, you don’t need to. 🤷 . If they used XMPP instead, they could even offer similar XMPP services. 1. 3 Do you mind clarifying what parallels you’re drawing between CVS vs Git and IRC vs Matrix? 1. 7 CVS and IRC are hosted on a central server, Git and Matrix are distributed/federated. 1. 10 That makes sense; I’m not sure that alone is an argument that Matrix is an unqualified better choice than IRC though. Matrix is a very heavy ecosystem that has relatively few implementations, actually setting up a homeserver is an arduous process, the homeservers tend to be pretty resource intensive which presents scalability issues, especially for a more “independent” service which is not backed by a cloud monopolist with compute resources coming out the wazoo. IRC is not federated, but the relative ease in which IRC servers are spun up and their undemanding operational requirement smake it far more effectively decentralised as an ecosystem than Matrix. Matrix also seems not to fit a lot of the ‘ethos’ that sourcehut espouses; in-house developed software that’s for-purpose and aiming to be pragmatic in both terms of use and design, often using technologies and workflows that free software developers already regularly use. IRC fits into this category much neater than Matrix does. It also feels to me personally that the advantages of federation are not as pronounced in real-time (synchronous) chat as source control. 1. 1 IRC is not federated what do you mean by this? IRC networks consist of many interconnected servers run by different people 1. IRC is a closed federation. Matrix/XMPP are open federations (that can be limited by allow/denylists or firewalls) 1. IRC servers can have allow/denylists and firewalls too. What makes it closed and the others open? 2. 1 IRC can be distributed through server-to-server connections, but IRC is not a federated protocol because these networks share a common view of users and channels for as long as they are connected, and there is no way to bridge communications with other networks at the level of the protocol itself. Compare and contrast with XMPP and Matrix, where it’s perfectly possible to communicate with other on federated servers that have absolutely no relation to your homeserver, and there is clear delineation of the ownership of identities and rooms. 1. 2 I don’t see any substance to the idea that IRC servers can’t federate with eachother while XMPP/Matrix servers can. All federated protocols form networks which can become fragmented by mismatches between software and policies. You also contrast “a common view of users and channels” with a “clear delineation of the ownership of identities and rooms.” This arises from a fundamental difference in what the protocols offer, namely that XMPP offers persistent identities while IRC does not, but that has no bearing on whether a protocol supports federation. For a non-federated contrast to IRC/XMPP/Matrix, see ICQ. 2. 2 A more obvious choice in that case might be XMPP. Or even SMTP (see delta chat) 1. 1 IRC is federated 3. 3 perhaps you could lay out why ddevault would disagree, considering that he is unable to respond here (due to a series of incidents that lobste.rs has decided to keep secret) 1. 1. 4 “Computer science” was the word that caused me some trouble here. I spent some time trying to understand what the article was saying, and I concluded that the article, at the same time, managed to be elitist and to be trivial. The statement that “computer science was originally invented to be taught to everyone, but not for economic advantage.” is loaded. Perhaps it is a reaction against the “We need to teach everyone to code.” craze, but then I found elsewhere in the article a note about how not teaching everyone computers was a threat to a democratic society, so I really could not place it. Computers are a tool. Some people build the tools, and a lot of other people use them to do what they actually want to do, like make art or run a business. I’d like to compare the use of computers with the use of cars. When cars first came out they were fiddly things and for a while you had to be mechanically inclined to use them. Then they became more and more user friendly because for every one who liked to spend evenings under the motor there were a thousand for whom it was a tool to improve life quality. We got to the nice position in society where you didn’t have to know anything about internal combustion, lithium ions, gears or electric motors to use the car for business or pleasure. Through the efforts of those employing computer science we are approaching the state where you don’t have to know about bits and bytes to use computers for business or pleasure, and we have been at a reasonable spot with that for many decades now. It is not a threat to the free world. If we need to teach artists and entrepreneurs computer science so they can use computers we have failed in the same way as if we had to teach them thermodynamics and electrical engineering so they can drive a car. 1. 4 Academics aren’t incentivized to create something like this, because doing so is just “applied” research which tends not to be as prestigious. You don’t get to write many groundbreaking papers by taking a bunch of existing ideas and putting them together nicely. Consider electronic voting as a simple example. With a paper ballot, the set of people that can audit the process is huge: basically, anyone who is numerate and not too badly visually impaired. Any candidate who has enough support to stand a chance in a fair election can find people who can turn up in polling stations and monitor the ballots. Contrast that with an electronic scheme where (ignoring the difficulties accessing the code) the number of people who can audit the election is very small. There are a lot of examples like this where power is concentrated into the hands of a small number of people who understand a particular system. I’d like to compare the use of computers with the use of cars. I think that is an incredibly misleading analogy. 100 years ago, a car was a machine to get you from A to B. Today, a car is a machine to get you from A to B. The value of the car is directly related to how well it performs that specific task. The task is reasonably well defined and (aside from a few changes to traffic legislation) really hasn’t changed much over the last century. The most valuable car would be one that requires zero maintenance and drives itself. Having cars all do the same thing makes traffic management easier and improves efficiency in the system overall In contrast, the value of a computer comes from the fact that it can be made to do new things. The larger the space of new things you are able to make a computer do, the more valuable the computer is to you. If enough people need to do a specific thing then there may be some off-the-shelf software that does it already but as soon as you want to do something more specialised then you need to make the computer do something new. This may be something simple, such as entering a new formula into a spreadsheet or writing a macro to automate a task in a word processor, but it still fundamentally a specific thing that you are making the computer do beyond what everyone else does with it. If we need to teach artists and entrepreneurs computer science so they can use computers we have failed in the same way as if we had to teach them thermodynamics and electrical engineering so they can drive a car. If we are going to say that the only tasks that someone should do with a general-purpose computing device are the set of things that an elite of programmers have permitted that they do, then we have failed them in a far worse way. 1. 1 It is not a threat to the free world. Events with social media, misinformation, and the like beg to differ. as if we had to teach them thermodynamics and electrical engineering so they can drive a car. Even mechanics don’t need to know either of those, so the comparison feels forced. 1. 11 I have a computer science degree from a fairly well-known college. I disagree pretty strongly with a lot of my friends and colleagues who studied computer science at the same time and place I did, about exactly what specific things are the problems with social media, what information constitutes misinformation, and what the correct political or technological responses to these issues are. Expecting people who study computer science to magically have the right answers to these fundamentally-political questions is like expecting everyone who knows how to take apart a car engine to magically have the right answers to public policy questions about what the right road tolls should be and whether it’s a good or bad idea to build a highway in a given location. 1. 1 No, of course I don’t expect any X to magically Y in any context. What I do expect is that better-educated populartions are harder to control at scale. Would teaching person X that computers can process information in ways A, B, or C make them realise Facebook is dangerous? No, of course not, just as teaching person X to read would not have broken the stranglehold of the church by itself. 1. 1 I would agree that better educated populations are harder to control. However I lump any “practical” computer science degree in with the least educational vocational schools and schools which don’t take the liberal arts seriously. A liberal arts degree is an education… at some schools. But not others. It’s just as useless as computer science degrees that don’t head for theory land and get lost there. To the point where I tell most people who want to code professionally who wish to go to university that they ought study anything but computer science. 1. 1 Oh, of course, most of most current University systems is a shit show. But most people wouldn’t get CS exposure at University anyway because most people don’t go to university. Anyway, veering far out of the topic space now I fear… 2. 3 Events with social media, misinformation, and the like beg to differ. How strong is the evidence that learning computer science makes one less likely to fall for non-computer-science-related misinformation? To the extent there’s a correlation, it seems like it’d be pretty hard to disentangle “learning computer science” (which is what we’re talking about getting everyone to do) from “being predisposed to learn computer science.” 1. 1 More that knowing anything about the power of computing would make people less likely to blindly give over all their data and attention to a single black-box program. 1. 11 Is that actually true, though? I know tons of people with CS degrees who are totally fine with Facebook, etc. At the same time, I know a bunch of people who aren’t terribly technical who are concerned about that stuff. I think it actually has much more to do with general civic awareness than technical skills. 2. 3 But is social media and misinformation really caused by computers? I’ve been thinking so for a while. It’s easy easier to get into bubbles. It makes sense. But lately I am not so sure anymore. Looking a few decades back, you still have crazy terrorists, but you also have really crazy cults, mass suicides, and terrorists with horribly obscure believes. At the same time it’s easier than ever to get opposing opinions, even in circumstances where one is watched a lot of the time. I think to some degree it’s easier to be professional looking, but then judging things purely by looks has always been wrong. Yes, I think people have to learn to not blindly trust everything, but people had to do that with conspiracy theories in books, newspapers, TV. Besides that people really should not forget how history also changed society over these decades. Vietnam war, child war, the wars in the middle east, huge amounts of lies were told by governments across the world. There’s good reasons for people to distrust governments. We have a situation now where we see different media telling different stories, but we have had that. Catholics vs protestants, different parties having their own newspapers. Of course today media spreads faster, maybe too fast, evoking emotions like the other things I’ve mentioned did. Globalisation, everyone knowing English, information “warfare” being possible by smaller groups or individuals being possible all are part of that. However with the history one needs to put things into perspective. And in my opinion it’s not too different from terrorist organizations of any kind having better weapons, because there are better weapons. I think social media and a culture that doesn’t care about making stupid ideas public just makes things that used to be thought in private, talked in homes, bars, etc. more public. We get a mirror of society that is real time rather than having investigative journalists having to infiltrate personal meetings. Now you know how your cousin, your uncle, etc. actually thinks about the world. I think that being public about all things also lead to other changes. People date to talk about their sexuality, about diseases, as well as many other former taboos for the same reasons. I think a lot of this is two sided. I certainly think social media is to blame for all sorts of things, but I do think a lot of what they are criticized for is making symptoms visible and most likely enhancing them. I don’t think root causes are often found in social media. 1. 1 This is incredible. An optimization to make qureySelector(sel) if sel[0] == “#” then getElementById seems like something I would expect browsers to do … 1. 3 What about querySelector("#t1 .c1")? Or <div id="foo:bar">? Or <div id="foo\bar">? You have to pay the costs of parsing, no matter what. 1. 1 If the selector is anything but a raw id you couldn’t have used getElementById anyway, but recognizing “this is just an id” should be much cheaper than parsing full css expressions. I’m not sure what the exact point of your funky-id examples was, but both are invalid so even if a browser chooses to accept them, seems fine if you pay a performance penalty for using invalid IDs. 1. 4 My point is that the browser needs to tokenize and parse the entire query string before it can decide to shortcut into getElementById. The parsing is the expensive part. Those examples are both valid IDs, straight out of MDN. How you escape them is different between querySelector and getElementById. 1. 1 I also highly recommend checking out xonsh 1. 3 Deploying apps for the Linux desktop is hard. Honestly lost me here. x86_64 binaries will work on 80% of up to date systems, and everyone else can maintain a package in their distro of choice if they want it. Heck, even several year old binary nonfree blobs often work fine, much more than half the time. 1. 10 Shipping a binary is easy. Shipping a binary that depends on libraries is where you get into Fun Time due to the varying package managers. 1. 2 Most binaries I download just list the libraries needed and make it my problem to install them. Some ship .so files in the bundle, which seems popular with nonfree especially for some reason. 1. 7 That’s a pretty bad user experience compared to application distribution on Windows/macOS. Sure, the dev can package it for various distributions or volunteers can, but that still results in a lot of work. 1. 1 With AppImage things just work in my experience. I build on CentOS 7 and it works on bleeding edge ArchLinux. 1. 4 This is an interesting thread on making Makefiles which are POSIX-compatible. The interesting thing is that it’s very hard or impossible, at least if you want to keep some standard features like out-of-tree builds. I’ve never restricted myself to write portable Makefiles (I use GNU extensions freely), but I previously assumed it wasn’t that bad. That this is so hard is maybe a good example of why portability to different dependencies is a bad goal when your dependencies are already open source and portable. As many posters in the thread say, you can just use gmake on FreeBSD. The same goes for many other open source dependencies: If the software is open source, portability to alternatives to that software is not really important. 1. 4 you can just use gmake on FreeBSD. I can, but I don’t want to. If you want to require any specific tool or dependecy, fine, that’s your prerogative, just don’t force your idea of the tool’s cost on me. Own your decision, if it impacts me, be frank about it, just don’t bullshit me that it doesn’t impact me just because the cost for you is less than the cost for me. The question of why don’t you use X instead of Y is nobody’s business but mine. I fully understand and expect that you might not care about Y, please respect my right not to care about X. 1. 11 That’s very standard rhetoric about portability, but the linked thread shows it’s not so simple in this case: It’s essentially impossible to write good, portable Makefiles. 1. 5 Especially considering how low cost using GNU Make is, over i.e. switching OS/architecture. 1. 2 It’s just as easy to run BSD make on Linux as it is to run GNU make on BSDs, yet if I ship my software to Linux users with a BSD makefile and tell them to install BSD make, there will hardly be a person who wouldn’t scorn at the idea. Yet Linux users expect BSD users not to complain when they do exact same thing. Why is this so hard to understand, the objection is not that you have to run some software dependency, the objection is people telling you that you shouldn’t care about the nature of the dependency because their cost for that dependency is different than yours. I don’t think that your software is bad because it uses GNU make, and I don’t think that using GNU make makes you a bad person, but if you try to convince me that “using GNU make is not a big deal”, then I don’t want to ever work with you. 1. 2 Are BSD makefiles incompatible with GNU make? I actually don’t know. 1. 2 The features, syntax, and semantics of GNU and BSD make are disjoint. Their intersection is POSIX make, which has almost no features. …but that’s not the point at all. 1. 2 If they use BSD specific extensions then yes 2. 2 Posix should really standardize some of GNU make’s features (e.g. pattern rules) and/or the BSDs should just adopt them. 1. 5 I get the vibe at this point that BSD intentionally refuses to make improvements to their software specifically because those improvements came from GNU, and they really hate GNU. Maybe there’s another reason, but why else would you put up with a program that is missing such a critically important feature and force your users to go thru the absurd workarounds described in the article when it would be so much easier and better for everyone to just make your make better? 1. 4 I get the vibe at this point that BSD intentionally refuses to make improvements to their software specifically because those improvements came from GNU, and they really hate GNU. Really? I’ve observed the opposite. For example, glibc refused to adopt the strl* functions from OpenBSD’s libc, in spite of the fact that they were useful and widely implemented, and the refusal to merge them explicitly called them ‘inefficient BSD crap’ in spite of the fact that they were no less efficient than existing strn* functions. Glibc implemented the POSIX _l-suffixed versions but not the full set from Darwin libc. In contrast, you’ll find a lot of ‘added for GNU compatibility’ functions in FreeBSD libc, the *BSD utilities have ‘for GNU compatibility’ in a lot of places. Picking a utility at random, FreeBSD’s du [has two flags that are listed in the man page as first appearing in the GNU version], whereas GNU du does not list any as coming from BSDs (though -d, at least, was originally in FreeBSD’s du - the lack of it in GNU and OpenBSD du used to annoy me a lot since most of my du invocations used -d0 or -d1). 1. 2 The two are in no way mutually exclusive. 2. 1 Maybe there’s another reason, but why else would you put up with a program that is missing such a critically important feature and force your users to go thru the absurd workarounds described in the article when it would be so much easier and better for everyone to just make your make better? Every active software project has an infinite set of possible features or bug fixes; some of them will remain unimplemented for decades. glibc’s daemon function, for example, has been broken under Linux since it was implemented. The BSD Make maintainers just have a different view of the importance of this feature. There’s no reason to attribute negative intent. 1. 1 The BSD Make maintainers just have a different view of the importance of this feature I mean, I used to think that too but after reading the article and learning the details I have a really hard time continuing to believe that. we’re talking about pretty basic everyday functionality here. 2. 1 Every BSD is different, but most BSDs are minimalist-leaning. They don’t want to add features not because GNU has them, but because they only want to add things they’ve really decided they need. It’s an anti-bloat philosophy. GNU on the other hand is basically founded in the mantra “if it’s useful then add it” 1. 6 I really don’t understand the appeal of the kind of philosophy that results in the kind of nonsense the linked article recommends. Why do people put up with it? What good is “anti-bloat philosophy” if it treats “putting build files in directories” as some kind of super advanced edge case? Of course when dealing with people who claim to be “minimalist” it’s always completely arbitrary where they draw the line, but this is a fairly clear-cut instance of people having lost sight of the fact that the point of software is to be useful. 1. 3 The article under discussion isn’t the result of a minimalist philosophy, it’s the result of a lack of standardisation. BSD make grew a lot of features that were not part of POSIX. GNU make also grew a similar set of features, at around the same time, with different syntax. FreeBSD and NetBSD, for example, both use bmake, which is sufficiently powerful to build the entire FreeBSD base system. The Open Group never made an effort to standardise any of them and so you have two completely different syntaxes. The unfortunate thing is that both GNU Make and bmake accept all of their extensions in a file called Makefile, in addition to looking for files called GNUmakefile / BSDmakefile in preference to Makefile, which leads people to believe that they’re writing a portable Makefile and complain when another Make implementation doesn’t accept it. 2. 7 But as a programmer, I have to use some build system. If I chose Meson, that’d be no problem; you’d just have to install Meson to build my software. Ditto if I chose cmake. Or mk. Why is GNU make any different here? If you’re gonna wanna compile my software, you better be prepared to get my dependencies onto your machine, and GNU make is probably gonna be one of the easiest build systems for a BSD user to install. As a Linux user, if your build instructions told me to install bsdmake or meson or any other build system, I wouldn’t bat an eye, as long as that build system is easy to install from my distro’s repos. 1. 3 Good grief, why is this so difficult to get through? If you want to use GNU make, or Meson, or whatever, then do that! I use GNU make too! I also use Plan 9’s mk, which few people have installed, and even fewer would want to install. That’s not the point. The problem here has nothing to do with intrinsic software properties at all, I don’t know why this is impossible for Linux people to understand. If you say “I am using GNU make, and if you don’t like it, tough luck”, that’s perfectly fine. If you say “I am using GNU make, which can’t cause any problem for you because you can just install it” then you are being ignorant of other people’s needs, requirements, or choices, or you are being arrogant for pretending other people’s needs, requirements, or choices are invalid, and of course in both cases you are being patronizing towards users you do not understand. This has nothing to do with GNU vs. BSD make. It has nothing to do with software, even. It’s a social problem. if your build instructions told me to install bsdmake or meson or any other build system, I wouldn’t bat an eye, as long as that build system is easy to install from my distro’s repos. And this is why Linux users do not understand the actual problem. They can’t fathom that there are people for whom the above way of doing things is unacceptable. It perfectly fine not to cater to such people, what’s not fine is to demand that their reasoning is invalid. There are people to whom extrinsic properties of software are far more important than their intrinsic properties. It’s ironic that Linux people have trouble understanding this, given this is the raison d’etre for the GNU project itself. 1. 5 I think the question is “why is assuming gmake is no big deal any different than assuming meson is no big deal?” And I think your answer is “those aren’t different, and you can’t assume meson is no big deal” but you haven’t come out and said that yet. 2. 1 I can, but I don’t want to. Same. Rewriting my Makefiles is so annoying, that so far I have resigned to just calling gmake on FreeBSD. Maybe one day I will finally do it. I never really understood how heavily GNUism “infected” my style of writing software, until I switched to the land of the BSD. 3. 2 What seems to irk BSD users the most is putting gnuisms in a file called Makefile; they see the file and expect to be able to run make, yet that will fail. Naming the file GNUMakefile is an oft-accepted compromise. I admit I do not follow that rule myself, but if I ever thought a BSD user would want to use my code, I probably would follow it, or use a Makefile-generator. 1. 4 I’d have a lot more sympathy for this position if BSD make was actually good, but their refusal to implement pattern rules makes it real hard to take seriously. 1. 2 I’d have a lot more sympathy for this position if BSD make was actually good bmake is able to build and install the complete FreeBSD source tree, including both kernel and userland. The FreeBSD build is the most complex make-based build that I’ve seen and is well past the level of complexity where I think it makes sense to have hand-written Makefiles. For the use case in mind, it’s worth noting that you don’t need pattern rules, bmake puts things in obj or $OBJDIRPREFIX by default.

2. 1

That this is so hard is maybe a good example of why portability to different dependencies is a bad goal when your dependencies are already open source and portable.

I mean, technically you are right, but in my opinion, you are wrong because of the goal of open source.

The goal of open source is to have as many people as possible using your software. That is my premise, and if it is wrong, the rest of my post does not apply.

But if that is the goal, then portability to different dependencies is one of the most important goals! The reason is because it is showing the user empathy. Making things as easy as possible for users is being empathetic towards them, and while they may not notice that you did it, subconsciously, they do. They don’t give up as easily, and in fact, sometimes they even put extra effort in.

I saw this when porting my bc to POSIX make. I wrote a configure script that uses nothing other than POSIX sh. It was hard, mind you, I’m not denying that.

But the result was that my bc was so portable that people started using on the BSD’s without my knowledge, and one of those users decided to spend effort to demonstrate that my bc could make serious performance gains and help me to realize them once I made the decision to pursue that. He also convinced FreeBSD to make my bc the system default for FreeBSD 13.

Having empathy for users, in the form of portability, makes some of them want to give back to you. It’s well worth it, in my opinion. In fact, I just spent two days papering over the differences between filesystems on Windows and on sane platforms so that my next project could be portable enough to run on Windows.

(Oh, and my bc was so portable that porting it to Windows was little effort, and I had a user there help me improve it too.)

1. 4

The goal of open source is to have as many people as possible using your software.

I have never heard that goal before. In fact, given current market conditions, open source may not be the fastest way if that is your goal. Millions in VC to blow on marketing does wonders for user aquisition

1. 1

That is true, but I’d also prefer to keep my soul.

That’s the difference. One is done by getting users organically, in a way that adds value. The other is a way to extract value. Personally, I don’t see Open Source as having an “extract value” mindset in general. Some people who write FOSS do, but I don’t think FOSS authors do in general.

2. 4

The goal of open source is to have as many people as possible using your software.

I actually agree with @singpolyma that this isn’t necessarily a goal. When I write software and then open source it, it’s often stuff I really don’t want many people to use: experiments, small tools or toys, etc. I mainly open source it because the cost to me of doing it is negligible, and I’ve gotten enough neat random bits and pieces of fun or interesting stuff out of other people’s weird software that I want to give back to the world.

On the other hand, I’ve worked on two open source projects whose goal was to be “production-quality” solutions to certain problems, and knew they weren’t going to be used much if they weren’t open source. So, you’re not wrong, but I’d turn the statement around: open source is a good tool if you want as many people as possible using your software.

1. 5

That’s not how you spell ZFS.

Bcachefs is too new and unproven. It will eat your data if you let it by using it.

1. 10

Bcachefs is intended to succeed the current go-to options: Btrfs and ZFS. All filesystems have to start somewhere.

1. 4

Is there a guide what are the planned improvements over ZFS?

1. 9

Apparently it has a much smaller codebase than ZFS or btrfs (and in fact, it’s slightly smaller than ext4) while being cleaner and more flexible.

1. 6

Not a technical improvement, but if all goes right, it would be included in mainline kernel with a known-valid licence - which I don’t think zfs will ever achieve.

1. 4

Not a technical improvement, but if all goes right, it would be included in mainline kernel with a known-valid licence - which I don’t think zfs will ever achieve.

…on Linux, at least ;)

1. 3

At the same time it seems to be GLPv2 licensed, which mean that merge into BSDs is also something that we probably will not see. So almost all users of ZFS will not bother with it.

1. 2

Maybe that’s for the best. Both worlds can have great filesystems, and the two worlds don’t need to use the same great filesystems. I’m getting more and more convinced that the less BSD people and Linux people have to interact, the better.

1. 1

I’m still hopeful once HAMMER2 is more mature, it’ll get ported to other systems including Linux.

2. 3

To me, two things stands out comparing to ZFS (at least promised):

1. True tiered storage support, unlike current L2ARC / slog in ZFS;

2. Better expansion RAIDZ* disks.

1. 1

Number one is 100% correct, and no one should use an slog thinking it’s tiered storage. The only tiered storage ZFS does is tier-0 in memory (ARC, not L2ARC) and tier-1 to your pool, rate-limited by the slowest vdev. The ZFS devs have turned down requests for true tiered storage many times, so I don’t think it’s going to happen anytime in the next 5 years. (You can get it working by using ZFS with a dm-writecache device as the underlying “disk” but I think all bets are probably off as to the integrity of the underlying data.)

But for number two, draid is probably the correct answer. I think it’s superior to what bcachefs is offering.

3. 2

But why not use what’s there. ZFS or porting HAMMER instead? If your focus is “reliability and robustness” code that is used in production seems better than creating new code with new bugs.

1. 3

But why not use what’s there. ZFS or porting HAMMER instead? If your focus is “reliability and robustness” code that is used in production seems better than creating new code with new bugs.

Because licenses, to start somewhere (the Linux kernel can’t include BSD code, for example). It’s also hard to argue that HAMMER is proven due to it’s extremely small user base, even if it might be interesting from a technical standpoint.

1. 6

Linux can include BSD licensed code - as long as there’s no advertising clause, it’s compatible with GPL.

1. 1

Linux can include BSD licensed code - as long as there’s no advertising clause, it’s compatible with GPL.

I stand corrected!

1. 1

This is true, but note that ZFS is CDDL, not BSD.See https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/

…but HAMMER would be fine from a license perspective.

2. 4

the Linux kernel can’t include BSD code, for example

What makes you think so? There’s quite a bit of code in the Linux kernel that started out BSD licensed.

1. 2

It’s totally legal to go either direction (BSD code mixed with GPL code or GPL code mixed with BSD code). There is a cultural stigma against the other license, depending on which license camp one is in. I.e. Most BSD people dislike the GPL license and go out of their way to replace GPL’d code with BSD license code(and the opposite for BSD code in Linux land).

As a more recent example, GUIX didn’t get invented out of a vacuum, it’s partly(mostly?) GPL people unhappy with the Nix license.

One extra gotcha as @singpolyma points out, GPL code mixed with BSD code enforces the GPL license on the binaries, usually making BSD proponents unhappy.

1. 3

This is very misleading because if you add GPL to BSD then the combined work/a derived work can no longer be distributed as BSD.

2. 2

the Linux kernel can’t include BSD code

BSD is GPL-compatible, but the reverse isn’t true: BSD code can’t include GPL’d code.

1. 2

Technically BSD codebases can include all the GPL’d code they want. They can even keep the BSD license on the parts that don’t depend on the GPL part. The binaries would be GPL though and there are other reasons purists might avoid this.

2. 1

due to it’s extremely small user base

So, like, bcachefs? or nilfs2 (which is in mainline…).

1. 1

It feels to me that the major feature is rollbacks. Now while I can understand that, after all it’s why people like having backups and ZFS snapshots (and but environments by extension) I just don’t really see how this really matters being integrated into the package manager really matters. If you have software installing it uninstalling software seems to have worked fine over the past few decades.

Is state/data/configuration somehow managed in a special way?

When I think of such scenarios the burden is on getting stuff that is not managed by package managers back into order.

On the topic of running multiple versions. While I very rarely have wanted to do that, mostly for Debugging or bad upgrade paths of software while not having backups from the article I understand that certain services are being run multiple times with different versions. If that is correct I’m very curious how that is done in relation to sockets (Unix, TCP, etc ) how does other software decide where to connect to? Just the address or is there something else in how packages are built handled that helps with deciding?

1. 7

It feels to me that the major feature is rollbacks.

I’m a keen NixOS and Guix user, and I don’t consider this to be directly important, though I can see why it’s seen that way.

On the topic of running multiple versions.

I also don’t run multiple versions of things.

The biggest benefit that NixOS/Guix introduces for me is that I can treat my machine as code, with high fidelity and efficiently.

In Linux distributions such as Debian/Arch/CentOS, I consider my machine to be a 30GB mutable blob (the contents of /etc, /usr, etc.). Updates are mutations to this 30GB blob, and I have low confidence how it’s going to behave. Cfgmgmt /automates/ this, automating low-confidence steps still results in low confidence.

For NixOS/Guix, I consider my machine to be about 100kB (my Guix config). I can understand this 100kB, and when I change the system, I change this 100kB, and accurately know what state my machine will in after its changed: the update is actually a replace.

Is state/data/configuration somehow managed in a special way?

~Everything in /etc is managed via NixOS/Guix.

~nothing in /var/ is managed via NixOS/Guix. NixOS/Guix reduces my state space, but /var is still state I have to care about. (Actually, on my desktop no state is preserved on /var between boots: every boot has a fresh computer smell.)

1. 1

Hey, thanks a lot for your response. I have some naive questions then.

In your desktop system, in many cases that blob, the state one cares about would probably be in $HOME. What about that? What do you mean sheet “/etc is managed”? Say I have a configuration that would usually lie there. Where is it now? Say I want to customize it what would I do? Say configuration syntax changes, what would I do? I understand your comparison with mutable blob vs declared state, after all that’s the same approach that other kinds of software often use, be it configuration management, some cloud/service orchestration tools and honestly a lot of software that has the word declarative in the first few sentences. In practical use I see these systems fall apart very quickly, because a lot of the time it’s more a changing state in the way one would define a sweet of database migrations. So for a simple example. Let’s take /etc. That’s the configuration. You in many situations can copy that to a new system and it’s fresh and the way you want. Various package managers also can output a list of which packages are installed in a format that can be read so you usually have /usr covered as well. Because of that I don’t usually see this part as a big issue. After all that’s in a way how many distro installers look at things. /boot is similar. /usr should not be touched, though sometimes it can be an emergency hack, but I prefer to have it read only other than on changes by the package manager. That leaves /var and /home, which sounds at least somewhat similar to what you are saying (correct me if I’m wrong. So in my understanding what is done is more that the system makes sure that what should be actually is? Talking about upgrades, removals, etc. not leaving stuff behind? I guess that makes quick hacks hard or impossible? Don’t get me wrong I’d actually consider that a good thing. /var on desktop might not have much needed state, but in many situations that state would be in /home. Anyways, thank you again for your response. I guess at this point it might make sense if I took a closer look at it myself. I just am curious about practical experiences, because I completely understand that declaratively describing a system tends to look very nice on paper, but in many situations (also because of badly designed software) is more like simply writing a setup shell script and maybe running it each boot, just that shell scripts tend to be more flexible for better and for worse. Of course having a solution that does that fit you with a good abstraction is interesting. That’s why lately I’ve been thinking about where we handle big blobs that we sometimes want to modify in a predictive manner and had to think about database schemas and migrations. Thanks again, have a nice day. :) 1. 4 What do you mean sheet “/etc is managed”? Say I have a configuration that would usually lie there. Where is it now? Say I want to customize it what would I do? Say configuration syntax changes, what would I do? The contents live as part of your nix configuration. It’s both awesome and frustrating at times. Nix tries to overlay it’s view of reality onto the config file format, so say it’s nginx and instead of trying to write nginx confg like: http { sendfile on; }  in nix you would write something like: services.nginx.http.sendfile = true;  and then when nix goes to build the system, it will generate a nginx config file for you. This allows for some nice things, like services.nginx.recommendedOptimisation = true; and it will fill in a lot of boilerplate for you. Not all of nix is this integrated with the config file(s), so sometimes you get some oddities, or sometimes the magic that nix adds isn’t very clear and you have to go dig around to see what it’s actually doing. Another downside, is it means a re-build every time you want to change a minor thing in 1 application. The upside, nix is usually really good about not restarting the entire world and will try to just restart that 1 application that changed. This is just an off-shoot of the declarative process, and some will call it a feature, especially in production, but it can be annoying when in development. You can turn all of that off and just say this app will read from /var/etc/nginx/nginx.conf and leave nginx.conf entirely in your control. This is handy when moving to nix, or maybe in development of a new service or something. As far as the mutable state of applications, nix mostly punts on this, and makes it YOUR problem. There is some nix config options that packagers of apps can take advantage of, so say on upgrades, if you installed PG11 originally, it won’t willy nilly upgrade you to PG12. It makes you do that yourself. So you get all the new bits except PG11 will still run. All that said this stuff isn’t perfect, so testing is your friend. 1. 1 You can turn all of that off and just say this app will read from /var/etc/nginx/nginx.conf and leave nginx.conf entirely in your control. My goodness, do you have a guide or blogpost or something for this way of going about it? That’d be super helpful. I’ve tried Nix a few times and this is exactly where I go crazy. I can store real config files in git too; just let me do that! (and avoid the Nix language!) 1. 2 https://search.nixos.org/options?channel=21.05&from=0&size=50&sort=relevance&type=packages&query=services.nginx is where I’d start to look for how to disable the config management part of the nginx service. If that wasn’t enough, I’d go to the corresponding file in nixpkgs. If a module doesn’t meet my needs, and I can’t easily make it do so, sometimes I will write my own module, to have fully control over it. I still reuse the nginx package, and would typically start my module by copy-pasting and trimming the existing one. 95% of the time, the provided modules do exactly what I want. https://github.com/NixOS/rfcs/blob/master/rfcs/0042-config-option.md aims to make the “please let me take control over the service” usecase easier. 1. 1 Well, doing this in some ways defeats one of the big reasons for Nix, but there are valid use-cases: For nginx, this is what I do: # this sets the config file to /etc/nginx/nginx.conf, perfect for us using consul-template. services.nginx.enableReload = true; services.nginx.config = "#this should be replaced.";  Now it’s on you to maintain /etc/nginx/nginx.conf For us, we use consul-template(ran via systemd as more nix configuration) and it generates the config. But you are free to replace it (after deploy) manually. i.e. nix will over-write the /etc/nginx/nginx.conf file every nixos-rebuild build with the contents: #this should be replaced. Otherwise, what nixos tends to do is symlink the /etc//configfile -> to somewhere in /nix which is read-only for you. so it’s up to you to erase the symlink and put a real file there. One could automate this with a systemd service that runs on startup. Another way to do this, is hack up the systemd service, assuming the service will accept the location of the config file as a cmd line argument. This is non-standard and can be fiddly in nix. I don’t know of a better way. 2. 2 On Guix system it is recommended to manage anything in etc via a “service” which will be written/configuerd in scheme and has deploy and rollback semantics. For config that lives in$HOME there is guix home and guix home services, which parallel the system for /etc and system services, and even work on other operating systems.

1. 2

In your desktop system, in many cases that blob, the state one cares about would probably be in $HOME. What about that? I use https://github.com/nix-community/home-manager to manage$HOME. I use that to manage my git config and bashrc. I also use it to declare which directories should survive a reboot. e.g. I persist ~/.steam and ~/.thunderbird, “Documents/”, a few others. But everything else, e.g. ~/.vim (which I only use in an ad-hoc manner) is wiped.

Even that leaves some blob-like state: I persist “.config/dconf”. Ideally that could be managed declaratively, but I haven’t seen a workable solution.

Let’s take /etc. That’s the configuration. You in many situations can copy that to a new system and it’s fresh and the way you want. Various package managers also can output a list of which packages are installed in a format that can be read so you usually have /usr covered as well.

That works fine for building new machines well, but a typical Linux machine is built far less frequently than its updated. For example, I’ve managed machines with packages/Puppet/Ansible in the past, and occasionally run into situations where the machine state according to packages/Puppet/Ansible no longer matches the actual machine state:

• postinst scripts that worked well during install, but get updated such that upgrades work and installs are broken.
• cases where apt-get install $x followed by apt-get purge$x leaves live config (e.g. files in /etc/pam.d)
• cases where the underlying packages are changed in ways incompatible with the Puppet config: after all, the underlying packages typically don’t attempt to QA against Puppet config.

The result is that even just covering /etc and /usr, machines are brittle, and occasionally need to be rebuilt to have confidence.

Talking about upgrades, removals, etc. not leaving stuff behind? I guess that makes quick hacks hard or impossible? Don’t get me wrong I’d actually consider that a good thing.

Yes, it does make quick hacks hard/impossible. It is possible to do some quick hacks on the box (systemctl stop foo, for example), and nixpkgs is designed so that various parts can be overridden if needed.

When we climb the ladder of abstraction, and lose access to easily change the inner workings of lower levels, it looks like (and is!) restrictive. In the same way I wouldn’t modify a binary in a hex editor to perform deployments, nor would I make live changes to a Docker image, I aim to not SSH to a machine to mutate it either. I prefer my interactions with lower-level abstractions to be mediated via tooling that applies checks-and-balances.

Anyways, thank you again for your response. I guess at this point it might make sense if I took a closer look at it myself.

I don’t make recommendations without understanding requirements, but NixOS/Guix is at least a novel approach to distributions, which might be interesting to OS folks.

NixOS/Guix might have come too late for industry: containers also aim to manage system complexity, and do a good job of it. I think NixOS/Guix offers good solutions for low-medium scale, and as a way to build container images.

I just am curious about practical experiences, because I completely understand that declaratively describing a system tends to look very nice on paper, but in many situations (also because of badly designed software) is more like simply writing a setup shell script and maybe running it each boot, just that shell scripts tend to be more flexible for better and for worse.

I only use NixOS/Guix for my personal infra, and manage all those machines in a declarative manner (other than out-of-scope things such as databases like ~/.config/dconf and postgres).

That’s why lately I’ve been thinking about where we handle big blobs that we sometimes want to modify in a predictive manner and had to think about database schemas and migrations.

Yes, DB schema migrations is an interesting case where a declarative approach would be nice to have: it’s much easier to reason about a single SQL DDL than a sequence of updates.

A similar problem I have is the desire for declarative disk partitions: ideally I could declare my partition scheme, and apply a diff-patch of mutations to make the declaration reality. It would only proceed if it was safe and preserved the underlying files. It’d likely only be possible under particular constraints (lvm/btrfs/zfs ?). Even then that’s hard to get right!

Thanks again, have a nice day. :)

You too!

1. 34

I don’t really agree with a lot of the claims in the article (and I say this as someone who was very actively involved with XMPP when it was going through the IETF process and who wrote two clients and continued to use it actively until 2014 or so):

Truly Decentralized and Federated (meaning people from different servers can talk to each other while no central authority can have influence on another server unlike Matrix)

This is true. It also means that you need to do server reputation things if your server is public and you don’t want spam (well, it did for a while - now no one uses XMPP so no one bothers spamming the network). XMPP, unlike email, validates that a message really comes from the originating domain, but that doesn’t stop spammers from registering millions of domains and sending spam from any of them. Google turned off federation because of spam and the core problems remain unsolved.

End-To-End Encryption (unlike Telegram, unless you’re using secret chats)

This is completely untrue for the core protocol. End-to-end encryption is (as is typical in the XMPP world) multiple, incompatible, extensions to the core protocol and most clients don’t support any of them. Looking at the list of clients almost none of them support the end-to-end encryption XEP that the article recommends. I’d not looked at XEP-0384 before, but a few things spring to mind:

• It’s not encrypting any metadata (i.e. the stuff that the NSA thinks is the most valuable bit to intercept), this is visible to the operators of both party’s servers.
• You can’t encrypt presence stanzas (so anything in your status message is plaintext) without breaking the core protocol.
• Most info-query stanzas will need to be plain-text as well, so this only affects direct messages, but some client-to-client communication is via pub-sub. This is not necessarily encrypted and clients may or may not expose which things are and aren’t encrypted to the user.
• The bootstrapping thing involves asking people to trust new fingerprints that exist. This is a security-usability disaster: users will click ‘yes’. Signal does a good job of ensuring that fingerprints don’t change across devices and manages key exchange between clients so that all clients can decrypt a message encrypted with a key assigned to a stable identity. OMEMO requires a wrapped key for every client.
• The only protection against MITM attacks is the user noticing that a fingerprint has changed. If you don’t validate fingerprints out-of-band (again, Signal gives you a nice mechanism for doing this with a QR code that you can scan on the other person’s phone if you see them in person) then a malicious server can just advertise a new fingerprint once and now you will encrypt all messages with a key that it can decrypt.
• There’s no revocation story in the case of the above. If a malicious fingerprint is added, you can remove it from the advertised set, but there’s no guarantee that clients will stop sending things encrypted with it.
• The XEP says that forward secrecy is a requirement and then doesn’t mention it again at all.
• There’s no sequence counter or equivalent so a server can drop messages without your being aware (or can reorder them, or can send the same message twice - no protection against replay attacks, so if you can make someone send a ‘yes it’s fine’ message once then you can send it in response to a request to a different question).
• There’s no padding, so message length (which provides a lot of information) is available.

This is without digging into the protocol. I’d love to read @soatok’s take on it. From a quick skim, my view is that it’s probably fine if your threat model is bored teenagers.

They recommend looking for servers that support HTTP upload, but this means any file you transfer is stored in plain text on the server.

Cross-Platform Applications (Desktop, Web, and Mobile)

True, with the caveat that they have different feature sets. For example, I tried using XMPP again a couple of years ago and needed to have two clients installed on Android because one could send images to someone using a particular iOS client and the other supported persistent messaging. This may be better now.

Multi-Device Synchronization (available on some servers)

This, at least, is fairly mature. There are some interesting interactions between it and the security gurantees claimed by OMEMO.

Voice and Video Calling (available on most servers)

Servers are the easy part (mostly they do STUN or fall back to relaying if they need to). There are multiple incompatible standards for voice and video calling on top of XMPP. The most widely supported is Jingle which is, in truly fractal fashion, a family of incompatible standards for establishing streams between clients and negotiating a CODEC that both support. It sounds as if clients can now do encrypted Jingle sessions from their article. This didn’t work at all last time I tried, but maybe clients have improved since then.

1. 8

Strongly agree – claiming that XMPP is secure and/or private without mentioning all the caveats is surprising! There’s also this article from infosec-handbook.eu outlining some of the downsides: XMPP: Admin-in-the-middle

The state of XMPP security is a strong argument against decentralization in messengers, in my opinion.

1. 7

Spam in XMPP is largely a solved problem today. Operators of open relays, servers where anyone can create an account, police themselves and each other. Anyone running a server that originates spam without dealing with it gets booted off the open federation eventually.

Another part of the solution is ensuring smaller server operators don’t act as open relays, but instead use invites (like Lobste.rs itself). Snikket is a great example of that.

but that doesn’t stop spammers from registering millions of domains and sending spam from any of them.

Bold claim. Citation needed. Where do you register millions of domains cheaply enough for the economics of spam to work out?

Domains tend to be relatively expensive and are easy to block, just like the IP addresses running any such servers. All I hear from server operators is that spammers slowly register lots of normal accounts on public servers with open registration, which are then used once for spam campaigns. They tend to be deleted by proactive operators, if not before, at least after they are used for spam.

Google turned off federation because of spam and the core problems remain unsolved.

That’s what they claim. Does it really seem plausible that Google could not manage spam? It’s not like they have any experience from another federated communications network… Easier for me to believe that there wasn’t much in the way of promotion to be gained from doing anything more with GTalk, so they shut it down and blamed whatever they couldn’t be bothered dealing with at the time.

1. 3

Your reasonning about most clients not supporting OMEMO is invalid because noone cares about most clients: it’s all about the marketshare. Most XMPP clients probably don’t support images but that doesn’t matter.

For replays, this may be dealt with the double ratchet algorithm since the keys change fairly often. Your unknown replay would also have to make sense in an unknown conversation.

Forward secrecy could be done with the double ratchet algorithm too.

Overall OMEMO should be very similar to Signal’s protocol, which means that it’s quite likely the features and flaws of one are in the other.

Conversations on Android also offers showing and scanning QR codes for validation.

As for HTTP upload, that’s maybe another XEP but there’s encrypted upload with an AES key and a link using the aesgcm:// scheme (as you can guess: where to retrieve the file plus the key).

I concur that bootstrapping is often painful. I’m not sure it’s possible to do much better without a centralized system however.

Finally, self-hosting leads to leaking quite a lot of metadata because your network activity is not hidden in large amounts of network activity coming from others. I’m not sure that there’s really much more that is available by reading the XMPP metadata. Battery saving on mobile means the device needs to tell the server that it doesn’t care about status messages and presence from others but who cares if it’s unencrypted to the server (on the wire, there’s TLS) since a) it’s meant for the server, b) even if for clients instead, you could easily spot the change in network traffic frequency. I mean, I’m not sure there’s a lot more that is accessible that way (not even mentionning that if you’re privacy-minded, you avoid stuff like typing notifications and if you don’t, traffic patterns probably leak that anyway). And I’m fairly sure that’s the same with Signal for many of these.

1. 3

now no one uses XMPP so no one bothers spamming the network

I guess you’ve been away for awhile :) there is definitely spam, and we have several community groups working hard to combat it (and trying to avoid the mistakes of email, not doing server/ip rep and blocking and all that)

1. 3
Cross-Platform Applications (Desktop, Web, and Mobile)


True, with the caveat that they have different feature sets. For example, I tried using XMPP again a couple of years ago and needed to have two clients installed on Android because one could send images to someone using a particular iOS client and the other supported persistent messaging. This may be better now.

Or they’ve also calcified (see: Pidgin). Last time I tried XMPP a few years ago, Conversations on Android was the only tolerable one, and Gajim was janky as hell normally, let alone on Windows.

1. 3

True, with the caveat that they have different feature sets. For example, I tried using XMPP again a couple of years ago and needed to have two clients installed on Android because one could send images to someone using a particular iOS client and the other supported persistent messaging. This may be better now.

This was the reason I couldn’t get on with XMPP. When I tried it a few years ago, you really needed quite a lot of extensions to make a good replacement for something like WhatsApp, but all of the different servers and clients supported different subsets of the features.

1. 3

I don’t know enough about all the details of XMPP to pass technical judgement, but the main problems never were the technical decisions like XML or not.

XMPP had a chance, 10-15 years ago, but either because of poor messaging (pun not intended) or not enough guided activism the XEP thing completely backfired and no two parties really had a proper interaction with all parts working. XMPP wanted to do too much and be too flexible. Even people who wanted it to succeed and run their own server and championed for use in the companies they worked for… it was simply a big mess. And then the mobile disaster with undelivered messages to several clients (originally a feature) and apps using up to much battery, etc.pp.

Jitsi also came a few years too late, sadly, and wasn’t exactly user friendly either at the start. (Good people though, they really tried).

1. 5

I don’t know enough about all the details of XMPP to pass technical judgement, but the main problems never were the technical decisions like XML or not.

XML was a problem early on because it made the protocol very verbose. Back when I started working on XMPP, I had a £10/month plan for my phone that came with 40 MB of data per month. A few extra bytes per message added up a lot. A plain text ‘hi’ in XMPP was well over a hundred bytes, with proprietary messengers it was closer to 10-20 bytes. That much protocol overhead is completely irrelevant now that phone plans measure their data allowances in GB and that folks send images in messages (though the requirement to base64-encode images if you’re using in-band bytestreams and not Jingle still matters) but back then it was incredibly important.

XMPP was also difficult to integrate with push notifications. It was built on the assumption that you’d keep the connection open, whereas modern push notifications expect a single entity in the phone to poll a global notification source periodically and then prod other apps to make shorter-lived connections. XMPP requires a full roster sync on each connection, so will send a couple of megs of data if you’ve got a moderately large contact list (first download and sync the roster, then get a presence stanza back from everyone once you’re connected). The vcard-based avatar mechanism meant that every presence stanza contained the base64-encoded hash of the current avatar, even if the client didn’t care, which made this worse.

A lot of these problems could have been solved by moving to a PubSub-based mechanism, but PubSub and Personal Eventing over PubSub (PEP) weren’t standardised for years and were incredibly complex (much more complex than the core spec) and so took even longer to get consistent implementations.

The main lessons I learned from XMPP were:

• Federation is not a goal. Avoiding having an untrusted admin being able to intercept / modify my messages is a goal, federation is potentially a technique to limit that.
• The client and server must have a single reference implementation that supports anything that is even close to standards track, ideally two. If you want to propose a new extension then you must implement it at least once.
• Most users don’t know the difference between a client, a protocol, and a service. They will conflate them, they don’t care about XMPP, they care about Psi or Pidgin - if the experience isn’t good with whatever client you recommend that’s the end.
1. 2

XMPP requires a full roster sync on each connection, so will send a couple of megs of data if you’ve got a moderately large contact list (first download and sync the roster, then get a presence stanza back from everyone once you’re connected).

This is not accurate. Roster versioning, which means that only roster deltas, which are seldom, are transferred, is used widely and also specified in RFC 6121 (even though, not mandatory to implement, but given that it’s easy to implement, I am not aware of any mobile client that doesn’t use it)

1. 1

Also important to remember that with smacks people are rarely fully disconnected and doing a resync.

Also, the roster itself is fully optional. I consider it one of the selling points and would not use it for IM without, but nothing prevents you.

1. 1

Correct.

I want to add that, it may be a good idea to avoid using XMPP jargon to make the test more accessible to a wider audience. Here ‘smacks’ stands for XEP-198: Stream Management.

2. 2

XMPP had a chance, 10-15 years ago, but either because of poor messaging (pun not intended) or not enough guided activism the XEP thing completely backfired and no two parties really had a proper interaction with all parts working. XMPP wanted to do too much and be too flexible.

I’d argue there is at least one other reason. XMPP on smartohones was really bad for a very long time, also due to limitations on those platforms. This only got better later. For this reason having proper messaging used to require spending money.

Nowadays so you “only” need is too pay a fee to put stuff into the app store and in case of iOS development buy an overpriced piece of hardware to develop on. Oh and of course deal with a horrible experience there and be at the risk of your app being banned from the store, when they feel like. But I’m drifting off. In short: Doing what the Conversation does used to be harder/impossible on both Android and iOS until certain APIs were added.

I think that gave it a pretty big downturn when it started to do okay on the desktop.

I agree with the rest though.

3. 2

I saw a lot of those same issues in the article. Most people don’t realize (myself included until a few weeks ago) that when you stand up Matrix, it still uses matrix.org’s keyserver. I know a few admins who are considering standing up their own keyservers and what that would entail.

And the encryption thing too. I remember OTR back in the day (which was terrible) and now we have OMEMO (which is ….. still terrible).

This is a great reply. You really detailed a lot of problems with the article and also provided a lot of information about XMPP. Thanks for this.

1. 2

It’s not encrypting any metadata (i.e. the stuff that the NSA thinks is the most valuable bit to intercept), this is visible to the operators of both party’s servers. You can’t encrypt presence stanzas (so anything in your status message is plaintext) without breaking the core protocol.

Do you know if this situation is any better on Matrix? Completely honest question (I use both and run servers for both). Naively it seems to me that at least some important metadata needs to be unencrypted in order to route messages, but maybe they’re doing something clever?

1. 3

I haven’t looked at Matrix but it’s typically a problem with any Federated system: you need at least an envelope that tells you the server that a message needs to be routed to to be public. Signal avoids this by not having federation and by using their sealed-sender mechanism to avoid the single centralised component from knowing who the sender of a message is.

1. 1

Thanks.

2. 1

There is a bit of metadata leaking in matrix, because of federation. But it’s something the team is working to improve.

3. 2

Fellow active XMPP developer here.

I am sure you know that some of your points, like Metadata encryption, are a deliberate design tradeoff. Systems that provide full metadata encryption have other drawbacks. Other “issues” you mention to be generic and apply to most (all?) cryptographic systems. I am not sure why XEP-0384 needs to mention forward secrecy again, given that forward secrecy is provided by the building blocks the XEP uses and discussed there, i.e., https://www.signal.org/docs/specifications/x3dh/. Some points of yous are also outdated and no longer correct. For example, since the newest version of XEP-0384 uses XEP-0420, there is now padding to disguise the actual message length (XEP-0420 borrows this again from XEP-0373: OpenPGP for XMPP).

From a quick skim, my view is that it’s probably fine if your threat model is bored teenagers.

That makes it sound like your threat model shouldn’t be bored teenagers. But I believe that we should also raise the floor for encryption so that everyone is able to use a sufficiently secured connection. Of course, this does not mean that raising the ceiling shouldn’t be researched and tried also. But we, that is, the XMPP community of volunteers and unpaid spare time developers, don’t have the resources to accomplish everything in one strike. And, as I said before, if you need full metadata encryption, e.g., because you are a journalist in a suppressive regime, then the currently deployed encryption solutions in XMPP are probably not what you want to use. But for my friends, my family, and me, it’s perfectly fine.

They recommend looking for servers that support HTTP upload, but this means any file you transfer is stored in plain text on the server.

That depends on the server configuration, doesn’t it? I imagine at least some servers use disk or filesystem-level encryption for user-data storage.

For example, I tried using XMPP again a couple of years ago and needed to have two clients installed on Android because one could send images to someone using a particular iOS client and the other supported persistent messaging. This may be better now.

It got better. But yes, this is the price we pay for the modularity of XMPP due to its extensibility. I also believe it isn’t possible to have it any other way. Unlike other competitors, most XMPP developers are not “controlled” by a central entity, so they are free to implement what they believe is best for their project. But there is also a strong incentive to implement extensions that the leading Implementations support for compatibility. So there are some checks and balances in the system.

1. 4

There are many similarities between the two, though I don’t have any experience with Nix to comment further here.

I don’t understand why one would use Guix without trying Nix first.

1. 9

What I gathered is that the author has a preference for Scheme and thus also for Guix, which I think is a sufficient argument.

However, I’d be very interested in reading an in-depth comparison between the two projects, as they target the same niche and are quite related.

1. 3

This is basically my position. I’ve used Nix a couple times but Guix is preferable.

2. 7

Early on Nix had the opposite problem. You would ask it to install firefox, and completely unprompted it would install the adobe flash plugin to go with it. They told me if I wanted firefox without that awful shit to install a separate “firefox-no-plugins” package or something ridiculous like that.

It hasn’t done that for a while, but it’s taken like a decade to recover from the lost trust. I couldn’t handle the idea of running an OS managed by people capable of making such a spectacularly bad decision.

1. 2

Doesn’t that depend on your affinities? If you like lisp languages then Guix seems like a logical choice. I guess it also depends on what software you depend on in your day-to-day activities.

1. 5

There is more to consider here:

Guix is a GNU project with the unique lenses and preferences that come along with that. It is also a much smaller community that Nix, which means less packages, less eyes on the software, and I would argue also less diversity of thought.

I personally prefer Scheme to Nix’s DSL-ish language, but as a project I think Nix is in a much better position to deliver a reasonable system of such levels of ambition.

It also frustrates me that Guix tries to act like it’s not just a fork of Nix, when in reality it is, and it would be better to embrace this and try to collaborate and follow Nix more closely.

Unfortunately the GNU dogma probably plays a role in preventing that.

1. 4

It is also a much smaller community that Nix, which means less packages

If you convert Nix packages to Guix packages, you can get the best of both worlds in Guix. But that’s admittedly not a very straightforward process, and guix-import is being/has been removed due to bugginess.

2. 2

I’m aware of both, but I tried Guix first because people I know use it, I like the idea of using a complete programming language (even if I dislike parens), and the importers made getting started easy. Also guix is just an apt install away for me.

1. 36

Personally, I think the Debian model of having a free core but allowing a non-free area is better for building a community and thus a distribution. Actively restricting conversation in official channels about non-free software and hardware(!) is user hostile. Being a purist may make you feel self-righteous, but very few people in this world can afford such luxuries and will just move on.

1. 9

Honestly, the guix model is very similar to Debian. There is no nonfree allowed in “official” channels, but everyone in the “official” channels knows about #nonguix which is the equivalent of Debian nonfree.

1. 18

The culture is very different. People on the Guix subreddit get attacked when they mention nonguix. I was curious about Guix, so I googled how to install proprietary Nvidia drivers, which I need for work. Not only are they not supported, Guix users get angry when people try to help others install them on even non-official forums like Reddit.

1. 4

On a subreddit? That’s just weirdly hypocritical…

I mostly hang out in the IRC

1. 3

Meanwhile on Arch

pacman -S nvidia

2. 1

ehhhh. The GNU community is rather hostile to non-free software overall. I mean, what do you expect, it’s an organization dedicated to a software that intentionally makes it hard to use with non-free software (GPL).

1. 2

Is there a reliable way to bridge Matrix and XMPP so that a matrix user can talk to an XMPP user and vice versa? I see that there will be downsides in a way as only the common subset of both protocols would be supportable, but it would still be nice.

1. 4

There’s this: https://github.com/matrix-org/matrix-bifrost

I got it running but never got it to connect correctly with my Matrix server and kinda gave up. I’m not sure if it works.

1. 3

It has bugs, and for obvious reasons new vector isn’t putting much into fixing them, but for some use cases it works. The instance on aria-net is the most stable and has many bug fixes not in upstream

1. 3

Thank you! Looks intimidating. I’m not sure I’d really want to run this on my small XMPP server…

1. 2

No reason you need to run anything matrix related. Just connect to addresses that go via the aria-net or matrix.org instance. Running it yourself gains you nothing since you’ll just be using it to send messages to matrix.org users anyway.

1. 1

You mean, as in “it just works”? Looking it up in DNS I am entirely surprised indeed:

$dig _xmpp-server._tcp.matrix.org SRV ; <<>> DiG 9.16.15-Debian <<>> _xmpp-server._tcp.matrix.org SRV ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6254 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;_xmpp-server._tcp.matrix.org. IN SRV ;; ANSWER SECTION: _xmpp-server._tcp.matrix.org. 300 IN SRV 0 5 5269 lethe.matrix.org. ;; Query time: 48 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Fr Nov 19 17:55:31 CET 2021 ;; MSG SIZE rcvd: 93  I have not tested if there is actually something listening on port 5269 on lethe.matrix.org, but if it is, it indeed just ought to work. That’s… interesting. So I can assume matrix.org Matrix users will just be reachable via XMPP without further setup? That’s kind of cool. I have no Matrix user in my XMPP roster yet, so I was unaware of this and just assumed the two universes are entirely incompatible to one another. Thanks for rectifying this. I will remember that the next time I talk to someone about XMPP vs. Matrix. 1. 1 You are likely to have a better experience with this one https://aria-net.org/SitePages/Portal/Bridges.aspx though 1. 1 Thanks! 1. 3 What’s centralized about Matrix compared to XMPP? 1. 8 In principle, nothing. In practice: • 35% of all users on the network are on a single homeserver (matrix.org). • Most people who are not on the matrix.org homeserver are still using their identity server (maps matrix IDs to other IDs like email addresses or phone numbers). • Nearly everyone is using the matrix.org integration server (provides room widgets like stickers and video calls). Beyond that, the Matrix specification is largely “whatever Synapse and Element do”, so there are effectively no compatible third-party servers, and third-party clients vary in their support, with relatively few providing end-to-end encryption and none that I know of supporting “spaces”. Obviously, XMPP has a lot of compatibility issues too (it can be a crapshoot whether your client and your server both support the XEPs needed for the features you want). But the cause is different. 1. 5 I think that’s fairly ok. Everything is open source and the system actually managed to take off at bigger installations and corporations. Move fast and break things is ok for something that is still going this much. It’s kinda the same with rustc and the calls for a spec. I’d say most people don’t actually want federation, at least my workplace doesn’t. They either use the big system where everyone is, or they want their private space formed for a specific organization. Although element really does lack a multi-account feature to actually be able to use this kind of operation. 1. 4 most people don’t actually want federation To be honest, I don’t think too many really want federation and the problems it brings w/ an open world. I suspect people want more single signon and a single client, which can be achieved via SSO/APIs. 1. 3 I’d say most people don’t actually want federation Maybe not, but most people don’t want a single point of failure. 1. 2 shrug It sounds like you don’t need or want decentralized chat, just on-premises hosted chat. But a lot of people do, I think. 1. 6 First of all I want something that works for everyone and isn’t whatsapp. I already have over 5 messengers installed, so I’d much rather have something that has actually more users than a cargo cult of nerds. Then I’d like to host some instances for myself and others. But I can already find solutions for the second part at various niche software solutions, which won’t actually work for anyone else than me and the other people here on lobste.rs. If convenience, features and ease of use weren’t a thing I wouldn’t have so many contacts on telegram, which is behind whatsapp and threema in regards to E2E. But it has much more (contact related) privacy, notification and group settings, has anonymous surveys, an official bot api, doesn’t require you to pass a phone number, can handle more files and sizes, has better preview integration, bigger groups, more admin features and a much cleaner desktop experience. So yes please, first break up the market and deliver the required features, then you can concentrate on a specification set in stone. For now there are already some very useful 3rd party clients for matrix. Edit: And all of these easy solutions require someone to pay for these space and bandwidth monstrosities that are modern chat solutions with files, photos, full text search, gifs, user (animated) stickers, push notifications, voice messages and possibly video chat. So I won’t hold any grudge against matrix if they first try to get up in the marked and handle their clients needs, which surely isn’t federation or a full fledged specification for the army or government instances. I don’t even know how they want to make money to keep all of this running, while all the people are used to paying “nothing” in whatsapp, signal, telegram & co. We’ve got a matrix instance at the institution. I don’t need want a “job” for my day to day communication, which will ping me at any point in time when it goes down. Because now it’s bound to you personally and not your job as your family/friends/.. can’t connect anymore. 1. 5 I completely agree. Federation was pushed as a goal with XMPP, without stepping back and understanding the problem that it was trying to solve. I want to be able to run a chat app and talk to other people without some other entity being able to: • Read my messages. • Delete my account. • Tamper with my messages. Running my own XMPP server gives me this guarantee. Someone else running their own server that is federated with mine gives them this guarantee. If the person using the server is not running the server then some of this breaks down: if their account is tied to a domain that they don’t own, then it’s possible for the server admin to unilaterally kick them off. The server admin can always delete their messages and without end-to-end encryption can read or tamper with them. With XMPP’s end-to-end encryption, the admin can still drop messages, record all metadata for their conversations, and launch replay attacks (send the same message twice). Signal gives me all of these guarantees, plus the really important one: A single app that I can tell my mother to install and that works for her out of the box with no configuration. Threema has some questionable security claims and spews FUD, Telegram is a semi-proprietary version of Signal without some of the newer security features. As a colleague of mine said a couple of years ago when I did a survey of open source messaging solutions, the correct answer is ‘Just use Signal’. It’s easy to use, provides user-friendly features like reactions, stickers (I don’t actually know what these are, but apparently people care about them), in-band picture messaging, and encrypted voice / video chat, and has mobile and desktop apps. Signal has a single entity running it, but they’re a registered charity, they’re well funded by donations (and the cost of running the service is really small per user at scale - WhatsApp was charging$1/year and was very profitable), and even if they were evil the protocol is designs so that they don’t even see most of the metadata, let alone the content.

2. 2

35% of all users on the network are on a single homeserver (matrix.org).

Is it really only 35%? Basically everyone I talked to who tried setting up their own home server gave up on it, so I wonder if the reality is more like … 35% of accounts created, but not 35% of accounts in active use? Of the matrix users I see in our bridged Libera channel, well over half are on matrix.org.

1. 3

Well, new vector runs a hosted homeserver thing now, so some people are “not on matrix.org” but still on new vector infrastructure of course

1. 3

Basically everyone I talked to who tried setting up their own home server gave up on it

That sounds really weird to me. I set up my own home server ages ago, back when it was harder. Now there are ansible playbooks https://github.com/spantaleev/matrix-docker-ansible-deploy/ and tutorials https://matrix.org/docs/guides/free-small-matrix-server/

1. 3

Anecdotally, I would expect it to be higher. I read an article last week that gave the 35% figure, which surprised me at the time, but I don’t remember their methodology.

2. 1

The identity server also serves out keys, right?

1. 2

I do not think so, but I could be wrong. I think your own homeserver does key backup.

1. 2

I realize that bringing this up may bog us down in some frustrating discourse about political correctness, but I do think that a serious barrier to bringing back finger is its name. Juvenile jokes about fingering someone are inevitable (“oh, when I said I want to finger her, I just meant the social network!”), and may contribute to a hostile environment for people with vulvas at a point in history where we should really know better.

If people are serious about bringing back the finger protocol, as some have become serious about resurrecting gopher, can we prioritize the “name” alias, or rewrite it in Rust and call it something clever and self-referential like “digits” that less closely resembles a sex act in English?

1. 11

or rewrite it in Rust and call it something clever and self-referential like “digits” that less closely resembles a sex act in English?

Or we could just be a little less jumpy about things in general. Especially as “finger” also has relevant and entirely non-sexual etymology as well:

The term “finger” has a definition of “to snitch” or “to identify”

FWIW I don’t like the phrase political correctness, as it’s been diluted beyond useful meaning. I’d argue that this isn’t a case of it, either: properly, it refers to whether a fact is politically safe to express or act upon. “Is Lysenkoism correct, Comrade?” “It doesn’t matter; it’s politically correct.”

So I think the real issue here is of what constitutes hypersensitivity to sexual terms - or even terms that could be interpreted as sexual; I’d be willing to bet that the author(s) of finger intended it as a double entendre.

1. 3

I don’t think this is about jumpiness as much as it is caring about people who have an experience different than your own.

There’s no question there’s a group of people to whom language around “fingering” causes discomfort and/or painful memories. I’m not sure what the size of that group is (maybe it’s small!), but it’s also unquestionably disproportionately women.

Since we’re trying to make software a more welcoming place for underrepresented groups, like women, it seems to me that @skyfaller is asking if this is a conversation we should have.

Meanwhile, it seems to me that you’re saying (a) you can’t imagine being in that group of people and that group of people should get over it, and moreover (b) the conversation is illegitimate.

The problem with the internet is how many people are now reachable. Is it hypersensitivity if you cause many thousands of people distress, even if the denominator is much much larger? If choosing your words empathetically is akin to censorship to you (Lysenkoism, comrade), I don’t know what to say. (edit: I’m sorry, I misread; I see you’re saying it’s not this. If anything I guess I just take issue with calling it hypersensitivity as opposed to discussing how much we should be sensitive.)

1. 9

The problem with the internet is how many people are now reachable. Is it hypersensitivity if you cause many thousands of people distress, even if the denominator is much much larger?

It’s hypersensitivity to be caused distress by the word finger. I’m sorry, but it’s silly to pretend otherwise.

1. 2

Another “I can’t imagine this bothering me” vote.

1. 3

But this reasoning is circular without some objectivity. You must be making some value judgments yourself without consulting an opinion panel, or “having a discussion”.

2. 4

If people get offended by something, I most certainly don’t care.

1. 3

I will first say that I love your thoughtfulness and your willingness to reconsider in the middle of a discussion. That’s what this should all be about. Kudos.

That being said, my inner Izzit jumped at this:

I’m not sure what the size of that group is (maybe it’s small!), but it’s also unquestionably disproportionately women.

Is it though? Can ‘men’ (whatever that means) not get fingered?

It does not invalidate your point that we need to be thoughtful and conscious in our language, but even when arguing for thoughtfulness, we can still be blind to our own bias (or ‘blind to our own view of reality shaped by our own lived experiences’).

All that to say - let’s have these discussions, but go out of our way to have them with people actually affected, whose version of reality looks nothing like ours - with humility.

Side story - probably unrelated: I feel violated when anyone on my team uses the word “bang”. I honestly feel enraged and it takes everything in my power to not throw my computer across the room anytime I’m invited to “bang this out with someone,” or when a team member relates how they solved a problem by staying up late and “banging it out.”

I have not said anything to these people. I probably never will. For reasons that you may or may not be able to imagine, based on your own lived experiences.

I honestly have no idea if they are aware of what it sounds like. I know for a fact they are not thinking of me (intending to get a reaction out of me, or how it would make me - or anyone else - feel).

1. 3

All that to say - let’s have these discussions, but go out of our way to have them with people actually affected, whose version of reality looks nothing like ours - with humility.

This should be printed on a shirt. You make an excellent point about my own biases and assumptions even in an attempt to try and be thoughtful about it in my personal vacuum. :+1:

2. 3

If anything I guess I just take issue with calling it hypersensitivity as opposed to discussing how much we should be sensitive.

Yes, this is the conversation I think we need to have: what is a desirable level of sensitivity, and when does it tip over to being excessive? The answer will be different for different people in different communities, but it should be an ongoing conversation that changes with the times, and doesn’t assume that because something has always been accepted that it should continue to be acceptable (and perhaps allows for language to move the other direction as well, e.g. if slurs like “queer” are reclaimed as words one can take pride in).

Elsewhere in this thread, @david_chisnall lists other utilities that also lend themselves to double entendres:

In contrast, finger always led to people making fingering jokes (mind you, with touch, unzip, mount, fsck, and so on, the UNIX command line doesn’t really provide a shortage of input for a dirty mind).

I’ve seen t-shirts etc. that use these utility names to refer to sexual acts, but I will admit that for whatever reason these others did not set off alarm bells for me in quite the same way as finger. If I were god emperor of command line utilities, I might rename finger and leave the rest alone. Are there people who find one of these other utility names more troubling, and find finger to be acceptable?

How do we decide when a name needs to be changed, or when the change is worth the effort / inconvenience? Is there a meaningful difference between many people experiencing mild discomfort and a few people experiencing extreme distress? How many people have to experience an issue, and how severe must that issue be, before we seriously discuss changing a name? How much should we care about how the worst people in a community behave vs. more common behavior? There will always be someone who finds something objectionable about anything, so we cannot take action on every objection, but it also doesn’t seem like we should ignore all objections.

How do we weigh the current community vs. an aspirational community we hope to have? I think one important piece of context is that programming has been something of a boy’s club for decades, and if the profession wants to be more inclusive and welcoming, it perhaps should err on the side of being more sensitive. The absence of objections could be a sort of survivorship bias.

I also think it’s worth taking a stand in favor of caring about people’s feelings, instead of making it cool to be cruel. Many people have an attitude of “fsck your feelings” until it’s their feelings that are hurt, and then they demand blood. We shouldn’t be ruled by people who refuse to consider any feelings they don’t share, any more than we should be ruled by puritans who object to everything.

2. 2

That angle is certainly one that’s been brought up time and again. And then you get questions about whether terms like domain driven design might also be problematic. I’m not sure where the line should be, and I don’t think I have the standing to weigh in very much.

Sometimes, it’s easy for me to see. For example, it is super clear to me that we should not refer to hacked-up solutions as (racial-slur)-rigged.

I do also tend to think that if people are saying that “finger” and “domain driven design” make it challenging for them to participate in the community, IMO we should just believe them and swap out the terms for different ones without the same issue.

But ideally, I’d like to have a heuristic to identify these terms before they actually make a person feel excluded, and I have a hard time identifying such a thing in the last two cases. I don’t like the idea of waiting till someone is uncomfortable participating because I repeated a term of art.

I know that I just need to accept that discomfort as a privilege (it sure beats feeling being afraid to participate!) but I still want to spot a way to be better.

1. 6

I was really confused by the domain-drive design thing. First I clicked on the link to see what ‘domain’ meant that caused problems for folks, only to discover that it was the TLA, DDD, that was the problem. The person in the Twitter thread was claiming that this was a bra size, which confused me because I was under the impression that it was not. So I then ended up bra sizes on Wikipedia, and it turns out that we were both right: it isn’t in my locale: DDD in US sizes is called E in the UK (and it’s called F more commonly in the USA). So we have a thing whose name is completely fine, whose initialism happens to be the same as a locale-specific size for bras (but only in parts of the US) and this is a problem because porn apparently decides to use DDD to mean ‘big breasts’ (ignoring the fact that this is a ratio measurement and has nothing to do with absolute size, only the relative size of the cup and the band size). So the chain from the thing to the offensive thing is four long. Given a chain that long, I suspect I could link any word in any alphabet to a term that some people would find offensive.

In contrast, finger always led to people making fingering jokes (mind you, with touch, unzip, mount, fsck, and so on, the UNIX command line doesn’t really provide a shortage of input for a dirty mind).

1. 4

Given a chain that long, I suspect I could link any word in any alphabet to a term that some people would find offensive.

I don’t know the details of this specific case, but if someone is taking offense at the result of a chain that long, it’s reasonable to infer that they’re choosing to take offense.

1. 1

I don’t know the details of this specific case, but if someone is taking offense at the result of a chain that long, it’s reasonable to infer that they’re choosing to take offense.

This is very close to getting political, but it’s not about choosing to take offence. Imagine being a non-dude, coming into a computer club where all young-isch dudes are talking about fingering each others and others (you can probably imagine the jokes). None of it is probably intended to offend anyone, and none of the comments or jokes will offend you…but being constantly surrounded by it will eventually just wear you down.

TL;DR it’s not necessarily offensive, it’s inconsiderate and it’s just a part of everything else that is inconsiderate out on the internet.

1. 3

This is very close to getting political, but it’s not about choosing to take offence.

Sure; I was referring to taking offense at Domain Driven Design because its acronym is also a bra size in some countries.

I think a reasonable person could object to a utility with a name that’s a double entendre. I think at the point where you’re offended by Domain Driven Design, though, you’re looking for something to be offended by.

1. 1

Sure; I was referring to taking offense at Domain Driven Design because its acronym is also a bra size in some countries.

1. 2

No offense taken ;)

Seriously, lobste.rs has to be about the only place on the public Internet I feel comfortable actually discussing topics like these, in part because people tend to assume good intent and act politely ❤️

2. 1

It’s very common IME for American neo-puritans to assume that their own parochial, contingent cultural anxieties do and should apply unconditionally to the rest of the world. A form of cultural imperialism, I would say.

I’m very much on board with attempting to empathize with people with lived experiences different from the majority demographic in a community, but does that mean that a single person from a morally-privileged group gets an unconditional veto over anything they object to?

3. 2

But ideally, I’d like to have a heuristic to identify these terms before they actually make a person feel excluded

I don’t think that’s possible, since many of these are based on intentional mis-readings, or terms whose underlying language has dramatically shifted, or taking things out of context etc.

The only thing one can do is either make a change, or stay silent in the face of a request for one. Making that call is hard and will depend on how many spoons you have when it comes in, honestly.

1. 2

many of these are based on intentional mis-readings

I think I’m missing a reference here, because I am not aware of a situation where someone intentionally misread something and then felt excluded on that basis, let alone many of them. But I’m specifically referring to situations where someone wants to participate and feels uncomfortable doing so because of the language being used by others. Bad faith misreadings are entirely different.

Making that call is hard and will depend on how many spoons you have

Is that an auto-correct error? If not, can you tell me what “how many spoons” means in this context or point me to an explanation? I’m a native US English speaker, and that is new to me.

I recognize that I’m engaging in wishful thinking. It doesn’t seem like a bad wish, though, to want to be smart enough to identify exclusionary things before they make someone feel excluded.

1. 3
1. 1

Thank you. That’s much less crass than the idiom I usually use to reference that same thing. I plan to adopt that.

1. 1

I have taken to using the term “koalas” after Ze Frank’s “koalas in the rain” song.

2. 1

Sometimes, it’s easy for me to see. For example, it is super clear to me that we should not refer to hacked-up solutions as (racial-slur)-rigged.

I had to Google that one - the only “rigged” phrase I knew already was jury-rigged, and I was wondering how anyone could find it offensive.

Agreed that’s it’s pretty clear that the extreme cases like those are easy; it’s drawing a line between accidental exclusion and hysteria that’s difficult.

Ironically, the term hysteria itself used to be deeply misogynistic; the root word is “hystera”, that is, “uterus”.

1. 2

I’m sorry to have made you google that but simultaneously glad that it wasn’t already in your vocabulary. I learned it from my supervisor at my first job in high school. When he was informed by more senior management that it wasn’t acceptable, he replaced it with the term “afro engineering”. IIRC he was given his walking papers shortly thereafter for a completely unrelated reason. The state of that part of the world at the time was that as long as you weren’t literally using the n-word, you weren’t being racist.

1. 1

Probably just because I’m Australian. We have our own collection of similarly vile racial terms, just aimed at Aborigines :(

2. 1

The term “jerry-rigged” (or “jerry-built”) is the only other such form I know of, which I used to think was connected to the slang (slur?) term for Germans, but apparently it predates Germans being referred to as “Jerry” (circa WW1).

1. 1

Wikipedia has the etymology of this one, it’s a nautical term and has nothing to do with any group referred to as Jerry - the jerry variant is a more modern corruption. I learned something actually - I thought a Jurry Rig was a Napoleonic-era naval term, but apparently it’s a couple of hundred years older.

1. 7

The real tragedy is that this choice is still such a big deal, because calling code in one language from another has only slightly improved in decades

1. 24

I’m sympathetic to the goal of making reasoning about software defects more insightful to management, but I feel that ‘technical debt’ as a concept is very problematic. Software defects don’t behave in any way like debt.

Debt has a predictable cost. Software defects can have zero costs for decades, until a single small error or design oversight creates millions in liabilities.

Debt can be balanced against assets. ‘Good’ software (if it exists!) doesn’t cancel out ‘Bad’ software; in fact, it often amplifies the effects of bad software. Faulty retry logic on top of a great TCP/IP stack can turn into a very damaging DoS attack.

Additive metrics like microdefects or bugs per line of code might be useful for internal QA processes, but especially when talking to people with a financial background, I’d avoid them, and words like ‘debt’, like the plague. They need to understand software used by their organization as a collection of potential liabilities.

1. 11

Debt has a predictable cost. Software defects can have zero costs for decades, until a single small error or design oversight creates millions in liabilities.

I think this you’ve nailed the key flaw with the “technical debt” metaphor here. It strongly supports this “microdefect” concept, explicitly by analogy to microCOVID, which the piece doesn’t mention is named for micromort. The analogy works really well to your point: these issues are very low cost and then sudden, potentially catastrophic failure. Maybe “microcrash” or “microoutage” would be a clearer term; I’ve seen “defect” used for pretty harmless issues like UI typos.

The piece is a bit confusing by relying on the phrase ‘technical debt’ while trying to supplant it, it’d be stronger if it only used it once or twice to argue its limitations.

We’ve seen papers on large-scale analyses of bugfixes on GitHub. Feels like that route of large-scale analysis could provide some empirical justification for assessing values of different microdefects.

1. 1

I’m very surprised by the microcovid.org website not mentioning their inspiration from the micromort.

1. 1

It’s quite possible they invented the term “microCOVID” independently. “micro-” is a well-known prefix in science.

2. 1

One thing I think focusing on defects fails to capture is the way “tech debt” can slow down development,even if it’s not actually resulting in more defects. If a developer wastes a few days flailing because the didn’t understand something crucial about a system e.g. because it was undocumented, then that’s a cost even if it doesn’t result in them shipping bugs.

Tangentially relatedly, the defect model also implicitly assumes a particular behavior of the system is either a bug or not a bug. Often things are either subjective or at least a question of degree; performance problems often fall into this category, as do UX issues. But I think things which cause maintenance problems (lack of docs, code that is structured in a way that is hard to reason about, etc) often work similarly, even if they don’t directly manifest in the runtime behavior of the system.

1. 1

Microcovids and micromorts at least work out in the aggregate; the catastrophic failure happens to the individual, i.e. there’s no joy in knowing the chance of death is one in a million if you happen to be that fatality.

Knowing the number of code defects might give us a handle on the likelihood of one having an impact, but not on the size of its impact.

2. 3

Actually, upon re-reading, it seems the author defines technical debt purely in terms of code beautification. In that case the additive logic probably holds up well enough. But since beautiful code isn’t a customer-visible ‘defect’, I don’t understand how monetary value could be attached to it.

1. 3

I usually see “tech debt” used to describe following the “no design” line on https://www.sandimetz.com/s/012-designStaminaGraph.gif past the crossing point. The idea is that the longer you keep on this part of the curve, the harder it becomes to create or implement any design, and the ability to maintain the code slows.

1. 1

I think this is the key:

For example, your code might violate naming conventions. This makes the code slightly harder to read and understand which increases the risk to introduce bugs or miss them during a code review.

Tech debt so often leads to defects, they become interchangeable.

1. 1

To me, this sounds like a case of the streetlight effect. Violated naming conventions are a lot easier to find than actual defects, so we pretend fixing one helps with the other.

2. 3

I think it’s even simpler than that: All software is a liability. The more you have of it and the more critical it is to your business, the bigger the liability. As you say, it might be many years before a catastrophic error occurs that causes actual monetary damage, but a sensible management should have amortized that cost over all the preceding years.

1. 1

I think it was Dijkstra who said something like “If you want to count lines of code, at least put them on the right side of the balance sheet.”

2. 2

Debt has a predictable cost

Only within certain bounds. Interest rates fluctuate and the interest rate that you can actually get on any given loan depends on the amount of debt that you’re already carrying. That feels like quite a good analogy for technical debt:

• It has a certain cost now.
• That cost may unexpectedly jump to a significantly higher cost as a result of factors outside your control.
• The more of it you have, the more expensive the next bit is.
1. 1

especially when talking to people with a financial background, I’d avoid them, and words like ‘debt’, like the plague

Interesting because Ward Cunningham invented the term when he worked as a consultant for people with a financial background to explain why code needs to be cleaned up. He explicitly chose a term they knew.

1. 1

And he didn’t choose very wisely. Or maybe it worked at the time if it got people to listen to him.