Threads for emery

  1. 1

    I use the fish shell and my prompt function plays an LCARS beep selected by the exit code of the previous command. Perceived productivity is way up.

    1. 1

      While I would never use this (macOS is the best Linux DE), I think it’s really fucking rad that someone put this much effort into making something so detailed. I think it’s time to watch a few episodes TNG again!

      Oohh, please share.

      1. 2

        The sounds I got from Trek Core. I would prefer to synthesize beeps rather than use samples but I don’t have the expertise. I’ve looked a bit at a ChucK and SuperCollider but synthesis and routing details aside it would take a real artist to recreate something like TNG sound effects.

        In fish the prompt comes from a shell function rather than an environmental variable so its possible to call an arbitrary command like ~/bin/beep $status every time the prompt is regenerated. I keep an instance of mpv running idle and send it commands over a socket to enqueue the beep files. It could be done with a script but I use Syndicate to abstract the beep command from sending commands to mpv so that I could send beeps code over network but I haven’t tested it. Some rainy day I will test and publish it.

    1. 2

      “Making inefficient markets more efficient with software” was a good idea, and companies like Uber, Doordash, and Etsy captured a lot of the value.

      This article isn’t about osdev or unix, it’s trashy propaganda written by someone too stupid to realize they are writing propaganda.

      1. 7

        “Making inefficient markets more efficient with software” was a good idea, and companies like Uber, Doordash, and Etsy captured a lot of the value.

        None of this is what those companies do, nor is it why they make money. They used investor money to artificially lower the cost of a service to consumers, in order to lock customers and vendors out of their old relationships and into a walled garden market the company can control and tax.

        1. 1

          I disagree that it’s trashy or propaganda. There is definitely value in these thoughts.

          I do agree that the article is not about osdev or any technical aspect of unix. Perhaps it should be tagged culture?

          1. 0

            There is definitely value in these thoughts.

            Define value, because in this context it means expoiting people while failing to make a profit.

        1. 32

          For some reason sfc seems to ignore the reason the rules are in place. The app store was full of Foss apps with similar icons which were just compiled free software, but cost money. (and usually badly packaged and not updated) While technically legal, it’s basically a scam since the publishers had nothing to do with the developers. The new rule is not great, but the previous situation was not better.

          See tentacrul raising this issue some time ago about audacity https://mobile.twitter.com/Tantacrul/status/1520135740159664128

          1. 12

            Sometimes it doesn’t matter what reason a rule is in place if that rule categorically punishes people who are trying to do the right thing.

            1. 22

              That’s very black and white. There is no solution here that will satisfy all developers, prevent all scams, and provide proper choice to the users. Someone will end up unhappy however ms decides to play this. Punishing some people trying to do the right thing will happen and we can only hope it’s minimised.

              The reason for the change matters, otherwise we’d just complain regardless of what the rule says.

              1. 4

                It is black and white, innocent people should not be punished because it improves the image of the marketplace. There are plenty of cases where society accepts a suboptimal solution and we just deal with it.

                1. 9

                  I assume most FOSS developers would rather not have paid versions of their software that they don’t profit from on the store than profit from it themselves. This change probably isn’t even net harmful.

                  1. 5

                    If they have a problem with that, why would they release their code under a FOSS license?

                    1. 6

                      If you released a piece of software under a FOSS license and someone bundled a 4 year old version of it with malware and was selling it for $5 on an App Store, would you just shrug your shoulders and say “whelp I guess I asked for it with that FOSS license, nothing to be done about it?”

                      1. 2

                        I’ve contributed code to BSD licensed projects. My code has been packaged and sold to big companies - other people are making money on my contributions. Those companies often don’t contribute back… in some relatively rare cases they upstream changes, but mostly it’s one way traffic.

                        This does not bother me. If I did, I’d have not given my work to a BSD-licensed project. Easy as.

                        If you apply a license to your project that lets someone foo, then they foo, I don’t think you have much room to complain.

                        1. 1

                          I guess I was also assuming there were trademarks at play here but maybe not. What you’re saying is fair.

                        2. 1

                          malware would be a problem but I don’t see why you need a special policy for FOSS programs. don’t they already take down most malware?

                        3. 2

                          You’re right, but I think this line of thinking is a distraction when we’re talking about a curated marketplace. Yes, anybody can legally distribute their own binaries or forks of FOSS but MS is under no obligation to accept them. The question is how MS can best guide its users toward installing the FOSS that they really wanted (or more importantly, guiding them away from installing the version they didn’t want).

                          Much like a Linux distribution it would be best if humans kept track of the community to promote the version that the user probably wants, including the nuance of providing an alternative fork if a legitimate contender arises. Unlike a Linux distribution there is actual cashflow up for grabs which is going to make the arguments all the spicier.

                          Imagine a scenario where GIMP (random example) decided to charge $100 and used (updated) MS policy to be the only option on the store. Meanwhile, unsophisticated users are saying “heck no”, googling and getting tricked into malware-infested versions via ad links. Then other trustworthy but unaffiliated developers are offering to provide free builds and berating MS for not letting them in. I could understand it if MS intentionally banned FOSS entirely just to avoid this kind of drama.

                          1. 1

                            I can put out a bowl of candy in front of my doorstep, with the assumption that people will act in good faith and just take a piece or two. Somebody might come along and take all the candy. It might be a bit silly of me to get mad at this person for doing this, since it’s perhaps no less than I can expect, just leaving out a bowl of candy unattended. And certainly, leaving an armed guard in front of the candy to make sure that no one takes more than one piece wouldn’t at all be in the spirit of giving and sharing. But I think it’s still completely reasonable to prefer that people just take one piece each.

                1. 3

                  No talk about encryption in 2022? 😔

                  1. 10

                    What encryption do you have in mind? IRC already supports SSL, and end-to-end encryption would not be feasible with their goal of minor backwards-compatible improvements.

                    1. 8

                      I don’t get why encryption is always brought up when talking about IRC. What’s the threat model here? I mean, I’m all for E2EE (I know you didn’t say that specifically) but that makes more sense for actually private conversations—IRC is for public discourse and must be used as such.

                      1. 3

                        For one-to-one end to end encrypted chat, OTR works well enough. It isn’t universally supported. I use it though. For secure group chat, if you trust someone to host, it’s pretty straightforward to set up a private server with TLS.

                      2. 2

                        No sense in talking about encryption in 2022 if we aren’t better at hiding metadata. As far as I know dark IRC servers on Tor hidden services are still state-of-the-art for private groupchat.

                        1. 8

                          I’m only on github to make PRs. A important step for me was to consciously avoid the dopamine triggers by removing all my stars and follows, and making my profile page as boring and inconsequential as possible. I find it’s easier to ignore the social scoring by committing to not reciprocate.

                          1. 4

                            A important step for me was to consciously avoid the dopamine triggers by removing all my stars and follows

                            I’m genuinely curious, what’s your reason for doing that? To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally. That’s kind of what it’s all about for me; having people use and care about things I create is my primary motivation for programming (outside of work, which I do primarily for money).

                            1. 15

                              To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally.

                              Clicking a button doesn’t exactly signal “giving a shit” to me… it requires no effort. What signals (to me) that people give a shit is seeing patches come in, bug reports, random folks on IRC or whatever saying “hi” and thanking me, seeing distros picking up my stuff, and so on. Giving fake internet points doesn’t necessarily mean that anyone gives a shit, at best they’re probably bored or mildly interested enough to click a button and move on to the next shiny thing in their feed.

                              1. 4

                                Every pull request or email patch I’ve received is a thousand times more meaningful than any star or follow on Github. Those are just pointless.

                                Originally stars were for bookmarking, but it’s degenerated into meaningless 👍🏻/+1 noise.

                                1. 3

                                  Exactly - a “star” can simply mean “hey this looks cool”. I’m sure a majority of people who star a project never even tried to use the project. It’s just ego inflation. More important is that people actually use your stuff, in places where it matters. If some project is technically cool but unusable, it could still acquire many many stars.

                                  1. 3

                                    I usually star projects so I can find them later.

                                2. 7

                                  I’m genuinely curious, what’s your reason for doing that? To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally

                                  My bar for that rests at the point that someone gives me their personal feedback on my work in a way that lets me know they have actually read, studied, or used it. That is giving a shit. Competing with the whole world and collecting a few imaginary stars, stickers, or points does not say anything about your work, unless you happen to be a marketeer.

                                3. 1

                                  I set mine to private; everyone can. Eventually all my code will be self-hosted and accessible via an RSS feed and https.

                                1. 1

                                  Hey fellow full-stack devs, designers and Internet users, what’s your opinion on this?

                                  1. 10

                                    The question was posted to StackExchange 3 years ago, and I have not seen emoji turn up in UX in any significant amount since then. So I don’t think the wider UX community thinks it’s a good idea.

                                    1. 3

                                      If its text mixed with emojii, it looks bad but a fair warning about what kind of people you are dealing with. If its a link or button that only contains emojii, its not going to render at all for some people.

                                      1. 1

                                        In what context would it not render at all? The only case I can think of is an embedded platform too resource-constrained to have an emoji font, which seems kind of niche. (Not to mention the original post was about a native iOS app.)

                                        1. 1

                                          A native app using emojii for labels or buttons seems like only a potential accessibility problem. As for websites, no, emojii fonts are not ubiquitious. If browsers aren’t required to bundle them then the only safe assumption is that they aren’t there..

                                          1. 1

                                            Apple’s text-to-speech reads out descriptions of emoji (just have Siri read you your text messages sometime.) I’m sure other assistive tech does too.

                                            Platforms aren’t required to have fonts covering the rest of Unicode either, but if you follow that line of reasoning, you can’t use math symbols or any non-Roman alphabet in web pages. ¯\_(ツ)_/¯

                                            The one environment I just ran across that doesn’t support emoji is the full-screen console mode of a Raspberry Pi. I have a server I wrote that uses some emoji in its output, and I noticed they don’t show up when I run it on my Pi outside a windowing environment. I’m not taking them out though :)

                                      2. 1

                                        The original post was about a native (iOS) app. By “full-stack” I think you mean web, which is a different context with different issues, like visual variance between platforms.

                                      1. 2

                                        Wait, if you can’t compare string values in Dhall what do you do instead?

                                        1. 3

                                          If you need to work with a string that can have a finite set of possible values then use an enum type. If the string can be an abritrary value, then you probably ought not compare it.

                                          1. 1

                                            presumably you have to convert them to the same string type first? (e.g. if one is raw bytes and the other is decoded, or one is list of 32-bit chars and the other is vector of utf-8 bytes)

                                          1. 12

                                            Nim is case-insensitive with an exception for the leading character, and that makes some people angry https://lobste.rs/s/uyv16t/nim_v2_get_rid_style_insensitivity

                                            1. 7

                                              Note: This is copied from my comment on the orange site.

                                              I appreciate this article bacause this is an important distinction to make. In fact, it is so important that I am willing to rewrite code in order to know the names and contact information of all of the people that my dependencies depend on, as well as having some sort of professional relationship with them.

                                              For example, in a project I am working on, I need a database, a way to talk over the Internet, and cryptography. Obviously, I know what database to use: SQLite (D. Richard Hipp). Obviously, I know what dependency to use for talking over the Internet: curl (Daniel Stenberg).

                                              Cryptography is harder, but I finally settled on BearSSL (Thomas Pornin). BearSSL does not give me everything I want, however; since I want OPAQUE (a way for clients to not give their password to a potentially malicious server), I need that. BearSSL also does not give me a “KSF,” or key-stretching function, which OPAQUE requires, though I can use Argon2i for that.

                                              The reference implementation for Argon2i unfortunately seems dead, even though I know the names of the people who made it. I don’t know if they will respond if I contact them, while I do know that D. Richard Hipp, Daniel Stenberg, and Thomas Pornin will respond. So in order to make sure I always have a point of contact with all of my dependencies, I am going to write Argon2i, BLAKE2b (needed by Argon2i), and OPAQUE myself.

                                              Bad idea? Yeah, don’t roll your own crypto. But I am studying hard, and I intend to get my crypto audited before publishing.

                                              The end result, however, is very worth it: my dependencies will be well-known, and I know each of the authors personally, albeit through email.

                                              And down the road, if I manage to make some money, I can kick some of it back to them in exchange for their previous help. In turn, they’ll be happy to continue the relationship.

                                              That’s how Open Source works at its best: it depends on relationships, and on giving back to those relationships. I think that that is what this article is trying to say, and I whole-heartedly agree.

                                              1. 7

                                                What you’ve written resonates with me, but it seems to push a lot of people to the side. People with families may not have the time to learn the algorithmic details of say, BLAKE2d. Others with learning difficulties or large knowledge gaps in the thing they want to recreate may never be able to achieve an implementation in a reasonable amount of time.

                                                And then there’s the other side: maybe these FOSS authors really don’t want to know you. Maybe the truth is, they write software because they like it, feel a sense of real achievement due to being helpful to humanity, but want nothing more. The semi-recent events of authors being ridiculed for “lack of support” is a similar example of “don’t bother me”.

                                                Trust though is huge and that’s why we have digital signatures for software packages, source code tarballs, and whatever else we want to maybe trust. I think the motto “don’t trust, verify” is heavily underrated, and should be applied more diligently to our dependencies. I believe this tool is helping with that.

                                                The context will determine the degree of trust you require.

                                                1. 2

                                                  You bring up good points. And yet…

                                                  As it so happens, I have a wife. I have very little time, and I have knowledge gaps in cryptography. I am not going to be able to do this in a reasonable amount of time. Yet I am still doing it because I feel strongly about doing my software right. Not many people want to. I do want to, and I am betting that once I do publish something, people would want to use it.

                                                  As I mentioned on the orange site, I already have an email relationship with two of the three programmers. All three of them make their money off of the particular software I am going to use as dependencies (well, one makes their money off of consulting in the same area as the software he made), so if all else fails, I can throw money at them in the same way that others already do.

                                                  That’s why my three requirements for dependencies included some way to pay for something. First, as TFA says, that’s important for Open Source, and second, it will probably be important to my business later.

                                                  And as for trust, well, if I have paid them, part of the product I will pay them for is to have them declare that their software is fit for my purposes. (Basically, to nullify the “no warranty” clause of licenses, so I would have some sort of warranty.) While that doesn’t take care of the problem, there are laws that govern the use of warranties, so I would have some recourse. And when they take money from me, they would probably have more incentive to keep me happy, i.e., they would have incentive to be trustworthy.

                                                  Still not enough for complete trust, but hey, it’s only three dependencies. That makes it easier to audit them personally.

                                                  1. 4

                                                    Yet I am still doing it because I feel strongly about doing my software right.

                                                    Devil’s advocate: And so any other way is doing software wrong?

                                                    I already have an email relationship with two of the three programmers.

                                                    So in this particular case it’s possible / they accept relationships, but this isn’t true of all scenarios…

                                                    The trust part yeah, there are a lot of ways to approach it. I wasn’t dismissing anything you said, just augmenting it with my own comment 🙂

                                                    Thank you for the insightful conversation! Makes you think.

                                                    1. 2

                                                      Devil’s advocate: And so any other way is doing software wrong?

                                                      If you’re paid for it, yes. If not, no. But I do want to be paid for mine eventually. Like the three fellows I will be depending on.

                                                      So in this particular case it’s possible / they accept relationships, but this isn’t true of all scenarios…

                                                      Correct. I had to do the leg work first before deciding on them. And if they decide to cut me off, I’m going to have to replace their software with something else or write it myself.

                                                      Thank you for the insightful conversation! Makes you think.

                                                      You’re welcome. And thank you. :)

                                                      1. 1

                                                        If you’re paid for it, yes

                                                        This just in: getting paid is bad. I think respecting licenses and creating software off the works of others is how we advance as a society (and indeed, is how countless other creative pursuits have evolved). Purists in this respect don’t usually get very far is my cynical take here.

                                                  2. 2

                                                    People with families may not have the time to learn the algorithmic details of say, BLAKE2d.

                                                    Implementing BLAKE2b is only a couple hours of work using the documentation provided and less than 1KSLOC 200SLOC. Packages like that are the easiest to write and maintain.

                                                    1. 2

                                                      is only a couple hours of work using the documentation provided

                                                      You missed “for me” somewhere in there

                                                      1. -1

                                                        What you’ve written resonates with me, but it seems to push a lot of people to the side.

                                                        If you aren’t willing to do the work then your opinion can be pushed to the side.

                                                        1. 1

                                                          What if you’re willing but unable?…

                                                1. 15

                                                  I’d prefer the Linux community instead pushed Discord users to Matrix or back onto IRC or XMPP. I understand trying to support where the masses are, but at what cost when user privacy is at stake? There’s nothing open about Discord and they’ve actively shutdown every project that tried to build CLI or alternative clients which against the spirit. The games played can be closed source because they don’t contain all of your private communications–that sort of stuff should be fully open (and ideally end-to-end encrypted).

                                                  Otherwise, I just agree with a sibling that users should keep it in the browser sandbox instead of Electron. It’s a shame Mozilla shutdown SSB before almost anyone knew about it to have PWAs. Users should be pushing back on communities standardizing on Discord as well.

                                                  1. 10

                                                    That’s a nice sentiment, but it does nothing to help combat the network effect that is why Discord is so irreplaceable.

                                                    1. 9

                                                      Lucky for me, the communities I participate aren’t limited to Discord and if they were, I just won’t join (and have raised a complaint with ones I’ve considered joining).

                                                      1. 5

                                                        That’s what people said about Windows, and AIM, and ICQ, and MSN, and …

                                                        If a product’s main value is its network effect, then it is extraordinarily vulnerable to disruption by a genuinely more valuable competitor. This is why Discord is so hostile to CLI / alternative clients.

                                                        1. 1

                                                          ….nor the practical reasons why people use it.

                                                          1. 2

                                                            What are those practical reasons, aside from the network effect?

                                                            1. 3

                                                              The fact compared to XMPP or IRC (as mentioned in parent comment), features like rich text, media uploads, calls, push notifications, message carbons, etc. are there (unlike IRC) and work consistently (unlike XMPP).

                                                              1. 2

                                                                And compared to modern FOSS software like Matrix or Zulip?

                                                                1. 1

                                                                  Not to mention what’s mentioned seem mostly like progressive enhancements certain IRC clients do support. Or are easy to supplement with other dedicated FOSS services. Mumble still works—as does Jitsi and Jami.

                                                                  1. 5

                                                                    Not to mention what’s mentioned seem mostly like progressive enhancements certain IRC clients do support.

                                                                    I’m a bit jaded after many of the people I knew who worked on IRCv3 gave up after the benefits failed to materialize in any tangible way for users.

                                                                  2. 1

                                                                    Matrix has many of its own issues (clients with confusing UX, servers that consume resources, seeming general focus on features over polish). Zulip I actually respect, but the workflow is extremely polarizing.

                                                          2. 3

                                                            Matrix, IRC, and XMPP are protocols not platforms. While I like some of these protocols, you can’t “push users to IRC” you need to “push them the libera.chat and Thunderbird” or whatever actual platform you are going to promote to them.

                                                            The benefit of an open protocol is that we don’t all have to push the exact same platform so long as they are compatible, but “use XMPP” is like saying “use SMTP” – ok, but how and where do I use this thing? Discord could federate any time it wants to and they wouldn’t even have to leave to “use XMPP” so while that specific thing is unlikely, I do think when doing advocacy the advocate has to pick product/platform winners and not just vague tech.

                                                            1. 0

                                                              I don’t think this is the “Linux community”, just people who happen to run linux but don’t have stake in the game or are too dumb to relealize that they do.

                                                            1. 23

                                                              Interesting, thanks for writing.

                                                              The problem you run into with Ansible (as an example of a stateless solution) is that removing resources from your config doesn’t ensure they’re removed from the cloud without some care. So say I create a VPC in the YAML, run Ansible and it gets built, then I remove it from my YAML and run Ansible again, the VPC will continue to exist indefinitely without adding some other steps.

                                                              By contrast, Terraform with state typically ensures that the resources will be cleaned up once removed from the HCL.

                                                              In theory you’ll end up with much more cruft over time the stateless way. Whether or not that is more painful than working with terraform state is a compelling question. I think it depends on the team size and apply process.

                                                              1. 4

                                                                This is exactly correct. When Terraform was very early on, I think 0.3 or some such? Well before data resources – I initially did an on-prem to AWS migration solely using Terraform.

                                                                Unfortunately, it wasn’t quite up to snuff so after a few months I rewrote it all in Ansible, which ended up being far, far more verbose and had all the problems you listed. From an operator pov it ‘felt good,’ though.

                                                                Had I to do it again, I would likely use Ansible to ‘bootstrap’ all the Terraform stuff (s3 buckets and iam users/roles/keys) and do the rest with TF. Shooting for 100% Terraform (or really 100% only one tool) is usually the wrong path.

                                                                1. 4

                                                                  At a previous company we had a bootstrap terraform repo that would setup the basics for the rest of the automation, like backend buckets and some roles. The bootstrap terraform used local state, committed to the repository. It worked well enough.

                                                                  1. 2

                                                                    My approach is generally to use a bootstrap Terraform stack which creates the buckets and so on that Terraform needs, then change the bootstrap stack to use the bucket it created as its state backend. Having some backups is useful (as with all statefiles) but it’s generally and easy and safe approach.

                                                                  2. 2

                                                                    That’s thought provoking. I wonder if it would be reasonable to run Ansible (or some other stateless tool) in a mode where it went looking for things that didn’t exist and removed them. The flaw there would be that no Ansible config sets up an entire system from first principles, but assumes there are some tools already in place.

                                                                    Maybe git or other source control could be used to store take on the state burden to detect removal.

                                                                    1. 7

                                                                      The downside of that idea is that it is extremely common to have multiple Terraform workspaces managing resources at the same provider. If you did “destroy all resources that aren’t in your configuration” you’d end up removing, say, all the QA infrastructure every time someone did an update to production if the two are managed separately.

                                                                      1. 2

                                                                        terraform et al are great in theory. it’s in practice where they often fall apart. it’s one of those domains where sanity must be enforced though disciplined conventions, which is hard.

                                                                        1. 1

                                                                          If such culture existed, then ansible, terraform et al would probably never spring to existence. This usage is very much motivated by a mindset of throwing a flashy tool at a problem, rather than understanding the fundamental problem and how to control it.

                                                                          For example, using yet another AWS service out of their line up of hundreds, is a choice that is rarely questioned. Then that device has its own challenges… Ok, AWS offers yet another service to ‘solve’ them, and so on. There is no time for discipline in this reality… Just keep adding stuff and hiring engineers to cope with the system untill the next company mass lay off.

                                                                      2. 2

                                                                        This is a classical problem with stateless tools, going as far back as make.

                                                                        1. 2

                                                                          This is also a thing in Puppet (another stateless system). The answer is that you have to put the absence of the resource in the config (in Puppet terms, ensure => absent). Then when the config has been applied everywhere it needs to be, you can delete that “tombstone” resource. For some resources there is also a purge attribute that means to delete all unmanaged resources found in a scope (like a directory).

                                                                        1. 1

                                                                          I can’t help but notice there are no images of the thing doing anything that would keep it out of the landfill in a few months.

                                                                          1. 15

                                                                            There are a bunch of good reasons for this, in no particular order:

                                                                            When you’re writing a program, unless you have a 100% accurate specification and formally verify your code, you will have bugs. You also have a finite amount of cognitive load that your brain can devote to avoiding bugs, so it’s a good idea to prioritise certain categories. Generally, bugs impact one or more of three categories, in (for most use cases) descending order of importance:

                                                                            1. Integrity
                                                                            2. Confidentiality
                                                                            3. Availability

                                                                            A bug that affects integrity is the worst kind because its effects can be felt back in time (corrupting state that you thought was safe). This is why Raskin’s first law (a program may not harm a user’s data or, through inaction, allow a human’s data to come to harm) is his first law. Whatever you do, you should avoid things that can cause data loss. This is why memory safety bugs are so bad: they place the entire program in an undefined state where any subsequent instruction that the CPU executes may corrupt the user’s data. Things like SQL injection fall into a similar category: they allow malicious or buggy inputs to corrupt state.

                                                                            Confidentiality may be almost as important in a lot of cases, but often the data that a program is operating on is of value only to the user and so leaking it doesn’t matter nearly as much as damaging it. In some defence applications the converse is true and it’s better to destroy the data than allow it to be leaked.

                                                                            Availability generally comes in last. The only exceptions tend to be safety-critical systems (if your car’s brakes fail to respond for 5 seconds, that’s much worse than your engine management system corrupting the mileage logs or leaking your position via a mobile channel, for example). For most desktop software, it’s a very distant third. If a program crashes without losing any data, and restarts quickly, I lose a few seconds of time but nothing else. macOS is designed so that the entire OS can crash without annoying the user too much. Almost every application supports sudden termination: it persists data to disk in the background and so the kernel can kill it if it runs out of memory. If the kernel panics then it typically takes a minute or two to reboot and come back to the original state.

                                                                            All of this means that a bug from not properly handling out-of-memory conditions is likely to have very low impact on the user. In contrast, it requires a huge amount of effort to get right. Everything that transitively allocates an object must handle failure. This is a huge burden on the programmer and if you get it wrong in one path then you may still see crashes from memory exhaustion.

                                                                            Next, there’s the question of what you do if memory is exhausted. As programs become more complicated, the subset of their behaviour that doesn’t require allocation becomes proportionally smaller. C++, for example, can throw an exception if operator::new fails[1], but what do you do in those catch blocks? Any subsequent memory allocation is likely to fail, and so even communicating with the user in a GUI application may not be possible. The best you can do is write unsaved data to disk, but if you’re respecting Raskin’s first law then you did that as soon as possible and so doing it on memory exhaustion is not a great idea. Most embedded / kernel code works around this by pre-allocating things at the start of some operation so that it has a clear failure point and can abort the operation if allocation fails. That’s much harder to do in general-purpose code.

                                                                            Closely related, the failure is (on modern operating systems that are not Windows) not related to allocation. Overcommit is a very important tactic for maximising the use of memory (memory that you’ve paid for but are not using is wasted). This means that malloc / new / whatever is not the point where you receive the out-of-memory notification. You receive it when you try to write to the memory, the OS takes a copy-on-write fault, and cannot allocate physical memory. This means that any store instruction my be the thing to trigger memory exhaustion ( it often isn’t that bad, but on systems that do deduplication, it is exactly that bad). If you thought getting exception handling right for anything that calls new was hard, imagine how much harder it is if any store to memory needs correct exception handling.

                                                                            Finally, and perhaps most importantly, there’s the question of where to build in reliability in a system. I think that the most important lesson from Erlang is that failure should be handled at the largest scale possible. Running out of memory is one possible cause of a program crashing. If you correctly handle it in every possible case, you probably still have other things that can cause the program to crash. In the best case, with formally verified code from a correct and complete specification, hardware failures can cause crashing. If you really want reliable systems then you should work on the assumption that the program can crash. Again, macOS does this well and provides very fast recovery paths. If a background app crashes on macOS, the window server keeps a copy of the window contents, the kernel restarts the app, which reconnects and reclaims the windows and draws back into them. The user probably doesn’t notice. In a server system, if you have multiple fault-tolerant replicas then you handle memory exhaustion (as long as it’s not triggered by allowing an attacker to allocate unbounded amounts of memory) in the same way that you handle any other failure: kill a replica and restart. The same mechanism protects you against large numbers of bug categories, including a blown fuse in the datacenter.

                                                                            All other things being equal, I would like programs to handle out of memory conditions gracefully but all other things are not equal and I’d much rather that they provided me with strong data integrity, data confidentiality, and could recover quickly from crashes.

                                                                            [1] Which, on every non-Windows platform, requires heap allocation. The Itanium ABI spec requires that the C++ runtime maintain a small pool of buffers that can be used but this has two additional problems. First, on a system that does overcommit, there’s no guarantee that the first use of those buffers won’t cause CoW faults and a SIGSEGV anyway. Second, there’s a finite pool of them and so in a multithreaded program some of the threads may be blocked waiting for others to complete error handling, and this may cause deadlock.

                                                                            1. 2

                                                                              C++, for example, can throw an exception if operator::new fails[1], but what do you do in those catch blocks? Any subsequent memory allocation is likely to fail, and so even communicating with the user in a GUI application may not be possible

                                                                              This may or not be the case depending on what you were doing inside the try. In the example of a particularly large allocation for a single operation, it’d be pretty straightforward to inform the user and abort the operation. For the case of GUI needing (but not being able) to allocate, I’d suggest that good design would have all allocation needed for user interaction being done early (during application startup) so this doesn’t present as a problem, even if it’s only for critical interactions.

                                                                              All other things being equal, I would like programs to handle out of memory conditions gracefully but all other things are not equal and I’d much rather that they provided me with strong data integrity, data confidentiality, and could recover quickly from crashes.

                                                                              Agreed, but it bothers me that the OS itself (and certain libraries, and certain languages) put blocks in the way to ever handling the conditions gracefully.

                                                                              Thanks for your comments.

                                                                              1. 1

                                                                                This may or not be the case depending on what you were doing inside the try. In the example of a particularly large allocation for a single operation, it’d be pretty straightforward to inform the user and abort the operation.

                                                                                That’s definitely true but most code outside of embedded systems has a lot of small allocations. If one of these fails then you need to backtrack a lot. This is really hard to do.

                                                                                Agreed, but it bothers me that the OS itself (and certain libraries, and certain languages) put blocks in the way to ever handling the conditions gracefully.

                                                                                Apparently there’s been a lot of discussion about this topic in WG21. In the embedded space (including kernels), gracefully handling allocation failure is critical, but these environments typically disable exceptions and so can’t use the C++ standard interfaces anyway. Outside of the embedded space, there are no non-trivial C++ applications that handle allocation failure correctly in all cases, in spite of the fact that the standard was explicitly designed to make it possible.

                                                                                Note that Windows was designed from the NT kernel on up to enable precisely this. NT has a policy of not making promises it can’t keep. When you ask the kernel for committed memory, it increments a count for your process representing ‘commit charge’. The total commit charge of all processes (and bits of the kernel) must add up to less than the available memory + swap. Requests to commit memory will fail if this limit is exceeded. Even stack allocations will probe and will throw exceptions on stack overrun. SEH doesn’t require any heap allocations and so can report out-of-memory conditions (it does require stack allocations, so I’m not quite sure what it does for those - I think there’s always one spare page for each stack) and all of the higher-level Windows APIs support graceful handling of allocation errors.

                                                                                With all of that in mind, have you seen evidence that Windows applications are more reliable or less likely to lose user data than their macOS counterparts?

                                                                                1. 1

                                                                                  Outside of the embedded space, there are no non-trivial C++ applications that handle allocation failure correctly in all cases

                                                                                  I’ve written at least one that is supposed to do so, though it depends on your definition of “trivial” I guess. But anyway, “applications don’t do it” was one of the laments.

                                                                                  With all of that in mind, have you seen evidence that Windows applications are more reliable or less likely to lose user data than their macOS counterparts?

                                                                                  That’s a bit of a straw-man, though, isn’t it? Nobody’s claimed that properly handling allocation failure at the OS level will by itself make applications more reliable.

                                                                                  I understand that people don’t think the problem is worth solving (that was somewhat the point of the article) - I think it’s subjective though. Arguments that availability is less important than integrity for example aren’t news, and aren’t enough to change my mind (I’ll point out that the importance of availability doesn’t diminish to zero just because there are higher-priority concerns). Other things that are being bought up are just echoing things already expressed by the article itself - the growing complexity of applications, the difficulty of handling allocation failure correctly; I agree the problem is hard, but I lament that OS behaviour, library design choices and language design choices only serve to make it harder, and for instance that programming languages aren’t trying to tackle the problem better.

                                                                                  But, if you disagree, I’m not trying to convince you.

                                                                                  1. 1

                                                                                    I’ve written at least one that is supposed to do so

                                                                                    I took a very quick (less than one minute) skim of the code and I found this line, where you use a throwing variant of operator new, in a way that is not exception safe. On at least one of the call chains that reach it, you will hit an exception-handling block that handle that failure and so will propagate it outwards.

                                                                                    It might be that you correctly handle allocation failure but a quick skim of the code suggests that you don’t. The only code that I’ve ever seen that does handle it correctly outside of the embedded space was written in Ada.

                                                                                    With all of that in mind, have you seen evidence that Windows applications are more reliable or less likely to lose user data than their macOS counterparts?

                                                                                    That’s a bit of a straw-man, though, isn’t it? Nobody’s claimed that properly handling allocation failure at the OS level will by itself make applications more reliable.

                                                                                    No, I’m claiming the exact opposite: that making people think about and handle allocation failure increases cognitive load and makes them more likely to introduce other bugs.

                                                                                    1. 1

                                                                                      I took a very quick (less than one minute) skim of the code and I found this line,

                                                                                      That’s in a utility that was just added to the code base, is still a work in progress, and the “new” happens on the setup path where termination on failure is appropriate (though, yes, it would be better to output an appropriate response rather than let it propagate right through and terminate via “unhandled exception”). The daemon itself - the main program in the repository - is, as I said, supposed to be resilient to allocation failure; If you want to skim anything to check what I’ve said, you should skim that.

                                                                                      No, I’m claiming the exact opposite: that making people think about and handle allocation failure increases cognitive load and makes them more likely to introduce other bugs.

                                                                                      Well, if you are making a claim, you should provide the evidence yourself, rather than asking whether I’ve seen any. I don’t think, though, that you can draw such a conclusion, even if there is evidence that Windows programs are generally more buggy than macOS equivalents (and that might be the case). There may be explanations other than “the windows developers are trying to handle allocation failure and introducing bugs as a result”. In any case, I still feel that this is missing the point.

                                                                                      (Sorry, that’s more inflammatory than I intended: what I meant was, you’re missing the thrust of the article. I’m really not interested in an argument about whether handling allocation failures is harder than not doing so; that is undeniably true. Does it lead to more bugs? With all other things being equal, it quite possibly does, but “how much so” is unanswered, and I still think there is a potential benefit; I also believe that the cost could be reduced if language design tried to address the problem).

                                                                                      1. 2

                                                                                        No, I’m claiming the exact opposite: that making people think about and handle allocation failure increases cognitive load and makes them more likely to introduce other bugs.

                                                                                        Well, if you are making a claim, you should provide the evidence yourself, rather than asking whether I’ve seen any.

                                                                                        The evidence that I see is that every platform that has designed APIs to require handling of OOM conditions (Symbian, Windows, classic MacOS, Win16) has had a worse user experience than ones that have tried to handle this at a system level (macOS, iOS, Android) and systems such as Erlang that don’t try to handle it locally are the ones that have the best uptime for large-scale systems.

                                                                                        You are making a claim that handling memory failures gracefully will improve something. Given that the experience of the last 30 years is that not doing so improves usability, system resilience, and data integrity, you need to provide some very strong evidence to back up that claim.

                                                                                        1. 1

                                                                                          You are making a claim that handling memory failures gracefully will improve something

                                                                                          Of course it will improve something - it will improve the behaviour of applications that encounter memory allocation failures. I feel like that’s a worthwhile goal. That’s the extent of my “claim”. It doesn’t need proving because it’s not really a claim. It’s a subjective opinion.

                                                                                          If all you want to do is say “you’re wrong”, you’ve done that. In all politeness, I don’t care what you think. You made some good points (as well as some that I flat-out disagree with, and some that are anecdotal or at least subjective) but that’s not changing my opinion. If you don’t want to discuss the ideas I was actually trying to raise, let’s leave it.

                                                                              2. 2

                                                                                Yes it is better that programs crash rather than continue to run in a degraded state but when a program crashes is still a bad thing. This reads like an argument that quality is low because of all the quality that is being delivered, or that memory leaks aren’t worth fixing.

                                                                                1. 2

                                                                                  That’s an argument that you can have programs that don’t crash by correctness. I.e., you can’t just be really really careful and write code that won’t crash. It’s basically impossible. What you can do is handle what is doable and architecture for redundancy, fast recovery, and minimization of damage.

                                                                                  1. 2

                                                                                    Data corruption, wrong results, and other Undefined Behavior are usually worse than crashing.

                                                                                    And I’m sorry to go into Grandpa Mode, but it’s easy to complain about quality when you haven’t had to try to handle and test every conceivable allocation failure (see my very long comment here for details.)

                                                                                1. 2

                                                                                  I’d blame the operating system. Genode or unikernel frameworks like Solo5 have an explicit memory limit and programs are designed with this taken into account. With memory constraints you have a dial available on every program to adjust things like caching behavior (unless it’s a port or a quick hack).

                                                                                  1. 5

                                                                                    I heard there was a project to port Qubes to seL4 as hypervisor. This was years ago, and I haven’t heard about it again.

                                                                                    I am hopeful it will pick up, at which point Qubes architecture will possibly be sound. Right now, unfortunately, they use Xen, which hypervisor runs with full privileges and is far too large to be trusted.

                                                                                    edit:

                                                                                    Found the effort, makatea.

                                                                                    The requirements document suggests, near the end, that this effort is funded. I am very hopeful for this project.

                                                                                    I quote:

                                                                                    This effort is co-funded by a grant from NLnet Foundation. Neutrality’s time is co-funded by the Innosuisse – Swiss Innovation Agency and the European Union, under the Eurostars2 programme as Project E!115764.

                                                                                    1. 5

                                                                                      There was work to port Qubes to Genode and the NOVA hypervisor, but not seL4. Last I heard the seL4 virtualization is pretty buggy (yes, seL4 does crash).

                                                                                      1. 4

                                                                                        and is far too large to be trusted

                                                                                        Trusted to protect against who/what?

                                                                                        If I was looking for an OS that would keep my computer relatively safe against 0days in my web browser, Qubes would be an attractive option: my threat model would essentially be “let me keep my bank activity over there, my work activity over here, and any risky browsing contained far away”. Could someone build an exploit chain from Firefox through the OS into Xen? Sure, but I’m not betting on it: that’s a ton of effort that would be best spent exploiting cloud computing users instead of little ’ol me.

                                                                                        If I was looking for an OS that would keep me safe against an advanced threat actor with state-level funding? I probably would decide to do less interesting things with my life and stop using a computer to do those things.

                                                                                        Sweeping claims about what can/cannot be trusted are disappointing when they aren’t tied to a threat model. Qubes, however, disappoints me as well because they don’t publish any explicit threat model that I could easily find via searching.

                                                                                      1. 3

                                                                                        Great article… In other words, Nix is a leaky abstraction and it bottoms out at shell scripts, e.g. a wrapper to munge the linker command line to set RPATH to /nix/store.

                                                                                        This doesn’t appear in your code / package definitions and is hard to debug when it goes wrong.

                                                                                        Nix was designed and developed before Linux containers, so it had to use the /nix/store mechanism to achieve its goals (paths should identify immutable content, not mutable “places”).

                                                                                        So I wish we had a Nix like system based on containers … (and some kind of OverlayFS and /nix/store hybrid) Related: https://lobste.rs/s/psfsfo/curse_nixos#c_ezjimo

                                                                                        https://lobste.rs/s/ui7wc4/nix_idea_whose_time_has_come#c_5j3zmc

                                                                                        1. 9

                                                                                          So I wish we had a Nix like system based on containers … (and some kind of OverlayFS and /nix/store hybrid)

                                                                                          I don’t, I like running nix on macos. How would your “solution” of running linux binaries and filesystems on macos work exactly? This whole blog post amounted to using the wrong lld, I don’t see how “lets use containers” is a good fix for the problem at hand.

                                                                                          1. 4

                                                                                            this is the proper sentiment

                                                                                            1. 1

                                                                                              See this reply … I have a specific problem of running lightweight containers locally and remotely, and transfering them / storing them, and am not trying to do anything with OS X:

                                                                                              https://lobste.rs/s/vadunt/rpath_why_lld_doesn_t_work_on_nixos#c_pb8cpo

                                                                                              The background is that Oil contributors already tried to put Oil’s dev env in shell.nix, and run it on CI.

                                                                                              However my own shell scripts that invoke Docker/podman end up working better across Github Actions, sourcehut, local dev, etc. Docker definitely has design bugs, but it does solve the problem … I like that it is being “refactored away” and orthogonalized

                                                                                              1. 4

                                                                                                However my own shell scripts that invoke Docker/podman end up working better across Github Actions, sourcehut, local dev, etc. Docker definitely has design bugs, but it does solve the problem … I like that it is being “refactored away” and orthogonalized

                                                                                                Its not though, you’re not solving the problem for “not linux” if you put everything in docker/containers. nix the package manager is just files and symlinks at the end of the day. It runs on more than just linux, any linux only “solution” is just that, not a solution that actually fits the problem space as nix. Which includes freebsd, macos, and linux.

                                                                                                The background is that Oil contributors already tried to put Oil’s dev env in shell.nix, and run it on CI.

                                                                                                And you can create containers from nix package derivations, and your linked comment doesn’t really explain your problem. I’ve used nix to create “lightweight” containers, the way nix stores files makes it rather nice as it avoids all that layering rubbish. But without a clear understanding of what exactly you’re talking about it really seems to be unrelated to this post entirely. How do you test Oil on macos/freebsd via CI? Even with those operating systems and nix, its its own world so you’d still have to test in and out of docker. Or am I misunderstanding? I’m still unclear what problem you are trying to accomplish and how it relates to rpath in a linker on nix and how containers solve it.

                                                                                                1. 1

                                                                                                  Yeah it’s true, if you have a OS X or BSD requirement then what I’m thinking of isn’t going to help, or at least it has all the same problems that Docker does on OS X (I think it runs in a VM)

                                                                                                  The history of the problem is long, it’s all on https://oilshell.zulipchat.com/ and there is a shell.nix here

                                                                                                  https://github.com/oilshell/oil/blob/master/shell.nix

                                                                                                  which is not widely used. Instead the CI now uses 5 parallel jobs with 5 Dockerfiles, which I want to refactor into something more fine-grained.

                                                                                                  https://github.com/oilshell/oil/tree/master/soil

                                                                                                  I would have liked to have used Nix but it didn’t solve the problem well … It apparently doesn’t solve the “it works on my machine” problem. Apparently Nix Flakes solves that better? That is, whether it’s isolated/hermetic apparently depends on the build configuration. In contrast Bazel (with all its limitations) does pretty much solve that problem, independent of your build configuration is.

                                                                                                  I think the setup also depended on some kind of experimental Nix support for Travis CI (and cachix?), and then Travis CI went down, etc.

                                                                                                  I would be happy if some contributor told me this is all wrong and just fixed everything. But I don’t think that will happen because there are real problems it doesn’t address. Maybe Nix flakes will do it but in the meantime I think containers solved the problem more effectively.

                                                                                                  It’s better to get into the details on Zulip, but shells are extremely portable so about 5% of the build needs to run on OS X, and that can be done with tarballs and whatnot, because it has few dependencies. The other 95% is a huge build and test matrix that involves shells that don’t exist on OS X like busybox ash, and many many tools for quality and metaprogramming.

                                                                                                  Shells are also very low level, so we ran into the issue where Nix can’t sandbox libc. The libc on OS X pokes through, and that’s kind of fundamental as far as I can tell.


                                                                                                  This problem is sufficiently like most of the other problems I’ve had in the past that I’m willing to spend some time on it … e.g. I’m interested in distributed systems and 99% of those run Linux kernels everywhere. If you’re writing OS X software or simply like using OS X a lot then it won’t be interesting. (I sometimes use OS X, but use a Linux VM for development.)

                                                                                            2. 2

                                                                                              paths should identify immutable content, not mutable “places”

                                                                                              Dynamic linking to immutable content isn’t actually dynamic linking, its more like “deferred linking”. Optimize the deferred part away and you are back at static linking. The next optimization is to dedup libraries by static linking multiple programs into multicall binaries.

                                                                                              1. 2

                                                                                                Do containers allow any kind of fs merging at the moment? I.e. similar results to overlayfs or merged symlinks dir from nix itself? I thought namespaces only allow basic mapping, so I’m curious where they could help.

                                                                                                1. 2

                                                                                                  Yeah I think you might want a hybrid of OverlayFS and bind mounts. There was an experiment mentioned here, and it was pointed out that it’s probably not a good idea to have as many layers as packages. Because you could have more than 128 packages that a binary depends on, and the kernel doesn’t like that many layers:

                                                                                                  https://lobste.rs/s/psfsfo/curse_nixos#c_muaunf


                                                                                                  Here is what I’m thinking with bind mounts. I haven’t actually done this, but I’ve done similar things, and maybe some system already works like this – I wouldn’t be surprised. Let me know if it has already been done :)

                                                                                                  Say you want to install Python 3.9 and Python 3.10 together, which Nix lets you do. (As an aside, the experiment distri also lets you do that, so it also has these RPATH issues to solve, mentioned here: https://michael.stapelberg.ch/posts/2020-05-09-distri-hermetic-packages/ )

                                                                                                  So the point is to avoid all the RPATH stuff, and have more a “stock” build. IME the RPATH hacks are not necessarily terrible for C, but gets worse when you have dynamic modules in Python and R, which are shared libraries, which depend on other shared libraries. The build systems for most languages are annoying in this respect.


                                                                                                  So you build inside a container, bind mounting both the tarball and the output /mydistro, which is analogous to /nix/store. And then do something like:

                                                                                                  configure --prefix /mydistro/python && make && make install
                                                                                                  

                                                                                                  But on the host side, /mydistro/python is actually /repo/python-3.9 or /repo/python-3.10.

                                                                                                  So then at runtime, you do the same thing – bind mount /repo/python-3.9 as /mydistro/python

                                                                                                  So then on the host you can have python 3.9 and 3.10 simultaneously. This is where the package manager downloads data to.

                                                                                                  But apps themselves run inside a container, with their custom namespace. So with this scheme I think you should be able to mix and match Python versions dynamically because the built artifacts don’t have version numbers or /nix/store/HASH in their paths.

                                                                                                  I would try bubblewrap first – I just tried it and it seemed nice.

                                                                                                  So the runtime would end up as something like

                                                                                                  bwrap --mount /repo/python-3.9 /mydistro/python \
                                                                                                    --mount p39.py p39.py \
                                                                                                     /bin/p39.py arg1 arg2
                                                                                                  

                                                                                                  and

                                                                                                  bwrap --mount /repo/python-3.10 /mydistro/python \
                                                                                                     --mount p310.py p310.py \
                                                                                                     /bin/p310.py arg1 arg2
                                                                                                  

                                                                                                  There are obviously a bunch of other issues. The distri experiment has some notes on related issues, but I think this removes the RPATH hacks while still letting you have multiple versions installed.

                                                                                                  If anyone tries it let me know :) It should be doable with a 20 line shell script and bubblewrap.


                                                                                                  I actually have this problem because I want to use Python 3.10 pattern matching syntax to write a type checker! That was released in October and my distro doesn’t have it.

                                                                                                  Right now I just build it outside the container, which is fine. But I think having apps explicitly limited to a file system with their dependencies mounted has a lot of benefits. It is a middleground between the big blob of Docker and more precise dependencies of Nix.

                                                                                                  1. 6

                                                                                                    Yeah I think you might want a hybrid of OverlayFS and bind mounts. There was an experiment mentioned here, and it was pointed out that it’s probably not a good idea to have as many layers as packages. Because you could have more than 128 packages that a binary depends on, and the kernel doesn’t like that many layers:

                                                                                                    The problem with overlay / union filesystems is that the problem that they’re trying to solve is intrinsically very hard. If a file doesn’t exist in the top layer then you need to traverse all lower layers to try to find it. If you can guarantee that the lower layers are immutable then you can cache this traversal and build a combined view of a directory once but if they might be mutated then you need to provide some cache invalidation scheme. You have to do the traversal in order because a file can be created in one layer, deleted in a layer above (which requires you to support some notion of whiteout: the intermediate FS layer needs to track the fact that the file was deleted), and then re-added at a layer above. You also get exciting behaviours if only part of a file is modified: if I have a 1GiB file and I modify the header, my overlay needs to either copy the whole thing to the top layer, or it needs to manage the diff. In the latter case, this gets very exciting if something in the lower layer modifies the file. There are a lot of corner cases like this that mean you have to either implement things in a very inefficient way that scales poorly, or you have surprising semantics.

                                                                                                    This is why containerd uses snapshots as the abstraction, rather than overlays. If you have an overlay / union FS, then you can implement snapshots by creating a new immutable layer and, because the layer is immutable, you won’t hit any of the painful corner cases in the union FS. If you have a CoW filesystem, then snapshots are basically free. With something like ZFS, you can write a bunch of files, snapshot the filesystem, create a mutable clone of the snapshot, write / delete more files, and snapshot the result, and so on. Each of the snapshot layers is guaranteed immutable and any file that is unmodified from the previous snapshot shares storage. This means that the top ‘layers’ just have reference-counted pointers to the data and so accesses are O(1) in terms of the number of layers.

                                                                                                    The one thing that you lose with the snapshot model is the ability to arbitrarily change the composition order. For example, if I have one layer that installs package A, one that installs packages B on top, and one that installs package C on top, and I want a layer that installs packages A and C, then I can’t just combine the top and bottom layers, I need to start with the first one and install package C. Something like Nix can probably make the guarantees that would make this safe (that modified in the middle layer is modified by the application of the top layer), but that’s not possible in general.

                                                                                                    1. 1

                                                                                                      Hm yeah I have seen those weird issues with OverlayFS but not really experienced them … The reason I’m interseted in it is that I believe Docker uses it by default on most Linux distros. I think it used to use block-based solutions but I’m not entirely clear why they switched.

                                                                                                      The other reason I like it is because the layers are “first class” more amenable to shell scripting than block devices.


                                                                                                      And yes the idea behind the “vertical slices” is that they compose and don’t have ordering, like /nix/store.

                                                                                                      The idea behind the “horizontal layer” is that I don’t want to bootstrap the base image and the compiler myself :-/ I just want to do apt-get install build-essential.

                                                                                                      This is mainly for “rationalizing” the 5 containers I have in the Oil build, but I think it could be used to solve many problems I’ve had in the past.

                                                                                                      And also I think it is simple enough to do from shell; I’m not buying into a huge distro, although this could evolve into one.

                                                                                                      Basically I want to make the containers more fine-grained and composable. Each main() program should have its own lightweight container, a /bin/sh exec wrapper, and then you can compose those with shell scripts! (The continuous build is already a bunch of portable shell scripts.)

                                                                                                      Also I want more sharing, which gives you faster transfers over the network and smaller overall size.

                                                                                                      I am pretty sure this can solve my immediate problem – whether it generalizes I’m not sure, but I don’t see why not. For this project, desktop apps and OS X are out of scope.

                                                                                                      (Also I point out in another comment that I’d like to learn about the overlap between this scheme and what Flatpak already does? i.e. the build tools and runtime, and any notion of repository and network transfer. I’ve already used bubblewrap)

                                                                                                      1. 1

                                                                                                        Hm yeah I have seen those weird issues with OverlayFS but not really experienced them … The reason I’m interseted in it is that I believe Docker uses it by default on most Linux distros. I think it used to use block-based solutions but I’m not entirely clear why they switched.

                                                                                                        Docker now is a wrapper around containerd and so uses the snapshot abstraction. OCI containers are defined in terms of layers that define deltas on existing layers (starting with an empty one). containerd provides caching for these layers by delegating to a snapshotting service, which can apply the deltas as an overlay layer (which it then never modifies, so avoiding all of the corner cases) or to a filesystem with CoW snapshots.

                                                                                                        The other reason I like it is because the layers are “first class” more amenable to shell scripting than block devices.

                                                                                                        I’m not sure what this means. ZFS snapshots, for example, can be mounted in .zfs/{snapshot name} as read-only trees.

                                                                                                        Basically I want to make the containers more fine-grained and composable. Each main() program should have its own lightweight container, a /bin/sh exec wrapper, and then you can compose those with shell scripts! (The continuous build is already a bunch of portable shell scripts.)

                                                                                                        To do this really nicely, I want some of the functionality from capsh, so I can use Capsicum, not jails, and have the shell open file descriptors easily for the processes that it spawns, rather than relying on trying to shim all of this into a private view of the global namespace.

                                                                                                        1. 1

                                                                                                          I think you’re only speaking about BSD. containerd has the notion of “storage drivers”, and “overlay2” is the default storage driver on Linux. I think it changed 3-4 years ago

                                                                                                          https://docs.docker.com/storage/storagedriver/select-storage-driver/

                                                                                                          When I look at /var/lib/docker on my Ubuntu machine, it seems to confirm this – On Linux, Docker uses file level “differential” layers, not block-level snapshots. (And all this /var/lib/docker nonsense is what I’m criticizing on the blog. Docker is “anti-Unix”. Monolithic and code-centric not data-centric.)


                                                                                                          So basically I want to continue what Red Hat and others are doing and continue “refactoring away” Docker, and just use OverlayFS. From my point of view they did a good job of getting that into the kernel, so now it is reasonable to rely on it. (I think there were 2 iterations of OverlayFS – the second version fixes or mitigates the problems you noted – I agree it is hard, but I also think it is solved.)

                                                                                                          I think I wrote about it on the other thread, but I’m getting at a “remote/mobile process abstraction” with explicit data dependencies, mostly for batch processes. You need the data dependencies to be mobile. And I don’t want to introduce more concepts than necessary (according to the Perlis-Thompson principle and narrow waists), so just tarballs of files as layers, rather than block devices, seem ideal.

                                                                                                          The blocks are dependent on a specific file system, i.e. ext3 or ext4. And also I don’t think you can do anything with an upper layer without the lower layers. With the file-level abstraction you can do that.

                                                                                                          So it seems nicer not to introduce the constraint that all nodes have to be running the same file system – they merely all have to have OverlayFS, which is increasingly true.

                                                                                                          None of this is going to be built directly into Oil – it’s a layer on top. So presumably BSDs could use Docker or whatever, or maybe the remote process abstraction can be ported.

                                                                                                          Right now I’m just solving my own problem, which is very concrete, but as mentioned this is very similar to lots of problems I’ve had.

                                                                                                          Of course Kubernetes and dozens of other systems going back years have remote/mobile process abstractions, but none of them “won”, and they are all coupled to a whole lot of other stuff. I want something that is minimal and composable from the shell, and that basically leads into “distributed shell scripting”.

                                                                                                          I think all these systems were not properly FACTORED in the Unix sense. They were not narrow waists and didn’t compose with shell. They have only the most basic integration with shell.


                                                                                                          For example our CI is just 5 parallel jobs with 5 Dockerfiles now:

                                                                                                          https://github.com/oilshell/oil/tree/master/soil

                                                                                                          So logically it looks like this:

                                                                                                          run-in container1 job1 &
                                                                                                          run-in container2 job2 &
                                                                                                          run-in container3 job3 &
                                                                                                          run-in container4 job4 &
                                                                                                          run-in container5 job5 &
                                                                                                          wait
                                                                                                          

                                                                                                          I believe most CIs are like this – dumb, racy, without data dependencies, and with hard-coded schedules. So I would like to turn into something more fine-grained, parallel, and thus faster (but also more coarse-grained than Nix.) Basically by framing it in terms of shell, you get LANGUAGE-oriented composition.

                                                                                                          (And of course, as previous blog posts say, a container-based build system should be the same thing as a CI system; there shouldn’t be anything you can only run remotely.)


                                                                                                          I looked at Capsicum many years ago but haven’t seen capsh… For better or worse Oil is stuck on the lowest common denominator of POSIX, but the remote processes can be built on top, and right now that part feels Linux-only. I wasn’t really aware that people used Docker on BSD and I don’t know anything about it … (I did use NearlyFreeSpeech and their “epochs” based on BSD jails – it’s OK but not as flexible as what I want. It’s more on the admin side than the user side.)

                                                                                                          1. 1

                                                                                                            I think you’re only speaking about BSD. containerd has the notion of “storage drivers”, and “overlay2” is the default storage driver on Linux. I think it changed 3-4 years ago

                                                                                                            No, I’m talking about the abstractions that containerd uses. It can use overlay filesystems to implement a snapshot abstraction. Docker tried to do this the other way around and use snapshots to implement an overlay abstraction but this doesn’t work well and so containerd inverted it. This is in the docs.

                                                                                                            When I look at /var/lib/docker on my Ubuntu machine, it seems to confirm this – On Linux, Docker uses file level “differential” layers, not block-level snapshots

                                                                                                            Snapshots don’t have to be at the block level, they can be at the file level. There are various snapshotters in containerd that implement the same abstraction in different ways. The key point is that each layer is a delta that is applied to one specific immutable thing below.

                                                                                                            I’m not really sure what the rest of your post is talking about. You seem to be conflating abstractions and implementation.

                                                                                                            1. 1

                                                                                                              OK I think you were misunderstanding what I was talking about in the original message. What I’m proposing uses OverlayFS with immutable layers. Any mutable state is outside the container and mounted in at runtime. It’s more like an executable than a container.

                                                                                                    2. 1

                                                                                                      Adding to my own comment, if anyone has experience with Flatpak I’d be interested (since it uses bubblewrap):

                                                                                                      https://dev.to/bearlike/flatpak-vs-snaps-vs-appimage-vs-packages-linux-packaging-formats-compared-3nhl

                                                                                                      Apparently it is mostly for desktop apps? I don’t see why that would be since CLI apps and server apps should be strictly easier.

                                                                                                      I think the main difference again would be the mix of layers and slices, so you have less build configuration. And also naming them as first class on the file system and dynamically mixing and matching. What I don’t like is all the “boiling the ocean” required for packaging, e.g. RPATH but also a lot of other stuff …

                                                                                                      I have Snap on my Ubuntu desktop but I am trying to avoid it… Maybe Flatpak is better, not sure.

                                                                                                      1. 1

                                                                                                        That sounds like there’s a generic “distro python” though, which… is not necessarily true. You could definitely want environments with both python3.10 and 3.9 installed and not conflicting at the same time.

                                                                                                        1. 2

                                                                                                          The model I’m going for is that you’re not really “inside” a container … But each main() program uses a lightweight container for its dependencies. So I can’t really imagine any case where a single main() uses both Python 3.9 and 3.10.

                                                                                                          If you have p39 and p10 and want to pipe them together, you pipe together two DIFFERENT containers. You don’t pipe them together inside the container. It’s more like the model of iOS or Android, and apps are identified by a single hash that is a hash of the dependencies, which are layers/slices.

                                                                                                          BUT importantly they can share layers / “slices” underneath, so it’s not as wasteful as snap/flatpak and such.

                                                                                                          I’ve only looked a little at snap / flatpak, but I think they are more heavyweight, it’s like you’re inside a “machine” and not just assembling namespaces. I imagine an exec wrapper that makes each script isolated

                                                                                                           # this script behaves exactly like p310.py and can be piped together with other scripts
                                                                                                           exec bwrap --mount foo foo --mount p310.py p310.py -- /bin/p310.py "$@"
                                                                                                          
                                                                                                        2. 1

                                                                                                          Your idea kinda sounds like GoboLinux Runner, but I can’t tell if it’s exactly the same, since it’s been a long time since I played with GoboLinux. It’s a very interesting take on Linux, flipping the FHS on it’s head just like Nix or Guix, but still keeping the actual program store fully user accessible, and mostly manageable without special commands.

                                                                                                          1. 1

                                                                                                            Ah interesting, I heard about GoboLinux >10 years ago but it looks like they made some interesting developments.

                                                                                                            They say they are using a “custom mount table” and that is essentially what bubblewrap lets you do. You just specify a bunch of --mount flags and it makes mount() syscalls before exec-ing the program.

                                                                                                            https://github.com/containers/bubblewrap/blob/main/demos/bubblewrap-shell.sh

                                                                                                            I will look into it, thanks!

                                                                                                    1. 3

                                                                                                      Are articles on society or politics no longer off-topic?

                                                                                                      1. 6

                                                                                                        Sshh, don’t poke the bear. We’ve been tiptoeing around the bear so that we could read and discuss this primarily technical article about a case study in the resilience of the internet.

                                                                                                        1. 2

                                                                                                          Perhaps the trick is to explore the solutions but avoid the problems.

                                                                                                        2. 4

                                                                                                          It’s a technical article how ripe observes the network during a “human made disruption” in the area, why that is the case and how you can compare this with your country. (It’s more or less the bus-effect of ISPs) And I’m actually impressed how well it holds out.

                                                                                                        1. 3

                                                                                                          The census among everyone I discuss this with is that xdg-open is too complicated for power-users to bother to understand. If this is the default tool for opening URIs then its a clear indication to me that the “Linux desktop” is in a death spiral.

                                                                                                          1. 2

                                                                                                            I mean, if xdg-tools being a hot dumpster fire is an indication of a death spiral, it’s a heck of a long one – they’ve been pretty much the de facto standard for more than ten years now.

                                                                                                            At some point, around 2012 or so, I actually did the unthinkable and spent a few days learning how to use them. It’s unthinkable because the documentation is so bad (and incomplete) it’s not even funny. And I gave up on them because the whole thing was so brittle, and so prone to sudden breakage of all sorts, that it was pretty much useless. Not all is xdg’s fault – some of the more absurd failures (like having all MIME file types associated with Wine’s explorer.exe, or better yet, Firefox, which unsurprisingly opens anything you throw at it and dutifully prompts you what application you want it opened with, and the default choice is, yep, Firefox) are probably packaging problems. But the end result is so horrifyingly bad that I suspect it’s single-handedly responsible for the fact that so many people just open files from their terminals or with weird 1980s-like contraptions like nnn.

                                                                                                          1. 22

                                                                                                            There’s already these tags under the “culture” rubric:

                                                                                                            • culture - Technical communities and culture
                                                                                                            • law - Law, patents, and licensing
                                                                                                            • person - Stories about particular persons
                                                                                                            • philosophy - Philosophy
                                                                                                            1. 2

                                                                                                              Culture and Society are distinct in this case.

                                                                                                              1. 2

                                                                                                                With all due respect, so far no-one has posted any sort of example of a submission that would be on-topic if society was a tag. I’m personally against it being one, because of the contentious nature of the conceptually adjacent ones like culture, but I’m open to counter-examples.

                                                                                                                So far, crickets.