Threads for 1amzave

  1. 7

    Not quite through, but I was expecting a quicker read and was pleasantly surprised to find something so thorough! Won’t have time to finish it tonight :p

    Some thoughts, like, “The Final Word” might be overstating things:

    • Cluster FS’s highlight a uniquly different paradigm and approach (perhaps as the EVM does to other languages and runtimes);
    • BTRFS flaunts staggering flexibility over varying disk sizes, configurations, and migrations (re-compress, or convert between RAID paradigms, online, on the fly!);
    • and tbh I don’t know first thing about bcachefs but for that it’s the new kid on the block, and trust that the folks behind it have put their own spin on things (as anyone does when creating!).

    I point these out not to downplay ZFS’s own lovable quirks and the revolutionary impact they’ve had (notably on the lineage of these very systems), but to highlight these and future projects. It’s too soon to underscore “The Final Word”. That said, ZFS is still so worthy of our attention and appreciation! The author has clearly built a fantastic model and understanding of how the system works; I learned much more than I was ready for c:


    One particular section caught my eye:

    If you have a 5TiB pool, nothing wIll stop you from creating 10x 1TiB sparse zvols. Obviously, once all those zvols are half full, the underlying pool will be totally full. If the clients connected to the zvols try to write more data (which they might very well do because they think their storage is only half full) it will cause critical errors on the TrueNAS side and will most likely lead to extensive pool corruption. Even if you don’t over-commit the pool (i.e., 5TiB pool with 5x 1TiB zvols), snapshots can push you above the 100% full mark.

    I thought to myself, “ZFS wouldn’t hit me with a footgun like that, it knows data-loss is code RED”. While “extensive pool corruption” might overzealous, it is a sticky situation. The clients in this case the filesystem’s populating the ZVOLs, which are prepared to run out of space at some point, but not for the disk to fail out from under them. Snapshots do provide protection/recovery-paths from this, but also lower the “over-provisioning threshold”. This isn’t the sort of pool corruption that made me concerned; I couldn’t find any evidence of that. It would obviously still be a disruptive failure of monitoring in production, and might best be avoided precautionarily by under-provisioning, which is a shame, but then even thick provisioning has a messy relationship with snapshots in the following section.

    I’m not sure any known system addresses this short of monitoring/vertical-integration. I guess, it’s great when BTRFS fills up and you can just like plug in a USB drive to fluidly expand available space, halt further corruption, and carry it into degraded performance gracefully. Not that BTRFS’s own relationship with free space is unblemished, but this does work!

    Probably a viable approach in ZFS too (single device zdev?), but BTRFS really shines in it’s UI, responsiveness, and polish during exactly these sorts of migrations, which I’d find relieving in an emergency. ZFS has a lot of notorious pool expansion pitfalls, addressed here, which I also wouldn’t have to think about (even if just to dismiss them as inapplicable under the circumstances bc they relate to zdevs). It matters that I think ZFS can do it, and know that BTRFS can; it’s flexibility is reassuring. (Again, not a dig, still going great lengths to use ZFS everywhere; for all this I don’t run butter right now :p)


    Thinking about it more, this is probably because I’ve recreated BTRFS pools dozens of times, where as ZFS pools are more static and recreating them is often kinda intense. It’s like BTRFS is declarative, like Nix, allowing me to erase my darlings and become comportable with a broader ranges of configurations by being less attached to the specific setup I have at any given time.

    1. 5

      Live replication and sharing are both definitely missing from ZFS, though I can see how they could be added (to the FS, if not to the code). Offline deduplication is the other big omission and that’s hard to add as well.

      For cloud scenarios, I wish ZFS had stronger confidentiality and integrity properties. The encryption, last time I looked, left some fairly big side channels open and leaked a lot of metadata. Given that the core data structure is basically a Merkel tree, it’s unfortunate that ZFS doesn’t provide cryptographic integrity checks on a per-pool and per-dataset basis. For secure boot, I’d love to be able to embed my boot environment’s root hash in the loader and have the kernel just check the head of the Merkel tree.

      1. 4

        Yeah, the hubris of the “last word in filesystems” self-anointment always struck me as fairly staggering. While perhaps not exceeded so quickly and dramatically, it’s a close cousin of “640K ought to be enough for anybody”. Has any broad category of non-trivial software ever been declared finished, with some flawless shining embodiment never to be improved upon again? Hell, even (comparatively speaking) laughably simple things like sorting aren’t solved problems.

        1. 4

          Yeah, the hubris of the “last word in filesystems” self-anointment always struck me as fairly staggering.

          It’s just because the name starts with Z, so it’s always alphabetically last in a list of filesystems.

          1. 3

            Be right back, implementing Öfs.

            1. 1

              Or maybe just Zzzzzzzzzzzfs!

          2. 1

            Yeah, the hubris of the “last word in filesystems” self-anointment always struck me as fairly staggering. While perhaps not exceeded so quickly and dramatically, it’s a close cousin of “640K ought to be enough for anybody”.

            Besides the name starting with Z which @jaculabilis mentioned, I suspect it was also in reference to ZFS being able to (theoretically) store 2^137 bytes worth of data, which does ought to to be enough for anybody.

            Because storing 2^137 bytes worth of data would necessarily require more energy than that needed to boil the oceans, according to one of the ZFS creators [1].

            [1] https://hbfs.wordpress.com/2009/02/10/to-boil-the-oceans/

          3. 1

            Hasn’t BTRFS’s RAID support been unstable for ages? Is it better now?

            1. 3

              AFAIK the RAID5/6 write hole still exists, so that’s the notorious no-go; I’ve always preferred RAID10 myself. Does kinda put a damper on that declarative freedom aspect if mirrors are the only viably stable configuration, but the spirit is still there in the workflow and utilities.

            2. 1

              Article author here, I made an account so I could reply. I appreciate the kind words, I put a lot of effort into getting this information together.

              After talking to some of the ZFS devs, I’m going to clarify that section about overprovisioning your zvols. I misunderstood some technical information and it turns out it’s not as bad as I made it out to be (but it’s still something you want to avoid).

              The claim that ZFS is the last word in file systems comes from the original developers. I added a note to that effect in the first paragraph of the article. I have more info about what (I believe) they were getting at in one of the sections towards the end of the article: https://jro.io/truenas/openzfs/#final_word

              I’m obviously a huge fan of ZFS but I’ll admit that it’s not the best choice for every application. It’s not super lean and high-performance like ext4, it doesn’t (easily) scale out like ceph, and it doesn’t offer the flexible expansion and pool modification like UNRAID. Despite all that, it’s a great all-round filesystem for many purposes and is far more mature than something like BTRFS. (Really, I just needed a catchy title for my article :) )

              1. 1

                Which Cluster FS are you referring to? Google returns a lot of results.

                1. 1

                  The whole field! Ceph, Gluster, & co.

              1. 1

                I wonder how hard it would be to make a rootkit immune to this.

                1. 4

                  As with pretty much any such “trying to detect a compromised kernel after the fact” sort of situation, I’m pretty sure the answer is “not at all”. Looking at the implementation of this one, it looks like all you’d need to do is have your malicious module unlink its own entry from module_kset’s list of kobjects after initializing itself (all your nefarious code and such remains loaded of course).

                  1. 2

                    It’s a cat and mouse game. It’s nearly always possible to evade detection, and it’s nearly always possible to detect. Of course, there are exceptions, and sometimes the method is not feasable because of some various factors (e.g. too expensive).

                    This is a good introduction short article: https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf

                    (Ken Thompson’s “Reflections on Trusting Trust”, 1984)

                  1. 6

                    Unfortunately this doesn’t appear to address what I thought would have been one of the main lessons of the recent LastPass breach (though it’s also something I’ve been whinging about for a while regarding similar designs): password metadata is also sensitive, and storing it in plaintext is a major mistake, IMO.

                    1. 4

                      This is a good post. The C++ question of what remains in the “moved from” object is a big, nasty can of worms. Sometimes it’s well-defined, like e.g. std::unique_ptr are nulled when they are moved from (at the cost of performance). Other classes are less clear on the specifics, so you have to consider it on a case-by-case basis. Ideally, of course, you never use “moved from” objects, and that is how Rust does it, enforced by the compiler. This is faster and safer. In C++, the entire thing makes my head spin, so I usually just never use “moved from” objects again, unless I know the guaranteed behaviour off the top of my head.

                      1. 2

                        It’s a difficult problem in general once you start to have structure fields. The example of the string in Rust is fine for a local variable, but presumably this means that you can’t move a string out of a field in an object in Rust, because doing so would require the field to become invalid. C++ works around this by effectively making it a sum type of string and invalid string (which, as the author points out, is deeply unsatisfying and error prone). Pony does this a lot better by effectively making = a swap operator, so you have syntax for cheaply replacing the string with an empty string or other canonical marker.

                        The other problem with C++ that the author hints at is that move isn’t really part of the type system. R-value references are part of the type system, move is a standard library idiom. This means that you can’t special case this in useful places. For example, it would be nice to allow destructive moves out of fields when implementing the destructive move of an object. A language with move in the type system could implement a consume and destructure operation that took a unique reference to an object and returned its fields in a single operation. This would let you safely move things out of all fields, as long as you could statically prove that the object was not aliased.

                        1. 3

                          The example of the string in Rust is fine for a local variable, but presumably this means that you can’t move a string out of a field in an object in Rust, because doing so would require the field to become invalid. C++ works around this by effectively making it a sum type of string and invalid string (which, as the author points out, is deeply unsatisfying and error prone).

                          I’m not a rust expert by any means, but I think in rust you could achieve this pretty easily (and safely and explicitly) by having the struct field be an Option<T> and using Option::take().

                          1. 2

                            The problem then is that it’s now a union type and you need an explicit check for validity on every single field access. This is safer than C++ where you need the check but the type system doesn’t enforce it, it’s just undefined behaviour if it’s omitted, but it’s incredibly clunky for the programmer.

                            1. 2

                              Well sure, but it sounds like the alternative is “deeply unsatisfying and error prone”? If you want the Pony approach (by my understanding of your description; I’m not familiar with the language) of using some special sentinel value instead of an Option<T>, you could also use std::mem::replace(), or more concisely std::mem::take() if the type has a meaningful default value like "" for String.

                        2. 1

                          Yep a good post, bad title :-).

                          It has a good observation that Types must have specific, clearly expressed ‘capabilities’ (or traits) that make them usable or non-usable for a specific algorithmic or technical operation.

                          In a way, C++ templates system is a mechanism to express if a given type is ‘fitting’ to a specific ‘operation’ or algorithm. And if it is not – there was (often difficult to read), compilation error. But still there was a compilation error.

                          C++ allowed to expressed fitment of user types to algorithms, but did not allow to express the ‘fitment’ to the technical operations (like Move, parallel access, allocation model, exception model, etc).

                          And there is a impedance mismatch between built-in types and user-defined types, in the area of how these capabilities are expressed.

                          Now as the number and complexity of these technical operations grows, the inconsistency in expression of the capabilities through type system, is causing higher and higher cognitive load on developers (and I am sure compiler designers).

                          The problem with language now, is that it cannot evolve in a way where the complexity is constrained, but downward compatibility is maintained. It seems that we have to start choosing one over the other. As our cognitive faculties cannot just keep up with the complexity caused by the desire to maintain the previously-written-code compatibility of the language.

                        1. 2

                          This is for people who need to administer a handful of machines, all fairly different from each other and all Very Important. Those systems are not Cattle! They’re actually a bit more than Pets. They’re almost Family. For example: a laptop, workstation, and that personal tiny server in Sweden. They are all named after something dear.

                          They’re almost Family

                          Sounds like a good motivation to murder them personally and replace with Cattle provisioning.

                          1. 23

                            Cattle techniques aren’t really worth it until you have somewhere between 30 and 100

                            1. 12

                              Or you have 1 machine that when it dies will take you a week to recreate

                              1. 2

                                Such a machine should have backups in place, no?

                                1. 6

                                  Backups are great, until you want to upgrade to a new version of $software or $os. Then the backup needs to be applied, but are you sure each part needs to be there? Or that you didn’t miss something?

                                  Additive configuration, like we use for cattle, will work when you change something underlying, like the OS.

                                  1. 1

                                    FreeBSD and NixOS both let you keep the last old version around and reboot into it whenever you want. Others may or may not.

                              2. 4

                                disagree, cattle techniques don’t mean you can’t have extensible config management

                                although Chef isn’t easy to learn for a lot of folks, i’m glad i already know it, it’s easier to see when you have exactly as much extensibility as you need in your config management and not more than that… just like writing good software

                                1. 4

                                  Slight disagree on the idea and completely disagree on the threshold. In my experience, cattle management is extremely worth it on anything above 1. Otherwise at some point a change will be made to one of the hosts but not the other, or multiple things will change out of order. It’s basically inevitable with enough employees and time.

                                  For the idea itself, I’m finding it worth it to manage everything that way. After a disk failure I could rebuild my home server in minutes from the NixOS description, rather than trying to recover backups and figure out exactly how things were configured.

                                  1. 1

                                    I’ve embraced this strategy as well now for at least the last decade, just swapping out nix for a config management system.

                                    I keep backups of data (ex. Fileserver), but not system/program state. I could never go back, it feels wasteful of time and disk space now.

                                  2. 1

                                    Maaaaaybe. Ansible does pretty well for me with about 4 pets of various kinds. Some effort goes into making sure they are all quite similar though: all run Debian, they’re all within one version of each other, all have the same copy of my home dir on it even if they only need a few bits of it, etc. Each just has their own little config file that layers atop that general setup.

                                  3. 18

                                    Not everything has to efficient on an industrial scale.

                                    1. 5

                                      Hard agree. And that’s what I like about this post. But I think having systems that are very easily replaceable pays off even at small scale. Like someone offering me 3 free months of hosting for my lone cloud server if I move to their platform.

                                    2. 10

                                      Mental note: Never letting you take care of my cat. :-P

                                      1. 2

                                        Good thing we’re not relatives. /s :)

                                        I’d also rather use the larger ops tools: if only because you’ve got more chances to encounter them elsewhere, and that’s knowledge you’ll be able to reuse. Pets would not work for me, but I’m sure it’ll be useful to someone else. I’ll stick to ansible playbooks for now.

                                        1. 1

                                          Yeah, even if it’s one app, I’d rather make a terraform / ansible deployment strategy because I’ll be able to recreate it when requirements inevitably start requiring redundancy or what have you.

                                      1. 23

                                        Finally, Python for workgroups.

                                        1. 3

                                          Oh, it took me too long to get that joke  😀

                                          1. 11

                                            It was even the “official” name of the 3.11 release of the Linux kernel: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Makefile?id=v3.11#n5

                                            1. 2

                                              You have to be of a certain age.

                                          1. 3

                                            It’s quite similar to Perl rename script. In Debian packaged as prename.

                                            1. 3

                                              …likely so as not to conflict with the very similar rename command in the util-linux package (which I use pretty often, personally): https://github.com/util-linux/util-linux/blob/master/misc-utils/rename.c

                                              1. 2

                                                Your mileage may vary, but if you like the original, you will probably like another Perl script with the same name by Aristotle Pagaltzis even more.

                                              1. 9

                                                The thread on LKML about this work really doesn’t portray the Linux community in a good light. With a dozen or so new kernels being written in Rust, I wouldn’t be surprised if this team gives up dealing with Linus and goes to work on adding good Linux ABI compatibility to something else.

                                                1. 26

                                                  I dunno, Linus’ arguments make a lot of sense to me. It sounds like he’s trying to hammer some realism into the idealists. The easter bunny and santa claus comment was a bit much, but otherwise he sounds quite reasonable.

                                                  1. 19

                                                    Disagreement is over whether “panic and stop” is appropriate for kernel, and here I think Linus is just wrong. Debugging can be done by panic handlers, there is just no need to continue.

                                                    Pierre Krieger said it much better, so I will quote:

                                                    Part of the reasons why I wrote a kernel is to confirm by experience (as I couldn’t be sure before) that “panic and stop” is a completely valid way to handle Rust panics even in the kernel, and “warn and continue” is extremely harmful. I’m just so so tired of the defensive programming ideology: “we can’t prevent logic errors therefore the program must be able to continue even a logic error happens”. That’s why my Linux logs are full of stupid warnings that everyone ignores and that everything is buggy.

                                                    One argument I can accept is that this should be a separate discussion, and Rust patch should follow Linux rule as it stands, however stupid it may be.

                                                    1. 7

                                                      I think the disagreement is more about “should we have APIs that hide the kernel context from the programmer” (e.g. “am I in a critical region”).

                                                      This message made some sense to me: https://lkml.org/lkml/2022/9/19/840

                                                      Linus’ writing style has always been kind of hyperbolic/polemic and I don’t anticipate that changing :( But then again I’m amazed that Rust-in-Linux happened at all, so maybe I should allow for the possibility that Linus will surprise me.

                                                      1. 1

                                                        This is exactly what I still don’t understand in this discussion. Is there something about stack unwinding and catching the panic that is fundamentally problematic in, eg a driver?

                                                        It actually seems like it would be so much better. It recovers some of the resiliency of a microkernel without giving up the performance benefits of a monolithic kernel.

                                                        What if, on an irrecoverable error, the graphics driver just panicked, caught the panic at some near-top-level entry point, reset to some known good state and continued? Seems like such an improvement.

                                                        1. 5

                                                          I don’t believe the Linux kernel has a stack unwinder. I had an intern add one to the FreeBSD kernel a few years ago, but never upstreamed it (*NIX kernel programmers generally don’t want it). Kernel stack traces are generated by following frame-pointer chains and are best-effort debugging things, not required for correctness. The Windows kernel has full SEH support and uses it for all sorts of things (for example, if you try to access userspace memory and it faults, you get an exception, whereas in Linux or FreeBSD you use a copy-in or copy-out function to do the access and check the result).

                                                          The risk with stack unwinding in a context like this is that the stack unwinder trusts the contents of the stack. If you’re hitting a bug because of stack corruption then the stack unwinder can propagate that corruption elsewhere.

                                                          1. 1

                                                            With the objtool/ORC stuff that went into Linux as part of the live-patching work a while back it does actually have a (reliable) stack unwinder: https://lwn.net/Articles/728339/

                                                            1. 2

                                                              That’s fascinating. I’m not sure how it actually works for unwinding (rather than walking) the stack: It seems to discard the information about the location of registers other than the stack pointer, so I don’t see how it can restore callee-save registers that are spilled to the stack. This is necessary if you want to resume execution (unless you have a setjmp-like mechanism at the catch site, which adds a lot of overhead).

                                                              1. 2

                                                                Ah, a terminological misunderstanding then I think – I hadn’t realized you meant “unwinding” specifically as something sophisticated enough to allow resuming execution after popping some number of frames off the stack; I had assumed you just meant traversal of the active frames on the stack, and I think that’s how the linked article used the term as well (though re-reading your comment now I realize it makes more sense in the way you meant it).

                                                                Since AFAIK it’s just to guarantee accurate stack backtraces for determining livepatch safety I don’t think the objtool/ORC functionality in the Linux kernel supports unwinding in your sense – I don’t know of anything in Linux that would make use of it, aside from maybe userspace memory accesses (though those use a separate ‘extable’ mechanism for explicitly-marked points in the code that might generate exceptions, e.g. this).

                                                                1. 2

                                                                  If I understand the userspace access things correctly, they look like the same mechanism as FreeBSD (no stack unwinding, just quick resumption to an error handler if you fault on the access).

                                                                  I was quite surprised that the ORC[1] is bigger than DWARF. Usually DWARF debug info can get away with being large because it’s stored in separate pages in the binary from the file and so doesn’t consume any physical memory unless used. I guess speed does matter for things like DTrace / SystemTap probes, where you want to do a full stack trace quickly, but in the kernel you can’t easily lazily load the code.

                                                                  The NT kernel has some really nice properties here. Almost all of the kernel’s memory (including the kernel’s code) is pageable. This means that the kernel’s unwind metadata can be swapped out if not in use, except for the small bits needed for the page-fault logic. In Windows, the metadata for paged-out pages is stored in PTEs and so you can even page out page-table pages, but you can then potentially need to page in every page in a page-table walk to handle a userspace fault. That extreme case probably mattered a lot more when 16 MiB of RAM was a lot for a workstation than it does now, but being able to page out rarely-used bits of kernel is quite useful.

                                                                  In addition, the NT kernel has a complete SEH unwinder and so can easily throw exceptions. The SEH exception model is a lot nicer than the Itanium model for in-kernel use. The Itanium C++ ABI allocates exceptions and unwind state on the heap and then does a stack walk, popping frames off to get to handlers. The SEH model allocates them on the stack and then runs each cleanup frame, in turn, on the top of the stack then, at catch, runs some code on top of the stack before popping off all of the remaining frames[2]. This lets you use exceptions to handle out-of-memory conditions (though not out-of-stack-space conditions) reliably.

                                                                  [1] Such a confusing acronym in this context, given that the modern LLVM JIT is also called ORC.

                                                                  [2] There are some comments in the SEH code that suggest that it’s flexible enough to support the complete set of Common Lisp exception models, though I don’t know if anyone has ever taken advantage of this. The Itanium ABI can’t support resumable exceptions and needs some hoop jumping for restartable ones.

                                                          2. 4

                                                            What you are missing is that stack unwinding requires destructors, for example to unlock locks you locked. It does work fine for Rust kernels, but not for Linux.

                                                        2. 7

                                                          Does the kernel have unprotected memory and just rolls with things like null pointer dereferences reading garbage data?

                                                          For errors that are expected Rust uses Result, and in that case it’s easy to sprinkle the code with result.or(whoopsie_fallback) that does not panic.

                                                          1. 4

                                                            As far as I understand, yeah, sometimes the kernel would prefer to roll with corrupted memory as far as possible:

                                                            So BUG_ON() is basically ALWAYS 100% the wrong thing to do. The argument that “there could be memory corruption” is [not applicable in this context]. See above why.

                                                            (from docs and linked mail).

                                                            null derefernces in particular though usually do what BUG_ON essentially does.

                                                            And things like out-of-bounds accesses seem to end with null-dereference:

                                                            https://github.com/torvalds/linux/blob/45b588d6e5cc172704bac0c998ce54873b149b22/lib/flex_array.c#L268-L269

                                                            Though, notably, out-of-bounds access doesn’t immediately crash the thing.

                                                            1. 8

                                                              As far as I understand, yeah, sometimes the kernel would prefer to roll with corrupted memory as far as possible:

                                                              That’s what I got from the thread and I don’t understand the attitude at all. Once you’ve detected memory corruption then there is nothing that a kernel can do safely and anything that it does risks propagating the corruption to persistent storage and destroying the user’s data.

                                                              Linus is also wrong that there’s nothing outside of a kernel that can handle this kind of failure. Modern hardware lets you make it very difficult to accidentally modify the kernel page tables. As I recall, XNU removes all of the pages containing kernel code from the direct map and protects the kernel’s page tables from modification, so that unrecoverable errors can take an interrupt vector to some immutable code that can then write crash dumps or telemetry and reboot. Windows does this from the Secure Kernel, which is effectively a separate VM that has access to all of the main VM’s memory but which is protected from it. On Android, Halfnium provides this kind of abstraction.

                                                              I read that entire thread as Linus asserting that the way that Linux does things is the only way that kernel programming can possibly work, ignoring the fact that other kernels use different idioms that are significantly better.

                                                              1. 5

                                                                Reading this thread is a little difficult because the discussion is evenly spread between the patch set being proposed, some hypothetical plans for further patch sets, and some existing bad blood between the Linux and Rust community.

                                                                The “roll with corrupted memory as far as possible” part is probably a case of the “bad blood” part. Linux is way more permissive with this than it ought to be but this is probably about something else.

                                                                The initial Rust support patch set failed very eagerly and panicked, including on cases where it really is legit not to panic, like when failing to allocate some memory in a driver initialization code. Obviously, the Linux idiom there isn’t “go on with whatever junk pointer kmalloc gives you there” – you (hopefully – and this is why we should really root for memory safety, because “hopefully” shouldn’t be a part of this!) bail out, that driver’s initialization fails but kernel execution obviously continues, as it probably does on just about every general-purpose kernel out there.

                                                                The patchset’s authors actually clarified immediately that the eager panics are actually just an artefact of the early development status – an alloc implementation (and some bits of std) that follows safe kernel idioms was needed, but it was a ton of work so it was scheduled for later, as it really wasn’t relevant for a first proof of concept – which was actually a very sane approach.

                                                                However, that didn’t stop seemingly half the Rustaceans on Twitter to take out their pitchforks, insists that you should absolutely fail hard if memory allocation fails because what else are you going to do, and rant about how Linux is unsafe and it’s riddled with security bugs because it’s written by obsolete monkeys from the nineties whose approach to memory allocation failures is “well, what could go wrong?” . Which is really not the case, and it really does ignore how much work went into bolting the limited memory safety guarantees that Linux offers on as many systems as it does, while continuing to run critical applications.

                                                                So when someone mentions Rust’s safety guarantees, even in hypothetical cases, there’s a knee-jerk reaction for some folks on the LKML to feel like this is gonna be one of those cases of someone shitting on their work.

                                                                I don’t want to defend it, it’s absolutely the wrong thing to do and I think experienced developers like Linus should realize there’s a difference between programmers actually trying to use Rust for real-world problems (like Linux), and Rust advocates for whom everything falls under either “Rust excels at this” or “this is an irrelevant niche case”. This is not a low-effort patch, lots of thinking went into it, and there’s bound to be some impedance mismatch between a safe language that tries to offer compile-time guarantees and a kernel historically built on overcoming compiler permisiveness through idioms and well-chosen runtime tradeoffs. I don’t think the Linux kernel folks are dealing with this the way they ought to be dealing with it, I just want to offer an interpretation key :-D.

                                                            2. 1

                                                              No expert here, but I imagine linux kernel has methods of handling expected errors & null checks.

                                                            3. 6

                                                              In an ideal world we could have panic and stop in the kernel. But what the kernel does now is what people expect. It’s very hard to make such a sweeping change.

                                                              1. 6

                                                                Sorry, this is a tangent, but your phrasing took me back to one of my favorite webcomics, A Miracle of Science, where mad scientists suffer from a “memetic disease” that causes them to e.g. monologue and explain their plans (and other cliches), but also allows them to make impossible scientific breakthroughs.

                                                                One sign that someone may be suffering from Science Related Memetic Disorder is the phrase “in a perfect world”. It’s never clearly stated exactly why mad scientists tend to say this, but I’d speculate it’s because in their pursuit of their utopian visions, they make compromises (ethical, ugly hacks to technology, etc.), that they wouldn’t have to make in “a perfect world”, and this annoys them. Perhaps it drives them to take over the world and make things “perfect”.

                                                                So I have to ask… are you a mad scientist?

                                                                1. 2

                                                                  I aspire to be? bwahahaa

                                                                  1. 2

                                                                    Hah, thanks for introducing me to that comic! I ended up archive-bingeing it.

                                                                  2. 2

                                                                    What modern kernels use “panic and stop”? Is it a feature of the BSDs?

                                                                    1. 8

                                                                      Every kernel except Linux.

                                                                      1. 2

                                                                        I didn’t exactly mean bsd. And I can’t name one. But verified ones? redox?

                                                                        1. 1

                                                                          I’m sorry if my question came off as curt or snide, I was asking out of genuine ignorance. I don’t know much about kernels at this level.

                                                                          I was wondering how much an outlier the Linux kernel is - @4ad ’s comment suggests it is.

                                                                          1. 2

                                                                            No harm done

                                                                    2. 4

                                                                      I agree. I would be very worried if people writing the Linux kernel adopted the “if it compiles it works” mindset.

                                                                      1. 2

                                                                        Maybe I’m missing some context, but it looks like Linus is replying to “we don’t want to invoke undefined behavior” with “panicking is bad”, which makes it seem like irrelevant grandstanding.

                                                                        1. 2

                                                                          The part about debugging specifically makes sense in the “cultural” context of Linux, but it’s not a matter of realism. There were several attempts to get “real” in-kernel debugging support in Linux. None of them really gained much traction, because none of them really worked (as in, reliably, for enough people, and without involving ritual sacrifices), so people sort of begrudgingly settled for debugging by printf and logging unless you really can’t do it otherwise. Realistically, there are kernels that do “panic and stop” well and are very debuggable.

                                                                          Also realistically, though: Linux is not one of those kernels, and it doesn’t quite have the right architecture for it, either, so backporting one of these approaches onto it is unlikely to be practical. Linus’ arguments are correct in this context but only insofar as they apply to Linux, this isn’t a case of hammering realism into idealists. The idealists didn’t divine this thing in some programming class that only used pen, paper and algebra, they saw other operating systems doing it.

                                                                          That being said, I do think people in the Rust advocacy circles really underestimate how difficult it is to get this working well for a production kernel. Implementing panic handling and a barebones in-kernel debugger that can nonetheless usefully handle 99% of the crashes in a tiny microkernel is something you can walk third-year students through. Implementing a useful in-kernel debugger that can reliably debug failures in any context, on NUMA hardware of various architectures, even on a tiny, elegant microkernel, is a whole other story. Pointing out that there are Rust kernels that do it well (Redshirt comes to mind) isn’t very productive. I suspect most people already know it’s possible, since e.g. Solaris did it well, years ago. But the kind of work that went into that, on every level of the kernel, not just the debugging end, is mind-blowing.

                                                                          (Edit: I also suspect this is the usual Rust cultural barrier at work here. The Linux kernel community is absolutely bad at welcoming new contributors. New Rust contributors are also really bad at making themselves welcome. Entertaining the remote theoretical possibility that, unlikely though it might be, it is nonetheless in the realm of physical possibility that you may have to bend your technology around some problems, rather than bending the problems around your technology, or even, God forbid, that you might be wrong about something, can take you a very long way outside a fan bulletin board.)

                                                                          1. 1

                                                                            easter bunny and santa claus comment

                                                                            Wow, Linus really has mellowed over the years ;)

                                                                        1. 15

                                                                          I think CHERI is a big unknown here. If CHERI works, language level memory safety is less valuable, and Zig will be more attractive and Rust less.

                                                                          I am pretty optimistic about CHERI. The technology is solid, and its necessity is clear. There is just no way we will rewrite existing C and C++ code. So we will have CHERI for C and C++, and Zig will be an unintended beneficiary.

                                                                          1. 26

                                                                            For desktop and mobile applications, I’d prefer a safety solution that doesn’t require a billion or more people to throw out the hardware they already have. So whatever we do, I don’t think relying exclusively on CHERI is a good solution.

                                                                            1. 3

                                                                              People throw away their hardware, at least on average, once every decade. I’d much rather a solution that didn’t require rewriting trillions of dollars of software.

                                                                              1. 2

                                                                                People throw away their hardware, at least on average, once every decade.

                                                                                True, the software upgrade treadmill forces them to do that. But not everyone can keep up. Around 2017, I made friends with a poor person whose only current computer was a lower-end smartphone; they had a mid-2000s desktop PC in storage, and at some point also had a PowerPC iMac. It would be great if current software, meeting current security standards, were usable on such old computers. Of course, there has to be a cut-off point somewhere; programming for ancient 16-bit machines isn’t practical. I’m afraid that 3D-accelerated graphics hardware might be another hard requirement for modern GUIs; I was disappointed that GNOME 3 chose fancy visual effects over desktop Linux’s historical advantage of running well on older hardware. But let’s try not to keep introducing new hardware requirements and leaving people behind.

                                                                            2. 13

                                                                              Wouldn’t CHERI still discover these issues at runtime versus compile time? Do not get me wrong, I’m still bullish on CHERI and it would be a material improvement, but I do think finding these bugs earlier in the lifecycle is part of the benefit of safety as a language feature.

                                                                              1. 8

                                                                                That’s why I said “less valuable” instead of “nearly useless”.

                                                                                1. 2

                                                                                  Makes sense, thank you, just checking my understanding

                                                                              2. 2

                                                                                Is there a performance overhead from using CHERI?

                                                                                1. 5

                                                                                  CheriABI paper measured 6.8% overhead for PostgreSQL benchmark running on FreeBSD in 2019. It mostly comes from larger pointer (128 bits) and its effect on cache.

                                                                                  1. 1

                                                                                    Note that those numbers were from the CHERI/MIPS prototype, which was an in-order core with a fairly small cache but disproportionately fast DRAM (cache misses cost around 30ish cycles). Setting the bounds on a stack allocation was disproportionately expensive, for example, because the CPU couldn’t do anything else that cycle, whereas a more modern system would do that in parallel with other operations and so we typically see that as being in the noise on Morello. It also had a software-managed TLB and so couldn’t speculatively execute on any paths involving cache misses.

                                                                                    The numbers that we’re getting from Morello are a bit more realistic, though with the caveat that Arm made minimal modifications to the Neoverse N1 for Morello and so couldn’t widen data paths of queues in a couple of places where the performance win would have been huge for CHERI workloads relative to the power / area that they cost.

                                                                                  2. 3

                                                                                    We’re starting to get data on Morello, though it’s not quite as realistic a microarchitecture as we’d like, Arm had to cut a few corners to ship it on time. Generally, most of the overhead comes from doubling pointer sizes, so varies from almost nothing (for weird reasons, a few things get 5-10% faster) to 20% for very pointer-dense workloads. Adding temporal safety on top, on the four worst affected of the SPECCPU benchmarks costs about 1% for two, closer to 20% for the others (switching from glibc’s malloc to snmalloc made one of those 30% faster on non-CHERI platforms, some of SPEC is really a tight loop around the allocator). We have some thoughts about improving performance here.

                                                                                    It’s worth nothing that any microarchitecture tends to be turned for specific workloads. In designed for CHERI would see different curves because some things would be sized where they are hitting big wins for CHERI but diminishing returns for everything else. The folks working on Rust are guessing that Rust would be about 10% faster with CHERI. I believe WASM will see a similar speed up and MSWasm could be 50% or more faster than software enforcement.

                                                                                    1. 1

                                                                                      for weird reasons, a few things get 5-10% faster

                                                                                      If you happen to have any easily-explained concrete examples, I’d be curious to hear about these weird reasons…

                                                                                      1. 5

                                                                                        I don’t know if anyone has done root-cause analysis on them yet, but typically it’s things like the larger pointers reduce cache aliasing. I’ve seen one of the SPEC benchmarks get faster (I probably can’t share how much) when you enable MTE on one vendor’s core because they disable a prefetcher with MTE and that prefetcher happens to hit a pathological case in that one benchmark and slow things down.

                                                                                        It’s one of the annoying things you hit working on hardware security features. Modern CPUs are so complex that changing anything is likely to have a performance change of up to around 10% for any given workload, so when you expect your overhead to be around 5% on average you’re going to see a bunch of things that are faster, slower, or about the same. Some things have big differences for truly weird reasons. I’ve seen one thing go a lot faster because a change made the read-only data segment slightly bigger, which made two branch instructions on a hot path land in slightly different places and no longer alias in the branch predictor.

                                                                                        My favourite weird performance story was from some Apple folks. Apparently they got samples of a newer iPhone chip (this is probably about 10 years ago now), which was meant to be a lot faster and they found that a core part of iOS ran much, much slower. It turned out that, with the old core, it was always mispredicting a branch, which was issuing a load, and then being cancelled after 10 cycles or so. In the newer core, the branch was correctly predicted and so the load wasn’t issued. The non-speculative branch needed that load a couple of hundred cycles later and ended up stalling for 100-200 cycles waiting for memory. The cost of the memory wait fast over an order of magnitude higher than the cost of the branch misprediction. They were able to add an explicit prefetch to regain performance (and get the expected benefit from the better core), but it’s a nice example of how improving one bit of the hardware cause a huge systemic regression in performance.

                                                                                        1. 1

                                                                                          Interesting, thanks – reminds me of some of the effects described in this paper (performance impacts of environment size and link order).

                                                                                          Once doing some benchmarking for a research project circa 2015 or so I found a MySQL workload that somehow got consistently somewhat faster when I attached strace to it, though I unfortunately never figured out exactly why or how it happened…

                                                                                          1. 2

                                                                                            There was another similar paper at ASPLOS a few years later where they compiled with function and data sections and randomised the order of a bunch of benchmarks. They found that this gave a 20% perf delta and that a lot of papers about compiler optimisations were seeing a speed up simply as a result of this effect. Apparently the same team later produced a tool for properly evaluating optimisations that would do this randomisation and apply some statistics to see if your speed up is actually statistically significant.

                                                                                1. 7

                                                                                  The git log of bash is still mostly useless :(

                                                                                  1. 8

                                                                                    A lot of GNU projects built their processes when revision control systems were not widespread and when moves between revision control systems lost a lot of data. They generally prefer to maintain a ChangeLog file, rather than git history. In the distant past, I had an exciting pile of XSLT that would extract my svn commit message and the list of files that were changed from svn’s XML export, convert it into the GNU ChangeLog file format, and then add a new commit with the ChangeLog entry for the previous commit.

                                                                                    1. 3

                                                                                      Sure, but at this point most other high-profile GNU projects (glibc, emacs, coreutils, grep, gcc, binutils, gnulib, …) have adopted a more modern development style of fine-grained commits with descriptive commit messages. The bash repo remains something of an anachronism, unfortunately.

                                                                                      1. 2

                                                                                        Honestly, Changelogs should be for user-relevant changes, which commits might not necessarily be

                                                                                    1. 8

                                                                                      As a satisfied git user, this seems fair. The tldr is: fossil uses substantially the same data structure as git but makes completely opposite choices because it’s optimized for smaller development teams.

                                                                                      I’d love to hear from someone who has used both extensively and prefers fossil. What’s it like? What’s lost that git users rely on? What huge wins does fossil open up?

                                                                                      1. 8

                                                                                        I actually go so far as to just use Fossil for most of my websites. The wiki in markdown mode works great for my sites.

                                                                                        What’s lost that git users rely on?

                                                                                        The big thing is, git allows mutability, in fact all of git is basically a big fungible ball of mud. Fossil is much more immutable. If you think your VCS should allow mutable history, Git is for you. If you think your commits are immutable history, you should take a serious look at Fossil. I’m not sure there is a “right” answer here, it just depends on your perspective. Personally I’m an immutable person, that does mean you get the occasional commit message that looks like: ‘oops, fix typo’. But you can be fairly confident Fossil’s history is what actually happened and not a cleaned up version of reality.

                                                                                        1. 11

                                                                                          For what it’s worth, the commit graph in git is immutable. The only thing mutable are the references. What’s different is that in git you can move the references however you please, with no restrictions to move them along the edges of the commit graph.

                                                                                          In fossil, you can only converge the timelines. In git, you can jump between them.

                                                                                          1. 7

                                                                                            A big realization for me was that the git graph itself is versioned in the reflog. Using git reflog to restore old references is hugely valuable and I’m way more confident using git because of it. But to be fair, those commits are only stored on my local machine, will eventually be gc-ed and won’t be pushed to a remote. Fossil would track them everywhere.

                                                                                          2. 6

                                                                                            Could you elaborate a bit on the value of immutability to you? I fall into the mutability camp because I find the additional context is rarely worth the value when it costs so much to untangle. I’d rather use the mutability to create untangled context and have that be the only artifact that remains. I’m not disagreeing that the immutability has value to you, I’m just seeking understanding since I experience things differently.

                                                                                            1. 5

                                                                                              I generally have two diametrically opposed views on immutability:

                                                                                              • In other people’s repos, I want immutability with no exceptions. If I clone your repo, nothing you do should be able to modify the history that I think I have, so I can always find a common ancestor between your version and my fork and merge back or pull in new changes.
                                                                                              • In my own repo, I want the flexibility to change whatever I want, commit WIP things, move fixes back to the commit that introduced the bug, and so on.

                                                                                              I think GitHub’s branch protection rules are a good way of reconciling this. The main branch enforces immutable history, as do release branches. Ideally tags would also be immutable. Anything else is the wild west: if you work from those branches then don’t expect them to still exist upstream next time you pull and you may need to resolve conflicts later. I’d like a UI that made this distinction a lot more visible.

                                                                                              1. 2

                                                                                                This is definitely a reasonable perspective.

                                                                                                Question, what is the point of committing WIP stuff?

                                                                                                1. 3

                                                                                                  Locally, so that I have an undo button that works across files, so once something is working and I want to clean it up, I can always see what I’ve changed and broken during cleanup.

                                                                                                  Publicly, so that I can get feedback (from humans or CI), incorporate it, and then clean up the result so that, when it’s merged, everything in the history is expected to be working. This means that other people can bisect to find things that broke and not have to work out if a particular version is expectedly or unexpectedly broken. It also means that people merging don’t see conflicts in places where I made a change in a file, discovered I didn’t need it, reverted it, and they did a real change in that location.

                                                                                                  1. 1

                                                                                                    Thanks!

                                                                                                    For me and us @ $WORK, the undo button is either ZFS snapshots(exposed as ~/.snapshots) or $EDITOR’s undo functionality.

                                                                                                    For human feedback, we have a shared user machine we work from or we use other more general purpose tools, like desktop screen sharing(typically tmux, or guacamole).

                                                                                                    For CI feedback, since our CI/CD jobs are just nomad batch jobs, it’s just a nomad run project-dev.nomad command away.

                                                                                                    I.e. we prefer general tools we have to use anyway to solve these problems, instead of specific tools.

                                                                                                    1. 3

                                                                                                      For me and us @ $WORK, the undo button is either ZFS snapshots(exposed as ~/.snapshots) or $EDITOR’s undo functionality.

                                                                                                      That requires me to either:

                                                                                                      • Have a per-source-tree ZFS dataset (possible with delegated administration, but a change from just having one for my home directory and one for my builds which doesn’t get backed up and has sync turned off)
                                                                                                      • Tracking metadata externally about which of my snapshots corresponds to a checkpoint of which repo.

                                                                                                      In contrast, git already does this via a mechanism that is the same one that I use for other things. Vim has persistent undo, which is great for individual files, but when a change spans a dozen files then trying to use vim’s undo to go back to (and compare against) the state from the middle of yesterday afternoon that worked is hard.

                                                                                                      For human feedback, we have a shared user machine we work from or we use other more general purpose tools, like desktop screen sharing(typically tmux, or guacamole).

                                                                                                      That requires everyone that you collaborate with to be in the same company as you (or introduces some exciting security problems for your admin team to have to care about), for your code review to be synchronous. The first is not true for me, the second would be problematic given that I work with people distributed across many time zones. Again, GitHub’s code review doesn’t have those problems.

                                                                                                      For CI feedback, since our CI/CD jobs are just nomad batch jobs, it’s just a nomad run project-dev.nomad command away.

                                                                                                      That’s fine if everyone running your CI has deploy permissions on all of the infrastructure where you do testing.

                                                                                                      I.e. we prefer general tools we have to use anyway to solve these problems, instead of specific tools.

                                                                                                      The tools that you use have a lot of constraints that would prevent me from using them in most of the places where I use git.

                                                                                                      1. 1

                                                                                                        Around CI/CD. For local testing with nomad it can be as simple as download the nomad binary then nomad agent -dev then nomad run <blah.nomad> and you can be off to the races, running CI locally.

                                                                                                        We don’t do that, because @ $WORK, our developers are all in-house and it’s a non-issue, to share resources.

                                                                                                        Just to be clear, I’m not trying to convert you, just for those following along at home.

                                                                                                        Also, within Fossil’s commits, you can totally hide stuff from the timeline, similar to git rebase, using amend

                                                                                                        1. 1

                                                                                                          Thanks for the exchange! It’s interesting seeing the different trade-offs on workflow.

                                                                                                          For those following along, another way, totally within fossil to do undo across large amounts of code change is just generate a sqlite patch file instead of a commit. it’s easy enough: fossil patch create <blah.patch> and to undo: fossil patch apply <blah.patch> The patch file will include by default all uncommitted changes in the repo.

                                                                                                      2. 1

                                                                                                        an undo button that works across files, so once something is working and I want to clean it up, I can always see what I’ve changed and broken during cleanup.

                                                                                                        The staging area is underappreciated for this problem. Often when I hit a minor milestone (the tests pass!) I’ll toss everything into staged and then try to make it pretty in unstaged. With a good git UI it’s easy to look at the unstaged hunks in isolation and blow them away if I mess up. Good code gets promoted to the staging area and eventually I get a clean commit.

                                                                                                        …and then I end up with lots of messy commits anyway to accommodate the “humans or CI” cases. :)

                                                                                                      3. 1

                                                                                                        In short, for backing it up or for pushing out multiple commits to create a design sketch.

                                                                                                        1. 1

                                                                                                          The Fossil “Rebase Considered Harmful” document provides a lot of reasons for committing WIP: bisect works better, cherry-picks work better, backouts work better. Read the doc for more: https://www2.fossil-scm.org/home/doc/trunk/www/rebaseharm.md

                                                                                                          Rebase is a hack to work around a weakness in git that doesn’t exist in fossil.

                                                                                                          The Git documentation acknowledges this fact (in so many words) and justifies it by saying “rebasing makes for a cleaner history.” I read that sentence as a tacit admission that the Git history display capabilities are weak and need active assistance from the user to keep things manageable. Surely a better approach is to record the complete ancestry of every check-in but then fix the tool to show a “clean” history in those instances where a simplified display is desirable and edifying, but retain the option to show the real, complete, messy history for cases where detail and accuracy are more important.

                                                                                                          So, another way of thinking about rebase is that it is a kind of merge that intentionally forgets some details in order to not overwhelm the weak history display mechanisms available in Git. Wouldn’t it be better, less error-prone, and easier on users to enhance the history display mechanisms in Git so that rebasing for a clean, linear history became unnecessary?

                                                                                                      4. 5

                                                                                                        Sometimes it’s policy/law. When software starts mucking about with physical things that can kill/maim/demolish lives, stuff has to be kept track of. Think airplane fly by wire systems, etc. Fossil is good for these sorts of things, git could be with lots of policy around using it. Some industries would never allow a git rebase, for any reason.

                                                                                                        • The perspective of Fossil is: Think before you commit. It’s called a commit for a reason.
                                                                                                        • The perspective of Git is, commit every nanosecond and maybe fix the history later.

                                                                                                        Of course history being immutable can be annoying sometimes, but history is messy in every version of history you look at, except perhaps kindergarten level history books :P I’m not a kindergartener anymore, I can handle the real history.

                                                                                                        For me at $WORK, it’s policy.

                                                                                                        1. 6

                                                                                                          Ah, a policy reason certainly makes sense. I work in FinTech, where policy and/or law has not quite caught up to software to burden us with a more deeply ingrained moral responsibility to assure accountability of the software we write.

                                                                                                          • The perspective of Fossil is: Think before you commit. It’s called a commit for a reason.
                                                                                                          • The perspective of Git is, commit every nanosecond and maybe fix the history later.

                                                                                                          Of course history being immutable can be annoying sometimes, but history is messy in every version of history you look at, except perhaps kindergarten level history books :P I’m not a kindergartener anymore, I can handle the real history.

                                                                                                          This is condescending. Human beings make mistakes and storing those mistakes in all cases isn’t always a valuable artifact. Imagine a text editor that disallows text deletions. “You should have thought harder before typing it.” Come on, dude.

                                                                                                          1. 7

                                                                                                            Imagine a text editor that disallows text deletions.

                                                                                                            We call that Blockchain Notepad.

                                                                                                            1. 4

                                                                                                              Thanks, I hate it!

                                                                                                            2. 3

                                                                                                              I had a :P in there :)

                                                                                                              I apologize, but come on, it’s not like your comparison is remotely fair either :)

                                                                                                              Human beings make mistakes and storing those mistakes in all cases isn’t always a valuable artifact.

                                                                                                              I agree. Fossil(and git) both have ways to fix mistakes that are worth cleaning up. Git just has more of them. See “2.7 What you should have done vs. What you actually did” in the OP’s link.

                                                                                                              1. 1

                                                                                                                One thing I’m seeing about fossil that illuminates things a bit is that it looks like it might be possible to limit what sets of changes you see as a product of querying the change sets. This seems useful - not to reproduce something git-like - but to limit only to the changes that are considered more valuable and final than anything more transient. If that’s the case, I can see the “mess” of fossil being less of a problem, though with the added cost of now needing to be comfortable with querying the changes.

                                                                                                            3. 5

                                                                                                              I’d much rather have cleaned up history than try to bisect around half-finished commits.

                                                                                                              1. 3

                                                                                                                From a Fossil perspective, half finished commits belong locally or in .patch files to move around(which in Fossil land are sqlite3 files, not diffs). They don’t belong in commits.

                                                                                                                To be clear I agree with you, bisecting half-finished commits are terrible. Fossil just has a different perspective and workflow than Git when it comes to this stuff.

                                                                                                                1. 2

                                                                                                                  I imagine that the way this would get handled in fossil land is people making local half commits then redrafting the changes cleanly on another branch and using that as the official commits to release

                                                                                                                2. 3

                                                                                                                  There’s a series of steps that any change takes as it passes through decreasingly mutable tiers of storage, so to speak:

                                                                                                                  • typing moves things from the programmer’s brain to an editor buffer
                                                                                                                  • saving moves things from an editor buffer to a file
                                                                                                                  • committing moves things from a file to a (local) repo’s history
                                                                                                                  • pushing moves things from a local repo to a (possibly) public one

                                                                                                                  The question is at what level a given change becomes “permanent”. With git it’s really only when you’ve published your history, whereas it sounds like fossil’s approach doesn’t really allow the distinction between the last two and hence that happens on every (local) commit.

                                                                                                                  You could move the point-of-no-undo even earlier and install editor hooks to auto-commit on every save, or save and commit on every keystroke, but I think most people would agree that that would produce an unintelligible useless mess of history, even if it is “a more accurate record of reality” – so even in the fossil approach you’re still looking at a somewhat massaged, curated view of the development history. I think git’s model just makes that curation easier, by allowing you to create “draft” commits and modify them later.

                                                                                                                  1. 2

                                                                                                                    Fossil’s perspective would be, once it’s committed it is immutable, but you can do reporting on it and make it spit out whatever you want. i.e. Fossil really is just a fancy UI and some tooling around a SQLite database. There is basically no end to what one can do when your entire code tree is living in a SQL database.

                                                                                                                    i.e. You don’t change the history, you change the report of history to show the version of reality that is interesting to you today.

                                                                                                                    Fossil even includes an API for it: https://www2.fossil-scm.org/home/doc/trunk/www/json-api/api-query.md Not to mention the built-in querying available for instance in the timeline view

                                                                                                                  2. 3

                                                                                                                    While I disagree with conclusion, I appreciate you taking the time to explain this way of looking at it. The legality angle seems reasonable, (and, ofc, if you have no choice, you have no choice) but digging further I have some questions for you….

                                                                                                                    1. By this line of reasoning, why is the fossil commit the unit of “real history”? Why not every keystroke? I am not just being facetious. Indeed, why not screen record every editing session?
                                                                                                                    2. Given that the fossil commit has been deemed the unit of history, doesn’t this just encourage everyone to big-batch their commits? Indeed, perhaps even use some other mechanism to save ephemeral work while I spend hours, or even days, waiting for my “official work” to be done so that I can create clean history?

                                                                                                                    I’m not a kindergartener anymore, I can handle the real history.

                                                                                                                    This strikes me an almost Orwellian reversal, since I would say: “You (coworker) are not a kindergartner anymore. Don’t make me read your 50 garbage commits like ‘checkin’, ‘try it out’, ‘go back’, etc, when the amount of changes you have merits 3 clean commits. Have the basic professionalism to spend 5-10 minutes to organize and communicate clearly the work you have actually done to your current and future coworkers.” I am no more interested in this “true history” than I am interested in the 5 intermediate drafts of the email memo you just sent out.

                                                                                                                    1. 2

                                                                                                                      Don’t make me read your 50 garbage commits …

                                                                                                                      It sounds like we are no longer discussing Fossil, but a way of using Git where you do not use rebase.

                                                                                                                      Here’s what the Fossil document says:

                                                                                                                      Git puts a lot of emphasis on maintaining a “clean” check-in history. Extraneous and experimental branches by individual developers often never make it into the main repository. Branches may be rebased before being pushed to make it appear as if development had been linear, or “squashed” to make it appear that multiple commits were made as a single commit. There are other history rewriting mechanisms in Git as well. Git strives to record what the development of a project should have looked like had there been no mistakes.

                                                                                                                      Fossil, in contrast, puts more emphasis on recording exactly what happened, including all of the messy errors, dead-ends, experimental branches, and so forth. One might argue that this makes the history of a Fossil project “messy,” but another point of view is that this makes the history “accurate.” In actual practice, the superior reporting tools available in Fossil mean that this incidental mess is not a factor.

                                                                                                                      Like Git, Fossil has an amend command for modifying prior commits, but unlike in Git, this works not by replacing data in the repository, but by adding a correction record to the repository that affects how later Fossil operations present the corrected data. The old information is still there in the repository, it is just overridden from the amendment point forward.

                                                                                                                      My reading is that Fossil permits you to view a “clean” history of changes due to its “superior reporting tools” and the “correction records” added by the amend command. But unlike Git, the original commit history is still recorded if you need to see it.

                                                                                                                      1. 1

                                                                                                                        My reading is that Fossil permits you to view a “clean” history of changes due to its “superior reporting tools” and the “correction records” added by the amend command. But unlike Git, the original commit history is still recorded if you need to see it.

                                                                                                                        Ok that is interesting… I had been assuming that they were dismissing the value of clean history, but it seems they are not, but instead solving the same problem but at a different level in the stack.

                                                                                                                      2. 1

                                                                                                                        By this line of reasoning, why is the fossil commit the unit of “real history”? Why not every keystroke? I am not just being facetious. Indeed, why not screen record every editing session?

                                                                                                                        That’s what Teleport is for. Other tools obviously also do this.

                                                                                                                        More generally, stuff in the commit tree will eventually make it to production and run against real data and possibly hurt people. The stuff that can hurt people needs to be tracked. The ephemeral stuff in between doesn’t much matter. If I was purposefully negligent in my code, no amount of ephemeral work would prove it, there would be some other mechanism in place to prove that (my emails to a co-conspirator maybe, recording me with that evil spy, etc).

                                                                                                                        Given that the fossil commit has been deemed the unit of history, doesn’t this just encourage everyone to big-batch their commits? Indeed, perhaps even use some other mechanism to save ephemeral work while I spend hours, or even days, waiting for my “official work” to be done so that I can create clean history?

                                                                                                                        Why do you need to commit ephemeral work? what is the point?

                                                                                                                        Have the basic professionalism to spend 5-10 minutes to organize and communicate clearly the work you have actually done to your current and future coworkers.”

                                                                                                                        LOL fair point :) But that goes back to the previous comments, what is the purpose of committing ephemeral work? From my perspective there are 2 general reasons:

                                                                                                                        • Show some pointy haired boss you did something today
                                                                                                                        • Share some code in progress with another person to help solve a problem, code review, etc.

                                                                                                                        The 1st, no amount of code commits will solve this, it’s either trust me or come sit next to me and watch me do stuff(or screen record, video record my office, etc). If my boss doesn’t trust me to be useful to the organization, I’m at the wrong organization.

                                                                                                                        The 2nd, is easily solved in a myriad of ways, from desktop/screen sharing, to code collab tools to sharing Fossil patch files around.

                                                                                                                        I truly don’t understand the point of committing half-done work like Git proponents seem to think is an amazing idea. A commit needs to be USEFUL to be committed. Perhaps it’s part of a larger body of work, it’s very common to do that, but then it’s not ephemeral, you are doing a cleanup so you can then implement $FEATURE, that cleanup can happily be it’s own commit, etc.

                                                                                                                        But committing every nanosecond or on every save is just idiotic from my point of view. If you want that sort of thing, just run against a versioned file system. You can do this with ZFS snapshots if you don’t want to run a versioned file system. Git is not a good backup tool.

                                                                                                                        1. 4

                                                                                                                          I think this is fundamentally a workflow difference.

                                                                                                                          Proponents of git, myself included, use committing for many purposes, including these prominent ones:

                                                                                                                          1. A way to save partially complete work so you don’t lose it, or can go back to that point of time in the midst of experimentation.
                                                                                                                          2. A way to share something with a co-worker that will not be part of permanent history or ever merged.

                                                                                                                          The 2nd, is easily solved in a myriad of ways, from desktop/screen sharing, to code collab tools to sharing Fossil patch files around.

                                                                                                                          Yes, I suppose there are other ways to solve the sharing problem. But since we do everything in git anyway and will have a PR in Github anyway, it is very convenient to just commit to share, rather than introduce a new mechanism for sharing.

                                                                                                                          I truly don’t understand the point of committing half-done work like Git proponents seem to think is an amazing idea. A commit needs to be USEFUL to be committed.

                                                                                                                          Sharing code to discuss and backing up milestones of incomplete, experimental are both very useful to me.

                                                                                                                          1. 1

                                                                                                                            I think the disconnect is probably in what we consider “ephemeral.” You seem to think that we’re “idiotically [. . .] committing every nanosecond” (which, seriously, stop phrasing it like this because you’re being an asshole), but in most of my own use cases it’s simply a matter of wanting to preserve the current state of my work until I’ve reached a point where I’m ready to construct what I view as a salient description of the changes. In many cases this means making commits that roughly match the structure I’m after - a sort of design sketch - and maybe these don’t include all of the test changes yet, and I haven’t fully written out a commit message because I haven’t uncovered all the wrinkles that need ironing as I continue the refactor, and I find something later that makes more sense as a commit in the middle because it relates directly to that other block of work, and and and…

                                                                                                                            An example: when I reach the end of the day, I may want to stash what I’m working on or - depending on the amount of work I’ve put in - I may want to push up a WIP commit so that if something happens on my workstation that I don’t lose that work (this is always a real concern for reasons I won’t go into). Maybe that WIP commit doesn’t have tests passing in it, and I and my team try to follow the practice of ensuring that every commit makes a green build, so I don’t want that to be the final version of the commit that eventually makes it into the main branch. The next day I come in, reset my WIP commit and add the test changes I was missing and now make the actual commit I want to eventually see pushed up to the main branch.

                                                                                                                            I don’t know of anybody who thinks saving things in WIPs for untangling later is - as you say - “an amazing idea,” but it’s a natural extension of our existing workflow.

                                                                                                                  3. 6

                                                                                                                    I use both Fossil and Git at work, although we are almost done with moving all the Fossil repos to Git.

                                                                                                                    Fossil is fine, but the immutability is kind of annoying in the long term. The lack of a rebase for local work is a drag.

                                                                                                                    Its biggest problem is tooling. Nothing works with it. It doesn’t integrate with the CI system without a lot of effort, there’s no Bitbucket/Github-like system to use for PRs or code reviews, and it doesn’t integrate with the ticket system. Sure, it contains all those things, but they don’t meet the needs we (and most others, it seems) require.

                                                                                                                    On a personal note, I dislike the clone/open workflow as I’d much rather have the database in the project directory similar to the .git directory. There are other little things I don’t like, but they are mostly because I’m used to Git, despite all its flaws.

                                                                                                                    1. 3

                                                                                                                      I would argue it’s because your perspective around Fossil is wrong when it comes to immutability. Fossil’s perspective is when you commit, it’s literally a commitment, it’s done. Be careful and think about your commits. Practically the only thing we have noticed is occasionally we get ‘oops fixed typo’ type commits.

                                                                                                                      I agree with the clone/open workflow, but it’s that way for a reason, the perspective is, you clone locally once, and you open per branch/feature you want to mess with. So a cd is all you need to switch between branches. Once I figured that out, I didn’t mind the workflow that much, I just have a ~/fossil dir that keeps all my local open projects, and otherwise I mostly ignore that directory.

                                                                                                                      I agree with the tooling problem, though I don’t think it’s quite as bad as you think. There is fossil patch for PR/code review workflows. The only special tooling fossil gives you here is the ability to copy and apply the patch to a remote SSH host. Perhaps that could be changed, but it allows you to develop your own workflow if you care about those sorts of things.

                                                                                                                      I have a totally different perspective than the entire industry around CI/CD. CI/CD is just a specialization of running batch jobs. Since we have to run batch jobs anyway, we just use our existing batch job tooling for CI/CD. For us, that means our CI/CD integration is as simple as a commit hook that runs: nomad run <reponame.nomad> After that our normal nomad tooling handles all of our CI/CD needs, and allows anyone to start a CI/CD run, since it’s all in the repo, there is no magic or special tooling for people to learn. If you have to learn how production works anyway for batch jobs there is no sense in learning a diff. system too.

                                                                                                                      1. 2

                                                                                                                        It’s not just about perspective. I’m firmly in the mutable history camp, because a lot of my work - the vast majority of it, really - is experimenting and research. It’s all about coming up with ideas, sharing them, seeking feedback, and iterating. Most of those will get thrown out the window after a quick proof of concept. I see no point in having those proof of concepts be part of the history. Nor will I spend considerable time and effort documenting, writing tests and whatnot for what is a proof of concept that will get thrown out and rewritten either way, just to make the history usable. I’d rather just rearrange it once the particular branch is being finalised.

                                                                                                                        Immutable history is great when you can afford it, but a huge hindrance when you can’t.

                                                                                                                        With git, both can be accomplished with a little care: no force pushes to any branch. Done.

                                                                                                                        What one does locally is irrelevant. Even with Fossil, you will likely have had local variations before you ended up comitting it. The difference with git is that you can make local snapshots and rearrange them later, using the same tool. With Fossil, you would have to find some other way to store draft work which is not ready to become part of the immutable history.

                                                                                                                        I mean, there’ve been many cases over my career where I was working on a thing that became a single commit in the end, for days, sometimes even weeks. I had cases where that single commit was a single line changed, not a huge amalgamation or anything like that. But it took a lot of iteration to get there. With git, I could commit my drafts, share it with others, and then rewrite it before submitting it upstream. I made use of that history a lot. I rolled back, reverted, partially reverted, looked at things I tried before, and so on. With Fossil, I would have had to find a way to do all that, without comitting the drafts to the immutable history. It would have made no sense to commit them - they weren’t ready. Yet, I still wanted to iterate, I still wanted to easily share with colleagues.

                                                                                                                        1. 3

                                                                                                                          Clearly you and I are going to disagree, but I would argue that Fossil can handle your use-case just fine, albeit very differently than Git would handle it. You have clearly gotten very use to the Git workflow model, and that’s fine. That doesn’t mean the Fossil workflow model is wrong or bad or evil or anything, it’s just different than Git’s, because (I’d argue) it’s coming from a different perspective.

                                                                                                                          Fossil does have ways to store draft work and ship it around, I mentioned two ways in the comment you are replying to, but you either didn’t see them or just chose to ignore them. fossil patch is actually pretty cool, as the patch file is just a sqlite3 file. Easy to ship/move/play with.

                                                                                                                          1. 2

                                                                                                                            I wasn’t saying the Fossil model is bad - it isn’t. It’s just not suitable for every scenario, and I have yet to see what benefit it would have over the mutable model for me. Just because it can handle the scenarios I want doesn’t mean it’s easy, convenient or straightforward to do it. Git can do immutable workflows too, and mutable ones too - it just makes the latter a lot easier, while the former possible if you put in the work.

                                                                                                                            I did not see your comments about fossil patch before, I skipped over that part of your comment, sorry. However, that’s not suitable for my workflow: I don’t need a single patch, I can ferry one around easily, that wouldn’t be a problem. I work with branches, their history important, because I go often go back and revert (fully or partially), I often look back at things I tried before. That is important history during drafting, but completely irrelevant otherwise. Git lets me do dirty things temporarily, and share the refined result. Fossil lets me ferry uncomitted changes around, but that’s so very far from having a branch history. I could build something on it, sure. But git already ships with that feature out of the box, so why would I?

                                                                                                                            I could, of course, fork the repo, and do my draft commits in the fork, and once it reaches a stage where it’s ready to be upstreamed, I can rebuild it on top of the main repo - manually? Or does Fossil have something to help me with that?

                                                                                                                            I’m sure it works in a lot of scenarios, where the desire to commit often & refine is less common than think hard & write only when it’s already refined. It sounds terrible for quick prototyping or proof of concepts (which are a huge part of my work) within an existing project.

                                                                                                                            1. 2

                                                                                                                              Fossil really is just a fancy UI and some tooling around a SQLite database. There is basically no end to what one can do when your entire code tree is living in a SQL database. You don’t need 100k confusing git commands, when you literally can type sqlite3 <blah.fossil> and do literally anything you want. If fossil will understand it for you after is of course an exercise left to the SQL writer. :)

                                                                                                                              That is important history during drafting, but completely irrelevant otherwise.

                                                                                                                              Fossil has a different perspective here. All history is important.

                                                                                                                              I think the big difference here is, Fossil’s UI and query tools are vastly more functional than Git’s. Literally an entire SQL implementation. Meaning you can hide/ignore loads of stuff from the history, so that in practice most of this ‘irrelevant history’ can be hidden from view the vast majority of the time.

                                                                                                                              I could, of course, fork the repo, and do my draft commits in the fork, and once it reaches a stage where it’s ready to be upstreamed, I can rebuild it on top of the main repo - manually? Or does Fossil have something to help me with that?

                                                                                                                              Yes, see: https://www2.fossil-scm.org/home/doc/trunk/www/branching.wiki

                                                                                                                              No need to fork, a branch should work just fine.

                                                                                                                              1. 2

                                                                                                                                Fossil really is just a fancy UI and some tooling around a SQLite database.

                                                                                                                                Similarly, git is just an UI over an object store. You can go and muck with the files themselves, there are libraries that help you do that. If git will understand it for you after, is an exercise left for whoever mucks in the internals. ;)

                                                                                                                                Fossil has a different perspective here. All history is important.

                                                                                                                                And that is my major gripe. I do not believe that all history is important.

                                                                                                                                Meaning you can hide/ignore loads of stuff from the history, so that in practice most of this ‘irrelevant history’ can be hidden from view the vast majority of the time.

                                                                                                                                It still takes up space, and it still takes effort to even figure out what to ignore. With git, it’s easy: it simply isn’t there.

                                                                                                                                No need to fork, a branch should work just fine.

                                                                                                                                According to the document, a branch is just a named, intentional fork. From what I can tell, the history of the branch is still immutable, so if I want to submit it upstream, I would still need to manually rebuild it first. Fossil maintains a single DAG for the entire repo (so the linked doc says), so if I wanted to clean things up before I submit it upstream, I’d need to rebuild by hand. With git, I can rebase and rewrite history to clean it up.

                                                                                                                                1. 2

                                                                                                                                  I do not believe that all history is important.

                                                                                                                                  Seconded.

                                                                                                                                  We do not have to remember everything we do.

                                                                                                                            2. 1

                                                                                                                              Can you explain a little bit more how things like code reviews work? I’m not skeptical that they can’t be done in fossil, it’s just that the workflow is so different from what I’m used to.

                                                                                                                              1. 2

                                                                                                                                I am by no means a Fossil expert, but I’ll give you my perspective. Fossil handles moving the code back and forth, the rest is up to you.

                                                                                                                                I work on a new feature and am ready to commit, but I want Tootie to look it over(code review). If we have a shared machine somewhere with SSH and fossil on it, I can use fossil patch push server:/path/to/checkout and push my patch to some copy for her to look at. If not I can fossil patch create <feature.patch> and then send her the .patch file(which is just a sqlite3 DB file) via any method.

                                                                                                                                She does her review and we talk about it, either in Fossil Forums or Chat, or email, irc, xmpp, whatever. Or she can hack on the code directly and send a new .patch file back to me to play with.

                                                                                                                                Whenever we agree it’s good to go, either one of us can commit it(assuming we both have commit rights). See the fossil patch docs.

                                                                                                                      2. 1

                                                                                                                        What do you mean of “the same data structure as git”, you know Fossil is using SQLite? I don’t know what the current Git data structure is, but from my experience with it, it is much more complicated to do something with compared to a SQL database.

                                                                                                                        1. 2

                                                                                                                          From the link under heading 2.3:

                                                                                                                          The baseline data structures for Fossil and Git are the same, modulo formatting details. Both systems manage a directed acyclic graph (DAG) of Merkle tree structured check-in objects. Check-ins are identified by a cryptographic hash of the check-in contents, and each check-in refers to its parent via the parent’s hash.

                                                                                                                          1. 1

                                                                                                                            A merkle tree of commits containing the file contents.

                                                                                                                          2. 1

                                                                                                                            I’ve been using Fossil for my personal projects since summer 2020.

                                                                                                                            The main advantages I see compared to git is that it is fairly easy to backup/move to a new server (the repository is just a single SQLite database and not a folder of a custom key/value format), as well as to give other contributors commit access (no shell access required).

                                                                                                                            Beside this, I’m also more inclined to their philosophy of being immutable, as I would otherwise spent way too much time making my commits looks nice.

                                                                                                                          1. 8

                                                                                                                            This is somewhat humorous in light of the Torvalds-Tanenbaum debate.

                                                                                                                            (I know it’s not the same thing. I’m just imagining a future where Linux is a microkernel that does nothing more than present a bunch of io_uring interfaces that can be mapped into multiple processes and everything runs in userspace talking via ring buffers.)

                                                                                                                            1. 8

                                                                                                                              There’s been a fair amount of migration of functionality in both directions I’d say. Yeah, there’s FUSE and this and stuff like DPDK, but at the same time we’ve also recently gained in-kernel TLS and an SMB server. And eBPF really blurs the lines a lot, because it’s IMO not even really clear which “side” it falls on, really – it’s code supplied by userspace, but which executes in the kernel. It’s arguably not really microkernel-ish because it executes in the same shared address space as the rest of the (monolithic) kernel, but at the same time it also has ahead-of-time static checks that provide safety guarantees that largely negate the risks of that, so…shrug?

                                                                                                                              I feel like the microkernel-vs-monolithic kernel debate has reached similar territory as the RISC-vs-CISC debate in that it’s an eternally popular topic for armchair peanut-gallery argumentation that’s vastly disproportionate to how contentious it is in real systems design, and also tends to devolve into meta-arguments about what the “true” definition of each really is. Are there five angels perched atop that pin head or eight? I dunno, I’m not sure I’m interested enough at this point to pick a side…

                                                                                                                            1. 5

                                                                                                                              Company: Equinix Metal

                                                                                                                              Company site: https://metal.equinix.com/

                                                                                                                              Position: Senior Firmware Engineer

                                                                                                                              Location: Remote (US/UK/EU preferred, other locations potentially considered)

                                                                                                                              Description: Help develop OpenBMC for deployment on our bare-metal cloud servers! Past experience with Linux kernel/firmware/embedded development, electronics, and open-source community participation desired; see the posting for more information.

                                                                                                                              Tech stack: OpenBMC – Yocto/OpenEmbedded, Linux kernel, u-boot, C, C++, occasional bits of Rust.

                                                                                                                              Compensation: I don’t have any concrete numbers available, but it’s pretty competitive. US-based applicants may appreciate that health benefits have recently been expanded to cover travel and lodging.

                                                                                                                              Contact: email <zweiss at equinix.com>, or find me (username zevweiss) on the OpenBMC Discord server (please do get in touch if you’ve applied or have any questions!).

                                                                                                                              1. 1

                                                                                                                                Nearly impossible to detect on the running server. However, if you have something like a pihole looking for dns exfiltration attempts, this becomes much easier to detect. It does require multiple layers of protection though, I’ll give it that.

                                                                                                                                1. 2

                                                                                                                                  Since I haven’t seen any mention of it tampering with the kernel or hooking actual syscalls (as opposed to userspace syscall wrappers), it sounds like its concealment mechanisms should be pretty simple to bypass using statically-linked executables? (A static busybox build, say.)

                                                                                                                                  1. 1

                                                                                                                                    This was my take. LD_PRELOAD wouldn’t work in the statically linked context

                                                                                                                                  2. 1

                                                                                                                                    Or if you’re running in AWS there’s also their guardduty alert which I hope would pick it up: https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_finding-types-ec2.html#backdoor-ec2-ccactivitybdns

                                                                                                                                    1. 1

                                                                                                                                      The grsecurity patchset includes a feature called Trusted Path Execution (TPE). It can integrate with the RTLD to completely mitigate LD_PRELOAD abuses. I’m working on implementing something similar in HardenedBSD this weekend. :-)

                                                                                                                                    1. 18

                                                                                                                                      As Github’s post notes, Atom is the thing that gave us Electron. That’s going to be with us a long time.

                                                                                                                                      1. 37

                                                                                                                                        But other than that, Mrs. Lincoln, how was the play

                                                                                                                                        1. 15

                                                                                                                                          Know what’s cooler than pissing and moaning about Electron? Taking a web codebase and getting three desktop apps for cheap. I run a 2013 MBP and good Electron apps are fine. What the hell are people complaining about? Is this some sort of purity test?

                                                                                                                                          1. 11

                                                                                                                                            For me, personally, many Electron apps are quite sluggish and far less resource efficient than native apps (Discord, Skype, etc)

                                                                                                                                            1. 1

                                                                                                                                              There’s definitely good and bad electron apps. Slack, VS Code and Spotify are very snappy and do enough stuff to justify the extra memory usage, while Discord and Signal are absolute turds.

                                                                                                                                              At the end of the day the memory usage thing is a bit of a canard. I have a 2018 laptop with 16GB of RAM, regularly run Goland, PyCharm (multiple instances), Spotify, Slack, Firefox, Chrome, VS Code, Obsidian… and still have a few GB of RAM to spare.

                                                                                                                                              1. 2

                                                                                                                                                So in other words, running a bunch of Electron apps takes gigabytes upon gigabytes of memory and you’d be screwed if you had only 8GB.

                                                                                                                                            2. 9

                                                                                                                                              Doing something for cheap that degrades the user experience is cool for managers, but not for users. If good Electron apps run fine on a 2013 laptop, think of what bad Electron apps do on a 2008 laptop.

                                                                                                                                              1. 1

                                                                                                                                                I grew up in a developing country, and even I find it hard to shed a tear for those poor people using a laptop that is almost a decade and a half old.

                                                                                                                                                1. 3

                                                                                                                                                  The slope of the Moore’s Law performance curve has leveled off significantly in the last ~10 years or so; the difference between a 2008 computer and a 2022 computer is a lot smaller than a 1994 computer and a 2008 computer. If it works well enough (or perhaps, if the only reason it wouldn’t work well enough is shitty, resource-gluttonous software), why spend money and create more e-waste replacing it?

                                                                                                                                              2. 3

                                                                                                                                                Maybe desktop applications are not desirable. Maybe Web applications are an anti-pattern. It’s not a purity test, but a realization that we made an architectural wrong turn.

                                                                                                                                                1. 3

                                                                                                                                                  If you’re against desktop and web applications then are you for a new platform? How does it differ from the web beyond just being better somehow?

                                                                                                                                                  1. 1

                                                                                                                                                    Maybe the concept of “application” – the deliverable, the product, the artifact which is sold to consumers – is incorrect.

                                                                                                                                                    At a minimum, we could imagine replacing the Web with something even more amenable to destroying copyright and empowering users of Free Software. A content-addressed system is one possible direction.

                                                                                                                                                2. 2

                                                                                                                                                  Electron apps tend to crash my Wayland desktop if they’re not updated. They rarely are, like Discord’s Electron hasn’t been updated for ages.

                                                                                                                                                  Sure there are always ways to skirt around the issue, but we have a lot of resources yet most of them are spent on apps that run worse. Often those apps use Electron.

                                                                                                                                                  We shouldn’t have worse resource usage and waste of energy just because some managers think it’s cheap on the short run.

                                                                                                                                                  1. 1

                                                                                                                                                    Nobody’s forcing you to use those apps. Just take your money somewhere else and notify those managers that they are losing your business because they use electron.

                                                                                                                                                    1. 6

                                                                                                                                                      Pretty much everyone is forcing me to use Slack, though. To the point where it was actually a significant factor in finally deciding I had to buy a new computer a couple years ago, with as much RAM as I could fit in it so that Slack didn’t make the fan practically lift it off the desk. Yeah there’s a web client but it doesn’t have all the integrations and blah blah. And I’d say 90% of the companies & projects I’ve worked with & on in the last 3-4 years have required me to use it, no matter how much I don’t like it.

                                                                                                                                                      1. 6

                                                                                                                                                        I am forced to use Discord if I want the community around my games to thrive.

                                                                                                                                                        I’m locked in to these systems, that’s the whole point of making them like this.

                                                                                                                                                        Same with Slack, but for work.

                                                                                                                                                        Edit: Not to forget all my friends who use discord, and good luck trying tp convince them to use any alternatives.

                                                                                                                                                    2. 2

                                                                                                                                                      Yeah

                                                                                                                                                    3. 0

                                                                                                                                                      Sparks - So Tell Me Mrs. Lincoln Aside From That How Was The Play? (Official Audio) https://youtu.be/OuHGmtdJrDM?list=OLAK5uy_ntUoHXUt38rtp3L91dpdq-n7l776TF0nE

                                                                                                                                                    4. 1

                                                                                                                                                      Nodewebkit existed and was a big deal before that. Electron is not the breakthrough piece of softtware many are assuming it is.

                                                                                                                                                      1. 2

                                                                                                                                                        As did Microsoft’s HTA in 1998, but in the end, it was Electron that got mass adoption.

                                                                                                                                                    1. 6

                                                                                                                                                      Nice. For a while I used DNS TXT records as an alternative to twitter, though it fell into disuse and I recently abandoned it for a honk instance.

                                                                                                                                                      Also in the category of DNS-based hacks: iodine (IP-over-DNS tunnel).

                                                                                                                                                      1. 1

                                                                                                                                                        Hah, coincidentally, I too recently switched to a honk instance from Pleroma. Great stuff.

                                                                                                                                                      1. 16

                                                                                                                                                        In some ways, high-level languages with package systems are to blame for this. I normally code in C++ but recently needed to port some code to JS, so I used Node for development. It was breathtaking how quickly my little project piled up hundreds of dependent packages, just because I needed to do something simple like compute SHA digests or generate UUIDs. Then Node started warning me about security problems in some of those libraries. I ended up taking some time finding alternative packages with fewer dependencies.

                                                                                                                                                        On the other hand, I’m fairly sympathetic to the way modern software is “wasteful”. We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t. It’s cool to look at how tiny and efficient code can be — a Scheme interpreter in 4KB! The original Mac OS was 64KB! — but yowza, is it ever difficult to code that way.

                                                                                                                                                        There was an early Mac word processor — can’t remember the name — that got a lot of love because it was super fast. That’s because they wrote it in 68000 assembly. It was successful for some years, but failed by the early 90s because it couldn’t keep up with the feature set of Word or WordPerfect. (I know Word has long been a symbol of bloat, but trust me, Word 4 and 5 on Mac were awesome.) Adding features like style sheets or wrapping text around images took too long to implement in assembly compared to C.

                                                                                                                                                        The speed and efficiency of how we’re creating stuff now is crazy. People are creating fancy OSs with GUIs in their bedrooms with a couple of collaborators, presumably in their spare time. If you’re up to speed with current Web tech you can bring up a pretty complex web app in a matter of days.

                                                                                                                                                        1. 24

                                                                                                                                                          I don’t know, I think there’s more to it than just “these darn new languages with their package managers made dependencies too easy, in my day we had to manually download Boost uphill both ways” or whatever. The dependencies in the occasional Swift or Rust app aren’t even a tenth of the bloat on my disk.

                                                                                                                                                          It’s the whole engineering culture of “why learn a new language or new API when you can just jam an entire web browser the size of an operating system into the application, and then implement your glorified scp GUI application inside that, so that you never have to learn anything other than the one and only tool you know”. Everything’s turned into 500megs worth of nail because we’ve got an entire generation of Hammer Engineers who won’t even consider that it might be more efficient to pick up a screwdriver sometimes.

                                                                                                                                                          We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t

                                                                                                                                                          That’s the argument, but it’s not clear to me that we haven’t severely over-corrected at this point. I’ve watched teams spend weeks poking at the mile-high tower of leaky abstractions any react-native mobile app teeters atop, just to try to get the UI to do what they could have done in ten minutes if they’d bothered to learn the underlying platform API. At some point “make all the world a browser tab” became the goal in-and-of-itself, whether or not that was inefficient in every possible dimension (memory, CPU, power consumption, or developer time). It’s heretical to even question whether or not this is truly more developer-time-efficient anymore, in the majority of cases – the goal isn’t so much to be efficient with our time as it is to just avoid having to learn any new skills.

                                                                                                                                                          The industry didn’t feel this sclerotic and incurious twenty years ago.

                                                                                                                                                          1. 7

                                                                                                                                                            It’s heretical to even question whether or not this is truly more developer-time-efficient anymore

                                                                                                                                                            And even if we set that question aside and assume that it is, it’s still just shoving the costs onto others. Automakers could probably crank out new cars faster by giving up on fuel-efficiency and emissions optimizations, but should they? (Okay, left to their own devices they probably would, but thankfully we have regulations they have to meet.)

                                                                                                                                                            1. 1

                                                                                                                                                              left to their own devices they probably would, but thankfully we have regulations they have to meet.

                                                                                                                                                              Regulations. This is it.

                                                                                                                                                              I’ve long believed that this is very important in our industry. As earlier comments say, you can make a complex web app after work in a weekend. But then there are people, in the mentioned above autoindustry, that take three sprints to set up a single screen with a table, a popup, and two forms. That’s after they pulled in the internet worth of dependencies.

                                                                                                                                                              On the one hand, we don’t want to be gatekeeping. We want everyone to contribute. When dhh said we should stop celebrating incompetence, majority of people around him called this gatekeeping. Yet when we see or say something like this - don’t build bloat or something along the line - everyone agrees.

                                                                                                                                                              I think the middle line should be in between. Let individuals do whatever the hell they want. But regulate “selling” stuff for money or advertisement eyeballs or anything similar. If an app is more then x MB (some reasonable target), it has to get certified before you can publish it. Or maybe, if a popular app does. Or, if a library is included in more then X, then that lib either gets “certified”, or further apps using it are banned.

                                                                                                                                                              I am sure that is huge, immensely big, can of worms. There will be many problems there. But if we don’t start cleaning up shit, it’s going to pile up.

                                                                                                                                                              A simple example - if controversial - is Google. When they start punishing a webapp for not rendering within 1 second, everybody on internet (that wants to be on top of google) starts optimizing for performance. So, it can be done. We just have to setup - and maintain - a system that deals with the problem ….well, systematically.

                                                                                                                                                            2. 1

                                                                                                                                                              why learn a new language or new API when you can just jam an entire web browser the size of an operating system into the application

                                                                                                                                                              Yeah. One of the things that confuses me is why apps bundle a browser when platforms already come with browsers that can easily be embedded in apps. You can use Apple’s WKWebView class to embed a Safari-equivalent browser in an app that weighs in at under a megabyte. I know Windows has similar APIs, and I imagine Linux does too (modulo the combinatorial expansion of number-of-browsers times number-of-GUI-frameworks.)

                                                                                                                                                              I can only imagine that whoever built Electron felt that devs didn’t want to deal with having to make their code compatible with more than one browser engine, and that it was worth it to shove an entire copy of Chromium into the app to provide that convenience.

                                                                                                                                                              1. 1

                                                                                                                                                                Here’s an explanation from the Slack developer who moved Slack for Mac from WebKit to Electron. And on Windows, the only OS-provided browser engine until quite recently was either the IE engine or the abandoned EdgeHTML.

                                                                                                                                                            3. 10

                                                                                                                                                              On the other hand, I’m fairly sympathetic to the way modern software is “wasteful”. We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t.

                                                                                                                                                              The problem is that your dependencies can behave strangely, and you need to debug them.

                                                                                                                                                              Code bloat makes programs hard to debug. It costs programmer time.

                                                                                                                                                              1. 3

                                                                                                                                                                The problem is that your dependencies can behave strangely, and you need to debug them.

                                                                                                                                                                To make matters worse, developers don’t think carefully about which dependencies they’re bothering to include. For instance, if image loading is needed, many applications could get by with image read support for one format (e.g. with libpng). Too often I’ll see an application depend on something like ImageMagick which is complete overkill for that situation, and includes a ton of additional complex functionality that bloats the binary, introduces subtle bugs, and wasn’t even needed to begin with.

                                                                                                                                                              2. 10

                                                                                                                                                                On the other hand, I’m fairly sympathetic to the way modern software is “wasteful”. We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t.

                                                                                                                                                                The problem is that computational resources vs. programmer time is just one axis along which this tradeoff is made: some others include security vs. programmer time, correctness vs. programmer time, and others I’m just not thinking of right now I’m sure. It sounds like a really pragmatic argument when you’re considering your costs because we have been so thoroughly conditioned into ignoring our externalities. I don’t believe the state of contemporary software would look like it does if the industry were really in the habit of pricing in the costs incurred by others in addition to their own, although of course it would take a radically different incentive landscape to make that happen. It wouldn’t look like a code golfer’s paradise, either, because optimizing for code size and efficiency at all costs is also not a holistic accounting! It would just look like a place with some fewer amount of data breaches, some fewer amount of corrupted saves, some fewer amount of Watt-hours turned into waste heat, and, yes, some fewer amount of features in the case where their value didn’t exceed their cost.

                                                                                                                                                                1. 7

                                                                                                                                                                  We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t

                                                                                                                                                                  But we aren’t. Because modern resource-wastfull software isn’t really realeased quicker. Quite the contrary, there is so much development overhead that we don’t see those exciting big releases anymore with a dozen of features every ones loves at first sight. They release new features in microscopic increments so slowly that hardly any project survives 3-5 years without becoming obsolete or out of fashion.

                                                                                                                                                                  What we are trading is quality, by quantity. We lower the skill and knowledge barrier so much to acompdate for millions of developers that “learned how tonprogra in one week” and the results are predictably what this post talks about.

                                                                                                                                                                  1. 6

                                                                                                                                                                    I’m as much against bloat as everyone else (except those who make bloated software, of course—those clearly aren’t against it). However, it’s easy to forget that small software from past eras often couldn’t do much. The original Mac OS could be 64KB, but no one would want to use such a limited OS today!

                                                                                                                                                                    1. 5

                                                                                                                                                                      The original Mac OS could be 64KB, but no one would want to use such a limited OS today!

                                                                                                                                                                      Seems some people (@neauoire) do want exactly that: https://merveilles.town/@neauoire/108419973390059006

                                                                                                                                                                      1. 6

                                                                                                                                                                        I have yet to see modern software that is saving the programmer’s time.

                                                                                                                                                                        I’m here for it, I’ll be cheering when it happens.

                                                                                                                                                                        This whole thread reminds me of a little .txt file that came packaged into DawnOS.

                                                                                                                                                                        It read:

                                                                                                                                                                        Imagine that software development becomes so complex and expensive that no software is being written anymore, only apps designed in devtools. Imagine a computer, which requires 1 billion transistors to flicker the cursor on the screen. Imagine a world, where computers are driven by software written from 400 million lines of source code. Imagine a world, where the biggest 20 technology corporation totaling 2 million employees and 100 billion USD revenue groups up to introduce a new standard. And they are unable to write even a compiler within 15 years.

                                                                                                                                                                        “This is our current world.”

                                                                                                                                                                        1. 11

                                                                                                                                                                          I have yet to see modern software that is saving the programmer’s time.

                                                                                                                                                                          People love to hate Docker, but having had the “pleasure” of doing everything from full-blown install-the-whole-world-on-your-laptop dev environments to various VM applications that were supposed to “just work”… holy crap does Docker save time not only for me but for people I’m going to collaborate with.

                                                                                                                                                                          Meanwhile, programmers of 20+ years prior to your time are equally as horrified by how wasteful and disgusting all your favorite things are. This is a never-ending cycle where a lot of programmers conclude that the way things were around the time they first started (either programming, or tinkering with computers in general) was a golden age of wise programmers who respected the resources of their computers and used them efficiently, while the kids these days have no respect and will do things like use languages with garbage collectors (!) because they can’t be bothered to learn proper memory-management discipline like their elders.

                                                                                                                                                                          1. 4

                                                                                                                                                                            I’m of the generation that started programming at the tail end of ruby, and Objective-C, and I would definitely not call this the golden age, if anything, looking back at this period now it looks like mid-slump.

                                                                                                                                                                          2. 4

                                                                                                                                                                            I have yet to see modern software that is saving the programmer’s time.

                                                                                                                                                                            What’s “modern”? Because I would pick a different profession if I had to write code the way people did prior to maybe the late 90s (at minimum).

                                                                                                                                                                            Edit: You can pry my modern IDEs and toolchains from my cold, dead hands :-)

                                                                                                                                                                      2. 6

                                                                                                                                                                        Node is an especially good villain here because JavaScript has long specifically encouraged lots of small dependencies and has little to no stdlib so you need a package for near everything.

                                                                                                                                                                        1. 5

                                                                                                                                                                          It’s kind of a turf war as well. A handful of early adopters created tiny libraries that should be single functions or part of a standard library. Since their notoriety depends on these libraries, they fight to keep them around. Some are even on the boards of the downstream projects and fight to keep their own library in the list of dependencies.

                                                                                                                                                                        2. 6

                                                                                                                                                                          We’re trading CPU time and memory, which are ridiculously abundant

                                                                                                                                                                          CPU time is essentially equivalent to energy, which I’d argue is not abundant, whether at the large scale of the global problem of sustainable energy production, or at the small scale of mobile device battery life.

                                                                                                                                                                          for programmer time, which isn’t.

                                                                                                                                                                          In terms of programmer-hours available per year (which of course unit-reduces to active programmers), I’m pretty sure that resource is more abundant than it’s ever been any point in history, and only getting more so.

                                                                                                                                                                          1. 2

                                                                                                                                                                            CPU time is essentially equivalent to energy

                                                                                                                                                                            When you divide it by the CPU’s efficiency, yes. But CPU efficiency has gone through the roof over time. You can get embedded devices with the performance of some fire-breathing tower PC of the 90s, that now run on watch batteries. And the focus of Apple’s whole line of CPUs over the past decade has been power efficiency.

                                                                                                                                                                            There are a lot of programmers, yes, but most of them aren’t the very high-skilled ones required for building highly optimal code. The skills for doing web dev are not the same as for C++ or Rust, especially if you also constrain yourself to not reaching for big pre-existing libraries like Boost, or whatever towering pile of crates a Rust dev might use.

                                                                                                                                                                            (I’m an architect for a mobile database engine, and my team has always found it very difficult to find good developers to hire. It’s nothing like web dev, and even mobile app developers are mostly skilled more at putting together GUIs and calling REST APIs than they are at building lower-level model-layer abstractions.)

                                                                                                                                                                          2. 2

                                                                                                                                                                            Hey, I don’t mean to be a smart ass here, but I find it ironic that you start your comment blaming the “high-level languages with package systems” and immediately admit that you blindly picked a library for the job and that you could solve the problem just by “taking some time finding alternative packages with fewer dependencies”. Does not sound like a problem with neither the language nor the package manager honestly.

                                                                                                                                                                            What would you expect the package manager to do here?

                                                                                                                                                                            1. 8

                                                                                                                                                                              I think the problem in this case actually lies with the language in this case. Javascript has such a piss-poor standard library and dangerous semantics (that the standard library doesn’t try to remedy, either) that sooner, rather than later, you will have a transient dependency on isOdd, isEven and isNull because even those simple operations aren’t exactly simple in JS.

                                                                                                                                                                              Despite being made to live in a web browser, the JS standard library has very few affordances to working with things like URLs, and despite being targeted toward user interfaces, it has very few affordances for working with dates, numbers, lists, or localisations. This makes dependency graphs both deep and filled with duplicated efforts since two dependencies in your program may depend on different third-party implementations of what should already be in the standard library, themselves duplicating what you already have in your operating system.

                                                                                                                                                                              1. 2

                                                                                                                                                                                It’s really difficult for me to counter an argument that it’s basically “I don’t like JS”. The question was never about that language, it was about “high-level languages with package systems” but your answer hyper focuses on JS and does not address languages like python for example, that is a “high-level language with a package system”, which also has an “is-odd” package (which honestly I don’t get what that has to do with anything).

                                                                                                                                                                                1. 1

                                                                                                                                                                                  The response you were replying to was very much about JS:

                                                                                                                                                                                  In some ways, high-level languages with package systems are to blame for this. I normally code in C++ but recently needed to port some code to JS, so I used Node for development. It was breathtaking how quickly my little project piled up hundreds of dependent packages, just because I needed to do something simple like compute SHA digests or generate UUIDs.

                                                                                                                                                                                  For what it’s worth, whilst Python may have an isOdd package, how often do you end up inadvertently importing it in Python as opposed to “batteries-definitely-not-included” Javascript? Fewer batteries included means more imports by default, which themselves depend on other imports, and a few steps down, you will find leftPad.

                                                                                                                                                                                  As for isOdd, npmjs.com lists 25 versions thereof, and probably as many isEven.

                                                                                                                                                                                  1. 1

                                                                                                                                                                                    and a few steps down, you will find leftPad

                                                                                                                                                                                    What? What kind of data do you have to back up a statement like this?

                                                                                                                                                                                    You don’t like JS, I get it, I don’t like it either. But the unfair criticism is what really rubs me the wrong way. We are technical people, we are supposed to make decisions based on data. But this kind of comments that just generates division without the slightest resemblance of a solid argument do no good to a healthy discussion.

                                                                                                                                                                                    Again, none of the arguments are true for js exclusively. Python is batteries included, sure, but it’s one of the few. And you conveniently leave out of your quote the part when OP admits that with a little effort the “problem” became a non issue. And that little effort is what we get paid for, that’s our job.

                                                                                                                                                                              2. 3

                                                                                                                                                                                I’m not blaming package managers. Code reuse is a good idea, and it’s nice to have such a wealth of libraries available.

                                                                                                                                                                                But it’s a double edged sword. Especially when you use a highly dynamic language like JS that doesn’t support dead-code stripping or build-time inlining, so you end up having to copy an entire library instead of just the bits you’re using.

                                                                                                                                                                              3. 1

                                                                                                                                                                                On the other hand, I’m fairly sympathetic to the way modern software is “wasteful”. We’re trading CPU time and memory, which are ridiculously abundant, for programmer time, which isn’t.

                                                                                                                                                                                We’re trading CPU and memory for the time of some programmers, but we’re also adding the time of other programmers onto the other side of the balance.

                                                                                                                                                                                1. 1

                                                                                                                                                                                  I definitely agree with your bolded point - I think that’s the main driver for this kind of thing.

                                                                                                                                                                                  Things change if there’s a reason for them to be changed. The incentives don’t really line up currently to the point where it’s worth it for programmers/companies to devote the time to optimize things that far.

                                                                                                                                                                                  That is changing a bit already, though. For example, performance and bundle size are getting seriously considered for web dev these days. Part of the reason for that is that Google penalizes slow sites in their rankings - a very direct incentive to make things faster and more optimized!

                                                                                                                                                                                1. 1

                                                                                                                                                                                  The logging code records a thread ID, but I don’t see any mention of what use (if any) the trace replay makes of it – it could be more accurate to model concurrency for lock contention and such, but then you also get into questions of how the performance differences between different allocators affect that concurrency (two allocations that were concurrent with the original allocator might not have ended up that way with another).

                                                                                                                                                                                  1. 3

                                                                                                                                                                                    My experience is exactly the opposite. We all know what the plural of anecdote isn’t data, but still. I had EXT3/4 partitions suffer all kinds of power loss and hardware failures, and they were always recoverable.

                                                                                                                                                                                    But that one time I decided to go with XFS for the root partition, a power loss event killed it instantly. The data partition, which was EXT3, just needed a routine fsck. I never used XFS since then: once bitten, twice shy, you know.

                                                                                                                                                                                    I still haven’t tried BTRFS, so I can’t say anything on that subject yet.

                                                                                                                                                                                    1. 2

                                                                                                                                                                                      Out of curiosity, how long ago was this? I know XFS used to have pretty significant reliability issues in the past, but I’ve been using it nowadays for quite a while without issues.

                                                                                                                                                                                      1. 1

                                                                                                                                                                                        That XFS incident was in 2008 or such, very long time ago. None of my friends use XFS, so I had no way to know if it improved and after that event I didn’t feel like trying it again without a solid proof that it improved. :)

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          Ah interesting. I know by around 2012 a lot of major improvements had either very recently been, or were soon to be, pushed into XFS (https://xfs.org/images/d/d1/Xfs-scalability-lca2012.pdf), which included the addition of checksums on metadata. I do also know that it had a strong tendency to lose data on power loss in the past, but as for some very anecdotal evidence, I’ve been using it for a few years now on my personal system, and it’s endured at least several dozen forced shutdowns without data loss.

                                                                                                                                                                                      2. 2

                                                                                                                                                                                        Indeed – this post and the ensuing discussion reinforces my belief in my Grand Unifying Theory of Filesystems.

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          Paraphrase:

                                                                                                                                                                                          For all filesystems there exists a user that says “$fs ate my data”

                                                                                                                                                                                      1. 1

                                                                                                                                                                                        “any kernel version” That’s probably not true.

                                                                                                                                                                                        1. 3

                                                                                                                                                                                          If you take a maximally-literal interpretation, sure, it’s not going to work on HURD or FreeBSD or Linux 2.4, but I think it would be fair to interpret it as meaning it will work with any kernel for which that could reasonably be expected (i.e. a recent-ish Linux with CONFIG_IO_URING=y).