1. 8

    I’ve said it before and I’ll say it again: ZFS should be the default on all Linux distros. It’s in a league of its own, and makes all other existing Linux filesystems irrelevant, bizarre licensing issues be damned.

    1. 7

      I use ZFS and love it. But I disagree that ZFS should be the default as-is. It requires a fair bit of tuning. For non-server workloads, the ARC in particular. ZFS does not use Linux’ buffer cache and while ARC size adapts, I have often seen on lower memory machines that the ARC takes too much memory at a given point, leaving too little memory for the OS and applications. So, most users would want to tune zfs_arc_max for their particular workload.

      I do think ZFS should be available as an option in all Linux distributions. It is simply better than the filesystems that are currently provided in the kernel. (Maybe bcachefs will be a competent alternative in the future.)

      1. 2

        I agree.

        I remember installing FreeBSD 11 once (with root on ZFS) because I needed a machine remotely accessible via SSH to handle files on an existing disk with ZFS.

        No shizzle, FreeBSD defaults, the machine had 16G of RAM, and during an hours long scp run, ARC decided to eat up all the memory, triggering the kernel into killing processes… including SSH.

        So I lost access, had to restart scp again (no resume, remember), etc. This is a huge show stopper and it should never happen.

        1. 1

          That seems like a bug that should be fixed. Don’t see any reason why that should prevent it from being the default though.

        2. 1

          That’s definitely something to consider, however, Apple has made APFS (ZFS inspired) the default on macOS, so there’s got to be a way to make it work for ZFS + Linux Desktop too. ZFS is all about making things work without you having to give it much thought. Desktop distros can pick reasonable defaults for desktop use, and ZFS could possibly make the parameter smarter somehow.

        3. 2

          I think the licensing issue is the primary problem for Linux distros.

          1. 1

            I agree on technical superiority. What about the Oracle threat given its owner pulled off that API trick? Should we take the risk of all owing Oracle’s lawyers money in some future case? Or rush to implement something different that they don’t control with most of its strengths? I think the latter makes the most sense in the long-term.

            1. 3

              Oracle is not a problem, as the ZFS license is not being violated – it is the Linux license.

              1. 1

                “Oracle is not a problem, as the ZFS license is not being violated”

                That’s a big claim to make in the event large sums of money are ever involved. Oracle threw massive amounts of lawyers at Google with the result being API’s were suddenly a thing they could copyright. Nobody knew that before. With enough money and malicious intent, it became a thing that could affect FOSS developers or anyone building on proprietary platforms. What will they do next?

                I don’t know. Given they’re malicious, the safest thing is to not use anything they own or might have patents on. Just stay as far away from every sue-happy party in patent and copyright spaces. Oracle is a big one that seeks big damages for its targets on top of trying to rewrite the law in cases. I steer clear of their stuff. We don’t even need it, either. It’s just more convenient than alternatives.

                1. 8

                  The CDDL, an OSI-approved open source licensed, includes both a copyright and patent grant for all of the code released by Sun (now Oracle). Oracle have sued a lot of people for a lot of things, but they haven’t come after illumos or OpenZFS and there are definitely companies using both of those bodies of software to make real money.

                  1. 2

                    I think you’re missing the implications of they effectively rewrote the law in the case I referenced. If they can do that, it might not matter what their agreements say if it’s their property. The risk might be low enough that it never plays out. One just can’t ever know if they depend on legal provisions with a malicious party that tries to rewrite laws in its favor with lobbyists and lawyers.

                    And sometimes succeeds unlike basically everyone doing open source and free software. Those seem to barely enforce their agreements and/or be vulnerable to patent suits in case of the permissive licenses. Plus, could the defenders even afford a trial at the current rates?

                    I bet 10 years ago you wouldn’t have guessed a mobile supplier using an open-ish platform would be fighting to avoid giving over $8 billion dollars to an enterprise-focused, database company. Yet, untrustworthy dependencies let that happen. And we got lucky it was a rich company that depended on OSS/FOSS stuff defending. The rulings could’ve been worse for us if it wasn’t Google.

                    1. 6

                      Seeing as Sun gave ZFS away before Oracle bought it, Oracle would have a LOT of legal wackiness to get the CDDL license revoked somehow. But for the safe of argument, let’s assume they do manage somehow to make it invalidated, and went nuts and decided to try and charge everyone currently using ZFS pay bajillions of dollars for “their” tech. Laws would have to change significantly for that to happen, and with such a significant change in current law, there is basically zero chance it would be retro-active from the moment you started using ZFS, so worst case you’d have to pay from the time of the law change. That is if you didn’t just move off of ZFS after the law changed and be out zero dollars.

                      Also, the OSS version of ZFS is significantly different from Oracle’s version that they are sort of kissing cousins at best anymore. ZFS has been CDDL licensed since 2005, so a long history of divergence from the Oracle version. I think Oracle would have a VERY hard time getting the OSS version back under the Oracle banner(s). Even with very hypothetical significant law changes.

                      I’m in favour of things competing against ZFS, but currently nothing really does.. BTRFS tries, but their stability record is pretty miserable for anything besides the simplest workloads. ZFS has had wide production usage since 2001. Maybe in another 5 or 10 years we will have a decent stable competitor to some of ZFS’s feature-sets.

                      But regardless if you are a large company with something to lose, your lawyers will be the ones advising you about using ZFS or not, and Canonical’s lawyers clearly decided there was nothing to worry about, Along with Samsung(who own Joyent, the people behind Illumos). There are also many other large companies that have bet big on Oracle having basically zero legal leg to stand on.

                      Of course the other side of the coin is the ZFS <-> Linux marriage, but that’s easy just don’t run ZFS under Linux, or use the Canonical shipped version and let Canonical take all the legal heat.

                      1. 2

                        Best counterpoints so far. I’ll note this part might not be as strong as you think:

                        “and Canonical’s lawyers clearly decided there was nothing to worry about, Along with Samsung(who own Joyent, the people behind Illumos)”

                        The main way companies dodge suits is to have tons of money and patents themselves to make the process expensive as hell for anyone that tries. Linux companies almost got patent sued by Microsoft. IBM, a huge patent holder, stepped up saying they’d deal with anyone that threatened it. They claimed they were putting a billion dollars into Linux. Microsoft backed off. That GPL companies aren’t getting sued made Canonical’s lawyers comfortable but not an actual assurance. Samsung is another giant, patent holder with big lawyers. It takes an Apple-sized company to want to sue them.

                        So, big, patent holders or projects they protect are outliers. That might work to ZFS’s advantage here. Especially if IBM used it. They don’t prove what will happen with smaller companies, though.

                        1. 2

                          I agree with you in theory, but not in practice because of the CDDL (which ZFS is licensed under). This license explicitly grants a “patent peace” see: https://en.wikipedia.org/wiki/Common_Development_and_Distribution_License

                          I know most/many OSS licenses sort of wimp out on patents and ignore the problem, CDDL doesn’t. Perhaps it could have even stronger language, and there might be some wiggle room for some crazy lawyering.. I just don’t really see Oracle being THAT crazy. Oracle, being solely focused on $$$$, would have to see some serious money bags to go shake loose, I doubt they would ever bother going after anyone not the size of a Fortune 500, the money just isn’t there. Google has giant bags full of money they don’t even know what to do with, so Oracle trying to steal a few makes sense. :P

                          Oracle going after Google makes sense knowing Oracle, and it was , like you said, brand new lawyering, trying to create API’s out of Copyrights. Patents are not remotely new. So some lawyer for Oracle would have to dream up some new way to screw up laws to their advantage. Possible sure, but it would be possible for any other crazy lawyer to go nuts here (wholly unrelated to ZFS or even technology), it’s not an Oracle exclusive idiocy. Trying to avoid unknown lawyering that’s not even theoretical at this point would be sort of stupid I would think… but I’m not a lawyer.

                          1. 1

                            “I know most/many OSS licenses sort of wimp out on patents and ignore the problem, CDDL doesn’t.”

                            That would be re-assuring on patent part.

                            “Possible sure, but it would be possible for any other crazy lawyer to go nuts here (wholly unrelated to ZFS or even technology), it’s not an Oracle exclusive idiocy. Trying to avoid unknown lawyering”

                            Oracle was the only one to flip software copyright on its head like this. So, I don’t think it’s an any company thing. Either way, the threat I’m defending against isn’t unknown lawyering in general: it’s unknown lawyering of a malicious company whose private property I may or may not depend on. When you frame it that way, one might wonder why anyone would depend on a malicious company at all. Avoiding that is a good pattern in general. Then, the license negates some amount of that potential malice for a great product with unknown, residual risk.

                            I agree the residual risk probably won’t affect individuals, though. An Oracle-driven risk might affect small to mid-sized businesses depending on how it plays out. Good news is swapping filesystems isn’t very hard on Linux and BSD’s. ;)

                  2. 4

                    AFAIK, it’s the GPL that’s being violated. But I’m really tired and the SFC does mention something about Oracle suing so 🤷.

                    Suing based on the use of works derived from Oracle’s CDDL sources would be a step further than the dumb Google Java lawsuit because they haven’t gone after anyone for using OpenJDK-based derivatives of Java. Oracle’s lawsuit-happy nature would, however, mean that a reimplementation of ZFS would be a bigger target because it doesn’t have the CDDL patent grant. Of course, any file system that implements one of their dumb patents could be at risk….

                    I miss Sun!

              2. 1

                What does ZFS have that is so much better than btrfs?

                I’m also not sure these types of filesystems are well suited for databases which implement their own transactions and COW, so I’m not sure I would go as far as saying they are all irrelevant.

                1. 11

                  ZFS is extremely stable and battle-tested, while that’s not a reason in itself to make it a better filesystem, it makes it a extremely safe option when what you’re looking for is something stable to keep your data consistent.

                  It is also one of the most cross-platform file system. Linux, FreeBSD, MacOS, Windows Illumos. It has a huge amount of development behind it, and as of recently the community has come together significantly across the platforms. Being able to export your pool on FreeBSD and import it on Linux or another platform makes it a much better option if you want to avoid lock-in.

                  Additionally, the ARC

                  Problems with btrfs that make it not ready:

                  1. 0

                    If I don’t use/want to use RAID5 then I don’t see the problem with btrfs.

                    1. 3

                      I ran btrfs in production on my home server for ~3-4 years, IIRC. If you want to use btrfs as a better ext4, e.g. just for the compression and checksumming and maybe, maybe snapshotting, then you’re probably fine. If you want to do anything beyond that, I would not trust it with your data. Or at the very least, I wouldn’t trust it with your data that’s not backed up using something that has nothing to do with btrfs (i.e. is not btrfs snapshots and is not btrfs send/receive).

                      I had three distinct crashes/data corruption problems that damaged the filesystem badly enough that I had to back up and run mkfs.btrfs again. These were mostly caused by interruptions/power failures while I was making changes to the fs, for example removing a device or rebalancing or something. Honestly I’ve forgotten the exact details now, otherwise I’d say something less vague. But the bottom line is that it simply lacks polish. And mind you, this is from the filesystem that is supposed to be explicitly designed to resist this kind of corruption. I know at least the last case of corruption I had (which finally made me move to ZFS) was obviously preventable but that failure handling hadn’t been written yet and so the fs got into a state that the kernel didn’t know how to handle.

                  2. 3

                    well, I don’t know about better, but ZFS has the distinct disadvantage of being out of tree filesystem so it can and will break depending completely on the whims of kernel development. How anyone can call this stable and safe for production use is beyond me.

                    1. 3

                      I think the biggest argument is mature implementations used by large numbers of people. That catches lots of common and uncommon problems. In reliability-focused filesystems, that the reliability is field-proven then constantly maintained is more important to me than about anything. The only reason I don’t use it is that it came from Oracle with all the legal unknowns that can bring down the line.

                      1. 3

                        When you say “Oracle”, are you referring to ZFS or btrfs? ;)

                        1. 1

                          Oh shit! I didn’t know they designed both! Glad I wasn’t using btrfs either. Thanks for the tip haha.

                      2. 2

                        On a practical level, ZFS is a lot more tested (in Solaris/Illumos, FreeBSD, and now Linux); more different people have put more terabytes of data in and out of ZFS than they seem to have for btrfs. This matters because we seem to be unable to build filesystems that don’t run into corner cases sooner or later, so the more time and data a filesystem has handled, the more corner cases have been turned up and fixed.

                        On a theoretical level, my personal view is that ZFS picked a better internal structure for how its storage is organized and managed than btrfs did (unless btrfs drastically changed things since I last looked several years ago). To put it simply, ZFS is a volume manager first and then a filesystem manager second (on top of the volumes), while btrfs is (or was) the other way around (you manage filesystems and volumes are a magical side effect). ZFS’s model does more (obvious) violence to Linux IO layering than I think btrfs’s does, but I strongly believe it is the better one and gives you cleaner end results.

                      3. 0

                        Why would I want to run ZFS on my laptop?

                        1. 1

                          Why wouldn’t you want to run it on your laptop?

                      1. 3

                        How come “(benevolent) dictator for life” is OK for pudgy but lovable Western European men, but is not ok for corporate entities from the USA?

                        1. 63

                          Shall I compare thee to a billion-dollar corporation? Thou art more lovely and more temperate,

                          1. 12

                            Because ultimately, when the community VERY LOUDLY made it known to Guido that type system theory may be a fascinating pursuit but that radically altering the Python language to sprout that kind of feature isn’t something some people are interested in, he stood aside and allowed the language governance to change to meet the needs of the community.

                            What I’m reading from this article (Not a Go fan, no skin in that game. Feels like a time warp back to K&R C circa 1988 to me.) is that there is exactly zero chance of this ever happening with Go.

                            1. 5

                              when the community VERY LOUDLY made it known to Guido [..], he stood aside and allowed the language governance to change to meet the needs of the community.

                              Wait, you tout it as a good thing? Being very loud rarely has anything to do with being right or useful. And making a smart person cave in to harsh pressure is not really an achievement.

                              (Mind, I’m not discussing the technical point about types itself.)

                              1. 3

                                From my perspective, despite the phrasing I chose in my previous comment, this isn’t at all about being loud.

                                It’s about defining the technical direction of a piece of technology that is used and relied upon by a vast number of people with very varied use cases.

                                Nobody is saying that Guido shouldn’t design type theory oriented extensions to Python, what people are objecting to is radically altering the syntax of the language standard in order to further those goals.

                                For many people, what we value about Python is its un-clever, un-cluttered syntax, and for a substantial number of people, the directions Guido wanted to take the language were not conducive to our usage patterns.

                                My point here is simply that Guido, as language designer, was mature/enlightened enough to put his ego aside and recognize that the directions he wanted to take things were not what a sizable numerical percentage of the community wanted, and that his energy would be best spent standing aside and letting majority rule. I see this as a very different situation from the one you paint in your comment.

                                1. 2

                                  Now, that is a much better explanation, thank you!

                                  Despite being a long-time user of Python I was never involved in the design discussion around the language, so I only see what other people translate outside of the inner circle.

                              2. 3

                                Isn’t that exactly what happened? The community said they needed module support, the go team added module support.

                                1. 11

                                  Again, not an expert - not even a Go fan - but what I’m reading is that the community came up with its own solution and the Go team chose a radically different path.

                                  So, maybe? I’m not sure. The point I was addressing was the “Go is Google’s, not the community’s”.

                                  1. 2

                                    Not in that way, though. There were a few community solutions to packaging over the years when the Go team didn’t want to tackle the problem, leading to one solution (dep) that was basically officially blessed right up until it wasn’t and another path was chosen by the core team. There was poor communication around this and expectations were not well managed. I think the go team truly wants the best solution, but seeing this play out did make some people feel like the author of this post.

                                    The bigger issue is that the priorities used to design go modules are quite different from what some folks in the community want in a dependency manager (specifically, its use of minimal version solving), which puts you in the position to use the official thing that doesn’t do what you want or try to find a community-backed project that is attempting to work in a familiar way but will surely dry up as folks adopt the official solution.

                                    FWIW I had a bunch of issues with dep, but I think they were the result of not having a centralized repository for packages like there is for other languages and with not the tool itself. It turns out that it’s hard to build a reliable dependency manager when you need to be able to contact every host who hosts code instead of just one.

                                2. 7

                                  Because modules.

                                  (Seriously, I read the whole post as a complaint about the modules decision it alludes to, plus the obligatory swipe at the lack of generics.)

                                  1. 5

                                    You’re probably correct.

                                    I, for one, am happy with modules. I never liked dep, and think modules is significantly better both in semantics and implementation. That being said, some of the communication was … unfortunate, although different people seem to have quite different recollections of some events, so who knows 🤷

                                    I do empathize a lot with the dep team (And Sam in particular) as it always sucks if you spend a lot of time on something, only to not have it used. But sometimes these things are hard to avoid.

                                    1. 3

                                      I don’t think I made it clear enough that my entry was an observation, not a complaint; to the extent that it’s a complaint, it’s only a complaint that the Go team is not honest about how things work in Go’s evolution. I personally am happy with the end results of the current Go module design (as a sysadmin, I particularly like the minimum necessary version approach) and the current lack of generics (I would rather have no generics than not-great generics or generics that clash with the rest of the language). There are a number of benefits that come from having a language with a distinct ownership, provided that you have the right owners, and I think that Go does.

                                      (I’m the author of the linked-to entry.)

                                    2. 3

                                      It sounds like the complaint is that Google talks a big game about how open Go’s development is, but at least in the modules case, is happy to completely ignore extensive open coordination on building a module system for Go while silently building its own module system from scratch. I don’t follow Python development that much, but I don’t think they’ve ever done anything like that.

                                      1. 0

                                        It’s really not ok for either, to be frank.

                                      1. 12

                                        I see you’re already using ublock origin. Is there a reason you’re not just using that to block JavaScript?


                                        1. 10

                                          There’s also uMatrix for people who want fine-grained control over both Javascript and all sorts of other things (including cookies). I’ve adopted to uMatrix’s interface and find it nice, but it’s probably not what you want if you want to be routinely enabling Javascript on sites.

                                          1. 2

                                            I suggest to use uBO with medium mode rather than install uMatrix along uBO. uMatrix’s default ruleset makes it more similiar to uBO’s medium mode.

                                            When using another blocker (uMatrix included) along uBO, this can prevent uBO from using its neutered resources/scripts, which are quite useful to prevent site breakages or to work around anti-blockers.

                                            For example uBO redirects script from googletagservices.com to a local neutered version, and this prevents a lot of breakage (googletagservices.com is not blocked by EasyPrivacy due to the high likelihood of page breakage). But if uMatrix blocks the script from googletagservices.com, then uBO won’t be able to redirect to the local neutered script, because blocking has priority over redirecting in the WebExtensions API.

                                            from the author: https://www.reddit.com/r/firefox/comments/706xrr/umatrix_vs_ublock_origin_medium_blocking_mode/dn1goxr?utm_source=share&utm_medium=web2x

                                        1. 31

                                          This is on one level pretty straightforward and on another level very arcane. There are two importance pieces of knowledge beyond normal Unix command line stuff needed to understand it: echo is a builtin in modern shells, and when a process writes to a closed pipe it is normally terminated with a SIGPIPE.

                                          Since echo is a builtin, the (echo red; echo green 1>&2) will all be run by one process. If this process does not start running until after echo blue has exited, it will receive SIGPIPE when it does echo red and die, and not go on to run echo green. So we have three order of execution cases:

                                          • the shell on the left side gets through both of its echos before echo blue runs (well, before it writes its output). The output you get is ‘green blue’.
                                          • the echo red happens before echo blue exits, so it doesn’t get a SIGPIPE, but echo green happens afterwards. The output you get is ‘blue green’. This is probably the usual case, especially on a multi-core system where both sides of the pipeline can run at once.
                                          • the echo blue process runs to completion before the shell on the left side gets a chance to finish echo red, so the left side shell is terminated before it writes ‘green’. The output you get is just ‘blue’.

                                          If echo wasn’t a builtin the SIGPIPE would only ever terminate the ‘echo red’ process, not the entire shell command sequence on the left side, and so you would normally always see ‘green’ in the output (unless you’ve set the option in the shell to terminate command sequences if one of them has an error). You might still theoretically see it before the ‘blue’, depending on scheduling.

                                          (I believe that one thing that makes ‘blue green’ a more likely output is that shells usually start the processes in pipelines from left to right, so the echo red; echo green 1>&2 will normally become ready to run ever so slightly before the echo blue.)

                                          1. 4

                                            Also, for some extra raciness… isn’t stderr usually unbuffered, while stdout is usually line buffered?

                                            1. 3

                                              At least when going to a pipe, stdout seems to be buffered in a 64k buffer.

                                              This prints 16384 lines for me and then hangs:

                                              (i=1; while true; do echo red; echo $i 1>&2; ((i+=1)); done) | sleep 10

                                              I would assume that “red\n” has 4 chars. 16384*4=65536.

                                              1. 2

                                                I believe that this is the kernel’s pipe buffer limit in action, not the shell’s. In order to limit how much kernel memory you’re using, the kernel only allows so much data to be written to a pipe before it’s read by the other side. I believe that POSIX requires it be at least 4 KB, but systems are free to allow more than that if they feel like it. A larger kernel buffer size has the advantage that it makes some programs work more reliably.

                                                (The simple way to write a program that both writes data to a subprocess and then reads back the subprocess’s output is to write all the data first then read all the output back afterwards. Unfortunately this is prone to deadlocks, where the sub-process writes out enough to block until you read while you’re writing enough more input to it to block until it reads. The larger the kernel’s pipe buffer size, the more writing you need on both sides to have this happen.)

                                              2. 2

                                                I think that buffering doesn’t matter here. Although the two echos on the left side are being run by the same shell process, the shell is careful to make them behave as if they were separate processes so as part of that it will flush any standard output buffering as part of finishing echo red.

                                                (There have been bugs here in Bash in peculiar situations, but as far as I know they’re all gone now.)

                                              3. 2

                                                On a statistical level, you could assume it’s a Poissonian distribution. Now calculating the expected value (what it should be) would be a lot of effort, so instead you can just run that many times in an actual shell to get enough data to say that a vast majority of the time (~ 3 standard deviations) the answer is “green\nblue\n” and that it is what the output should be. Most of the time.

                                              1. 4

                                                I think that the ZFS-on-Linux crowd has had over a decade to design and implement a ZFS module which works within Linux’s VFS framework, and that this latest debacle has confirmed that the ZFS maintainers are not interested in working within the upstream Linux community any more than is necessary to keep their self-isolated project afloat.

                                                1. 5

                                                  ZFS’s end-to-end data integrity guarantees don’t work if ZFS is layered with a VFS. https://wiki.illumos.org/download/attachments/1146951/zfs_last.pdf

                                                  1. 1

                                                    I guess it might be possible to extend the VFS to that for the data integrity the storage layer could upcall into the FS to ask it to figure out the correct data associated to a set of blocks (eg. in mirror: which of these blocks matches the checksums, or in raid-z: here are 4 related blocks, please recover the 3 blocks of data from them).

                                                    Of course that would be a huge reimplementation effort, and it will probably show some cracks near the edges, but it probably would have been possible - except that it would tie CDDL code even closer to GPL interfaces in Linux.

                                                  2. 2

                                                    ZFS on Linux does broadly work at the VFS layer, much as BTRFS does (let us ignore zvols here, which operate through a different interface). This is not surprising, since anything that presents one or more filesystems to other systems inside the Linux kernel (including eg to the NFS server code) must do so at this level. ZFS is not a conventionally structured Linux filesystem, but then neither is BTRFS.

                                                    ZFS cannot be split into a ‘redundancy layer’ (that would look like current software RAID or LVM) and ‘filesystem layer’ that talk to each other only through Linux kernel interfaces without changing the disk format or losing the ability to recover from degraded data that does not produce read errors. This is because ZFS’s checksums are external to the data being read, not internal; the checksum for a data block is in the file indirect block that points to it, the checksum for a file indirect block is in the file inode, and so on up the tree. This means that a hypothetical redundancy layer can’t verify the checksum of a block as it reads it without additional private information (ie, the expected checksum), and since it can’t do that it can’t reconstruct a damaged block because it doesn’t know it’s damaged. Without private interfaces to either pass the checksum through or tell the redundancy layer to try again differently, any checksum failure would be unrecoverable.

                                                    (There is also the small issue that it would require a relatively complete code rewrite. ZFS is internally divided into a dataset layer and a disk IO layer, but the two do not communicate with each other through anything like Linux kernel block APIs.)

                                                  1. 22

                                                    To start, the ZFS filesystem combines the typical filesystem with a volume manager. It includes protection against corruption, snapshots and copy-on-write clones, as well as volume manager.

                                                    It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

                                                    Ever wondered where did vi come from? The TCP/IP stack? Your beloved macOS from Apple? All this is coming from the FreeBSD project.

                                                    Technically, vi and the BSD implementations of the TCP/IP stack can be attributed to 4.xBSD at UCB; FreeBSD is not the origin of either.

                                                    1. 10

                                                      It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

                                                      At least ext4 supports metadata checksums:


                                                      At any rate Ted T’so (the ext[34] maintainer) has said as far back as 2009 that ext4 was meant to be transitional technology:

                                                      Despite the fact that Ext4 adds a number of compelling features to the filesystem, T’so doesn’t see it as a major step forward. He dismisses it as a rehash of outdated “1970s technology” and describes it as a conservative short-term solution. He believes that the way forward is Oracle’s open source Btrfs filesystem, which is designed to deliver significant improvements in scalability, reliability, and ease of management.


                                                      Of course, the real failing here is not ext4, but that btrfs hasn’t been able to move to production use in more than ten years (at least according to some people).

                                                      That said, ZFS works fine on Linux as well and some distributions (e.g. NixOS) support ZFS on root out-of-the-box.

                                                      1. 3

                                                        Of course, the real failing here is not ext4, but that btrfs hasn’t been able to move to production use in more than ten years (at least according to some people).

                                                        I think it’s good to contrast “some people’s” opinion with the one from Facebook:

                                                        it’s safe to say every request you make to Facebook.com is processed by 1 or more machines with a btrfs filesystem.

                                                        Facebook’s open-source site:

                                                        Btrfs has played a role in increasing efficiency and resource utilization in Facebook’s data centers in a number of different applications. Recently, Btrfs helped eliminate priority inversions caused by the journaling behavior of the previous filesystem, when used for I/O control with cgroup2 (described below). Btrfs is the only filesystem implementation that currently works with resource isolation, and it’s now deployed on millions of servers, driving significant efficiency gains.

                                                        But Facebook employs btrfs project lead.

                                                        There is also the fact that Google is now using BTRFS on Chromebooks with Crostini.

                                                        As for opinions I’ve seen one that claims that “ZFS is more mature than btrfs ON SOLARIS. It is mostly ok on FreeBSD (with various caveats) and I wouldn’t recommend it on Linux.”.

                                                        1. 2

                                                          I wouldn’t recommend it on Linux.

                                                          I’d still say that ZFS is more usable than lvm & linux-softraid. If only due to the more sane administration tooling :)

                                                      2. 9

                                                        Ext4, like most evolutions of existing filesystems, is strongly constrained by what the structure of on-disk data and the existing code allows it to do. Generally there is no space for on-disk checksums, especially for data; sometimes you can smuggle some metadata checksums into unused fields in things like inodes. Filesystems designed from the ground up for checksums build space for checksums into their on-disk data structures and also design their code’s data processing pipelines so there are natural central places to calculate and check checksums. The existing structure of the code matters too because when you’re evolving a filesystem, the last thing you want to do is to totally rewrite and restructure that existing battle-tested code with decade(s) of experience embedded into it; if you’re going to do that, you might as well start from scratch with an entirely new filesystem.

                                                        In short: that ext4 doesn’t have checksums isn’t surprising; it’s a natural result of ext4 being a backwards compatible evolution of ext3, which was an evolution of ext2, and so on.

                                                        1. 4

                                                          It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

                                                          Ext4 doesn’t aim to be that type of filesystem, for desktop use on the average user, this is fairly okay since actual bitrot in data the user cares about is rare (most bitrot occurs either in system files or empty space or in media files where the single corrupt frame barely matters).

                                                          If you want to check out a more modern alternative, there is bcachefs. I’ve been using it on my laptop for a while (until I stopped but now I’m back on it) and it’s been basically rock solid. The developer is also working on erasure coding and replication in a more solid way than btrfs currently has.

                                                        1. 1

                                                          I’m glad to received a link to my own article, but I do disagree somewhat with that is said in this one.

                                                          The specific example of cron/NFS is in fact a hard dependency: cron runs reboot tasks when it starts, and if they need NFS mounts, then those mounts should be a hard requirement of cron, “ordering” is not sufficient.

                                                          The implied issue is that cron doesn’t need the NFS mounts once it’s run those tasks, so the dependency “becomes false” at that point. If I understand the argument correctly, it is: seeing as “the system as a whole wants both”, you could use a startup ordering to avoid leaving a lingering dependency once the @reboot jobs have run while still ensuring that NFS mounts are available before cron starts. This is true, but it would be fragile and racy. For instance, there would be nothing to prevent the NFS mounts, even with the co-operation of the service manager, being unmounted just after crond begins execution, but before it has even started (or possibly when it is midway through) running the @reboot tasks.

                                                          In my eyes there are two ways to solve it properly: separate cron boot tasks from regular cron so that you can run them separately (that would mean changing cron or using some other tool), or having the cron boot tasks work by starting short-running services (which can then list NFS mounts as a dependency). This latter requires non-priviliged users be allowed to start services though, and that’s opening a can of worms. I feel that ultimately the example just illustrates the problems inherent in cron’s @reboot mechanism.

                                                          (Not to mention that there’s a pre-existing problem: for cron, “reboot” just means “cron started”. If you start and stop cron, those reboot tasks will all run again…)

                                                          1. 2

                                                            Belatedly (as the author of the linked-to article): In our environment, NFS mounts are a hard dependency of those specific @reboot cron jobs, but not of cron in general. In fact we specifically want cron to run even if NFS mounts are not there, because one of the system cron jobs is an NFS mount updater that will, as a side effect, keep retrying NFS mounts that didn’t work the first time. Unfortunately there is no good way to express this in current init systems that I know about and @reboot cron jobs are the best way we could come up with to allow users to start their own services on boot without having to involve sysadmins to add, modify, and remove them.

                                                            (With sufficient determination we could write our own service for this which people could register with and modify, and in that service we could get all of the dependencies right. But we’re reluctant to write local software, such a service would clearly be security sensitive, and @reboot works well enough in our environment.)

                                                            1. 1

                                                              But it’s not a dependency of cron, it’s a dependency of these particular tasks. Cron the package that contains the service definition has no idea about what you put into your crontab.

                                                              Yes, it’s a problem in cron. This is why there’s movement towards just dropping cron in favor of integrating task scheduling into service managers. Apple launchd was probably first, systemd of course has timers too, and the “most Unix” solution is in Void Linux (and runit-faster for FreeBSD now): snooze as runit services. In all these cases, each scheduled task can have its own dependencies.

                                                              (Of course the boot tasks then are just short-running services as you described.)

                                                              1. 1

                                                                But it’s not a dependency of cron, it’s a dependency of these particular tasks

                                                                Agreed, but if you’re the sysadmin and know that cron jobs are using some service/unit, then you’d better make sure that the cron service is configured with an appropriate dependency. At least, that’s how I view it. Without knowing more about the particular system in question, I’m not sure we can say much more about how it should be configured - I agree that cron isn’t perfect, particularly for “on boot” tasks, but at least it’s a secure way of allowing unprivileged users to set up their own time-based tasks. (I guess it’s an open question whether that should really be allowed anyway).

                                                              2. 1

                                                                I was also confused by that, but from the discussion in the comments, I think the reason they don’t want it to be a hard dependency is that, in their setup, some machines typically have NFS configured and some don’t. In the case where the machine would start NFS anyway, they want an ordering dependency so it starts before cron. But if NFS wasn’t otherwise configured to start on that machine, then cron should start up without trying to start NFS.

                                                                1. 1

                                                                  Yes, that accords with the comments below the post:

                                                                  On machines with NFS mounts and Apache running, we want Apache to start after the NFS mounts; however, we don’t want either NFS mounts or Apache to start the other unless they’re explicitly enabled. If we don’t want to have to customize dependencies on a per-machine basis, this must be a before/after relationship because neither service implies the other

                                                                  The problem is “don’t want to have to customize dependencies” is essentially saying “we are ok with the dependencies being incomplete on some machines if it means we can have the same dependency configurations on all machines”. That seems like the wrong approach to me; you should just bite the bullet and configure your machines correctly; you’ve already got to explicitly enable the services you do want on each machine, anyway.

                                                                  1. 1

                                                                    This gets into the philosophy of fleet management. As someone who manages a relatively small fleet by current standards (we only have a hundred machines or so), my view is that the less you have to do and remember for specific combinations of configurations, the better; as much as possible you want to be able to treat individual configuration options as independent and avoid potential combinatorial explosions of combinations. So it’s much better to be able to set a before/after relationship once, globally, than to remember that on a machine that both has NFS mounts and Apache that you need a special additional customization. Pragmatically you’re much more likely to forget that such special cases exist and thus set up machines with things missing (and such missing cases may be hard to debug, since they can work most of the time or have only tiny symptoms when things go wrong).

                                                                    (One way to think of it is that it is a building blocks approach versus a custom building approach. Building blocks is easier.)

                                                              1. 4

                                                                I actually had to laugh when I saw this title. Was ed ever a “good” editor?

                                                                1. 16

                                                                  Yes, absolutely. Ed was a good editor in the 1970s, when 300 baud links were reasonably fast things and teletypes with printed output were common, and even significantly into the 1980s, when you might still be dealing with 300 baud serial links (1200 baud if you were lucky), heavily overloaded systems, very dumb CRT-based terminals, or some combination of all three. About its only significant limitation in those situations is that it’s a single file at a time editor, which makes some things more awkward. Using any visual editor in those situations is a frustrating exercise in patience (or outright impossible on hardcopy teletypes or sufficiently dumb terminals).

                                                                  (I’m the author of the linked-to article.)

                                                                  1. 2

                                                                    The ancestors of ed, such as CTSS QED and it’s descendants on Multics (QEDX, Ted) all worked on multiple buffers, but those (and other) features were not carried forward to the simpler UNIX ed.

                                                                    1. 1

                                                                      Given that Ken Thompson was fully familiar with QED et al (cf), the omission of multiple buffer support in ed is clearly deliberate. I wonder if Thompson felt forced to do it by resource constraints on Unix or if he just decided that it wasn’t important enough and omitting it simplified the experience in ways he wanted.

                                                                      (By the time of V7 I suspect that resource constraints weren’t a big enough issue, and the Bell Labs people certainly were willing to rewrite programs if they should be improved. And I believe that people did versions of QED for early Unix, too.)

                                                                    2. 1

                                                                      I don’t think I’ve ever actually used ed. I grew up in the 90s, so by then it was pico/nano. Is edlin from the MS-DOS days similar? If so I did use edline quite a bit, but would not be able to tell you anything about how to use it today.

                                                                      1. 2

                                                                        I never used edlin, but I’d guess that editor has more in common with CP/M ED than UNIX ed.

                                                                        1. 2

                                                                          Based on the edlin Wikipedia page, edlin is kind of a relatively simplified take on ed (although that’s not necessarily its historical origins). Full ed is fairly powerful, with a lot of text manipulation commands that don’t require you to retype lines to modify them and a bunch of ways of specifying what lines you want to operate on (including ‘lines that match this regular expression’). Interested parties can see the GNU ed manual. It’s probably simplest to start with the commands list if you want to see what ed is capable of.

                                                                          (GNU ed is a moderate superset of V7 ed in both commands and its regular expression support.)

                                                                        1. 1

                                                                          I’d have to say no - it was clearly inferior to the QED descendants QEDX (and it’s descendent, Ted), which preceded it.

                                                                          In defense of ‘ed’, this simplification was a design decision, according to Ritchie.

                                                                          When using Multics, I often find myself reaching for QEDX, especially for quick editing tasks.

                                                                        1. 3

                                                                          Wrap on integer overflow is a good idea, because it will get rid of one of undefined behavior in C.

                                                                          Undefined behavior is evil. One evil it causes is that it makes codes optimization-unstable. That is, something can work in debug build but does not work in release build, which is very undesirable. The article does not address this point at all.

                                                                          1. 1

                                                                            The article does not address this point at all.

                                                                            To remove all undefined behaviour in C would severely impact the performance of C programs.

                                                                            The post does suggest that trap-on-overflow is a superior alternative to wrap-on-overflow, and of course you could apply it to both debug and release builds if this optimisation instability concerns you. Even if you apply it just to your debug build, you’ll at least avoid the possibility that something works in the debug build but not in the release.

                                                                            1. 2

                                                                              Note that Rust wraps on overflow and as far as I can tell this does not impact performance of Rust programs.

                                                                              1. 1

                                                                                This is essentially the same as one of the arguments I addressed in the post. Although in certain types of programs, particularly lower-level languages (like C and Rust) where the code is written directly by a human programmer, there probably are not going to be many cases where the optimisations enabled by having overflow be undefined will kick in. However, if the program makes heave use of templates or generated code, or is produced by transpiling a higher-level language with a naive transpiler, then it could do (I’ll conceded this is theoretical in that I can’t really give a concrete example). The mechanism by which the optimisation works is well understood and it isn’t too difficult to produce an artificial example where the optimisation grants a significant speed-up in the generated code.

                                                                                Also, in the case of Rust programs, you can’t really reliably assess the impact of wrapping on overflow unless there is an option to make overflow undefined behaviour. Is there such an option in Rust?

                                                                                1. 2

                                                                                  No, there is no such option, and there never will be. Rust abhors undefined behaviors. Performance impact assessment I had in mind was comparison with C++ code.

                                                                                  On the other hand, rustc is built on LLVM so it is rather trivial to implement: rustc calls LLVMBuildAdd in exactly one place. One can replace it with LLVMBuildNSWAdd (NSW stands for No Signed Wrap).

                                                                              2. 0

                                                                                To remove all undefined behaviour in C would severely impact the performance of C programs.

                                                                                This cannot be entirely true. As a reducto ad absurdum, it would be possible in principle to laboriously define all the things that compilers currently do with undefined behaviour and make that the new definition of the behaviour, and there would then be zero performance impact.

                                                                                C compiler writers might argue that removing all undefined behaviour without compromising performance would be prohibitively expensive, but I’m not entirely convinced; there are carefully-optimized microbenchmarks on which the naive way of defining currently-undefined behaviour produces a noticeable performance degradation, but I don’t think it’s been shown that that generalises to realistic programs or to a compiler that was willing to put a bit more effort in.

                                                                                1. 2

                                                                                  This cannot be entirely true. As a reducto ad absurdum, it would be possible in principle to laboriously define all the things that compilers currently do with undefined behaviour and make that the new definition of the behaviour, and there would then be zero performance impact

                                                                                  Clearly that would be absurd, and it’s certainly not what I meant by “remove all undefined behaviour”. Your “possible in principle” suggestion is practically speaking completely impossible, and what I said was true if you don’t take such a ridiculously liberal interpretation of it. Let’s not play word games here.

                                                                                  C compiler writers might argue that removing all undefined behaviour without compromising performance would be prohibitively expensive

                                                                                  They might, but that’s not what I argued in the post.

                                                                                  I don’t think it’s been shown that that generalises to realistic programs or to a compiler that was willing to put a bit more effort in.

                                                                                  There’s no strong evidence that it does, nor that it wouldn’t ever do so.

                                                                                  1. 1

                                                                                    Well then, what did you mean? You said “To remove all undefined behaviour in C would severely impact the performance of C programs.” I don’t think that’s been demonstrated. I’m not trying to play word games, I’m trying to understand what you meant.

                                                                                    1. 1

                                                                                      I meant “remove all undefined behaviour” in the sense and context of this discussion, in particular related to what @sanxiyn above says:

                                                                                      Undefined behavior is evil. One evil it causes is that it makes codes optimization-unstable. That is, something can work in debug build but does not work in release build, which is very undesirable

                                                                                      To avoid that problem, you need to define specific behaviours for cases that are currently specify undefined behaviour (not ranges of possible behaviour that could change depending on optimisation level). To do so would significantly affect performance, as I said.

                                                                                      1. 1

                                                                                        I could believe that removing all sources of differing behaviour between debug and release builds would significantly affect performance (though even then, I’d want to see the claim demonstrated). But even defining undefined behaviour to have the current range of behaviour would be a significant improvement, as it would “stop the bleeding”: one of the insidious aspects of undefined behaviour is that the range of possible impacts keeps expanding with new compiler versions.

                                                                                        1. 2

                                                                                          I could believe that removing all sources of differing behaviour between debug and release builds would significantly affect performance

                                                                                          It’s not just about removing the sources of differing behaviour - but doing so with sensibly-defined semantics.

                                                                                          though even then, I’d want to see the claim demonstrated

                                                                                          A demonstration can only show the result of applying one set of chosen semantics to some particular finite set of programs. What I can do is point out that C has pointer arithmetic and this is one source of undefined behaviour; what happens if I take a pointer to some variable and add some arbitrary amount, then store a value through it? What if doing so happens to overwrite part of the machine code that makes up the program? Do you really suppose it is possible to practically define what the behaviour should be in this case, such that the observable behaviour will always be the same when the program is compiled with slightly different optimisation options - which might result in the problematic store being to a different part of code? To fully constrain the behaviour, you’d need pointer bounds checking or similar - and that would certainly have a performance cost.

                                                                                          But even defining undefined behaviour to have the current range of behaviour would be a significant improvement, as it would “stop the bleeding”

                                                                                          As I’ve tried to point out with the example above, the current range of undefined behaviour is already unconstrained. But for some particular cases of undefined behaviour, I agree that it would be better to have more restricted semantics. For integer overflow, in particular, I think it could reasonably be specified that the result becomes unstable (eg. behaves incosistently in comparisons), but the behaviour is otherwise defined - for example. Note that even this would impede some potential optimisations. (And that I still advocate trap on overflow as the preferred implementation).

                                                                                  2. 2

                                                                                    I suspect that one issue is that compilers may manifest different runtime behaviour for undefined behaviour, depending on what specific code the compiler decided to generate for a particular source sequence. In theory you could document this with sufficient effort, but the documentation would not necessarily be useful; it would wind up saying ‘the generated code may do any of the following depending on factors beyond your useful control’.

                                                                                    (A canonical case of ‘your results depend on how the compiler generates code’, although I don’t know if it depends on undefined behaviour, is x86 floating point calculations, where your calculations may be performed with extra precision depending on whether the compiler kept everything in 80-bit FPU registers, spilled some to memory (clipping to 64 bits or less), or used SSE (which is always 64-bit max).)

                                                                                    1. 1

                                                                                      It’s not only possible: Ive seen formal semantics that define various undefined behaviors just like you said. People writing C compilers can definitely do it if they wanted to.

                                                                                1. 3

                                                                                  The problem turns out to be some obscure FUSE mounts that the author had lying around in a broken state, which subsequently broke the kernel namespace system. Meanwhile, I have been running systemd on every computer I’ve owned in many years and have never had a problem with it.

                                                                                  Does this not seem a bit melodramatic?

                                                                                  1. 9

                                                                                    From the twitter thread:

                                                                                    Systemd does not of course log any sort of failure message when it gives up on setting up the DynamicUser private namespace; it just goes ahead and silently runs the service in the regular filesystem, even though it knows that is guaranteed to fail.

                                                                                    It sounds like the system had an opportunity to point out an anomaly that would guide the operator in the right direction, but instead decided to power through anyways.

                                                                                    1. 8

                                                                                      A lot like continuing to run in a degraded state is a plague that affects distributed systems. Everybody thinks it’s a good idea “some service is surely better than no service” until it happens to them.

                                                                                      1. 3

                                                                                        At $work we prefer degraded mode for critical systems. If they go down we make no money, while if they kind of sludge on we make less but still some money while we firefight whatever went wrong this time.

                                                                                        1. 8

                                                                                          My belief is that inevitably you could be making $100 per day, would notice if you made $0, but are instead making $10 and won’t notice this for six months. So be careful.

                                                                                          1. 4

                                                                                            We have monitoring and alerting around how much money is coming in, that we compare with historical data and predictions. It’s actually a very reliable canary for when things go wrong, and for when they are right again, on the scale of seconds to a few days. But you are right that things getting a little suckier slowly over a long time would only show up as real growth not being in line with predictions.

                                                                                        2. 2

                                                                                          I tend to agree that hard failures are nicer in general (especially to make sure things work), but I’ve also been in scenarios where buggy logging code has caused an entire service to go down, which… well that sucked.

                                                                                          There is a justification for partial service functionality in some cases (especially when uptime is important), but like with many things I think that judgement calls in that are usually so wrong that I prefer hard failures in almost all cases.

                                                                                          1. 1

                                                                                            Running distributed software on snowflake servers is the plague to point out.

                                                                                            1. 1

                                                                                              Everybody thinks it’s a good idea “some service is surely better than no service” until it happens to them.

                                                                                              So if the server is over capacity, kill it and don’t serve anyone?

                                                                                              Router can’t open and forward a port, so cut all traffic?

                                                                                              I guess that sounds a little too hyperbolic.

                                                                                              But there’s a continuum there. At $work, I’ve got a project that tries to keep going even if something is wrong. Honest, I’m not sure I like how all the errors are handled. But then again, the software is supposed to operate rather autonomously after initial configuration. Remote configuration is a part of the service; if something breaks, it’d be really nice if the remote access and logs and all were still reachable. And you certainly don’t want to give up over a problem that may turn out to be temporary or something that could be routed around… reliability is paramount.

                                                                                              1. 2

                                                                                                And you certainly don’t want to give up over a problem that may turn out to be temporary

                                                                                                I think that’s close to the core of the problem. Temporary problems recur, worsen, etc. I’m not saying it’s always wrong to retry, but I think one should have some idea of why the root problem will disappear before retrying. Computers are pretty deterministic. Transient errors indicate incomplete understanding. But people think a try-catch in a loop is “defensive”. :(

                                                                                          2. 4

                                                                                            So you never had legacy systems (or configurations) to support? I read Chris’ blog regularly, and he works at a university on a heterogeneous network (some Linux, some other Unix systems) that has been running Unix for a long time. I think he started working there before systemd was even created.

                                                                                            1. 3

                                                                                              Why do you say that the FUSE mounts were broken? As far as we can see they were just set up in a uncommon way https://twitter.com/thatcks/status/1027259924835954689

                                                                                              1. 3

                                                                                                It does look brittle that broken fuse mounts prevent the ntpd from running. IMO the most annoying part is the debugability of the issue.

                                                                                                1. 2

                                                                                                  Yes, it seems melodramatic, even to my anti-systemd ears. It’s a documentation and error reporting problem, not a technical problem, IMO. Olivier Lacan gave a great talk last year about good errors and bad errors (https://olivierlacan.com/talks/human-errors/). I think it’s high time we start thinking about how to improve error reporting in software everywhere – and maybe one day human-centric error reporting will be as ubiquitous as unit testing is today.

                                                                                                  1. 2

                                                                                                    In my view (as the original post’s author) there are two problems in view. That systemd doesn’t report useful errors (or even notice errors) when it encounters internal failures is the lesser issue; the greater issue is that it’s guaranteed to fail to restart some services under certain circumstances due to internal implementation decisions. Fixing systemd to log good errors would not cause timesyncd to be restartable, which is the real goal. It would at least make the overall system more debuggable, though, especially if it provided enough detail.

                                                                                                    The optimistic take on ‘add a focus on error reporting’ is that considering how to report errors would also lead to a greater consideration of what errors can actually happen, how likely they are, and perhaps what can be done about them by the program itself. Thinking about errors makes you actively confront them, in much the same way that writing documentation about your program or system can confront you with its awkward bits and get you to do something about them.

                                                                                                1. 6

                                                                                                  The challenging open issue I see with this sort of idea is the question of what links people will see by default when they view a page. If they see only the links the author put in, then we will generally have a situation no different than today (as most people will never change that default). If they see some additional set of links by default, then there will be ferocious competition from spammers to get their links included in that set and in general large arguments about what links will be included in it.

                                                                                                  For better or worse, HTML and browser technology today has a simple, distributed, scalable, and clearly fair answer to the question of ‘what links appear in a page by default’, and it’s one that keeps browsers and other central parties out of disputes (and generally out of the game of influencing the answer).

                                                                                                  (I admit that these days I look at all new protocols through the lens of ‘how can they be abused by spammers and other bad people’, but partly this is because we know spammers and other bad people are out there and will actively attempt to abuse anything they can.)

                                                                                                  1. 8

                                                                                                    The challenging open issue I see with this sort of idea is the question of what links people will see by default when they view a page.

                                                                                                    This is something we were concerned with at Xanadu during the development of XanaSpace (since we had all links in the form of loadable ODLs). The conclusion we came to was that a document author could recommend a particular set of links to go with their document, and that furthermore, people would produce and share collections of links (which, since they are not part of the content & are bidirectional, combine with transclusion to add additional context even to documents where the author is unaware of them) in sort of the same way as kottke.org or BoingBoing curates collections of other people’s web pages. Any resident links (including formatting links) would be applied when relevant (i.e., when the original source of any transcluded content overlapped with something mentioned in a link), and a person’s personal link collection would be private until shared.

                                                                                                    This sort of mirrors the fediverse / SSB model of requiring intentional hops between independent communities. Taking advantage of default link sets for spamming purposes only makes sense when the landscape is flat – where everybody sees everything unless they take countermeasures, and thus anything, no matter the quality, scales up indefinitely with no further human input. If, on the other hand, things only spread when they are actively shared by individuals from across different communities (each acting as curator for the sake of their community), the impact of these problems becomes small and it ceases to be worth the effort for bad actors.

                                                                                                    Ultimately, the links that appear on the page should be controllable by the person viewing the page, just like the formatting of the page should be under their control. In the case of links, that probably means trusting your friends and a handful of professional curators to have good taste & not scam you.

                                                                                                    1. 3

                                                                                                      This is really interesting. I did a reasonable chunk of reading around Xanadu in my research for this project, but ended up spending more time looking at ‘open hypermedia’ stuff and didn’t have the time to dive into a lot of the nitty gritty details as much as I would have liked. The details you’ve mentioned, for instance, seem to have totally passed me by.

                                                                                                      Curiously, although these kinds of ecosystem considerations weren’t really the main focus of my work, I think I came to somewhat similar (though less developed) conclusions in the assumptions I made for my prototyping work. If you wouldn’t mind, do you think you’d be able to point me towards a source for the design discussions you mentioned? I’d love to read more about this.

                                                                                                      1. 7

                                                                                                        Most Xanadu documentation is purely internal to the project, & the stuff that gets released is usually pretty non-technical. The discussions around how ODLs would be distributed were never made public at all, as far as I can tell (and they are part of a now-abandoned subproject). However, I did a technical overview of all of the Xanadu stuff I was privy to that wasn’t under trade secret(mirrored here), and this is probably the most complete & accessible public documentation on the project.

                                                                                                        ODL distribution isn’t covered in detail here, and XanaSpace never got to the point where it was seriously discussed in a systematic way, though there were some ideas thrown around, which I should document.

                                                                                                        Specifically: there was the concept of a ‘xanaful’ – a tarball containing all of the files (EDLs, ODLs, links, and sourcedocs) necessary for constructing a constellation of related documents. Paths in the tar format are just strings prefixing the content blob, so we were planning to use the full permanent address of each piece of content as its path, and check all resident xanaful tarballs for those addresses before fetching them from elsewhere. The idea is that a xanaful would be a convenient way for people to share not-yet-public documents, distribute stuff on physical media to be used where network access is limited, send bookmarks to friends in big chunks, distribute private ODLs (which contain formatting – and therefore themeing – links in addition to inter-content links), and get people who are a little skeptical to try a xanadu viewer out. Sending a xanaful out-of-band (for instance, by email) would be one method of ODL distribution.

                                                                                                        I wanted & was pushing for more of a peer-to-peer system. Specifically, I wanted individual XanaSpace applications to serve up the public parts of their caches to peers (to limit strain on hosts), and I wanted inter-peer communication to support a kind of friends network for sharing EDLs and ODLs directly with particular people. I was thinking of using gopher for this, since gopher is awfully simple to implement. (This is the system I wanted to use for transcopyright’s encryption oracle: the author’s machine or some trusted proxy would take requests for the OTP segment, check against a whitelist of authorized users, and distribute the OTP segment or a zeroed-out dummy.) There are some legal issues with this (which IPFS and SSB are discussing as well) and Ted wasn’t really comfortable with jumping into full distributed computing; also, this cut out any potential profit, and Ted still thinks of Xanadu as potentially profitable. As a result, none of these particular ideas got taken very seriously or had serious development work attached to them.

                                                                                                        Post-XanaSpace, our translit viewers have ditched the ODL entirely in favor of sticking link addresses in the EDL. (Our code always supported intermixing the two, but there was a conceptual division that I thought was useful.) It makes things easier for newcomers to understand but I think it does so at the cost of some clarity: now, people can still pretend that the author owns all the links in their document, and this only becomes clearly untrue when two authors link transclusions of overlapping segments from two resident documents. The web-based translit viewers only support one resident document at a time, so this never happens. (Having many resident documents at once is a vital feature & so we shouldn’t expect later implementations to keep this trend except accidentally.)

                                                                                                  1. 1


                                                                                                    1. 5

                                                                                                      According to the guy behind handmade hero:

                                                                                                      The fact that we currently have hardware vendors shipping both hardware and drivers (with USB and GPUs being to major examples), rather than just shipping hardware with a defined/documented interface, a la x64, or the the computers of the 80s, is a very large contributor to the fact that we have basically 3 consumer-usable OSes, and each one is well over 15 million lines of code. These large codebases are a big part of the reason that using software today can be rather unpleasant

                                                                                                      He proposes that if hardware vendors switched form a hardware+drivers to hardware that was well-documented in how it was controlled, so that most programmers could program it by feeding memory to/from it (which he considers an ISA of sorts), we’d be able to eliminate the need for drivers as such, and be able to go back to the idea of a much simpler OS.

                                                                                                      I haven’t watched the whole thing yet, but that’s the highlights

                                                                                                      1. 7

                                                                                                        Oh I would so, so, so, love that to happen…..

                                                                                                        …but as a guy whose day job is at that very interface I will point this out.

                                                                                                        The very reason for the existence of microcomputers is to soak up all the stuff that is “too hard to do in hardware”.

                                                                                                        Seriously, go back to the original motivations for the first intel micros.

                                                                                                        And as CPU’s have become faster, more and more things get “winmodemed”.

                                                                                                        Remember ye olde modems? Nice well defined rs-232 interface and standardized AT command set?

                                                                                                        All gone.

                                                                                                        What happen?

                                                                                                        Well, partly instead of having a separate fairly grunty/costly CPU inside the modem and a serial port… you could just have enough hardware to spit the i/q’s at the PC and let the PC do the work, and shift the AT command set interpretor into the driver. Result, cheaper better modems and a huge pain in the ass for open source.

                                                                                                        All the h/w manufacturers regard their software drivers as an encryption layer on top of their “secret sauce”, their competitive advantage.

                                                                                                        At least that’s what the bean counters believe.

                                                                                                        Their engineers know that the software drivers are a layer of kludge to make the catastrophe that is their hardware design limp along enough to be saleable

                                                                                                        But to bring their h/w up to a standard interface level would require doing some hard (and very costly) work at the h/w level.

                                                                                                        Good luck convincing the bean counters about that one.

                                                                                                        Of course, WinTel regard the current mess as a competitive advantage. It massively raises the barriers to entry to the market place. So don’t hold your breathe hoping WinTel will clean it up. They created this mess for Good (or Bad depending on view) reasons of their own.

                                                                                                        1. 1

                                                                                                          All the h/w manufacturers regard their software drivers as an encryption layer on top of their “secret sauce”, their competitive advantage.

                                                                                                          I thought the NDA’s and obfuscations were about preventing patent suits as much as competitive advantage. The hardware expert that taught me the basics of cat and mouse games in that field said there’s patents on about everything you can think of in implementation techniques. The more modern and cutting edge, the more dense the patent minefield. Keeping the internals secret means they have to get a company like ChipWorks (now TechInsights) to tear it down before filing those patent suits. Their homepage prominently advertises the I.P.-related benefits of their service.

                                                                                                          1. 2

                                                                                                            That too definitely! Sadly, all this comes at a huge cost to the end user. :-(

                                                                                                        2. 1

                                                                                                          The obvious pragmatic problem with this model is that hardware vendors sell the most hardware (and sell it faster) when people can immediately use their hardware, not when they must wait for interest parties to write device drivers from it. If the hardware vendor has to write and ship their own device drivers anyway, writing and shipping documentation is an extra cost.

                                                                                                          (There are also interesting questions about who gets to pay the cost of writing device drivers, since there is a cost involved here. This is frequently going to be ‘whoever derives the most benefit from having the device driver exist’, which is often going to be the hardware maker, since the extra benefit to major OSes is often small.)

                                                                                                      1. 1

                                                                                                        If you modify one byte in a file with the default 128KB record size, it causes the whole 128KB to be read in, one byte to be changed, and a new 128KB block to be written out.

                                                                                                        Recordsize “enforces the size of the largest block written”1 (emphesis mine), not that all blocks are that size.

                                                                                                        1. 2

                                                                                                          ZFS recordsize is extremely confusing. That description is correct at a filesystem level, but not at a file level; for a single file (or zvol), there is only one logical block size and if the file has multiple blocks, that logical block size will be the filesystem recordsize. So on a normal filesystem, if you have a file larger than 128 Kb, all its logical blocks will be 128 Kb and if you modify one byte, you do indeed rewrite 128 Kb. I had to go through an entire process of writing test files under various circumstances and dumping them with zdb to understand what was going on.

                                                                                                          (Compression may create different physical block sizes. One potentially surprising form of this compression is compressing the implicit zero bytes at the end of files that are not exact multiples of 128 Kb. So with compression, your 160 Kb incompressible file will come out close to 160 Kb allocated on disk, instead of 256 Kb.)

                                                                                                        1. 3

                                                                                                          Secondly, I think that compression works more efficiently when given larger blocks to compress.

                                                                                                          It can. In some database situations it’s suggested to use a very large blocksize, like 1meg, to maximize compression.

                                                                                                          I also wonder if the author’s problem is some sort of alignment? Their 4k image was not aligned to 4k blocks and that caused extra blocks to be written? I have no idea if that is possible in this situation.

                                                                                                          1. 3

                                                                                                            My understanding is that ZFS only compresses blocks separately, since it has to be able to do random IO to them. With small block sizes on some drives, this can create a surprisingly low space savings from compression due to the fact that ZFS can only write disk-level blocks. If your disks are using 4K physical blocks (ie they’re ‘advance format’ drives) and you’re using, say, 8K logical blocks, in order to save any space from compression you need at least a 50% compression ratio, so that your original 8K block can be written in a single 4K physical block; if it doesn’t compress that much, it has to be written using two 4K blocks and you save no space. If you’re using 128 Kb logical blocks you can win from much smaller compression ratios, because you don’t have to shrink as much in order to need fewer physical blocks.

                                                                                                            (SSDs are somewhat inconsistent in whether they report having 512 byte physical blocks or 4K physical blocks. It’s all smoke and mirrors on SSDs anyway, so the whole mess is inconvenient for ZFS.)

                                                                                                            1. 1

                                                                                                              Like anything: it depends. You probably won’t notice much difference between 4k blocks and 8k blocks for storing video files. But you might notice large differences when storing logs. I think what you’re saying is right, I’m just drawing more attention to the “it depends on what you’re storing” aspect of it.

                                                                                                              EDIT: Another place that matters too is Compressed ARC. Better compression you can get there the more data you can jam into RAM. Ramajama

                                                                                                          1. 16


                                                                                                            • In 2004 Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML, and apparent disregard for the needs of real-world web developers and created WHATWG as a way to get control over the web standards
                                                                                                            • they throw away a whole stack of powerful web technologies (XHTML, XSLT…) whose purpose was to make the web both machine readable and useful to humans
                                                                                                            • they invented Live Standards that are a sort of ex-post standards: always evolving documents, unstable by design, designed by their hands-on committee, that no one else can really implement fully, to establish a dynamic oligopoly
                                                                                                            • in 2017, Google and Microsoft joined the WHATWG to form a Steering Group for “improving web standards”
                                                                                                            • meanwhile the W3C realized that their core business is not to help lobbies spread broken DRM technologies, and started working to a new version of the DOM API.
                                                                                                            • in 2018, after months of political negotiations, they proposed to move the working draft to recommendation
                                                                                                            • in 2018, Google, Microsoft, Apple and Mozilla felt offended by this lack of lip service.

                                                                                                            It’s worth noticing that both these groups have their center in the USA but their decisions affects the whole world.

                                                                                                            So we could further summarize that we have two groups, one controlled by USA lobbies and the other controlled by the most powerful companies in the world, fighting for the control of the most important infrastructure of the planet.

                                                                                                            Under Trump’s Presidency.

                                                                                                            Take this, science fiction! :-D

                                                                                                            1. 27

                                                                                                              This is somewhat disingenuous. Web browser’s HTML parser needs to be compatible with existing web, but W3C’s HTML4 specification couldn’t be used to build a web-compatible HTML parser, so reverse engineering was required for independent implementation. With WHATWG’s HTML5 specification, for the first time in history, a web-compatible HTML parsing got specified, with its adoption agency algorithm and all. This was a great achievement in standard writing.

                                                                                                              Servo is a beneficiary of this work. Servo’s HTML parser was written directly from the specification without any reverse engineering, and it worked! To the contrary to your implication, WHATWG lowered barrier to entry for independent implementation of web. Servo is struggling with CSS because CSS is still ill-specified in the manner of HTML4. For example, only reasonable specification of table layout is an unofficial draft: https://dbaron.org/css/intrinsic/ For a laugh, count the number of times “does not specify” appear in CSS2’s table chapter.

                                                                                                              1. 4

                                                                                                                You say Backwards compatibility is necessary, and yet Google managed to get all major sites to adopt AMP in a matter of months. AMP has even stricter validation rules than even XHTML.

                                                                                                                XHTML could have easily been successful, if it hadn’t been torpedoed by the WHATWG.

                                                                                                                1. 15

                                                                                                                  That’s nothing to do with the amp technology, but with google providing CDN and preloading (I.e., IMHO abusing their market position)

                                                                                                                2. 2

                                                                                                                  Disingenuous? Me? Really? :-D

                                                                                                                  Who was in the working group that wrote CSS2 specification?

                                                                                                                  I bet a coffee that each of those “does not specify” was the outcome of a political compromise.

                                                                                                                  But again, beyond the technical stuffs, don’t you see a huge geopolitical issue?

                                                                                                                3. 15

                                                                                                                  This is an interesting interpretation, but I’d call it incorrect.

                                                                                                                  • the reason to create whatwg wasn’t about control
                                                                                                                  • XHTML had little traction, because of developers
                                                                                                                  • html5 (a whatwg standard fwiw) was the first meaningful HTML spec because it actually finally explained how to parse it
                                                                                                                  • w3c didn’t “start working on a new Dom”. They copy/backport changes from whatwg hoping to provide stable releases for living standards
                                                                                                                  • this has nothing to do with DRM (or EME). These after completely different people!
                                                                                                                  • this isn’t about lobby groups, neither is this avout influencing politics in the US or anywhere.

                                                                                                                  I’m not speaking on behalf of my function in the w3c working group I’m in, nor for Mozilla. But those positions provided me with the understanding and background information to post this comment.

                                                                                                                  1. 8

                                                                                                                    XHTML had little traction, because of developers

                                                                                                                    I remember that in early 2000s everyone started to write <br/> instead of <br> and it was considered cool and modern. There were 80x15 badges everywhere saying website is in xhtml. My Motorola C380 phone supported wap and some xhtml websites, but not regular html in builtin browser. So I had impression that xhtml was very popular.

                                                                                                                    1. 6

                                                                                                                      xhtml made testing much easier. For me it changed many tests from using regexps (qr#<title>foo</title>#) to using any old XML parser and XPATH.

                                                                                                                      1. 3

                                                                                                                        Agreed. Worth noting that, after the html5 parsing algorithm was fully specified and libraries like html5lib became available, it became possible to apply exactly the same approach with html5 parsers outputting a DOM structure and then querying it with xpath expressions.

                                                                                                                  2. 4

                                                                                                                    XHTML was fairly clearly a mistake and unworkable in the real world, as shown by how many nominally XHTML sites weren’t, and didn’t validate as XHTML if you forced them to be treated as such. In an ideal world where everyone used tools that always created 100% correct XHTML, maybe it would have worked out, but in this one it didn’t; there are too many people generating too much content in too many sloppy ways for draconian error handling to work well. The whole situation was not helped by the content-type issue, where if you served your ‘XHTML’ as anything other than application/xhtml+xml it wasn’t interpreted as XHTML by browsers (instead it was HTML tag soup). One result was that you could have non-validating ‘XHTML’ that still displayed in browsers because they weren’t interpreting it as XHTML and thus weren’t using strict error handling.

                                                                                                                    (This fact is vividly illustrated through syndication feeds and syndication feed handlers. In theory all syndication feed formats are strict and one of them is strongly XML based, so all syndication feeds should validate and you should be able to consume them with a strictly validating parser. In practice plenty of syndication feeds do not validate and anyone who wants to write a widely usable syndication feed parser that people will like cannot insist on strict error handling.)

                                                                                                                    1. 2

                                                                                                                      there are too many people generating too much content in too many sloppy ways for draconian error handling to work well.

                                                                                                                      I do remember this argument was pretty popular back then, but I have never understood why.

                                                                                                                      I had no issue in generating xhtml strict pages from user contents. This real world company had a couple handred of customers with pretty various needs (from ecommerce, to online magazines or institutional web sites) and thousands of daily visitors.

                                                                                                                      We used XHTML and CSS to distribute highly accessible contents, and we had pretty good results with a prototype based on XLS-FO.

                                                                                                                      To me back then the call to real world issues seemed pretestuous. We literally had no issue. The issues I remember were all from IE.

                                                                                                                      You are right that many mediocre software were unable to produce proper XHTML. But is this an argument?

                                                                                                                      Do not fix the software, let’s break the specifications!

                                                                                                                      It seems a little childish!

                                                                                                                      XHTML was not perfect, but it was the right direction.

                                                                                                                      Look at what we have now instead: unparsable contents, hundreds of incompatible javascript frameworks, subtle bugs, bootstrap everywhere (aka much less creativity) and so on.

                                                                                                                      Who gain most from this unstructured complexity?

                                                                                                                      The same who now propose the final solution lock-in: web assembly.

                                                                                                                      Seeing linux running inside the browser is not funny anymore.

                                                                                                                      Going after incompetent developers was not democratization of the web, it was technological populism.

                                                                                                                      1. 2

                                                                                                                        What is possible does not matter; what matters is what actually happens in the real world. With XHTML, the answer is clear. Quite a lot of people spent years pushing XHTML as the way of the future on the web, enough people listened to them to generate a fair amount of ‘XHTML’, and almost none of it was valid and most of it was not being served as XHTML (which conveniently hid this invalidity).

                                                                                                                        Pragmatically, you can still write XHTML today. What you can’t do is force other people to write XHTML. The collective browser world has decided that one of the ways that people can’t force XHTML is by freezing the development of all other HTML standards, so XHTML is the only way forward and desirable new features appear only in XHTML. The philosophical reason for this decision is pretty clear; browsers ultimately serve users, and in the real world users are clearly not well served by a focus on fully valid XHTML only.

                                                                                                                        (Users don’t care about validation, they care about seeing web pages, because seeing web pages is their goal. Preventing them from seeing web pages is not serving them well, and draconian XHTML error handling was thus always an unstable situation.)

                                                                                                                        That the W3C has stopped developing XHTML and related standards is simply acknowledging this reality. There always have been and always will be a great deal of tag soup web pages and far fewer pages that validate, especially reliably (in XHTML or anything else). Handling these tag soup web pages is the reality of the web.

                                                                                                                        (HTML5 is a step forward for handling tag soup because for the first time it standardizes how to handle errors, so that browsers will theoretically be consistent in the face of them. XHTML could never be this step forward because its entire premise was that invalid web pages wouldn’t exist and if they did exist, browsers would refuse to show them.)

                                                                                                                        1. 0

                                                                                                                          Users don’t care about validation, they care about seeing web pages, because seeing web pages is their goal.

                                                                                                                          Users do not care about the quality of concrete because having a home is their goal.
                                                                                                                          There will always be incompetent architects, thus let them work their way so that people get what they want.

                                                                                                                          Users do not care about car safety because what they want is to move from point A to point B.
                                                                                                                          There will always be incompetent manufacturers, thus let them work their way so that people get what they want.

                                                                                                                          That’s not how engineering (should) work.

                                                                                                                          Was XHTML flawless? No.
                                                                                                                          Was it properly understood by the average web developers that most companies like to hire? No.

                                                                                                                          Was it possible to improve it? Yes. Was it better tha the current javascript driven mess? Yes!

                                                                                                                          The collective browser world has decided…

                                                                                                                          Collective browser world? ROTFL!

                                                                                                                          There’s a huge number of browsers’ implementors that nobody consulted.

                                                                                                                          Among others, in 2004, the most widely used browser, IE, did not join WHATWG.

                                                                                                                          Why WHATWG did not used the IE design if the goal was to liberate developers from the burden of well designed tools?

                                                                                                                          Why we have faced for years incompatibilities between browsers?

                                                                                                                          WHATWG was turned into one of the weapons in a commercial war for the control of the web.

                                                                                                                          Microsoft lost such war.

                                                                                                                          As always, the winner write the history that everybody know and celebrate.

                                                                                                                          But who is old enough to remember the fact, can see the hypocrisy of these manoeuvres pretty well.

                                                                                                                          There was no technical reason to throw away XHTML. The reasons were political and economical.

                                                                                                                          How can you sell Ads if a tool can easily remove them from the XHTML code? How can you sell API access to data, if a program can easily consume the same XHTML that users consume? How can you lock users, if they can consume the web without a browser? Or with a custom one?

                                                                                                                          The WHATWG did not served users’ interests, whatever were the Mozilla’s intentions in 2004.

                                                                                                                          They served some businesses at the expense of the users and of all the high quality web companies that didn’t have much issues with XHTML.

                                                                                                                          Back then it was possible to disable Javascript without loosing access to the web functionalities.

                                                                                                                          Try it now.

                                                                                                                          Back then people were exploring the concept of semantic web with the passion people now talk about the last JS framework.

                                                                                                                          I remember experiments with web readers for blind people that could never work with the modern js polluted web.

                                                                                                                          You are right, W3C abandoned its leadership in the engineering of the web back then.

                                                                                                                          But you can’t sell to a web developer bullshit about HTML5.

                                                                                                                          Beyond few new elements and a slightly more structured page (that could have been done in XHTML too) all its exciting innovations were… more Javascript.

                                                                                                                          Users did not gain anything good from this, just less control over contents, more ads, and a huge security hole worldwide.

                                                                                                                          Because, you know, when you run a javascript in Spain that was served to you from a server in the USA, who is responsible for such javascript running on your computer? Under which law?

                                                                                                                          Do you really think that such legal issues were not taken into account from the browser vendors that flued this involution of the web?

                                                                                                                          I cannot believe they were so incompetent.

                                                                                                                          They knew what they were doing, and did it on purpose.

                                                                                                                          Not to serve their users. To use those who trusted them.

                                                                                                                    2. 0

                                                                                                                      The mention of Trump is pure trolling—as you yourself point out, the dispute predates Trump.

                                                                                                                      1. 6

                                                                                                                        I think it’s more about all of this sounding like a science fiction plot than just taking a jab at the Trump presidency; just a few years ago nobody would have predicted that would have happened. So, no, not pure trolling.

                                                                                                                        1. 2

                                                                                                                          Fair enough. I’m sorry for the accusation.

                                                                                                                          Since the author is critical of Apple/Google/Mozilla here, I took it as a sort of guilt by association attack on them (I don’t mind jabs at Trump), but I see that it probably wasn’t that.

                                                                                                                          1. 2

                                                                                                                            No problem.

                                                                                                                            I didn’t saw such possible interpretation or I wouldn’t have written that line. Sorry.

                                                                                                                        2. 3

                                                                                                                          After 20 years of Berlusconi and with our current empasse with the Government, no Italian could ever troll an American about his current President.

                                                                                                                          It was not my intention in any way.

                                                                                                                          As @olivier said, I was pointing to this surreal situation from an international perspective.

                                                                                                                          USA control most of internet: most root DNS, the most powerful web companies, the standards of the web and so on.

                                                                                                                          Whatever effect Cambridge Analitica had to the election of Trump, it has shown the world that internet is a common infrastructure that we have to control and protect together. Just like we should control the production of oxigen and global warming.

                                                                                                                          If Cambridge Analitica was able to manipulate USA elections (by manipulating Americans), what could do Facebook itself in Italy? Or in German?
                                                                                                                          Or what could Google do in France?

                                                                                                                          The Internet was a DARPA project. We can see it is a military success beyond any expectation.

                                                                                                                          I tried to summarize the debacle between W3C and WHATWG with a bit of irony because, in itself, it shows a pretty scary aspect of this infrastructure.

                                                                                                                          The fact that a group of companies dares to challenge W3C (that, at least in theory, is an international organisation) is an evidence that they do not feel the need to pretend they are working for everybody.

                                                                                                                          They have too much power, to care.

                                                                                                                          1. 4

                                                                                                                            The last point is the crux of the issue: are technologists willing to do the leg work of decentralizing power?

                                                                                                                            Because regular people won’t do this. They don’t care. This, they should have less say in the issue, though still some, as they are deeply affected by it too.

                                                                                                                            1. 0

                                                                                                                              No. Most won’t.

                                                                                                                              Technologist are a wide category, that etymologically includes everyone that feel entitled to speak about how to do things.

                                                                                                                              So we have technologists that mislead people to invest in the “blockchain revolution”, technologists that mislead politicians to allow barely tested AI to kill people on the roads, technologists teaching in the Universities that neural networks computations cannot be explained and thus must be trusted as superhuman oracles… and technologists that classify as troll any criticism of mainstream wisdom.

                                                                                                                              My hope is in hackers: all over the world they have a better understanding of their political role.

                                                                                                                            2. 2

                                                                                                                              If anyone wonders about Berlusconi, Cracked has a great article on him that had me calling Trump a pale imitation of Berlusconi and his exploits. Well, until Trump got into US Presidency which is a bigger achievement than Berlusconi. He did that somewhat by accident, though. Can’t last 20 years either. I still think Berlusconi has him beat at biggest scumbag of that type.

                                                                                                                              1. 2

                                                                                                                                Yeah, the article is funny, but Berlusconi was not. Not for Italians.

                                                                                                                                His problems with women did not impress much us. But for when it became clear most of them were underage.

                                                                                                                                But the demage he did to our laws and (worse) to our public ethics will last for decades.
                                                                                                                                He did not just changed the law to help himself: he destroyed most legal tools to fight the organized crime and to fight bribes and corruption.
                                                                                                                                Worse he helped a whole generation of younger people like him to be bold about their smartness with law workarounds.

                                                                                                                                I pray for the US and the whole world that Trump is not like him.

                                                                                                                        1. 3

                                                                                                                          An interesting link. A bit light on the sourcing and heavy on the opinion, but hey, it’s Medium.

                                                                                                                          It kind of glossed over why Canon became dominant in the later ages of film - the story goes that Nikon introduced AF (everyone was working on this, it was a Big Thing) and asked pros whether they liked it or not. Pros being pros, they said it was too slow for pro (sports) work, and Nikon said, fair enough. Canon on the other hand, decided to fix the AF problem, which they did, captured a huge share of the market, and left Nikon scrambling to catch up.

                                                                                                                          I have no sources for this, just memories and anecdotes picked up on the internet. Sorry.

                                                                                                                          Another interesting thing is that Nikon only lately, with the introduction of electronically controlled aperture mechanisms in the lens, has come to parity with the Canon EF mount - just electronic contacts, no screws or levers.

                                                                                                                          1. 4

                                                                                                                            My understanding is that part of Canon’s edge in the 1990s ultimately came from the fact that they were willing to move to a new mount in order to enable better AF and other things (FD to EOS/EF in 1987). Arguably Canon could afford to do this because they were the underdogs; moving to a new mount and thus instantly orphaning all of their existing customers wasn’t the huge deal it would have been for Nikon, because Canon didn’t have as many customers.

                                                                                                                            (In the context of this Medium story, I think this is an important difference. It shifts the lessons from ‘Nikon not grasping things’ to ‘underdogs can afford to do disruptive things that leaders can’t’.)

                                                                                                                            1. 1

                                                                                                                              Exactly right. It is very much like Apple switching to OS X from System 9—they had little to lose.

                                                                                                                              I tried to make this point so sorry it didn’t come across.

                                                                                                                              1. 1

                                                                                                                                Canon did have a lot of customers. The Canon FD mount cameras were very popular and successful, and they caused Nikon a lot of pain as they captured the “lower end” of the amateur market.

                                                                                                                                Ultimately the move to EF was a good choice, but it caused a lot of bad blood. Pros could afford to move eventually, but a lot of other shooters who had “invested” in the FD system were pissed. Part of it was of course because FD lenses and bodies instantly lost potential resale value, but I also think a large part of it was the feeling of being “cheated” by the company they had chosen. Fanboyism is by no means only confined to the internet age.

                                                                                                                                1. 1

                                                                                                                                  I think there’s more to it than fanboyism. Interchangeable lens cameras are a system, and when you buy into a system one of the things you’re taking a bet on is the continued life and development of the system. When Canon changed their lens mount, they killed FD as a system. Your existing FD gear would still work, but you weren’t getting any improved future cameras or any improved future lenses. People are naturally angry about a system being ended on them this way; it would be somewhat like Apple declaring out of the blue one day that they were stopping all future development of MacOS hardware and software, and what you had now was all you’d ever have. The drop in the resale value of your current hardware would be the least of your worries.

                                                                                                                                  (Not entirely like it, because 1980s cameras didn’t have security vulnerabilities.)

                                                                                                                                  1. 1

                                                                                                                                    That’s a good point… however, it was clear to everyone that AF was the future, and whether you shot Nikon or Canon you would have to update all your gear to take availability of the new features. Granted, many people were perfectly happy to keep on using older manual focus Nikkor glass on AF Nikons, and that path was denied to Canon shooters.

                                                                                                                                    1. 1

                                                                                                                                      One difference between a gradual shift (in any sort of system) and an abrupt, all at once shift is that in the latter, you’re dependent on the new system having everything you need right away. If it doesn’t, either you have to do without, delay your shift (and do without the new system’s benefits), or use two systems for a while (in this case, carrying two cameras, using two sets of film, etc).

                                                                                                                                      I don’t know how many Canon FD lenses were available in 1987 at the time that Canon announced EOS/EF, but I suspect that EOS didn’t launch with equivalents of all of them available. Although according to this source (and also) Canon does seem to have had an impressively large number of lenses available that year. Interestingly, it took until 1989 for them to put out an 85 mm prime, despite that being a common portrait focal length.

                                                                                                                                      1. 1

                                                                                                                                        One of the things that probably irked Canon FD shooters was that there was no way to adapt the FD lenses to EF without an adapter with optical elements.

                                                                                                                                        Canon did make such an adapter but it was special order only, and supposedly offered to owners of FD superteles so they could continue using these lenses on the new EOS system.

                                                                                                                                        The FD system was pretty complete.

                                                                                                                              2. 2

                                                                                                                                Hello..ouch light on sourcing :-) I included photos of magazine articles, ads, and brochures as well as my own lens ;-) What am I missing.

                                                                                                                                I tried not to gloss over this but clearly it didn’t sink in. Nikon was extremely focused on professionals and took a very conservative approach for this reason. They tended to view AF as a consumer feature and kept a very clean separation between consumer and pro cameras over concerns of both cannibalization and making pro cameras seem too gimmicky (even to the degree of worrying about having a backup mechanical shutter in the f3).

                                                                                                                                There are many magazine articles at the time debating autofocus and critical of the speed. The F3AF is horrendously slow and because it required a special finder and body and had only two slow lenses it was much more of an experiment. But the speed made it seem self-fulfilling.

                                                                                                                                it is worth noting that Nikon’s AF lens line came out the same year as the EF mount. It was also the same year as the Nikon F4 which was a pro camera with autofocus. The F4 was a big improvement over the F3 even without autofocus which led to a slower ramp up time of AF with Pros. I posted on FB a photo of the full range of AF lenses introduced at the time. Worth noting is that shortly after intro the AF lenses were all revised again to add “D” designation for distance information from lens to camera—already in the EF series.

                                                                                                                                Nikon was loathe to introduce electronic aperture since new lenses would not work at all with older cameras. It is only with the E and G lenses that the break has finally been made. Again they started with consumer lenses interestingly enough.

                                                                                                                                1. 2

                                                                                                                                  I do appreciate the article, the effort put into it, and the discussion it has engendered. Thanks!

                                                                                                                                  I felt you got the main points across, but it jumped a bit from film to digital to film again during its course.

                                                                                                                                  One thing worth mentioning is what I noticed leafing through old PopPhoto issues from the late 70s is that Nikon could run ads both for the pro F3 and the consumer level Nikon bodies, and in a very aspirational way - Nikon cameras were affordable and easy to use, and when you stepped up to the “big leagues”, the lenses all worked! (this was pre Canon EF, of course). The pro gear caused a “halo” effect. This is how Canon’s white glass work now.

                                                                                                                                  1. 1

                                                                                                                                    According to this source, E lenses actually started with pro lenses, first with the 2008 tilt/shift PC (perspective control) refresh and moving on to some of the long telephotos. The G lenses do seem to have started with consumer lenses (some sources say the 2000 era 70-300 f/4-5.6).

                                                                                                                                    The recent AF-P lenses have all been consumer lenses (and with strikingly low backward compatibility for Nikon). The cynical assumption is that removing the on-lens VR switch is largely about making them cheaper to manufacture (I say as an annoyed D7100 person).

                                                                                                                                    1. 1

                                                                                                                                      I have personally handled a 70-300mm f/4-5.6 G lens that was also screw-drive AF.

                                                                                                                                      1. 1

                                                                                                                                        That was a consumer lens (or no pro would ever bother). The first 300/2.8 was the screw lens. The second one, AF-I, had a lens motor but was just slow.

                                                                                                                                        1. 1

                                                                                                                                          Oh, indeed. My point (carelessly made) was that not all G spec lenses were AF-S/AF-I.

                                                                                                                                          I used to be able to keep track of Nikon’s lens compatibility but with the recent E and P spec lenses I’ve basically given up…

                                                                                                                                1. 2

                                                                                                                                  Is this likely to also apply to to Firefox Beta? Or does it just seem to be a Nightly thing?

                                                                                                                                  1. 4

                                                                                                                                    This specific study is Nightly only. In general I’d expect and hope that similar studies would only be run against Nightly for various reasons. However, as far as I can tell the current Mozilla privacy policy makes no difference between Nightly and Beta, and therefore Mozilla could consider themselves to have permission to run these opt-out studies on Beta, just as they have with Nightly. Mozilla may clarify this in the future.

                                                                                                                                    On a pragmatic level, doing something like this with Beta would probably be much more visible and produce many more annoyed people and bad publicity, so I think Mozilla probably has good reasons to avoid it unless they have a study that they consider really important. Beta also has enough visibility that news about such an opt-in study would probably be widely distributed and well know, so you’d hear of it enough in advance to do something.

                                                                                                                                    (I’m the author of the article.)

                                                                                                                                    1. 2

                                                                                                                                      Thanks for this. I’ve been running Nightly on my mobile, and I’ve switched to Beta. (Newer Firefox versions than Stable on Android seem quite a bit faster; that’s why I care.)

                                                                                                                                  1. 59

                                                                                                                                    This is why we can’t have good software. This program could literally have been an empty file, a nothing at all, a name capturing the essence perfectly.

                                                                                                                                    I’m not sure I could disagree more strongly. An empty file only has the true behavior because of a bunch of incredibly non-obvious specific Unix behaviors. It would be equally reasonable for execution of this file to fail (like false) since there’s no hashbang or distinguishable executable format to decide how to handle it. At a somewhat higher level of non-obviousness, it’s really weird that true need be a command at all (and indeed, in almost all shells, it’s nottrue is a builtin nearly everywhere).

                                                                                                                                    true being implementable in Unix as an empty file isn’t elegant—it’s coincidental and implicit.

                                                                                                                                    1. 15

                                                                                                                                      I mean, it’s POSIX specified behavior that any file that is executed that isn’t a loadable binary is passed to /bin/sh (”#!” as the first two bytes results in “implementation-defined” behavior), and it’s POSIX specified behavior that absent anything else, a shell script exits true.

                                                                                                                                      It’s no more coincidental and implicit than “read(100)” advances the file pointer 100 bytes, or any other piece of standard behavior. Sure, it’s Unix(-like)-specific, but, well, it’s on a Unix(-like) operating system. :)

                                                                                                                                      1. 25

                                                                                                                                        It’s precisely specified, yes, but it’s totally coincidental that the specification says what it does. A perfectly-reasonable and nearly-equivalent specification in an alternate universe where Thomson and Ritchie sneezed five seconds earlier while deciding how executables should be handled would have precisely the opposite behavior.

                                                                                                                                        On the other hand, if read(100) did anything other than read 100 bytes, that would be extremely surprising and would not have come about from an errant sneeze.

                                                                                                                                        1. 35

                                                                                                                                          Black Mirror Episode: The year is 2100 and the world is ravaged by global warming. The extra energy aggregated over decades because non executables went through /bin/sh caused the environment to enter the tipping point where the feedback loops turned on. A time machine is invented, where one brave soul goes back in time with a feather, finds Thomson and makes him sneeze, saving humanity from the brink of extinction. But then finds himself going back to 2100 with the world still ravaged. Learns that it was fruitless because of npm and left-pad.

                                                                                                                                          1. 12

                                                                                                                                            it’s totally coincidental that the specification says what it does.

                                                                                                                                            This is true of literally all software specifications, in my experience.

                                                                                                                                            1. 8

                                                                                                                                              Surely we can agree that it is far more coincidental that an empty executable returns success immediately than that e.g. read(100) reads 100 bytes?

                                                                                                                                              1. 7

                                                                                                                                                Why isn’t 100 an octal (or a hex or binary) constant? Why is it bytes instead of machine words? Why is read bound to a file descriptor instead of having a record size from an ioctl, and then reading in 100 records?

                                                                                                                                                Just some examples. :)

                                                                                                                                                1. 5

                                                                                                                                                  Obviously, minor variations are possible. However, in no reasonable (or even moderately unreasonable) world, would read(100) write 100 bytes.

                                                                                                                                                  1. 12

                                                                                                                                                    Pass a mmap’ed pointer to read, and it shall write. :)

                                                                                                                                            2. 12

                                                                                                                                              The current (POSIX) specification is the product of historical evolution caused in part by /bin/true itself. You see, in V7 Unix, the kernel did not execute an empty file (or shell scripts); it executed only real binaries. It was up to the shell to run shell scripts, including empty ones. Through a series of generalizations (starting in 4BSD with the introduction of csh), this led to the creation of #! and kernel support for it, and then POSIX requiring that the empty file trick be broadly supported.

                                                                                                                                              This historical evolution could have gone another way, but the current status is not the way it is because people rolled out of bed one day and made a decision; it is because a series of choices turned out to be useful enough to be widely supported, eventually in POSIX, and some choices to the contrary wound up being discarded.

                                                                                                                                              (There was a time when kernel support for #! was a dividing line between BSD and System V Unix. The lack of it in the latter meant that, for example, you could not make a shell script be someone’s login shell; it had to be a real executable.)

                                                                                                                                              1. 10

                                                                                                                                                The opposite isn’t reasonable though. That would mean every shell script would have to explicitly exit 0 or it will fail.

                                                                                                                                                Every. Shell. Script.

                                                                                                                                                And aside from annoying everyone, that wouldn’t even change anything. It would just make the implementation of true be exit 0, instead of the implementation of false be exit 1.

                                                                                                                                                And read(100) does do something besides read 100 bytes. It reads up to 100 bytes, and isn’t guaranteed to read the full 100 bytes. You must check the return value and use only the amount of bytes read.

                                                                                                                                                1. 7

                                                                                                                                                  It’s not obvious to me that an empty file should count as a valid shell script. It makes code generation marginally easier, I suppose. But I also find something intuitive to the idea that a program should be one or more statements/expressions (or functions if you need main), not zero or more.

                                                                                                                                                  1. 3

                                                                                                                                                    So if you run an empty file with sh, you would prefer it exits failure. And when you run an empty file with python, ruby, perl, et al., also failures?

                                                                                                                                                    Why should a program have one or more statements / expressions? A function need not have one or more statements / expressions. Isn’t top level code in a script just a de facto main function?

                                                                                                                                                    It’s intuitive to me that a script, as a sequence of statements to run sequentially, could have zero length. A program with an entry point needs to have at least a main function, which can be empty. But a script is a program where the entry point is the top of the file. It “has a main function” if the file exists.

                                                                                                                                                    1. 3

                                                                                                                                                      I think whatever the answer is, it makes equal sense for Perl, Python, Ruby, shell, any language that doesn’t require main().

                                                                                                                                                      In my opinion, your last argument begs the question. If an empty program is considered valid, then existing is equivalent to having an empty main. If not, then it isn’t.

                                                                                                                                                      In any case, I don’t mean to claim that it’s obvious or I’m certain that an empty program should be an error, just that it seems like a live option.

                                                                                                                                                    2. 2

                                                                                                                                                      Exactly. It sounds like arbitrary hackery common in UNIX development. Just imagine writing a semi-formal spec that defines a program as “zero characters” which you pass onto peer review. They’d say it was an empty file, not a program.

                                                                                                                                                      1. 2

                                                                                                                                                        I guess true shouldn’t be considered a program. It is definitely tied to the shell it runs in, as you wouldn’t call execv("true", {"/bin/true", NULL}) to exit a program correctly. for example. true has no use outside of the shell, so it makes sense to have it use the shell’s features. That is why now it tends to be a builtin. But having it a builtin is not specified by POSIX. Executing file on the other end, is, and the spec says the default exit code it 0 or “true”. By executing an empty file, you’re then asking the shell to do nothing, and then return true. So I guess it is perfectly fine for true to jist be an empty file. Now I do agree that such a simple behavior has (loke often with unix) way too many ways to be executed, ans people are gonna fight about it for quite some time! What about these?

                                                                                                                                                        alias true=(exit)
                                                                                                                                                        alias true='/bin/sh /dev/null'
                                                                                                                                                        alias true='sh -c "exit $(expr `false;echo $? - $?`)"'

                                                                                                                                                        The one true true !

                                                                                                                                                        1. 1

                                                                                                                                                          It depends upon the system. There is IEFBR14, a program IBM produced to help make files in JCL which is similar to /bin/true. So there could be uses for such a program.

                                                                                                                                                          It also has the distinction of being a program that was one instruction long and still have a bug in it.

                                                                                                                                                          1. 1

                                                                                                                                                            “That is why now it tends to be a builtin.”

                                                                                                                                                            Makes sense. If tied to the shell and unusual, I’d probably put something like this into the interpreter of the shell as an extra condition or for error handling. Part of parsing would identify an empty program. Then, either drop or log it. This is how such things are almost always handled.

                                                                                                                                                      2. 1

                                                                                                                                                        That would mean every shell script would have to explicitly exit 0 or it will fail.

                                                                                                                                                        I don’t see how that follows.

                                                                                                                                                        Once the file is actually passed to the shell, it is free to interpret it as it wishes. No reasonable shell language would force users to specify successful exit. But what the shell does is not in question here; it’s what the OS does with an empty or unroutable executable, for which I am contending there is not an obvious behavior. (In fact, I think the behavior of running it unconditionally with the shell is counterintuitive.)

                                                                                                                                                        And read(100) does do something besides read 100 bytes.

                                                                                                                                                        You’re being pedantic. Obviously, under some circumstances it will set error codes, as well. It very clearly reads some amount of data, subject to the limitations and exceptions of the system; zero knowledge of Unix is required to intuit that behavior.

                                                                                                                                                        1. 7

                                                                                                                                                          I don’t see how that follows.

                                                                                                                                                          You claim the exact opposite behavior would have been equally reasonable. That is, the opposite of an empty shell script exiting true. The precise opposite would be an empty shell script—i.e. a script without an explicit exit—exiting false. This would affect all shell scripts.

                                                                                                                                                          Unless you meant the opposite of executing a file not loadable as an executable binary by passing it to /bin/sh, in which case I really would like to know what the “precise opposite” of passing a file to /bin/sh would be.

                                                                                                                                                          You’re being pedantic. Obviously, under some circumstances it will set error codes, as well. It very clearly reads some amount of data, subject to the limitations and exceptions of the system; zero knowledge of Unix is required to intuit that behavior.

                                                                                                                                                          No. Many people assume read will fill the buffer size they provide unless they are reading the trailing bytes of the file. However, read is allowed to return any number of bytes within the buffer size at any time.

                                                                                                                                                          It also has multiple result codes that are not errors. Many people assume when read returns -1 that means error. Did you omit that detail for brevity, or was it not obvious to you?

                                                                                                                                                          1. 6

                                                                                                                                                            If a file is marked executable, I think it’s quite intuitive that the system attempt to execute. If it’s not a native executable, the next obvious alternative would be to interpret it, using the default system interpreter.

                                                                                                                                                        2. 3

                                                                                                                                                          Saying the behavior is totally (or even partially) coincidental is a bit strong. You’re ignoring the fundamental design constraints around shell language and giving the original designers more credit than they deserve.

                                                                                                                                                          Consider this experiment: you pick 100 random people (who have no previous experience to computer languages) and ask them to design a shell language for POSIX. How would all of these languages compare?

                                                                                                                                                          If the design constraints I’m talking about didn’t exist, then it would indeed be random and one would expect only ~50% of the experimental shell languages to have a zero exit status for an empty program.

                                                                                                                                                          I strongly doubt that is what you would see. I think you would see the vast majority of those languages specifying that an empty program have zero exit status. In that case, it can’t be random and there must something intentional or fundamental driving that decision.

                                                                                                                                                          1. 7

                                                                                                                                                            I don’t care about how the shell handles an empty file. (Returning successful in that case is basically reasonable, but not in my opinion altogether obvious.) I’m stating that the operating system handling empty executables by passing them to the shell is essentially arbitrary.

                                                                                                                                                            1. 4

                                                                                                                                                              The reason for the existence of human intelligence isn’t obvious either but that doesn’t make it random. A hostile environment naturally provides a strong incentive for an organism to evolve intelligence.

                                                                                                                                                              As far as the operating system executing non-binaries with “/bin/sh” being arbitrary, fair enough. Though I would argue that once the concepts of the shebang line and an interpreter exist, it’s not far off to imagine the concept of a “default interpreter.” Do you think the concept of a default is arbitrary?

                                                                                                                                                          2. 1

                                                                                                                                                            It’s precisely specified, yes, but it’s totally coincidental that the specification says what it does.

                                                                                                                                                            laughs That’s really taking an axe to the sum basis of knowledge, isn’t it?

                                                                                                                                                        3. 2

                                                                                                                                                          yes an empty file signifying true violates the principle of least astonishment.However if there were a way to have metadata comments about the file describing what it does, how it works, and what version it is without having any of that in the file we’d have the best of both worlds.

                                                                                                                                                          1. 2

                                                                                                                                                            true being implementable in Unix as an empty file isn’t elegant—it’s coincidental and implicit.

                                                                                                                                                            But isn’t this in some sense exactly living up to the “unix philosophy”?

                                                                                                                                                            1. 3


                                                                                                                                                            2. 1

                                                                                                                                                              Why is it weird that true need be a command at all?

                                                                                                                                                              1. 0

                                                                                                                                                                To me, the issue is whether it is prone to error. If it is not, it is culture building because it is part of the lore.

                                                                                                                                                              1. 7

                                                                                                                                                                There are a number of issues with these ideas but there are two I want to draw attention to in specific.

                                                                                                                                                                All byte spans are available to any user with a proper address. However, they may be encrypted, and access control can be performed via the distribution of keys for decrypting the content at particular permanent addresses.

                                                                                                                                                                While perpetually tempting, security through encryption keys has the major drawback that it is non-revocable (you can’t remove access once it’s been granted). As a result, over time it inevitably fails open; the keys leak and more and more people have access until everyone does. This is a major drawback of any security system based only on knowledge of some secret; we’ve seen it with NFS filehandles and we’ve seen it with capabilities, among others. Useful security/access control systems must cope with secrets leaking and people changing their minds about who is allowed access. Otherwise you should leave all access control out and admit honestly that all content is (eventually) public, instead of tacitly misleading people.

                                                                                                                                                                […] Any application that has downloaded a piece of content serves that content to peers.

                                                                                                                                                                People will object to this, quite strongly and rightfully so. Part of the freedom of your machine belonging to you is the ability to choose what it does and does not do. Simply because you have looked at a piece of content does not mean that you want to use your resources to provide that content to other people.

                                                                                                                                                                1. 1

                                                                                                                                                                  Any application that has downloaded a piece of content serves that content to peers.

                                                                                                                                                                  The other issue with this is what if the content is illegal? (classified government information, child abuse, leaked personal health records, etc.) There are some frameworks like Zeronet where you can chose to stop serving that content, and others like FreeNet where yo don’t even know if you’re serving that content. (These come with a speed vs anonymity trade-off of course).

                                                                                                                                                                  I do agree with the idea that any content you fetch, you should reserve by default, maybe with some type of blockchain voting system to pass information along to all the peers if some of the content might be questionable, giving the user a chance to delete it.

                                                                                                                                                                  1. 2

                                                                                                                                                                    Author of the original post here. My prototype uses IPFS, which uses (or plans to support, at least) distributed optional blocklists of particular hashes. This would be my model for blocking content. Anybody who doesn’t block what they’re asked to block becomes liable for hosting it.

                                                                                                                                                                1. 2

                                                                                                                                                                  As someone who is just starting to dive deep into operating systems, especially Unix, I’m grateful for all the writing you’ve done about the Oil project.

                                                                                                                                                                  Oil is taking shell seriously as a programming language, rather than treating it as a text-based UI that can be abused to write programs.

                                                                                                                                                                  One question in response to this statement is at what point does the shell language become just another programming language with an operating system interface. This question seems especially important when the Oil shell language targets users who are writing hundreds of lines of shell script. If someone is writing an entire program in shell script, what is the advantage of using shell script over a programming language? You seem to anticipate this question by comparing the Oil shell language to Ruby and Python:

                                                                                                                                                                  …Python and Ruby aren’t good shell replacements in general. Shell is a domain-specific language for dealing with concurrent processes and the file system. But Python and Ruby have too much abstraction over these concepts, sometimes in the name of portability (e.g. to Windows). They hide what’s really going on.

                                                                                                                                                                  So maybe these are good reasons (not sure if they are or aren’t) why Ruby and Python scripts aren’t clearly better than shell scripts. You also provide a mix of reasons why shell is better than Perl. For example: “Perl has been around for more than 30 years, and hasn’t replaced shell. It hasn’t replaced sed and awk either.”.

                                                                                                                                                                  But again, it doesn’t seem to clearly answer why the domain language for manually interacting with the operating system should be the same language used to write complex scripts that interact with the operating system. Making a language that is capable of both should provide a clear advantage to the user. But it’s not clear that there is an advantage. Why wouldn’t it be better to provide two languages: one that is optimized for simple use cases and another that is optimized for complex use cases? And why wouldn’t the language for complex use cases be C or Rust?

                                                                                                                                                                  1. 3

                                                                                                                                                                    My view is that the most important division between a shell language and a programming language is what each is optimized for in terms of syntax (and semantics). A shell language is optimized for running external programs, while a programming language is generally optimized for evaluating expressions. This leads directly to a number of things, like what unquoted words mean in the most straightforward context; in a fluid programming language, you want them to stand for variables, while in a shell language they’re string arguments to programs.

                                                                                                                                                                    With sufficient work you could probably come up with a language that made these decisions on a contextual basis (so that ‘a = …’ triggered expression context, while ‘a b c d’ triggered program context or something like that), but existing programming languages aren’t structured that way and there are still somewhat thorny issues (for example, how you handle if).

                                                                                                                                                                    Shell languages tend to wind up closely related to shells (if not the same) because shells are also obviously focused on running external programs over evaluating expressions. And IMHO shells grow language features partly because people wind up wanting to do more complex things both interactively and in their dotfiles.

                                                                                                                                                                    (In this model Perl is mostly a programming language, not a shell language.)

                                                                                                                                                                    1. 1

                                                                                                                                                                      Thanks, glad you like the blog.

                                                                                                                                                                      So maybe these are good reasons (not sure if they are or aren’t) why Ruby and Python scripts aren’t clearly better than shell scripts.

                                                                                                                                                                      Well, if you know Python, I would suggest reading the linked article about replacing shell with Python and see if you come to the same conclusion. I think among people who know both bash and Python (not just Python), the idea that bash is better for scripting the OS is universal. Consider that every Linux distro uses a ton of shell/bash, and not much Python (below a certain level of the package dependency graph).

                                                                                                                                                                      The main issue is that people don’t want to learn bash, which I don’t blame them for. I don’t want to learn (any more) Perl, because Python does everything that Perl does, and Perl looks ugly. However, Python doesn’t do everything that bash does.

                                                                                                                                                                      But again, it doesn’t seem to clearly answer why the domain language for manually interacting with the operating system should be the same language used to write complex scripts that interact with the operating system.

                                                                                                                                                                      There’s an easy answer to that: because bash is already both languages, and OSH / Oil aim to replace bash.

                                                                                                                                                                      Also, the idea of a REPL is old and not limited to shell. It’s nice to build your programs from snippets that you’ve already tested. Moving them to another language wouldn’t really make sense.