Threads for Brids

  1. 8
    <Mara> Isn't that because it's written in C and C is inherently unsafe even though hordes of "experts" decry otherwise?
    <Cadey> Don't say that, you'll incite the horde.
    

    This comment about removing sudo seemed a bit misplaced considering systemd and the linux kernel are also written in C. I get the practical issues, but this isn’t the only reason (or probably the main reason) to remove sudo.

    Another reason is because it might be over complicated for what most people need it for, which exacerbates any issues with C. I would say setuid is also risky even for safe languages as you need to have a deep understanding of lots of potential foot guns like TOCTOU races and file ownership. Minimizing the attack surface for privileged code is generally a good idea for any language.

    Thanks for posting/writing this.

    1. 3

      FWIW systemd and the linux kernel are at least trying to add Rust support for incrementally adding memory safety to important parts. I know of no such plans with sudo.

      1. 9

        This is rearranging deck chairs on the Titanic; we need to remove the ability to run unsafe code, not add new ways to hide unsafety. A memory space is like a water balloon, not a sponge; any violation of invariants can lead to complete compromise.

        1. 4

          There’s please which is a Rust alternative to sudo with some additional features. It did have some nasty cves a while ago though, but the author requested this audit which shows the right stuff to me (and they’re all fixed now).

          1. 6

            If the Rust fandom spent as much time writing new kernels and ecosystems as they do shitposting about C and trying to inject Rust elsewhere, we could’ve replaced Linux by now.

            1. 5

              I don’t think this is actually true, and in any case I don’t think replacing Linux is in and of itself a solution to the actual problem of memory-unsafe C software resulting in exploitable vulnerabilities. If you had an OS written in Rust (and that still had some concept of a root user), but the sudo binary was still written in C, and had an exploitable memory-safety vulnerability that allowed an attacker to get root privileges, that is still a security problem. And contrariwise if you leave linux the way it is but rewrite sudo in a memory-safe language, this greatly mitigates the memory-unsafety problem on existing systems without going to the immense trouble to replace Linux.

              1. 4

                Right, but it’s hard to escape the feeling that “it’s written in C” is an odd complaint to have about a single program sitting somewhere near the top of the userspace stack, in a post where virtually every single program that’s mentioned directly or indirectly – Linux (including Wireguard, which Tailscale is built on top of), ssh, auditd, coreutils – is written in C. Basically the only one that isn’t written in C is Nix itself, and even that one is C++.

                If – and I have no qualms with that claim – being written in C makes something inherently unsafe, then it’s probably worth contemplating the idea that this whole setup, with or without sudo, is inherently unsafe, from top to bottom. And replacing sudo, of all things in it, with something written in Rust (or any other memory-safe language, from Java to PHP and from Python to Visual Basic) is very much rearranging chairs on the Titanic’s deck.

                1. 2

                  Agreed (and it’s worth pointing out that writing a program in Rust doesn’t in and of itself guarantee that the program is free from exploitable security vulnerabilities - it just greatly mitigates an important class of such vulnerabilities). I do think it would be good if sudo were not written in C(/C++), but I think the same thing about every other program in the stack including Nix and the Linux kernel itself.

                  In any case, removing access to the sudo binary itself for users that don’t need to use it certainly couldn’t hurt, regardless of what programming language it’s written in. I wonder how this is implemented; i.e. what the nixos options security.sudo.enable or security.sudo.execWheelOnly are actually doing.

                2. 1

                  Heh, and deploying exploit mitigation techniques into the OS for running existing software doesn’t require rewriting anything at all.

          1. 6

            ‘No end in sight’ seems wrong to me. I’ve been using Linux as a daily driver desktop for about 3 years now, and in that time Wayland has gone from completely nonviable for consumers to the default protocol for my distribution. We also now have Nvidia acceleration on Wayland and XWayland. The only major thing left I believe is screen sharing, and that gap is closing quickly.

            1. 4

              So like Julia but … .

              Please fill in the …

              1. 3

                I think they’re trying to build ‘Pandas in Jupyter notebooks, but based on Go instead of Python’.

                • The bottom of the README links to related projects PandasGo+ and NumGo+
                • Wang Fenjin and others have built a Go+ Jupyter kernel
                • Go+ is a script-like version of Go, and (a script-like language / less stuff to type) is definitely one of the things you want when interactively analysing data. Some examples, gleaned from the readme and the tutorials folder:
                  • You can put a shebang line in your Go+ script
                  • Type annotations can be omitted in various situations
                  • List comprehensions, to turn for-loops into expressions
                  • but your Go+ code can import Go modules …
                  • … and your Go+ code can be used as a Go module by Go code.
                  • There’s probably more stuff that makes it easier to type than Go; but unfortunately I don’t know Go, so I can’t tell what the examples are improving on. Sorry!

                No sign of learning from R or Julia, alas. Pandas and Jupyter were a big step forward for the Python data analysis ecosystem; but they are still a big step backward when you come from R or Julia data frames, or Rstudio + Rmarkdown notebooks (sorry, I don’t know the Julia equivalents here – VSCode + Weave.jl?).

                1. 3

                  Not sure if you know this, but Julia is a supported language for Jupyter notebooks. (That’s where the ‘ju’ comes from), so the Julia equivalents would be Jupyter notebooks as well!

                  1. 1

                    Coooool, I didn’t know! Thanks!

              1. 3

                I’m not entirely convinced a new model is needed. We already have memory mapped files in all the major operating systems. And file pages can already be as small as 4KiB, which is tiny compared to common file sizes, these days. Perhaps it would make sense to have even smaller pages for something like Opteron, but do we really need to rethink everything? What would we gain?

                1. 4

                  What we’d gain is eliminating 50+ years of technical debt.

                  I recommend the Twizzler presentation mentioned a few comments down. It explains some of the concepts much better than I can. These people have really dug into the technical implications far deeper than me.

                  The thing is this: persistent memory blows apart the computing model that has prevailed for some 60+ years now. This is not the Von Neumann model or anything like that; it’s much simpler.

                  There are, in all computers for since about the late 1950s, a minimum of 2 types of storage:

                  • primary storage, which the processor can access directly – it’s on the CPUs’ memory bus. Small, fast, volatile.
                  • secondary storage, which is big, slow, and persistent. It is not on the memory bus and not in the memory map. It is held in blocks, and the processor must send a message to the disk controller, ask for a particular block, wait for it to be loaded from 2y store and place into 1y store.

                  The processor can only work on data in 1y store, but everything must be fetched from it, worked on, and put back.

                  This is profoundly limiting. It’s slow. It doesn’t matter how fast the storage is, it’s slow.

                  PMEM changes that. You have RAM only RAM, but some of your RAM keeps its contents when the power is off.

                  Files are legacy baggage. When all your data is in RAM all the time, you don’t need files. Files are what filesystems hold; filesystems are an abstraction method for indexing blocks of secondary storage. With no secondary storage, you don’t need filesystems any more.

                  1. 7

                    I feel like there are a bunch of things conflated here:

                    Filesystems and file abstractions provide a global per-device namespace. That is not a great abstraction today, where you often want a truly global namespace (i.e. one shared between all of your devices) or something a lot more restrictive. I’d love to see more of the historical capability systems research resurrected here: for typical mobile-device UI abstractions, you really want a capability-based filesystem. Persistent memory doesn’t solve any of the problems of naming and access. It makes some of them more complicated: If you have a file on a server somewhere, it’s quite easy to expose remote read and write operations, it’s very hard to expose a remote mmap - trying to run a cache coherency protocol over the Internet does not lead to good programming models.

                    Persistence is an attribute of files but in a very complicated way. On *NIX, the canonical way of doing an atomic operation on a file is to copy the file, make your changes, and then move the old file over the top. This isn’t great and it would be really nice if you could have transactional updates over ranges of files (annoyingly, ZFS actually implements all of the machinery for this, it just doesn’t expose it at the ZPL). With persistent memory, atomicity is hard. On current implementations, atomic operations with respect to CPU cache coherency and atomic operations with respect to committing data to persistent storage are completely different things. Getting any kind of decent performance out of something that directly uses persistent memory and is resilient in the presence of failure is an open research problem.

                    Really using persistent memory in this way also requires memory safety. As one of the The Machine developers told me when we were discussing CHERI: with persistent memory, your memory-safety bugs last forever. You’ve now turned your filesystem abstractions into a concurrent GC problem.

                    1. 1

                      Excellent points; thank you.

                      May I ask, are you the same David Chisnall of “C is not a low-level language” paper? That is probably my single most-commonly cited paper. My compliments on it.

                      Your points are entirely valid, and that is why I have been emphasizing the “just for fun” angle of it. I do not have answers to some of these hard questions, but I think that at first, what is needed is some kind of proof of concept. Something that demonstrates the core point: that we can have a complex, rich, capable environment that is able to do real, interesting work, which in some ways exceeds the traditional *nix model for a programmer, which runs entirely in a hybrid DRAM/PMEM system, on existing hardware that can be built today.

                      Once this point has been made by demonstration, then perhaps it will be possible to tackle much more sophisticated systems, which provide reliability, redundancy, resiliency, and all that nice stuff that enterprises will pay lots of money for.

                      There is a common accusation, not entirely unjust, that the FOSS community is very good at imitating and incrementally improving existing implementations, but not so good at creating wholly new things. I am not here to fight that battle. What I was trying to come up with was a proposal to use some existing open technology – things that are already FOSS, already out there, and not new and untested and immature, but solid, time-proven tools that have survived despite decades in obscurity – and assemble them into something that can be used to explore new and largely uncharted territory.

                      ISTM, based on really very little evidence at all, that HPE got carried away with the potential of someting that came out of their labs. It takes decades to go from a new type of component to large-scale highly-integrated mass production. Techies know that; marketing people do not. We may not have competitive memristor storage until the 2030s at the earliest, and HPE wanted to start building enterprise solutions out of it. Too much, too young.

                      Linux didn’t spring fully-formed from Torvalds’ brow ready to defeat AIX, HP-UX and Solaris in battle. It needed decades to grow up.

                      The Machine didn’t get decades.

                      Smalltalk has already had decades.

                      1. 1

                        Reply notifications are working again, so I just saw this!:

                        May I ask, are you the same David Chisnall of “C is not a low-level language” paper? That is probably my single most-commonly cited paper. My compliments on it.

                        That’s me, thanks! I’m currently working on a language that aims to address a lot of my criticisms of the C abstract machine.

                        Something that demonstrates the core point: that we can have a complex, rich, capable environment that is able to do real, interesting work, which in some ways exceeds the traditional *nix model for a programmer, which runs entirely in a hybrid DRAM/PMEM system, on existing hardware that can be built today.

                        I do agree with the ‘make it work, make it correct, make it fast’ model, but I suspect that you’ll find with a lot of these things that the step from ‘make it work’ to ‘make it correct’ is really hard. A lot of academic OS work fails to make it from research to production because they focus on making something that works for some common cases and miss the bits that are really important in deployment. For persistent memory systems, how you handle failure is probably the most important thing.

                        With a file abstraction, there’s an explicit ‘write state for recovery’ step and a clear distinction in the abstract machine between volatile and non-volatile storage. I can quite easily do two-phase commit to a POSIX filesystem (unless my disk is lying about sync) and end up with something that leaves my program in a recoverable state if the power goes out at any point. I may lose uncommitted data, but I don’t lose committed data. Doing the same thing with a single-level store is much harder because caches are (as their name suggests) hidden. Data that’s written back to persistent memory is safe, data in caches isn’t. I have to ensure that, independent of the order that things are evicted from cache, my persistent storage is in a consistent state. This is made much harder on current systems by the fact that atomic with respect to other cores is done via the cache coherency protocol, whereas atomic with respect to main memory (persistent or otherwise) is done via cache evictions and so guaranteeing that you have a consistent view of your data structures with respect to both other cores and persistent storage is incredibly hard.

                        The only systems that I’ve seen do this successfully segregated persistent and volatile memory and provided managed abstractions for interacting with it. I particularly like the FaRM project from some folks downstairs.

                        There is a common accusation, not entirely unjust, that the FOSS community is very good at imitating and incrementally improving existing implementations, but not so good at creating wholly new things.

                        I think there’s some truth to that accusation, though I’m biased from having tried to do something very different in an open source project. It’s difficult to get traction for anything different because you start from a position of unfamiliarity when trying to explain to people what the benefits are. Unless it’s solving a problem that they’ve hit repeatedly, it’s hard to get the message across. This is true everywhere, but in projects that depend on volunteers it is particularly problematic.

                        ISTM, based on really very little evidence at all, that HPE got carried away with the potential of someting that came out of their labs. It takes decades to go from a new type of component to large-scale highly-integrated mass production. Techies know that; marketing people do not. We may not have competitive memristor storage until the 2030s at the earliest, and HPE wanted to start building enterprise solutions out of it. Too much, too young.

                        That’s not an entirely unfair characterisation. The Machine didn’t depend on memristers though, it was intended to work with the kind of single-level store that you can build today and be ready to adopt memrister-based memory when it became available. It suffered a bit from the same thing that a lot of novel OS projects do: they wanted to build a Linux compat layer to make migration easy, but once they have a Linux compat layer it was just a slow way of running Linux software. One of my colleagues likes to point out that a POSIX compatibility layer tends to be the last piece of native software written for any interesting OS.

                    2. 4

                      I think files are more than just an abstraction over block storage, they’re an abstraction over any storage. They’re crucial part of the UX as well. Consider directories… Directories are not necessary for file systems to operate (it could just all be flat files) but they exist, purely for usability and organisation. I think even in the era of PMEM users will demand some way to organise information and it’ll probably end up looking like files and directories.

                      1. 2

                        Most mobile operating systems don’t expose files and directories and they are extremely popular.

                        1. 3

                          True, but those operating systems still expose filesystems to developers. Users don’t necessarily need to be end users. iOS and Android also do expose files and directories to end users now, although I know iOS didn’t for a long time.

                          1. 3

                            iOS also provides Core Data, which would be a better interface in the PMEM world anyway.

                            1. 2

                              True, but those operating systems still expose filesystems to developers.

                              Not all of them don’t, no.

                              NewtonOS didn’t. PalmOS didn’t. The reason being that they didn’t have filesystems.

                              iOS is just UNIX. iOS and Android devices are tiny Unix machines in your pocket. They have all the complexity of a desktop workstation – millions of lines of code in a dozen languages, multiuser support, all that – it’s just hidden.

                              I’m proposing not just hiding it. I am proposing throwing the whole lot away and putting something genuinely simple in its place. Not hidden complexity: eliminating the complexity.

                            2. 2

                              They tried. Really hard. But in the end, even Apple had to give up and provide the Files app.

                              Files are an extremely useful abstraction, which is why they were invented in the first place. And why they get reinvented every time someone tries to get rid of them.

                              1. 4

                                Files (as a UX and data interchange abstraction) are not the same thing as a filesystem. You don’t need a filesystem to provide a document abstraction. Smalltalk-80 had none. (It didn’t have documents itself, but I was on a team that added documents and other applications to it.) And filesystems tend to lack stuff you want for documents, like metadata and smart links and robust support for updating them safely.

                                1. 1

                                  I’m pretty sure the vast majority of iOS users don’t know Files exist.

                                  I do, but I almost never use it.

                                2. 1

                                  And extremely limiting.

                          1. 2

                            Interesting.

                            Prometheus documentation suggests that exporters should perform the work to gather metrics on every scrape, so the way the query exporter works would be in line with ‘best practice’. However there is a caveat that if the metrics are expensive to gather, it should perform the work and cache the results, only presenting the cache on scrape. It appears this would likely fall into the ‘expensive’ camp, although that might not be obvious if you’re testing on smaller databases.

                            1. 3

                              I’ve seen that in the docs and I think it’s a bad idea - you should never create an exporter that would in itself, with the default prometheus setup, hug your application to death. Or if you really apply to this, make sure you’re caching it internally as to avoid this.

                              1. 2

                                Yes you’re right, it doesn’t help that it’s all closed sourced as well