Threads for cyberia

  1. 4

    Is there anything Flatpak does better than Nix? After Debian, then Ubuntu, then Arch and now NixOS, I just can’t imagine moving to something which isn’t at least as strict, flexible and easy (really!) as Nix.

    1. 2

      Potentially not, in terms of features? But I think it’s vastly more accessible to the type of user who doesn’t want to know too much about what’s going on under the hood of their OS.

      That’s nothing against Nix, I just don’t think it’s for a non-technical audience right now.

      1. 2

        That’s fair. I should’ve specified that I was thinking more about the packaging side of things. Hence the mention of “easy”, because I don’t remember last time I learned a language which is so easy to read. It’s is sometimes hard to write, but once you have something that works it’s probably going to be much shorter and simpler than the equivalent in any other major packaging system.

      2. 2

        Documentation, probably. Last time I checked the getting started page on Nix told you to use nix-env -i to add packages, which is apparently a really bad idea and something you should never do. That’s when I gave up with it.

        1. 2

          Nix documentation is sadly lacking. Honestly not too surprising given that it comes from an academic background, but I suspect a small handful of dedicated writers could turn that around in a few months. Not saying it’ll happen anytime soon, but I really hope it gets some traction (and some refactors to remove cruft like with).

      1. 2

        This is a really nice proposal for a solution, hidden behind a title that made me roll my eyes and think “oh god, not another one of these posts”.

        1. 121

          I used to give the same advice, but I completely changed my opinion over the past 10 years or so. I eventually put in the time and learned shell scripting. These days my recommendation is:

          1. Learn to use the shell. It’s a capable language that can take you very far.
          2. Use ShellCheck to automatically take care of most of the issues outlined in the article.

          I really don’t want to figure out every project’s nodejs/python/ruby/make/procfile abomination of a runner script anymore. Just like wielding regular expressions, knowing shell scripting is a fundamental skill that keeps paying dividends over my entire career.

          1. 60

            Bingo.

            My advice is:

            • Always use #!/usr/bin/env bash at the beginning of your scripts (change if you need something else, don’t rely on a particular path to bash though).
            • Always add set -eou pipefail after that.
            • Always run shellcheck.
            • Always run shfmt.
            • Always pay attention to what version of bash you need to support, and don’t go crazy with “new” features unless you can get teammates to upgrade (this is particularly annoying because Apple ships an older version of bash without things like associative arrays).
            • Always use the local storage qualifier when declaring variables in a function.
            • As much as possible, declare things in functions and then at the end of your script kick them all off.
            • Don’t use bash for heavy-duty hierarchical data munging…at that point consider switching languages.
            • Don’t assume that a bashism is more-broadly acceptable. If you need to support vanilla sh, then do the work.

            While some people like the author will cry and piss and moan about how hard bash is to write, it’s really not that bad if you take those steps (which to be fair I wish were more common knowledge).

            To the point some folks here have already raised, I’d be okay giving up shell scripting. Unfortunately, in order to do so, a replacement would:

            • Have to have relatively reasonable syntax
            • Be easily available across all nix-likes
            • Be guaranteed to run without additional bullshit (installing deps, configuring stuff, phoning home)
            • Be usable with only a single file
            • Be optimized for the use case of bodging together other programs and system commands with conditional logic and first-class support for command-line arguments, file descriptors, signals, exit codes, and other nixisms.
            • Be free
            • Don’t have long compile times

            There are basically no programming languages that meet those criteria other than the existing shell languages.

            Shell scripting is not the best tool for any given job, but across every job it’ll let you make progress.

            (Also, it’s kinda rich having a Python developer tell us to abandon usage of a tool that has been steadily providing the same, albeit imperfect, level of service for decades. The 2 to 3 switch is still a garbage fire in some places, and Python is probably the best single justification for docker that exists.)

            1. 26

              While some people like the author will cry and piss and moan about how hard bash is to write, it’s really not that bad if you take those steps (which to be fair I wish were more common knowledge).

              I think “nine steps” including “always use two third-party tools” and “don’t use any QoL features like associative arrays” does, in fact, make bash hard to write. Maybe Itamar isn’t just “cry and piss and moan”, but actually has experience with bash and still think it has problems?

              1. 2

                To use any language effectively there are some bits of tribal knowledge…babel/jest/webpack in JS, tokio or whatever in Rust, black and virtualenv in Python, credo and dialyzer in Elixir, and so on and so forth.

                Bash has many well-known issues, but maybe clickbait articles by prolific self-pronoters hat don’t offer a path forward also have problems?

                1. 15

                  If your problem with the article is that it’s clickbait by a self-promoter, say that in your post. Don’t use it as a “gotcha!” to me.

                  1. 2

                    I think there’s merit here in exploring the criticism, though room for tone softening. Every language has some form of “required” tooling that’s communicated through community consensus. What makes Bash worse than other languages that also require lots of tools?

                    There’s a number of factors that are at play here and I can see where @friendlysock’s frustration comes from. Languages exist on a spectrum between lots of tooling and little tooling. I think something like SML is on the “little tooling” where just compilation is enough to add high assurance to the codebase. Languages like C are on the low assurance part of this spectrum, where copious use of noisy compiler warnings, analyzers, and sanitizers are used to guide development. Most languages live somewhere on this spectrum. What makes Bash’s particular compromises deleterious or not deleterious?

                    Something to keep in mind is that (in my experience) the Lobsters userbase seems to strongly prefer low-tooling languages like Rust over high-tooling languages like Go, so that may be biasing the discussion and reactions thereof. I think it’s a good path to explore though because I suspect that enumerating the tradeoffs of high-tooling or low-tooling approaches can illuminate problem domains where one fits better than the other.

                    1. 2

                      I felt that I sufficiently commented about the article’s thesis on its own merits, and that bringing up the author’s posting history was inside baseball not terribly relevant. When you brought up motive, it became relevant. Happy to continue in DMs if you want.

                    2. 6

                      You’re really quite hostile. This is all over scripting languages? Or are you passive aggressively bringing up old beef?

                  2. 9

                    Integrating shellcheck and shfmt to my dev process enabled my shell programs to grow probably larger than they should be. One codebase, in particular, is nearing probably like 3,000 SLOC of Bash 5 and I’m only now thinking about how v2.0 should probably be written in something more testable and reuse some existing libraries instead of reimplementing things myself (e.g., this basically has a half-complete shell+curl implementation of the Apache Knox API). The chief maintenance problem is that so few people know shell well so when I write “good” shell like I’ve learned over the years (and shellcheck --enable=all has taught me A TON), I’m actively finding trouble finding coworkers to help out or to take it over. The rewrite will have to happen before I leave, whenever that may be.

                    1. 11

                      I’d be interested in what happens when you run your 3000 lines of Bash 5 under https://www.oilshell.org/ . Oil is the most bash compatible shell – by a mile – and has run thousands of lines of unmodified shell scripts for over 4 years now (e.g. http://www.oilshell.org/blog/2018/01/15.html)

                      I’ve also made tons of changes in response to use cases just like yours, e.g. https://github.com/oilshell/oil/wiki/The-Biggest-Shell-Programs-in-the-World


                      Right now your use case is the most compelling one for Oil, although there will be wider appeal in the future. The big caveat now is that it needs to be faster, so I’m actively working on the C++ translation (oil-native passed 156 new tests yesterday).

                      I would imagine your 3000 lines of bash would be at least 10K lines of Python, and take 6-18 months to rewrite, depending on how much fidelity you need.

                      (FWIW I actually wrote 10K-15K lines of shell as 30K-40K lines of Python early in my career – it took nearly 3 years LOL.)

                      So if you don’t have 1 year to burn on a rewrite, Oil should be a compelling option. It’s designed as a “gradual upgrade” from bash. Just running osh myscript.sh will work, or you can change the shebang line, run tests if you have them, etc.

                      There is an #oil-help channel on Zulip, liked from the home page

                      1. 2

                        Thanks for this nudge. I’ve been following the development of Oil for years but never really had a strong push to try it out. I’ll give it a shot. I’m happy to see that there are oil packages in Alpine testing: we’re deploying the app inside Alpine containers.

                        Turns out that I was very wrong about the size of the app. It’s only about 600 SLOC of shell :-/ feels a lot larger when you’re working on it!

                        One thing in my initial quick pass: we’re reliant on bats for testing. bats seemingly only uses bash. Have you found a way to make bats use Oil instead?

                        1. 1

                          OK great looks like Alpine does have the latest version: https://repology.org/project/oil-shell/versions

                          I wouldn’t expect this to be a pain-free experience, however I would say should definitely be less effort than rewriting your whole program in another language!

                          I have known about bats for a long time, and I think I ran into an obstacle but don’t remember what it was. It’s possible that the obstacle has been removed (e.g. maybe it was extended globs, which we now support)

                          https://github.com/oilshell/oil/issues/297

                          In any case, if you have time, I would appreciate running your test suite with OSH and letting me know what happens (on Github or Zulip).

                          One tricky issue is that shebang lines are often #!/bin/bash, which you can change to be #!/usr/bin/env osh. However one shortcut I added was OSH_HIJACK_SHEBANG=osh

                          https://github.com/oilshell/oil/wiki/How-To-Test-OSH

                        2. 1

                          Moving away from Python? Now it has my interest… in the past I skipped past know it’d probably take perf hits and have some complicaged setup that isn’t a static binary.

                          1. 2

                            Yes that has always been the plan, mentioned in the very first post on the blog. But it took awhile to figure out the best approach, and that approach still takes time.

                            Some FAQs on the status here: http://www.oilshell.org/blog/2021/12/backlog-project.html

                            Python is an issue for speed, but it’s not an issue for setup.

                            You can just run ./configure && make && make install and it will work without Python.

                            Oil does NOT depend on Python; it just reuses some of its code. That has been true for nearly 5 years now – actually since the very first Oil 0.0.0. release. Somehow people still have this idea it’s going to be hard to install, when that’s never been the case. It’s also available on several distros like Nix.

                            1. 1

                              What is the status of Oil on Windows (apologies if it’s in the docs somewhere, couldn’t find any mentioning of this). A shell that’s written in pure C++ and has Windows as a first class citizen could be appealing (e.g. for cross-platform build recipes).

                              1. 1

                                It only works on WSL at the moment … I hope it will be like bash, and somebody will contribute the native Windows port :-) The code is much more modular than bash and all the Unix syscalls are confined to a file or two.

                                I don’t even know how to use the Windows sycalls – they are quite different than Unix! I’m not sure how you even do fork() on Windows. (I think Cygwin has emulation but there is way to do it without Cygwin)

                                https://github.com/oilshell/oil/wiki/Oil-Deployments

                      2. 4

                        To the point some folks here have already raised, I’d be okay giving up shell scripting. Unfortunately, in order to do so, a replacement would: […] There are basically no programming languages that meet those criteria other than the existing shell languages.

                        I believe Tcl fits those requirements. It’s what I usually use for medium-sized scripts. Being based on text, it interfaces well with system commands, but does not have most of bash quirks (argument expansion is a big one), and can handle structured data with ease.

                        1. 4

                          Always use #!/usr/bin/env bash at the beginning of your scripts (change if you need something else, don’t rely on a particular path to bash though).

                          I don’t do this. Because all my scripts are POSIX shell (or at least as POSIX complaint as I can make them). My shebang is always #!/bin/sh - is it reasonable to assume this path?

                          1. 4

                            you will miss out on very useful things like set -o pipefail, and in general you can suffer from plenty of subtle differences between shells and shell versions. sticking to bash is also my preference for this reason.

                            note that the /usr/bin/env is important to run bash from wherever it is installed, e.g. the homebrew version on osx instead of the ancient one in /bin (which doesn’t support arrays iirc and acts weirdly when it comes across shell scripts using them)

                            1. 4

                              My shebang is always #!/bin/sh - is it reasonable to assume this path?

                              Reasonable is very arbitrary at this point. That path is explicitly not mandated by POSIX, so if you want to be portable to any POSIX-compliant system you can’t just assume that it will exist. Instead POSIX says that you can’t rely on any path, and that scripts should instead be modified according to the system standard paths at installation time.

                              I’d argue that these days POSIX sh isn’t any more portable than bash in any statistically significant sense though.

                              1. 2

                                Alpine doesn’t have Bash, just a busybox shell. The annoying thing is if the shebang line fails because there is no bash, the error message is terribly inscrutable. I wasted too much time on it.

                                1. 2

                                  nixos has /bin/sh and /usr/bin/env, but not /usr/bin/bash. In fact, those are the only two files in those folders.

                                2. 3

                                  https://mkws.sh/pp.html hardcodes #!/bin/sh. POSIX definitely doesn’t say anything about shs location but I really doubt you won’t find a sh at /bin/sh on any UNIX system. Can anybody name one?

                                3. 2

                                  I would add, prefer POSIX over bash.

                                4. 18

                                  I checked, and shellcheck (at least the version on my computer) only catches issue #5 of the 5 I list.

                                  1. 14

                                    That’s because the other ones are options and not errors. Yes, typically they are good hygiene but set -e, for example, is not an unalloyed good, and at least some experts argue against using it.

                                    1. 3

                                      Not for lack of trying: https://github.com/koalaman/shellcheck/search?q=set+-e&type=issues

                                      There are tons of pedants holding us back IMO. Yes, “set -e” and other options aren’t perfect, but if you even know what those situations are, you aren’t the target audience of the default settings.

                                    2. 17

                                      I eventually put in the time

                                      Yup, that’s how you do it, It’s a good idea to put in the the time to understand shell scripting. Most of the common misconceptions come out of misunderstanding. The shell is neither fragile (it’s been in use for decades, so it’s very stable) nor ugly (I came from JavaScript to learning shell script, and it seemed ugly indeed at first, now I find it very elegant). Keeping things small and simple is the way to do it. When things get complex, create another script, that’s the UNIX way.

                                      It’s the best tool for automating OS tasks. That’s what it was made for.

                                      +1 to using ShellCheck, I usually run it locally as

                                      shellcheck -s sh
                                      

                                      for POSIX compliance.

                                      I even went as far as generating my static sites with it https://mkws.sh/. You’re using the shell daily for displaying data in the terminal, it’s a great tool for that, why not use the same tool for displaying data publicly.

                                      1. 6

                                        No, it really is ugly. But I’m not sure why that matters

                                        1. 13

                                          I believe arguing if beauty is subjective or not is off topic. 😛

                                      2. 16

                                        I went the opposite direction - I was a shell evangelist during the time that I was learning it, but once I started pushing its limits (e.g. CSV parsing), and seeing how easy it was for other members of my team to write bugs, we immediately switched to Python for writing dev tooling.

                                        There was a small learning curve at first, in terms of teaching idiomatic Python to the rest of the team, but after that we had much fewer bugs (of the type mentioned in the article), much more informative failures, and much more confidence that the scripts were doing things correctly.

                                        I didn’t want to have to deal with package management, so we had a policy of only using the Python stdlib. The only place that caused us minor pain was when we had to interact with AWS services, and the solution we ended up using was just to execute the aws CLI as a subprocess and ask for JSON output. Fine!

                                        1. 15

                                          I tend to take what is, perhaps, a middle road. I write Python or Go for anything that needs to do “real” work, e.g. process data in some well-known format. But then I tie things together with shell scripts. So, for example, if I need to run a program, run another program and collect, and then combine the outputs of the two programs somehow, there’s a Python script that does the combining, and a shell script that runs the three other programs and feeds them their inputs.

                                          I also use shell scripts to automate common dev tasks, but most of these are literally one-ish line, so I don’t think that counts.

                                          1. 2

                                            This makes sense to me

                                          2. 8

                                            we immediately switched to Python for writing dev tooling.

                                            FWIW when shell runs out of steam for me, I call Python scripts from shell. I would say MOST of my shell scripts call a Python script I wrote.

                                            I don’t understand the “switching” mentality – Shell is designed to be extended with other languages. “Unix philosophy” and all that.

                                            I guess I need to do a blog post about this ? (Ah I remember I have a draft and came up with a title – The Worst Amounts of Shell Are 0% or 100%https://oilshell.zulipchat.com/#narrow/stream/266575-blog-ideas/topic/The.20Worst.20Amount.20of.20Shell.20is.200.25.20or.20100.25 (requires login)

                                            (Although I will agree that it’s annoying that shell has impoverished flag parsing … So I actually write all the flag parsers in Python, and use the “task file” pattern in shell.)

                                            1. 2

                                              What is the “task file” pattern?

                                              1. 5

                                                It’s basically a shell script (or set of scripts) you put in your repo to automate common things like building, testing, deployment, metrics, etc.

                                                Each shell function corresponds to a task..

                                                I sketched it in this post, calling it “semi-automation”:

                                                http://www.oilshell.org/blog/2020/02/good-parts-sketch.html

                                                and just added a link to:

                                                https://lobste.rs/s/lob0rw/replacing_make_with_shell_script_for

                                                (many code examples from others in that post, also almost every shell script in https://github.com/oilshell/oil is essentially that pattern)

                                                There are a lot of names for it, but many people seem to have converged on the same idea.

                                                I don’t have a link handy not but Github had a standard like this in the early days. All their repos would have a uniform shell interface so that you could get started hacking on it quickly.

                                          3. 5

                                            You should investigate just for task running. It’s simple like make but none of the pitfalls of it for task running.

                                          1. 6

                                            It strikes me that if the FOSS brigade (and I count myself among them) zoomed out a little and focussed on the political scene that causes user freedoms to be infringed - i.e. vested interests of capital, copyright as enforcement of artificial scarcity - they might be more successful than they are by talking about user freedom in the abstract. It would perhaps push the argument outside their own niche and make it relevant to the people they dismiss as the “standard consumers of the world”.

                                            1. 8

                                              Completely agreed.

                                              Things like TPMs are a bit unfortunate because they have a lot of good uses. Secure boot is great for ensuring that I don’t have kernel-level malware if I can load my own signing keys. It’s then a building block that can prevent the TPM from disclosing my disk encryption key to anything other than an OS image that I trust and even to prevent it from disclosing my home directory’s encryption key to anything other than an OS image that I trust and that presents a PIN that I know. Combined with trusted I/O paths, it can even be used to prevent my home directory being unlocked unless I use a biometric sensor. For remote use, it can be used to store private key for use by WebAuthn, which can use a derived key based on a TPM secret and the host so that the private key that’s used to sign into site X is completely distinct from the key I use to sign into site Y.

                                              The same technology can also be used to prevent me from playing Netflix streams unless your computer provides a remote attestation that guarantees that it will enforce rights limits that extend well beyond fair use.

                                              If you care about freedom, do not to try to ban the technology. If you do then you’re easy to dismiss because the legitimate uses have huge benefits. The correct strategy is to advocate for laws that limit vigilante action by copyright holders that infringes on the fair use rights and on the doctrine of first sale. I would advocate for laws that consider anything that is protected by DRM to be subject to trade secret law, not to copyright (and DRM is very useful for trade secrets because you want to store them only on devices that your company trusts and has some degree of control over). If you want copyright protections then you have to distribute your work in a way that respects the doctrine of first sale and fair use. If you instead distribute it with DRM then that’s fine, you get trade secret protections and as soon as someone bypasses the DRM your work is in the public domain.

                                              1. 1

                                                I don’t think the author was advocating for banning TPMs. The thrust of the article seems to be that Windows 11 (and compatible hardware) is more harmful than Windows 10, so it is even more important to avoid it to preserve a market for alternatives. Small changes in market share can have small immediate effects, but building a movement to pass copyright reform is much more difficult.

                                                If we’re dreaming big, I would advocate for a complete overhaul of the copyright system and a ban on DRM.

                                                1. 1

                                                  The problem with this thrust is that, as a user, the majority of the TPM-backed features in Windows 11 are useful to me. Drive encryption with a protected key that has rate-limiting on the unlock attempts from a memorable PIN is useful. Being able to log into a load of web-based things without needing a password is useful. Being protected from boot-sector malware is useful. I’d love to have all of those be baseline features in any computer / OS.

                                                  I don’t like DRM (though I’ll note that yesterday I tried to stream Netflix to an AirPlay-enabled display from my 2013 MacBook Pro and it didn’t work, in spite of my Mac not having a TPM or Apple’s equivalent) but using DRM as the argument for not wanting trusted computing is conflating two arguments.

                                                  1. 1

                                                    It’s also convenient to use the most popular OS, which will be increasingly pre-installed on new computers. That doesn’t change the argument from the other side. Different people will have different calculations based on how much they value convenience or useful features vs. avoiding the future where PCs are as locked down as a phone or console.

                                                    using DRM as the argument for not wanting trusted computing is conflating two arguments.

                                                    I don’t think it’s a conflation if investment in trust-us computing increases the reach of DRM.

                                            1. 6

                                              Nice diagrams

                                              1. 6
                                                1. 4

                                                  Thank you for the kind words! :) I worked pretty hard on making them look consistent and clean.

                                                1. 1

                                                  Some of my colleagues will be talking about VPP at this! Can’t wait to watch them.

                                                  1. 1

                                                    VPP?

                                                    1. 1

                                                      It’s a high performance network packet processor. At $JOB we are using it as a programmable data plane.

                                                  1. 13

                                                    Deno is an impressive project. But importing URLs rather than some abstraction (package names, “@vendorname/packagename”, reverse DNS notation, anything) is a no-go for me. I 100% do not want a strict association between the logical identifier of a package and a concrete piece of internet infrastructure with a DNS resolver and an HTTP server. No thank you. I hate that this seems to be a trend now, with both Deno and Go doing it.

                                                    1. 8

                                                      package names, “@vendorname/packagename” and reverse DNS notation, both in systems like Maven or NPM, are just abstractions for DNS resolvers and HTTP servers, but with extra roundtrips and complexity. The idea is to get rid of all those abstractions and provide a simple convention: Whatever the URLs is, it should never change it’s contents, so the toolchain can cache it.

                                                      Any http server with static content can act as an origin for getting your dependencies. It could be deno.land/x, raw.githubusercontent.com, esm.sh, your own nginx instance, any other thing, or all of those options combined.

                                                      1. 21

                                                        Package identifiers are useful abstractions. With the abstraction in place, the package can be provided by a system package manager, or the author can change their web host and no source code (only the name -> URL mapping) needs to be changed. As an author I don’t want to promise that a piece of software will always, forevermore, be hosted on some obscure git host, or to promise that I will always keep a particular web server alive at a particular domain with a particular directory structure, I want the freedom to move to a different code hosting solution in the future, but if every user has the URL in every one of their source files I can’t do that. As a result, nobody wants to take the risk to use anything other than GitHub as their code host.

                                                        With a system which uses reverse DNS notation, I can start using a library com.randomcorp.SomePackage, then later, when the vendor stops providing the package (under that name or at all) for some reason, the code will keep working as long as I have the packages with identifier com.randomcorp.SomePackage stored somewhere. With a system which uses URLs, my software will fail to build as soon as randomcorp goes out of business, changes anything about their infrastructure which affects paths, stops supporting the library, or anything else which happens to affect the physical infrastructure my code has a dependency on.

                                                        The abstraction does add “complexity” (all abstractions do), but it’s an extremely useful abstraction which we should absolutely not abandon. Source code shouldn’t unnecessarily contain build-time dependencies on random pieces of physical Internet infrastructure.

                                                        That’s my view of things anyways.

                                                        1. 8

                                                          As an author I don’t want to promise that a piece of software will always, forevermore, be hosted on some obscure git host, or to promise that I will always keep a particular web server alive at a particular domain with a particular directory structure. I want the freedom to move to a different code hosting solution in the future.

                                                          Same applies to Maven and npm, repositories are coded into the project (or the default repository being defined by the package manager itself). If a host dies and you need to use a new one, you’ll need to change something.

                                                          What happens if npm or jcenter.bintray.com stops responding? Everyone will have to change their projects to point at the new repository to get their packages.

                                                          but if every user has the URL in every one of their source files I can’t do that. As a result, nobody wants to take the risk to use anything other than GitHub as their code host.

                                                          In Deno you can use an import map (And I encourage everyone to do so): https://deno.land/manual/linking_to_external_code/import_maps so all the hosts are in a single place, just one file to look at when a host dies, just like npm’s .npmrc.

                                                          There are lockfiles, too: https://deno.land/manual@v1.18.0/linking_to_external_code/integrity_checking#caching-and-lock-files.

                                                          And another note: It’s somewhat typical for companies to have an internal repository that works as a proxy for npm/maven/etc and caches all the packages in case some random host dies, that way the company release pipeline isn’t affected. Depending on the package manager and ecosystem, you’ll need very specific software for implementing this (Verdaccio for npm, for example). But with Deno, literally any off-the-shelf HTTP caching proxy will work, something way more common for systems people.

                                                          Source code shouldn’t unnecessarily contain build-time dependencies on random pieces of physical Internet infrastructure.

                                                          That’s right, but there are only two ways to make builds from source code without needing random pieces of physical internet infrastructure, and these apply for all package management solutions:

                                                          • You have no dependencies at all
                                                          • All your dependencies are included in the repository

                                                          The rest of solutions are just variations of dependency caching.

                                                        2. 5

                                                          Although this design is simpler, it has a security vulnerability which seems unsolvable.

                                                          The scenario:

                                                          1. A domain expires that was hosting a popular package
                                                          2. A malicious actor buys the domain and hosts a malicious version of the package on it
                                                          3. People who have never downloaded the package before, and therefore can’t possibly have a hash/checksum of it, read blog posts/tutorials/StackOverflow answers telling them to install the popular package; they do, and get compromised.

                                                          It’s possible to prevent this with an extra layer (e.g. an index which stores hashes/checksums), but I can’t see how the “URL only” approach could even theoretically prevent this.

                                                          1. 2

                                                            I think the weak link there is people blindly copy-pasting code from StackOverflow. That opens the door to a myriad of security issues, not only for Deno’s approach.

                                                            There are plenty of packages in npm with very similar names as legit, popular packages, but maybe just a letter, an underscore, or a number, differs. Enough for many people installing the wrong package, a malicious one, and getting compromised.

                                                            Same applies to domain names. Maybe someone buys den0.land and just writes it as DEN0.LAND in a forum online because anyway domains are case-insentive and the zero can hide better.

                                                            Someone could copy some random Maven host from StackOverflow and get the backdoored version of all their existing packages in their next gradle build.

                                                            Sure, in that sense, Deno is more vulnerable because the decentralisation of package-hosting domains. It’s easier for everyone to know that the “good” domain is github.com or deno.land. If any host could be a good host, any host could mislead and become a bad one, too.

                                                            For npm, we depend entirely on the fact that the domain won’t get taken over by a malicious actor without no one noticing. I think people will end up organically doing the same and getting their dependencies mostly from well-known domains as github.com or deno.land, but I think it’s important to have the option to not follow this rule strictly and have some freedom.

                                                            EDIT:

                                                            Apart from the “depending on very well-known and trusted centralised services” strategy, something more could be done to address the issue. Maybe there’s something about that in Deno 2 roadmap when it gets published. But fighting against StackOverflow blind copy-pastes is hard.

                                                            1. 1

                                                              What about people checking out an old codebase on a brand new computer where that package had never been installed before?

                                                              I dunno, this just feels wrong in so many ways, and there are lots of subtle issues with it. Why not stick to something that’s been proven to work, for many language ecosystems?

                                                              1. 1

                                                                What about people checking out an old codebase on a brand new computer where that package had never been installed before?

                                                                That’s easily solved with a lockfile, just like npm does: https://deno.land/manual/linking_to_external_code/integrity_checking

                                                                Why not stick to something that’s been proven to work, for many language ecosystems?

                                                                Well, the centralized model has many issues. Currently every piece of software that runs on node, to be distributed, has to be hosted by a private company property of Microsoft. That’s a single point of failure and a whole open source ecosystem relying on a private company and a private implementation of the registry.

                                                                Also, do you remember all the issues with youtube_dl on GitHub? Imagine something similar in npm.

                                                                Related to the topic: https://www.youtube.com/watch?v=MO8hZlgK5zc

                                                                1. 3

                                                                  Good points those. I hadn’t considered that!

                                                                  The single point of failure is not necessarily inherent to the centralized model though. In CHICKEN, we support multiple locations for downloading eggs. In Emacs, the package repository also allows for multiple sources. And of course APT, yum and Nix allow for multiple package sources as well. If a source goes rogue, all you have to do is remove it from your sources list and switch to a trustworthy source which mirrored the packages you’re using.

                                                            2. 2

                                                              Seems like you might want to adopt a “good practice” of only using URL imports from somehow-trusted sources, e.g. npmjs.com or unpkg or whatever.

                                                              Could have a lint rule for this with sensible, community-selected defaults as well.

                                                              1. 2

                                                                If in the code where the URL is also includes a hash of the content, then assuming the hash isn’t broken, it avoids this problem.

                                                                i.e.:

                                                                import mypackage.org/code/mypackage-v1.1#HASH-GOES-HERE
                                                                

                                                                You get the URL and the hash, problem solved. You either get the code as proved by the hash or you don’t get the code.

                                                                The downside is, you then lose auto-upgrades to some new latest version, but that is usually a bad idea anyway.

                                                                Regardless, I’m in favour of 1 level of indirection, so in X years when Github goes out of business(because we all moved on to new source control tech), people can still run code without having to hack up DNS and websites and everything else just to deliver some code to the runtime.

                                                                1. 1

                                                                  This is a cool idea, although I’ve never heard of a package management design that does this!

                                                                  1. 1

                                                                    Agreed, I don’t know of anyone that does this either. In HTTPS land, we have SRI that is in the same ballpark, though I imagine the # or sites that use SRI can be counted on 1 hand :(

                                                                2. 1

                                                                  It’s sort of unlikely to happen often in Go because most users use github.com as their URL. It could affect people using their own custom domain though. In that case, it would only affect new users. Existing users would continue to get the hashed locked versions they saved before from the Go module server, but new users might be affected. The Go modules server does I think have some security checks built into it, so I think if someone noticed the squatting, they could protect new users by blacklisting the URL in the Go module server. (https://sum.golang.org says to email security@golang.org if it comes up.)

                                                                  So, it could happen, but people lose their usernames/passwords on NPM and PyPI too, so it doesn’t strike me as a qualitatively more dangerous situation.

                                                                3. 2

                                                                  Whatever the URLs is, it should never change it’s contents, so the toolchain can cache it.

                                                                  Does Deno do anything to help with that or enforce it?

                                                                  In HTML you can apparently use subresource integrity:

                                                                  https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity

                                                                  <script src="https://example.com/example-framework.js"
                                                                          integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC"
                                                                          crossorigin="anonymous"></script>
                                                                  

                                                                  It would make sense for Deno to have something like that, if it doesn’t already.

                                                                  1. 2

                                                                    Deno does have something like that. Here’s the docs for it: https://deno.land/manual@v1.18.0/linking_to_external_code/integrity_checking.

                                                                    1. 2

                                                                      OK nice, I think that mitigates most of the downsides … including the problem where someone takes over the domain. If they publish the same file it’s OK :)

                                                                4. 2

                                                                  While I agree with your concern in theroy, in the Go community this problem is greatly mitigated by two factors:

                                                                  • Most personal projects tend to just use their GitHub/GitLab/… URLs.
                                                                  • The Go module proxy’s cache is immutable, meaning that published versions cannot be changed retrospectively.

                                                                  These two factors combined achieve the same level of security as a centralized registry. It is possible that Deno’s community will evolve in the same direction.

                                                                  1. 1

                                                                    change your hosts file

                                                                  1. 19

                                                                    Are these rants ever going to end?

                                                                    1. 1

                                                                      I did something similar once! Nice to see somebody actually write about the process rather than getting bored after making it vaguely work like I did 😁

                                                                      I wanted to start implementing filters, but realised that was beyond my DSP knowledge, so started reading a DSP textbook and quickly got overwhelmed.

                                                                      https://gitlab.com/mso42/harmony

                                                                      1. 2

                                                                        Nice project! I’ve had something similar living in the back of my head for a while. Shame it doesn’t work on X11, but X11 being what it is, well… I don’t blame the author.

                                                                        1. 3

                                                                          Now that I think of it, it shouldn’t be too hard to hack up an X11 version with xdotool and an i3 mode….

                                                                          1. 1

                                                                            nomouse might help (though it doesn’t support click…)

                                                                          2. 2

                                                                            the feature is builtin in X: https://en.wikipedia.org/wiki/Mouse_keys

                                                                          1. 3

                                                                            I assume this means we can have a std that is unsafe free, since that would effectively be “pushed down” one level into Rustix instead?

                                                                            1. 15

                                                                              Yes this will reduce amount of unsafe in std.

                                                                              On the other hand, my impression is that reducing amount of unsafe in std is a non-goal or at least a low priority. Some people even think amount of unsafe in std should be increased (so that they can be properly audited and rest of ecosystem can use less unsafe).

                                                                              std is very performance sensitive and contains lots of unsafe usages for performance for codes that can be written unsafe free. BTree is a good example: it is known std-compatible BTree can be written without any unsafe, but it is a little bit slower. So std BTree is full of unsafe.

                                                                              1. 4

                                                                                Performance related unsafe aside, surely it is a goal to have as small amount of unsafe in std as possible?

                                                                                Without knowing the exact details, my gut feeling is that merging Rustix into std would maybe make it easier to review the use of unsafe. To currently review something like std::fs means I both need to know what libc does, that the mapping between std and libc is correct, and trust that libc does the right thing against the kernel (which hopefully isn’t too much of a problem).

                                                                                1. 12

                                                                                  I think people are unwilling to delete unsafe if it causes 1% performance regression. That’s what I meant about low priority. Priority is about tradeoff.

                                                                                  1. 1

                                                                                    According to the article, rustix-based std is actually faster than current unsafe for some APIs.

                                                                                    1. 2

                                                                                      For OS calls yes. But you won’t get that speed for data structures like B-Trees - and if you’re removing unsafe there why not at both places, will be the question. I think that rust isn’t ready to abandon libc for its stable experience over something that’s some weeks to months old and has to be maintained for every arch, while libc already works and has more eyes/testers on its code. Especially since go already tried to replace libc and failed in the end. Maybe in the futures and I could imagine that it’ll be easy to switch to this as the std, but for now there are probably bigger issues.

                                                                                      1. 9

                                                                                        Also, Linux is the only OS where raw syscalls are The Official Public API. Pretty much everywhere else you are strongly advised NOT to abandon libc.

                                                                                        1. 1

                                                                                          I believe OpenBSD even kills the process if it tries to talk syscalls directly without going through its libc

                                                                                          1. 1

                                                                                            Yep

                                                                                  2. 3

                                                                                    One of the deciding factors of “does this go in std or not” is whether it needs unsafe to run well or not. This is one reason PRNG’s and regex’s aren’t in std.

                                                                              1. 1

                                                                                Yes, and throw in wanting different DPI scalings on the different screens as well (impossible by design under X, not sure about Wayland).

                                                                                1. 3

                                                                                  impossible by design under X

                                                                                  What design is that?

                                                                                  X, in fact, supports (at least) three different ways to do it: 1) run separate “screens”. This is the original way, but it doesn’t let windows move from one to the other; they must be destroyed and recreated on the other one, meaning it would jump across the transition instead of being simply dragged over and would require cooperation from the application. A bit of a pain. 2) Have xrandr connect the screens plus use a compositor to upscale everything on the lower dpi monitor to the higher one or vice versa. (This btw is also what Wayland chose to do.) Relatively easy to set up and can be done without the application’s cooperation. Or 3) Have xrandr connect the screens and then present the size difference to the application and let it scale itself. This requires cooperation from the application again.

                                                                                  It might not be easy, but it certainly isn’t impossible. I remember when getting X running at all was a bit of a pain. Then they did some work and actually did made it easy.

                                                                                  1. 2

                                                                                    This very recently bit me and I ended up going back to Windows (among other reasons). I have a 1440p monitor and a 4K vertical monitor and trying to get mixed incremental scaling was a bit of a nightmare. And when I did manage to get it working under (GNOME) Wayland, it was horribly laggy on the 4K display (getting it going on X11 was basically a no go with the amount of effort required).

                                                                                  1. 1

                                                                                    Huh, I was just this morning thinking how this would be a great idea. It would be cool to add Dropbox as a backend!

                                                                                    1. 5

                                                                                      Site is unusable without Javascript, unfortunately

                                                                                        1. 4

                                                                                          I ignored the warnings and tried to run this in Firefox, and it caused a hard crash of the whole browser. I’m impressed.

                                                                                          1. 1

                                                                                            I wonder which of the many subsystems being abused failed.

                                                                                            1. 1

                                                                                              Worked on my Firefox/MacOS, but with very low framerate.

                                                                                              1. 1

                                                                                                It definitely will work on iOS/MobileSafari with a few tweaks, too. The video takes over the page, and input is definitely not meant for mobile. But if I go back to the page (which pauses the video and thus the game) the checkboxes are updated correctly.

                                                                                            1. 1

                                                                                              Oh hey, Eugenia Cheng taught me in my first year of my undergrad at Sheffield. I remember her dropping hints about intuitionism while teaching us number theory.

                                                                                              1. 5

                                                                                                I really dislike that this mixes the concerns of command parsing and command handling. For me, one of the main advantages of Haskell is that it allows you to express your program model as an ADT, so that you can do verification (by testing or by proof) based on the type-space of that ADT.

                                                                                                This eliminates the ADT in favour of OOP-style objects that encapsulate behaviour behind a shared interface (IO ()), which, you know, is sometimes the best tool for the job, but if that’s the case you might as well write it in a more suitable language.

                                                                                                1. 1

                                                                                                  What would be more suitable? I find Haskell is very good at this kind of thing.

                                                                                                  1. 1

                                                                                                    Well, you can write OOP Haskell if you like, I suppose it is a general-purpose language after all. But I mean, you could do it in Python, TypeScript, C#, Java… anything you like really

                                                                                                    1. 2

                                                                                                      But none of those other options have the nice type system that Haskell does. Or the nice seperation of pure/impure. Or the nice concurrency system from GHC. Etc etc

                                                                                                      1. 4

                                                                                                        Yeah, don’t worry about the ivory tower people and just write it the way you want to.. Haskell has had a culture of 1^👨🚢 that is not entirely productive and often intimidating for newcomers. The style of programming cyberia advocates is really nice for a variety of reasons but it is also a bit dry and boilerplate-y when you haven’t arrived there yet.

                                                                                                        1. 1

                                                                                                          1^👨🚢

                                                                                                          Ahh, one-upmanship. Took me a while to get that.

                                                                                                        2. 3

                                                                                                          OK but my point is if you’re just calling everything IO (), then you’re getting none of the benefits of the type system and everything is impure anyway, so why bother?

                                                                                                  1. 5

                                                                                                    Excited to use the pattern matching! I hadn’t been keeping up with where it was at, so I’m pleasantly surprised it’s landed.

                                                                                                    1. 4

                                                                                                      It’s funny, the iterator example with zip and flatmap etc. is much more clear to me than the suggested improvement of dancing around the 2D array indices in a nested for loop.

                                                                                                      A matter of taste, really.